Visiting www.ttypp.nl/arch/ is a time machine. This is where the webpages of TYP/Typografisch papier are archived. TYP was a forward-thinking magazine initiated by Dutch graphic designer Max Kisman, published on paper, on floppy disks and online. Surfing through the online editions, one is transported 20 years back, to a period the World Wide Web had only just begun to arrive on people’s computers. The enthusiasm displayed by designers looking for ways to exploit the new medium is contagious. On the transient space of the internet, it seems like a small miracle that this cultural moment is still accessible.
Surfing around websites from the nineties on the Internet Archive, one notices that quite a few do not display anymore, among which many sites that use technologies like Shockwave or Flash. Websites using HTML tags, the native tongue of the browser, have generally aged much better. This is no accident: backwards compatibility has always been important for web browser vendors. Since browser vendors have so little control over the markup people write, browsers are very forgiving about the markup. Within HTML5, this tradition has been codified as the parsing of malformed or non-conforming HTML has been standardised.1
In recent years there has been a return to creating solutions built on static HTML files. This is because hosting HTML files is easier, cheaper and more secure than hosting a dynamic system. Since the website does not need to store information in a database, the ‘attack surface’ of a website hosting static files is much smaller. The website does not need to expose an editing interface that can be hacked. Also, a dynamic system will be have to kept up to date to fix known security holes. No such maintenance is needed for the HTML files: and because they do not use any specific capacities of the server, the cheapest hosting solution will generally suffice.
For many of the first web sites, HTML was not just the format in which they were delivered—it was the format in which they came about. The first web-sites were ‘hand-crafted HTML’: created as a series of HTML pages, with occasional updates (the person designing the site might then charge for each update!). This did not mean coding was necessary: tools like Adobe Dreamweaver provided a visual view and a code view. The democratisation of Content Management Systems (CMS) like WordPress and Joomla changed the equation. In these systems, a general design is encoded into a template, and the contents for individual pages are stored in a database that is easily editable by the user. For clients this saves time and money. The downside is that a CMS requires shoehorning every page into templates: those early HTML pages offered much more freedom in this respect, as potentially every page could be modified and changed to the designer’s whims.
This suggests that HTML has additional properties which not only make it the right format for delivering and archiving web sites: it looks like HTML files also provide a very powerful authoring format. The logic of CMS’s (and indeed, the intended logic of CSS) is to pull form and content apart. Yet traditionally, the intelligence of designers has resided in creating links between form and content. Moving beyond the template, and allowing authors and designers to modify the design of each specific page is what working in separate HTML files enables.
If such an approach were to be viable today, new tools will have to be developed. With tools like Dreamweaver fallen out of grace, it looks like the only tool we have now to edit HTML files is the code editor. Yet the popularity of database driven CMS’s stems from the fact that they can provide different interfaces for the different people involved in creating a website. A developer might need a specific view, an editor might require a specific angle, as might a designer—even if one person combines these roles. New tools will have to be able to provide different views for editing the HTML document. Instead of generating the HTML, as conventional CMS’s do, these tools should work on the files.
Even if there is something of a revival of HTML-based websites, these are often built with tools that do not exploit the full authorial potential of HTML. At the time of writing, some 398 ‘static site generators’ were listed on the staticsitegenerators.net registry. These generators suffer some of the same drawbacks as conventional CMS’s: they often work with a template through which all content has to be pushed. The advantage, then, is that such tools are always well equipped to generate indexes. The question of how to syndicate, index and provide navigation for a collection of static HTML files—that is the second challenge for HTML as an authoring format.
Having HTML files as the source upon which tools can be used, also has implications for interoperability. The workflows for creating web-sites today show an unprecedented fragmentation. Every web project has its own toolchain, where the development team has needed to pick a backend language, database and a specific set of front-end ‘frameworks’ and ‘pre-processors’ to implement the front-end design with. This means the content will not be encoded in HTML, but in some abstraction of HTML specific to the project. Fashions change quickly, and getting up to speed with the technologies used within a specific project can prove daunting. Skills do not transfer as easily, and tools developed for a project will work for that project only.
This is a pressing issue in digital publishing. Whereas traditional publishing workflows are often built around bespoke XML formats, a new generation of technologists is discovering the flexibility of re-purposing web technologies for creating hybrid publications: available as web-sites, ePubs and printable PDF’s. Currently, many parties all make their own database-driven solution. The design will be encoded in a custom template format. The text will be encoded in a custom markup format—often based on Markdown, but never exactly the same. This makes it very hard for third-party service providers to interact with these systems, which makes it harder for an economy to form around digital publishing.
There are counter-examples. Atlas, the platform for creating technical publications created by the American publisher O’Reilly, goes a long way in basing a workflow on de facto standards. The source is a series of HTML files, stored in a Git repository.2 Atlas proposes a visual editor for the HTML, and a review and editing workflow that is built on top of Git’s built-in capacity to split and merge parallel versions of files (branches). Because their solution is built upon Git and the file-system, one can use any other program to deal with the HTML files. The only downside to Atlas’ approach is that it deals with snippets of HTML, rather than complete files. These files will not display correctly by themselves, and they will not validate unless one wraps them in some boilerplate code.
Another interesting product is the OERPUB Textbook Editor, which has the advantage of being fully Open Source. It embraces the ePub standard. In most workflows, ePubs are generated artifacts, but in this case they are the source. This makes sense, since at the basis ePubs are collections of XHTML5 files with a mandatory set of metadata.3 This means that ePubs as a source format combine the expressiveness of HTML with some of the rigor demanded by conventional publishing workflows. The OERPUB editor requires an ePub to be stored in a Git repository, available on Github. The editor reads in the ePub and allows to edit the text with a WYSIWYG editor, create chapters and sections and edit metadata. Because the file structure is fully standard, existing tools like the IDPF Epubcheck validation software can easily be used alongside the tool.
A decoupling of tool and content can be beneficial for the ecosystem of digital publishing. Having the content stored in a more standardised way, allows tools to be more specialised. This has multiple advantages. First, standardisation will make it more easy for newcomers to navigate inbetween projects, lowering the bar to entry and making it easier for a more diverse set of practitioners to enter a field that currently seems technocratic. At the same time, tools can become more specialised, allowing for the more rapid development of new tools, and allowing experienced practitioners to focus on one aspect of the craft of digital publishing and thus pushing the status quo in that area. Freed from the obligation to provide a monolithic solution that handles publishing from start to finish, service providers will be able to focus on that part of the chain where they see their added value.
To separate the editing tool from the content edited, is what the precedent solutions have in common, as well as building on existing well known technologies. Even the technologically advanced Git software, chosen by both projects as the place to store and exchange the project contents, re-uses itself the proven abstraction of the file system. Low-tech but ubiquitous.
It is in this sense that I think that to imagine the future of digital publishing, we can take inspiration from the past, and more precisely by the cultural moment so lovingly described by Olia Lialina and Dragan Espenschied in the ‘Digital Folklore Reader’. In the late 1990ies, at the height of the DOT-com bubble, the nature of online publishing was of course shaped by companies backed by billions in venture capital. Yet it was shaped at least in equal measure by passionate amateurs like the denizens of GeoCities: the true authors behind the vernacular of the internet. The situation also provided ample opportunity for curious professionals like Max Kisman to experiment with and re-imagine the new medium. What leveled the playing field for all involved was that there was a lingua franca: HTML. Browsers could display the source code for a page, and authors could learn by copy-pasting. I think our success in imagining new digital publishing depends on whether we can enable such a hands-on approach, and envision solutions that allow for such bottom-up creativity. Taking a cue from the web’s formative moment, we are gonna publish like it is 1999.
Some misplaced puritanism has caused HTML standards writers and browser vendors to remove the blink tag. This might have to do with a narrative in which Geocities-style, amateur driven web-design had created a chaos from which we all had to be saved by standards loving professionals—in this sense, the blink tag becomes a pars pro toto for an approach to webdesign built on Comic Sans and MIDI files, that the ‘professional’ web users suspect they can kill off by sacrificing <blink>. But it stands as a curious omission in what has basically been a technology that has been remarkably caring for its past.↩
Git is a software to deal with multiple versions of files. Git tracks the versions and allows users to merge their changed files. Initially available only as a command line utility, it is starting to be built into content creation tools.↩
XHTML5 is HTML5, with additional restrictions to make sure the HTML is also valid XML. XML is a more generic standard for markup languages. This makes it possible to re-use tools developed for XML with the HTML.↩