Decoupled Rendering


The documentation system tenet of decoupled rendering states that the documentation system should generate documentation in multiple formats automatically, from a single canonical source, without forcing authors to reauthor documents for each format. This tenet supports the principle of truth proximity by promoting the generation of documents from a single canonical source.

Decoupled rendering involves adapting the content to fit each medium. In this example, hyperlinks are converted to page references when the Customer document view is rendered as PDF
Decoupled rendering involves adapting the content to fit each medium. In this example, hyperlinks are converted to page references when the Customer document view is rendered as PDF

While a tenet like this could be expressed as something more mundane such as ‘content/layout separation’, this would confuse the means with the end. It is most definitely useful to use document formats that separate content structure from its presentation, but the ultimate goal is to be able to treat rendering as a discrete stage which produces—as required—multiple document file formats. For example:

  1. HTML
  2. PDF
  3. E-Book
  4. Legacy Word Processor Format

The point of decoupled rendering is not the thoughtless generation of the target file format in the quickest way possible, but embracing the advantages, capabilities, and best practices inherent to each format.

Let’s now spend some time looking at the peculiarities of some of the most widespread file formats. 

HTML

In a contemporary documentation system we assume that HTML is the primary, or ‘default’ format. All documentation is rendered—not necessarily authored—in HTML by default. In fact, we use the term the documentation platform to refer to the primary web-based application in a documentation system.

However, the fact that our documentation platform renders its content as HTML doesn’t mean that we necessarily have much control over its HTML rendering process. For example:

  1. We may be limited to setting a color palette on a global template rather than having tight control of each CSS class or referenceable identifier.
  2. We may not be able to control which HTML elements are used for each logical document component. For example, headings may be rendered using a tag such as <div class="heading-1"> rather than <h1>. 
  3. The online documentation system may use HTML only for its own display purposes, without advanced capabilities to generate offline HTML files that can be browsed without the aid of an application server, or that can be restructured for a different base domain name and/or specific directory structure. 
  4. The online documentation system may be unable to generate consistent relative links, so HREF references need to be preprocessed before the HTML can be used for other purposes.

We may also want to apply responsive web design and accessibility best practices which may not be directly supported by our documentation platform.

PDF

Portable Document Format (PDF) is the de facto file format for documents whose ultimate destination is paper. However, most PDF documents are visualized directly on the screen and never printed—good for the ecology, but not always the ideal document format for this use case. 

PDF is an interoperable file format which can be opened on Windows, Linux, MacOS, mobile phones and pretty much any modern computing device, without the need to install third party software. As just mentioned, users and enterprises use this format for a variety of reasons which might be unrelated to the explicit choice of the paper medium:

  • Making document files read-only, preventing tampering.
  • Including designated areas that the user can fill or sign.
  • Hiding the authoring tool, making the document look more professional, as if designed by an agency—as opposed to a ‘homebrew’ Microsoft Office document.
  • Including any typefaces used by a document so that it does not look different if the user is missing one of the used typefaces.
  • Including vector images natively allowing scaling without pixelation or loss of resolution
  • Sharing information ‘offline’ in contexts in which access to the internet is either restricted, or simply not available—a remote island, a plane, etc.

Simply converting an HTML document to PDF results in, at best, in unprofessional results, at worst in a broken document. Instead, what we may need is to translate the documents to a format whose native medium is paper-oriented—such as LaTeX—as an intermediate step rather than ‘printing’ a web page onto a PDF file directly. Why go through this trouble? Because paper works differently than computer screens:

  • Links: A link in HTML is interactive. A ‘link’ in paper is a written reference to a chapter, section, or page. A table of contents, for example, needs to point to the page number in which each section is found. Links in PDF files are actually quite useful—given that they are nearly always browsed on computers—but they should lead to the correct page in which the relevant content is found.
  • Text layout: Text in a paper document assumes a fixed paper size. In professional looking documents, paragraphs should ideally be justified left to right and use a font size large enough so that each line contains less than 80 characters. A common telltale sign of ‘cheaply’ converted documents is that the font is illegible or that the text does not wrap tidily between the margins—it overflows to either side.
  • Cover Page(s): Paper documents have the notion of a first page which acts as a cover, which usually contains the document’s title, date of publication, and author’s name.
  • Headers and Footers: Paper documents don’t have a web browser surrounding them so it is necessary to include ‘navigation’ information using the headers and footers to provide the reader with spatial context: the current page number, the name of the current section, and so on.

Paper directives can also be expressed in HTML/CSS itself using @media directives, but they are different directives than those that apply to screen-bound use cases. In other words, screen and printer render targets require separate considerations and strategies.

E-Book

It is a pity that most enterprises assume that PDF suffices as the sole ‘offline’ documentation file format. Given that most PDFs are never printed, e-books offer several advantages over PDF, for offline use cases:

  • Paper-like experience without killing trees
  • Accessibility: e-book readers allow changing the font size, the typeface, as well as the color palette, for a tailored and accessible reading experience.
  • Conversion ease: e-books are easier to create than PDFs given that their internal format works similarly to HTML; text flows dynamically depending on the device’s screen size and the user’s font and layout preferences. 

Word Processor Format

Popular word processors include Microsoft Word, Google Docs, and Open Office’s Writer. The problem with these formats, unlike HTML, PDF, and E-Books, is that they are authoring formats which entangle both content and presentation properties, making it inconvenient to treat in a programmatic manner. 

The reason as to why such formats may still be necessary is because there might be a legacy business or engineering process which demands documents in these formats, or simply because they might serve as an offline ‘backup’ editable version of online documents—unlike PDFs in which copy paste does not always work consistently.


© 2022-2024 Ernesto Garbarino | Contact me at ernesto@garba.org