Composability

The documentation system tenet of composability states that most documentation should be produced by composing existing documents, rather than by starting new documents from scratch. This tenet directly supports the principle of truth proximity by providing the necessary technology support to avoid copy-pasting and retyping. It also supports the principle of shared responsibility by allowing the breakdown of a large document into multiple document parts, so that each part can be worked on independently.

The notion of composite documents is decades old. In fact, it is one of the central pillars of Ted Nelson’s Xanadu project (1987):

“Another form of text that is becoming increasingly important is compound text, where materials are viewed and combined with others. (This term too has recently become common.) A good way of visualizing this is as a set of windows to original materials from the compound texts themselves.”

It is bizarre that we have had subroutines in programming languages for over 70 years—so that we can reuse existing code—but the way in which many enterprises write documents is by doing so from scratch every single time.

A text document is not different from a computer program. There’s no reason as to why a document should be produced from scratch. The first thing you see at the top of a computer program is a bunch of imports—directives to include code from other files. Most documents should work in the same way, unless they are one-off pieces of writing like memos, communiqués, and so on.

Some may argue that composability may be achieved through templating, and most documentation systems—even Microsoft Word—support some form of templating. In its most basic implementation, templating allows separating the content from its presentation, so in this case it is definitely not a means for composability. A similar, but better approach is that taken by static site generators, or content management systems, in which templates assume that a document view will be composed of a number of document parts. Said document parts may also be templates which in turn include other document parts. This approach is a bit closer to this tenet’s spirit.

Yet, conventional ‘advanced’ templating is still a rigid and inaccessible mechanism. It is rigid in the sense that templates are predefined in advance and document parts must fit the predefined template’s compartments. It is inaccessible because there is often a ‘church and state’ separation between regular content and template ‘code’. This is most definitely the case in static site generators in which the content is in the form of Markdown files while the templates are in HTML, each sitting in different folders.

As understood by visionaries like Engelbart (1993), composability is meant to be implemented as a form of linking—rather than templating—in such a way that links “may specify views so traversal retrieves the destination object with a prespecified presentation view (e.g., display as a high-level outline or display only a particular statement).”. Engelbart expected such a feature to become “natural and constantly employed part of a user’s vocabulary.”

While this tenet is not prescriptive in the way that composability is achieved—because traditional templating mechanisms may still apply—we identify two broad capabilities: intra-document composability and inter-document composability.

Intra-document compossibility: the composition of documents from the perspective of one composite document by using compositional directives.
Inter-document composability: the composition of documents implemented as a process—e.g., a DocOps pipeline in a CI/CD platform.

Let’s now look at each of them, in reference to the example presented in Figure TBC-Composability.

Intra-document Composability

Intra-document composability is the ability to include other documents within an existing document. Documents that include other documents should be indistinguishable from those that do not. We typically use the term document part when referring to a document that is known to be a subset of a composite document—or that will be used for this purpose in the future—but this is just for didactic purposes, given that composite documents may be virtually infinitely nested. As noted by Nelson (1987):

“A document may have a window to another document, and that one to yet another, indefinitely. Thus A contains part of B, and so on. One document can be built upon another, and yet another document can be built upon that one, indefinitely: each having links to what was already in place.”

If we consider how computer programs are ‘composed’, we can observe that using modularized code is a two-step process. The first step is declaring what modules we want to use, and the second step is using specific functions from said modules by indicating that said functions are found in the relevant module’s namespace. A similar approach is useful for composite documents.

In the below example, we first declare the dependency upon the documents front_matter.md, business_plan.md, and next_steps.md, and bind them to the namespaces front_matter, business_plan, and next_steps, respectively.

Then, we specify which specific component we want from said module. In our example, these are: the heading named Business Context from front_matter.md, the heading named Business Plan (but not including the heading itself) from business_plan.md, and the entirety of next_steps.md.

---
title: Business Proposal
author : Sarah Jones  
date: 2024-09-11
imports:
   front_matter: /common/front_matter.md
   business_plan: /plans/2024/business_plan.md
   next_steps: /common/next_steps.md
---

# Introduction

@front_matter[Business Context]

# Business Plan

@business_plan[Business Plan](exclude_heading=yes)

# Next Steps

@next_steps

The ability to select specific components within a document is fundamental. This is because once we have established a dependency upon a secondary document, we may not necessarily want the entire document, but just a subset of it. For example:

the title
the author’s name
the heading named Business Context
all headings under Business Plan not including the selected heading itself
a demarcated area called blurb
the document’s labels
The value of a property called status
an anchor to the locations where the word regulatory appears

In the previous example, the directive @business_plan[Business Plan](exclude_heading=yes) would include all the headings under Business Plan not including the selected heading itself.

Similarly to advanced templating systems, we may also want to evaluate the included components, or use them in loops, so syntax support for logic conditions would make the system more versatile.

Inter-document Composability

There are a number of scenarios in which it may not be possible—or convenient—to extend existing document formats to include compositional directives. Whichever the circumstances, we may be forced to compose across documents rather from within a specific document. This is often the least intrusive way to achieve composition.

When composing in an intra-document basis, composition may be achieved using a myriad of techniques, from simple shell scripts to complex, purposely created programs. In turn, such programs may be run from a CI/CD automation pipeline.

The example below illustrates how the example observed in the last section could be composed using a shell script.

#!/bin/sh
echo -e "# Introduction\n\n" > final.md
grep "### Business Context" -A9999 front_matter.md \
       >> final.md
echo -e "# Business Plan\n\n" >> final.md
grep "# Business Plan" -A9999 business_plan.md | \
       tail -n +2 >> final.md
echo -e "# Next Steps\n\n" >> final.md
cat next_steps.md >> final.md

I wouldn’t be surprised if you feel perplexed at this point. Shouldn’t composability be an intrinsic feature of any decent documentation platform? Is composability really about implementing and parsing complex syntax extensions? Do I have to write convoluted shell scripts? Isn’t there a better way? All valid questions.

There isn’t an easy, simple answer but the short answer is that there are no agreed conventions nor standardized tooling to implement composability as of today. Some documentation platforms that offer some basic composability capabilities do so only within the confines of their closed-loop ecosystem.

Another noticeable limitation in both open source and commercial documentation platforms is that compositional features are implemented in the context of the final document view (e.g., the HTML rendering of the document) rather than as a means to generate intermediate documents for further processing.

In DocOps, documents are meant to be treated as data. As such, composition is expected to work similarly to an Extract-Load-Transform (ETL) process. We get data in, we get data out: we get a document in, we get a document out. The rendering of a document as HTML, PDF, and so on, is a separate concern which occurs at a later stage, as far as DocOps is concerned.

Common Composability Capabilities

As we mentioned earlier, simply ‘pasting’ an external document in some designated compartment is not sufficient to achieve effective composition. Both in terms of in-document composability and cross-document composability, we need specific capabilities to select components of a document. The following is a non-exhaustive list of such common capabilities:

Intelligent heading selection: Headings may be selected by name, by regular expression, by number, etc. Once selected we may want just the content below, the child headings below (with a maximum depth), or a combination thereof.
Heading level shifting: The heading levels used by document parts may not align with those used in the composite document, so they may need to be increased and decreased before inclusion.
Label-based selection: One or more documents may be selected by label including compound membership expressions: (e.g., contains labels A and B but not label C)
Search-based content selection: A component may be selected relative to the result of a search query (e.g., the paragraph in which the term appears, an anchor to the first heading above, etc.)
Metadata (label, properties) selection: A document’s labels and properties can be used as variables
Demarcated-area selection: Depending on the documentation format, it may be possible to demarcate components explicitly without the use of headings or properties. For example, in markdown an HTML comment such as  would be ignored by most markdown parsers and can be used to indicate a demarcated area.
Control flow statements: Without control flow statements such as ‘if-then-else’ and for/while loops it is difficult to handle special situations. While these certainly add noise to documents that would otherwise be completely legible by laymen users, they are often more practical than creating auxiliary text conversion code.