Embedding and Blending


The documentation system tenet of embedding and blending states that external information should be embedded and seamlessly blended with the relevant document rather than forcing users to look up missing information on external systems. This tenet extends the tenet of composability by including all content types—rather than just documents—as the potential components of a composite document. As such, it supports the same principles of shared responsibility and truth proximity. 

Embedding is not just about placing external content in a document, but transforming its data into native documentation elements. In this example, an Excel table is blended either as a table or as a series of headings
Embedding is not just about placing external content in a document, but transforming its data into native documentation elements. In this example, an Excel table is blended either as a table or as a series of headings

While embedding is essentially a form of composition, the difference is that the composed content comes from external systems rather than the documentation system itself. Embedding is not necessarily meant to work like an automated form copy-pasting, and neither like an HTML IFRAME. This is why the tenet does not just use the word embedding; it is embedding and blending.

Blending means that the embedding mechanism should allow the flexible querying and extracting of components from the relevant external source in such a way that they appear as regular documentation elements, so that “all knowledge workers—authors and users—modify and incorporate other knowledge products into their own information bases and knowledge products”. (Engelbart, 1995). We will illuminate this concept in a moment.

Given that embedding is a form of composition, we can also see the capability from two angles: Intra-document embedding and blending, and inter-document embedding and blending.

  • Intra-document embedding and blending: the embedding of content from the perspective of one document using embedding and blending directives.
  • Inter-document embedding and blending: the embedding of content implemented as a process—e.g., a DocOps pipeline in a CI/CD platform.

Intra-document Embedding and Blending

Similarly to intra-document composability, inter-document embedding and blending refers to the ability to embed external sources directly from a given document.

In this example, the document describes the notion of a Discount, for which it references an Excel file in which the discount types are maintained. 

In the above example, the directive @discounts references the file discounts.xlsx specified as YAML metadata. The directive selects the sheet named discount_table, and hides columns C and D.

Inter-Document Embedding and Blending

Implementing embedding and blending across documents is a rather common DocOps approach, rather than a fallback whenever we are unable to extend the primary or existing document formats to support embedding and blending directives. Typical inter-document embedding and blending use cases include:

  1. Appending the content of any referenced attachments to the end of existing documents
  2. Displaying referenced pictures in-line whenever their names appear
  3. Asking co-creators to use a keyword, say, EXT:discount_table and replacing all instances of said keyboard with embedded content

The next example illustrates the case of appending the Discount Types heading to an existing document using shell scripting. The open source xlsx2csv and csv2md command line utilities are used to convert an Excel file to a comma separated value (CSV) text file, and a CSV text file to markdown, respectively.

Common Capabilities

So far we have seen examples of embedding an external source—an Excel spreadsheet, but nothing about blending, other than mentioning that it is not necessarily supposed to work like an automated form of copy-pasting. To illustrate what we mean by blending, the below Markdown file illustrates the result of implementing either of the processing techniques seen earlier.

What we see here is that rather than creating a screenshot of the Excel table, or creating an IFRAME-like component which displays an Excel widget, we have extracted the content from Excel and translated it to a regular markdown table. The benefit is that once this markdown document is rendered, the look and feel will be seamless and indistinguishable from a table created by hand. 

For blending to be effective, it needs to be implemented in such a way that it can satisfy various use cases rather than converting the underlying data in one way only. For example, the directive @discounts[sheet=discount_codes,headings=[(2,A,B)] could tell the markdown preprocessor to treat column A as a level 2 heading, and column B as the content for said heading. The next code snippet exemplifies the expected result.

# Discount Types

These are the types of discounts that we offer:

## L1

A discount up to 5% 

## L2

A discount up to 15%

## L3

A discount up to 20%

To recap, embedding is effectively a form of composition—except that it uses external sources—and blending is the ability to structure the embedded data using native document elements. Blending also supports the tenet of consistent layout.

The exact embedding capabilities will vary depending on the type of external source being embedded.

Data Source Embedding and Blending Capabilities

We consider ‘data’ any source of tabular or structured information, be it in the form of static files (spreadsheets, CSV files, JSON files, etc.), proper database management systems (relational, NoSQL, etc.), but also systems that offer querying capabilities such as bug tracking and service management systems.

The general capabilities that we need for such systems are as follows:

  1. Source subset selection: Depending on the source system, once connected or referenced, we may still need to select a specific sheet, table, row range, JSON object, etc.
  2. Query formulation: Many external systems, in addition to conventional relational databases, offer the ability to formulate complex queries. The query syntax will depend on the data source. For example, relational databases use SQL while JSON-based sources may use JSONPath.
  3. Column or object-subset selection: Using flexible strategies such as:
    1. Use all columns/object paths by default, only specify hidden ones
    2. Only specify selected columns/object paths
    3. Select column by name or value match (e.g., similar to VLOOKUP in Excel)
  4. Structural mapping: When blending data from a table, it doesn’t necessarily mean that what we want is the raw table itself. A table may be used to codify a semantic structure that we may want to unpack and blend with the document. As such, capabilities as follows are necessary—in the case of tabular structures; similar notions apply to object-based ones:
    1. Treating columns as representing outlines rather than rows: headings (indicating the starting heading level), numbered lists, or bullet points.
    2. Specifying the heading-content relationship: the content for a heading may be found in the adjacent column, but it may also be found in the row below.
    3. Specifying the rendering of non-outline columns: they may contain plain text, markdown text, etc.
  5. Bulk processing and for-loop scoped attributes: Extracting columns in a for-loop scope so that the composite document can specify how to organize each of the columns.

Image Embedding and Blending Capabilities

Image processing is one of the most neglected aspects in current documentation systems. The capability of ‘displaying’ an image is often assumed to be about the inclusion of a file that is already saved in a format compatible with the final rendering format, such as a JPEG or PNG file in an HTML-based view. The problem is that images are rarely authored in such formats; manual ‘export’, conversion, or copy-paste steps are required before they can be used. This conflicts with the principle of truth proximity.

Broadly speaking, images are encoded using pixel or vector formats. Let us briefly talk about them. 

Pixel image formats (also called ‘raster’) are authored in applications such as Adobe Photoshop or GIMP. In their original format, pixel images may contain multiple layers and use higher resolutions and color depths than appropriate for direct inclusion in web-based documents.

Vector image formats consist of a set of geometrical directives that allow drawing images in a resolution-independent manner. Printers use this kind of directives so that the result improves the greater the DPI (dots per inch) depth. Popular vector image editors include Adobe Freehand and Inkscape. 

We also have diagram-oriented tools such as Microsoft Visio, LucidChart and Draw.io whereby images are meant to serve as blueprints: floor plans, business flows, and so on. These tools use proprietary vector-based formats but can also export files in pixel-based formats. Microsoft PowerPoint could be described as a hybrid tool given that it typically combines both pixel and vector-based elements.

It is also worth noting that there are a variety of engineering and business applications such as computer aided design (CAD) tools, for example, Sparxs EA, spreadsheets such as Google Sheets, and business intelligence applications like Microsoft PowerBI whose goal is not general ‘image authoring’ but whose output is visual in nature.

As explained earlier, embedding an image produced by any external source should not require accessing the external system (e.g., Microsoft Visio) and manually exporting said image in a document-friendly format. The conversion from Visio to PNG—if that is the ‘friendly’ format—is a DocOps problem, rather than an end-user one. 

To honor the principle of truth proximity, embedding and blending an image should be done in such a way that either the original image is referenced directly without any manual ‘export’ steps, or, alternatively, the conversion is performed automatically so that images in  documentation-compatible formats (PNG, JPEG, SVG, etc.) are made available on a zero touch basis.

There will always be specific needs depending on the image-producing source system’s peculiarities, but these are the general capabilities that apply to image embedding:

  1. Source subset selection: Depending on the source system or file, once connected or referenced, we may still need to select a specific image, tile, layer, etc.
  2. Format conversion: We often need to convert a file from a proprietary format to one that is a first class citizen in our documentation system. For example, from Visio to SVG, or from PSD (Photoshop) to PNG.
  3. Cropping: Many images come with large chunks of white areas which result in the relevant image appearing smaller once embedded
  4. Resampling: We may want to resample the image to a lower resolution or use a different color depth. Resampling is not the same as resizing, which only has an effect during display but does not change the underlying image asset.
  5. Metadata extraction: Leveraging the image’s built-in metadata capabilities may help produce more elegant automation workflows. For this, we the ability to treat selected metadata attributes as either headings, captions, or content.
  6. Bulk processing and for-loop scoped attributes: Extracting metadata in a for-loop scope so that the composite document can specify how to organize each of the metadata attributes. 
  7. Layout (optional): Depending on the documentation system format, ‘layout’ may be a property that can only be controlled in the rendering stage rather than the document composition and embedding one. Layout properties are also problematic because they normally apply to a specific device or rendered format. If for some reasons layout properties were to be applicable at this stage, they would be:
    1. Relative size to the page or screen size
    2. Alignment (left, center, right)
    3. Wrapping
  8. Bulk processing and for-loop scoped attributes: It is often useful to treat images as collections. This is not only useful for creating galleries or albums but to derive a document’s semantic structure—and define heading levels, captions, body text, etc.—based on metadata. When treating images as collections, there may be different ways to specify the grouping criteria; it may be by folder, by a common metadata property, by date range, etc. In bulk processing, there may also be the need for resampling, which is useful to create thumbnails.

Video and Audio Embedding and Blending Capabilities

Video and audio are closely related in the sense that a video stream includes both motion pictures and an audio stream—often encoded using the similar formats as stand-alone audio streams. Also, it is common to take a video stream and extract still pictures, and audio.

There are a number of complex automation workflows that may be applied to both video and audio which escape the scope of this book. However, some of the fundamental capabilities include:

  1. Source subset selection: Depending on the source system or file, once connected or referenced, we may still need to select a specific video/audio stream.
  2. Format conversion and resampling: Audio and especially video content may take minutes or even hours to convert and/or resample. It is generally not a good idea to drive the conversion from within an embedding directive itself. It is preferable, instead, that the embedding directive selects from within a predefined list of formats which have been already pre-converted by pipelines that would have been triggered when the assets were uploaded. For example:
    1. Video streams in 420p, 720p, 1080, and 2k formats from from a 4k canonical format
    2. Lossy audio streams in 64k, 128k, and 256k formats from a canonical lossless format.
  3. Metadata extraction: Leveraging the image’s built-in metadata capabilities may help produce more elegant automation workflows. For this, we need the ability to treat selected metadata attributes as either headings, captions, or content. In both video and audio streams, a key metadata property is the time-wise division of the stream into named sections. For example, the business context discussion starts at 15:09.59 and finishes at 21:15.23.
  4. Transcription and translation extraction: A transcription—or translation upon the original transcription is effectively text which may blend with the document in smart ways. For example:
    1. Organizing the transcription into headings by breaking up the transcription into blocks demarcated by time or text match.
    2. Selecting transcriptions’ passages based on subject (leveraging natural language processing capabilities)
    3. Highlighting transcriptions’ text based on text or subject match
  5. Bulk processing and for-loop scoped attributes: This capability is similar to the one that applies to images. We need the capability to treat audio and video streams as collections based on a common grouping criteria, and then extract metadata in each iteration. For-loop processing is also applicable in the context of a single stream which is broken up into sections. We may want, for example, to generate a heading, a short description, and a thumbnail for each section.

Code Embedding and Blending Capabilities

When documenting ‘code’ we may be writing something as simple as a ‘getting started’ guide or as complex as a complete API reference manual. Each use case calls for a different approach. For example, for introductory material including the likes of tutorials, getting started guides, and so on, we might embed small code fragments from within a body of text, whereas for formal reference documentation, it may be preferable to maintain the narrative together with the code itself and generate the documentation using DocOps automation techniques.

Code is also not only used in the form of code listing but referenced within sentences and paragraphs in which there is a chance of typos and/or inaccuracy—e.g., referencing a function name that has been renamed or no longer exists. 

Truth proximity is paramount in the particular case of code embedding and blending. The following is a list of the most common capabilities, each of which would apply depending on the nature of the documentation objective and the shape and nature of the associated code base.

  1. Code color highlighting: Any embedded standalone code fragment—not in the middle of a sentence—should provide language-specific color highlighting. This is not a cosmetic feature. The use of colors to differentiate language-reserved keywords from user definitions dramatically reduces cognitive load by allowing the reader to understand the source code’s logical structure at a glance.
  2. Flexible code fragment selection: The embedding mechanism should allow the flexible selection of files and specific code content within selected files. Line-based code fragment selection, for example, is not useful because line numbers might change when the source code is modified. As such the selection should be based on functions, methods, or custom demarcated areas.
  3. Query-based code fragment selection: It is often the case in documentation that the author wants to talk about methods or functions in a specific package, module, or class, or perhaps, those that start with a given prefix. These should be extracted using a pattern-based query with the possibility to ‘blend’ the result in flexible manners. For example, the matched method or functions may appear as a table or as level 2 headings.
  4. Comment-blending: Especially in combination with query-based code fragment selection, we may want to treat the comments in a flexible manner to enrich the destination document. For example, comments may be treated as formatted in the primary documentation format—in the case of Markdown, they may include the likes of **bold**, or _italics_—which would then be seamlessly blended with the document to maintain the relevant function/method documentation with the code itself
  5. Literal programming support: The code fragments in a document are processed directly as source. There are no two separate files, one for documentation and the other one to contain the source code. They are all one and the same.
  6. Code snippet validation support: Code snippets may be pointed to a file (or provided with a non-renderable context) which allows checking their syntax validity. For example, in a Jupyter Notebook it is possible to include code blocks which help introduce imports or variables into scope but which are not rendered in the final HTML version of the source notebook.

© 2022-2024 Ernesto Garbarino | Contact me at ernesto@garba.org