Narrative Programming

2022-04-23

Treating computer programming as literature

I’ve been playing around with the idea of mixing code and documentation. It’s not an original idea, but I think my implementation may be somewhat unique.

The idea is combining a cup of Literate Programming (an idea created by Donald Knuth), with a sprinkle of javadoc / Doxygen.

I’ll start with an example output first. The following link is the output from a small example program to teach how to use C to compile to WASM:

=> https://raw.githubusercontent.com/robrohan/wefx/main/docs/manual.pdf Wefx Narrative Manual

What is Literate Programming

Literate programming is a programming methodology that combines a programming language with a documentation language, making programs more robust, more portable, and more easily maintained than programs written only in a high-level language.

The idea was created by Donald Knuth. Literate Programming is the concept of writing a document that can can be understood by both a human and a computer. The main point of the document is an academic paper for highfalutin fancy pants PhD types.

=> https://www-cs-faculty.stanford.edu/~knuth/musings.html Donald Knuth

The closest thing to Literate Programming in the real world might be Jupyter Notebooks, but a Jupyter Notebook is closer to an interactive environment than it is to Literate Programming. Literate Programming, from my understanding, is more like writing a research paper that just so happens to compile into computer code. In some interviews I’ve seen with Knuth, he said Literate Programming should be like reading a novel or a story of the application.

=> https://jupyter.org/ Jupyter Notebooks

The way Knuth describes it, you create a single source document that is then sent through two processes: one process to create code that can compile, and one to create a human readable document - a research paper, book, article, etc.

Knuth’s implementation runs within TeX, and requires learning a macro language on top of whatever the underlying programming language is. To me this extra layer seems a bit unsustainable, or perhaps impractical, for real world usage. I can imagine basic issues such as trying to debug production code. Since it would be in some macro language on top of the normal language… it just seems to me to be a bit too much to ask from an every-day, “I have a dead-line” software engineer.

Weave - To get the human readable text
Tangle - To get the application

You can see Knuth describe it in the linked video:

=> https://www.youtube.com/watch?v=bTkXg2LZIMQ

What Problem am I Solving

The idea is to help fix two problems, and maybe create a new idea:

First, and most importantly, Automatic Manuals for IaC Systems
Useful documentation for computer programs
A way to treat source code like thesis papers (imagine bibliographies for computer programs)

Manuals for IaC Systems

Back in the day, when one bought a computer, it often came with a manual.

=> https://archive.org/details/commodore_c64_manuals Commodore 64 Users Guide

Even crazier, these manuals often came with circuit board diagrams… but that is for another day.

When a company bought a computer, not only did the the computer came with a manual from the manufacture, but the software that ran on the computer also had manuals. On top of that, the sysadmin would often create a “manual manual” (generally in a 3 ring binder) of the custom utilities and day to day operations of the computer (back then, a whole company would run off a single computer with many terminals attached, but that’s for another day too).

Today almost everything runs in the cloud. If you think of the cloud as replacing that single big iron computer for the company, wouldn’t it be nice to have that manual? In fact, today it is even more important as every company puts the “cloud parts” together in a different way. Saying, I run my infrastructure on AWS doesn’t really say much. You can combine AWS services together in an infinite amount of ways.

Imagine starting a DevOps job, and someone hands you a manual and says, “this is how our system works”. I think most places are very far from that ideal.

At my day job, we have started playing around with this, and so far I think it’s quite cool. The documentation is in one place, and it is always up to date, because the important parts - like say a CIDR address - is showing you the actual variable in the documentation. It’s not some mystery value stuffed away in Confluence never to be updated. Unfortunately, I can’t show you an example. However…

Useful Documentation for Programs

Below is an example output of the process for an open source software project, if you use your imagination, you can hopefully extrapolate what this documentation might look like for other types of processes.

=> https://raw.githubusercontent.com/robrohan/wefx/main/docs/manual.pdf Wefx Narrative Manual

The benefits of this are similar to the IaC section, and also just for basic APIs. The output is similar to Javadoc-like outputs, but the intention behind it is quite different. It is about the software engineer teaching the user how to use their code - not simply documenting it.

In fact, one of the main problems I’ve found with fresh-grad engineers is they don’t tend to write comments about why the code is doing what it’s doing - they just tend to re-describe what the code is doing. When the build output is a book, and you have to go through that as a PR, the comments tend to make way more sense.

To be fair though, so far, I’ve find this more useful for infrastructure code, but I still think it could be useful for application code especially if…

Source Codes as Papers

I don’t know how this would work, but I think it would be neat if you could reuse code from other places and reference it like you would a paper. There could be a “publish or die” kind of metric for code - or you could somehow see how often a sort idea was used. I am not sure how this would work, but it seems fun.

How it Works

The idea is only PoC at the moment. We are just trying it out at work so the implementation is very, very basic at the moment. However, the basic concept is:

You write code as you normally would, and around the code you comment using Markdown. Something like:

/*

# Very Important Thing

This does a very important thing, and I _very_ much like it.

*/
void main() {
 return 0;
}

There is a process, called Narrative that goes through the source and just makes a single, large markdown file (like Knuth’s weave).

=> https://github.com/robrohan/narrative

Then that large markdown file is fed to Pandoc which outputs PDF, man pages, epub, HTML, or several other formats.

=> https://github.com/robrohan/alpine-pandoc

Profit.

There is no tangle step, because what you are writing should compile as is. Which is very helpful for debugging, CI/CD pipelines, using other tools, IDEs, etc, etc.

Fin

If you want to learn more about Literate Programming, you can sell your car and buy the book:

=> https://web.stanford.edu/group/cslipublications/cslipublications/site/0937073806.shtml