Building a Static Documentation Site with Metalsmith

At work my company's product team has been using GitHub wiki for years for all of our use facing documentation. As they have grown from a small open source project to a much larger team with a more fully featured enterprise offering, they had outgrowing using GitHub wiki. We went out in search of a set of tools to build our own self hosted documentation web site with the following set of requirements:

  • An easy to use workflow for documentation authors that doesn't require a programer or designer to write
  • The ability to version our documentation
  • Quick deployments
  • A technology stack that our developers know and can support
  • Serverless deployment

The goto for documentation, and GitHub Sites' default is Jekyll which we looked at first. While Jekyll has a great community and would have been the path of least resistance, the fact that no one on our team had any Ruby experience made us look for more options. Our core product is written in Java, but we already have some of our supporting infrastructure written in NodeJS, so we started out there when looking for tools, and found Metalsmith to be the most popular option. While Metalsmith has plugins for days, it is closer to a box of Lego than a fully assembled system.

Luckily I found and heavily cribbed from the open source documentation for the fantastic Particle microcontroller board. Check them out on GitHub. Their working example of Metalsmith gave me enough of a reference to get started.

Project Structure

Our initial project structure looks something like this:

docs
├── public
│   └── components - Bower working directory
├── scripts - All of the actual Metalsmith code
├── src - Source of all content
│   ├── assets 
│   │   ├── doc-media - Images used in docs
│   │   └── images - Images used for all pages
│   ├── css
│   └── markdown - The actual docs, subdirectories correspond to topnav
│       ├── api
│       ├── development
│       ├── guide
│       ├── index.md
│       └── install
└── templates - The Bootstrap layouts for all pages

Setting up Metalsmith Pipeline

Metalsmith works as a chain of filters that transform an input directory (in our case, a bunch of markdown in /src/markdown) into the output directory. There is nothing that says that the input of Metalsmith has to be Markdown, nor that the output needs to be a static HTML site, but it is important to remember that at it's core, Metalsmith is transforming the source files, so trying to force it to work on another set of data outside of the source files can be difficult. At one point we tried to have Metalsmith bulk resize the screenshots we were using in our documentation at the same time it was building and it proved problematic.

In /scripts/metalsmith.js we script out the core rendering flow as follows:

var ms = Metalsmith(__dirname)
  .source('../src/markdown')
  .destination('../build')
  .use(paths())
  .use(helpers({
    directory: './hbs-helpers'
  }))
  .use(collections({
      home: {
        pattern: 'index.md',
        metadata: {
          name: "Home"
        }
      },
      installation: {
        pattern: 'install/*.md',
        sortBy: 'order',
        metadata: {
          name: "Installation"
        }
      },
      guide: {
        pattern: 'guide/*.md',
        sortBy: 'order',
        metadata: {
          name: "Guide"
        }
      },
	  development: {
        pattern: 'development/*.md',
        sortBy: 'order',
        metadata: {
          name: "Development"
        }
      },
      api: {
        pattern: 'api/*.md',
        sortBy: 'order',
        metadata: {
          name: "API"
        }
      }
    }))
  .use(markdown())
  .use(layouts({
    engine: 'handlebars',
    directory: '../templates',
    default: 'template.hbs'
  }))
  .use(assets({
    src: '../src/assets',
    dest: '../build/assets'
  }))
  .use(assets({
    src: '../src/css',
    dest: '../build/assets/css'
  }))
  .use(assets({
    src: '../public/components/bootstrap/dist',
    dest: '../build/assets/bootstrap'
  }))
  .use(assets({
    src: '../public/components/jquery/dist',
    dest: '../build/assets/jquery'
  }))
  .use(permalinks({
    relative: false
  }))

At a high level, here is what our rendering pipeline is doing:

  1. Configure source and destination directories
  2. Add file path information for each source file to the Metalsmith metadata collection, this helps us build links and ToC.
  3. Allow for javascript helpers exported in /scripts/hbs-helpers to be invoked by the Handlebars template. We use this for a few simple things like highlighting the active collection on the topnav.
  4. Split apart source files into collections based on a matching pattern. These are used for the topnav and the sidebar navigation as well as the directory each individual page gets rendered into.
  5. Render Markdown into HTML
  6. Inject rendered HTML into the Handlebars template
  7. Force copy the static assets outside of the "source" directory into the appropriate output directory.
  8. Move all html files not named index.html into a subdirectory with the same name, and rename them to index.html inside that directory. This gives us pretty URLs in our static site.

The pipeline is then exported so we can use it without a separate build scripts.

Build Scripts

The Metalsmith pipeline we built will compile the entire static site into the /build directory when invoked, but that's usually not what we want to do. We built a series of scripts on top of our master pipeline that lets us do a few fun things like:

  • Just render the whole thing and quit
  • Render the site and start a web server to host the content, and watch for any changes and rebuild the site. This is a great workflow for our documentation writers, as all then need to do is save their Markdown file and hit F5 in their browser to see how their work looks.
  • Render the site, then deploy it.

All of these scripts are run from package.json by doing something like npm run www.

Adding extra filters to these scripts is pretty straightforward, like this development server script:

ms
  .use(watch({
        paths: {
          "${source}/**/*": true,
          "../templates/**/*": true,
        },
        livereload: true,
      })
    )
  .use(serve({
    port:3000
  }))
  .build(function(){});

Versioning

Eventually we want to host different version of our docs that correspond to different release of our application. For now we are just tagging the git repo that hosts our content.

Deployments

The great thing about static sites is they are dead simple to host. In our case we copy the site to a AWS S3 bucket, and put a CloudFront CDN in front of that.

While Metalsmith has a S3 plugin, I found it easier to just roll my own using the Node S3 library which even runs checksums against all of your files so it uploads our entire site in just a few seconds. After the script is done with the upload, it follows it up by sending a cache invalidation request to CloudFront.

Here are the details of the deployment script:

ms
    .build(function(err){
        if(err) {
            return fatal(err.message);
        }
        else {
            var client = s3.createClient({
                s3Options: {
                    region:'us-west-2'
                }
            });
            
            var params = {
                localDir: __dirname + '/../build',
                deleteRemove: true,
                s3Params: {
                    Bucket:'docs-site'
                }
            };

            var uploader = client.uploadDir(params);
            uploader.on('error', function(err) {
                console.error("unable to sync:", err.stack);
            });
            uploader.on('progress', function() {
                console.log("progress", uploader.progressAmount, uploader.progressTotal);
            });
            uploader.on('end', function() {
                console.log("done uploading");
            });
        }
    });

If you don't have it setup already from the AWS CLI tool, you'll need to create a ~/.aws/credentials file with your AWS credentials to get the deployments to work.

Conclusion

In the end, our Metalsmith based documentation website probably tool a bit more work to get setup then we would have liked, but now that it's done, we are really happy with the results. The documentation writers have had a great time with the quick feedback look of the auto updating server. Using git has given us a great way to review documentation updates through pull requests and version the documentation. And the deployments are so fast it almost seems like something went wrong.

For the full working example, check out this GitHub repo.