Building fastai's Documentation

What is this about?

I've decided to try and build as much of the documentation that I use all the time on my local system, not just so that I'll have it if my internet connection goes down but also so that I won't be distracted by what's happening on the web. This is about building fastai's documentation, which was a little trickier than I thought it would be so I decided it would be worth it to make a note for the future.

You can skip to the In A Nutshell section of the post to get a summary of the steps without all the exposition that the middle section has.

What happened?

The Repository

The first thing I did was clone the fastai git repository from github. If you inspect it there's a folder called docs_src which seemed to logically mean that that's where the source files for the documentation are but when you go in there you won't find an index.html file, which seemed peculiar. There's a Makefile at the root of the repository so I inspected it and found that there's a rule:

docs: $(SRC)
        rsync -a docs_src/ docs
        nbdev_build_docs

So I tried a naive make docs but of course it failed because there's nothing called nbdev_build_docs, so I searched online and found out that nbdev is a fastai project to make jupyter notebooks into a Literate Programming system and that nbdev_build_docs is one of their command-line commands, so I installed it through pip:

pip install nbdev

And re-ran the make command, which did nothing because the rsync command had already created the docs folder and for some reason this made the nbdev_build_docs command not work. So I removed the docs folder and re-ran it, which produced a big dump of errors because in converting the notebooks nbdev was importing a bunch of python code that wasn't installed. Interestingly, at this point the docs folder actually has enough to run the site, despite all the error-messages, but if you just try to load the files into a browser you can see that it's kind of broken, so then I went looking for what was going on.

Jekyll and Hide

For some reason I couldn't find anything in the documentation on building it, but searching for "fastai build documentation" brought an outdated page that tells you how to build the documentation but was written for the prior version of fastai (v1) so much of it doesn't make sense for v2 (e.g. it refers to a non-existent tools folder), which I didn't figure out at first because the sites for v1 and v2 don't really identify their version, except in the URL for the old site.

All You Need

Reading that documentation it turns out that they're using Jekyll, so if you install it you just need to run Jekyll in the docs folder.

cd docs
bundle exec jekyll serve

And the site is ready to read at http://localhost:4000 and at this point you're good to go - but, of course, I didn't realize that and tried to fix the error messages first, which is what the rest of this post is about.

Fixing the Imports

There's three things you need to do to fix the imports:

Installing fastai

The old documentation recommended installing it in development mode. I don't know if that's strictly necessary, but it fixed a lot of things so it seems like a good idea.

In the root of the fastai repository run pip.

pip install -e ".[dev]"

This installs a lot of stuff so you might want to go get a cup of coffee (or maybe a cocktail) at this point while it does its thing. The settings.ini file lists the dev_requirements and the regular requirements if you want to see what needs to be installed in either case.

Installing Flask Compress

This is pretty straight-forward, just use pip.

pip install flask-compress

Install Azureml-core

  • The Problem

    This wasn't quite so straight-forward, which is why I put it in a separate section. If you try to install it in Ubuntu 21.04 (or 20.04, etc.) you will get a big blob of error messages ending in this.

    ERROR: Command errored out with exit status 1: /home/hades/.virtualenvs/fastai-clean/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-srnkqokl/ruamel-yaml_803314568
    e8f4fa49015a45528d277b2/setup.py'"'"'; __file__='"'"'/tmp/pip-install-srnkqokl/ruamel-yaml_803314568e8f4fa49015a45528d277b2/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(_
    _file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip
    -record-yfvqflby/install-record.txt --single-version-externally-managed --compile --install-headers /home/hades/.virtualenvs/fastai-clean/include/site/python3.9/ruamel.yaml Check the logs for full command output
    

    Which isn't really all that helpful. Scrolling up, it looks like the problem was with something called ruamel.yaml, so investigating this seemed like a place to start, but, of course, the error messages are completely inscrutable now that I haven't programmed in C for so many years so I decided to search the web instead of trying to debug it directly, figuring that someone else must have had this problem.

    This lead to a long search through various posts, but what it turned out to be was that both ruamel.yaml and azureml-core don't support python 3.9 yet (there are some bug reports on GitHub for it already) so you can't install it with the version that currently ships with Ubuntu (3.9.5) or anything above python 3.8.

  • The Fix

    The fix I decided to use was to install pyenv using their installer. Once you run the installer and follow the rest of their installation instructiors it's fairly straightforward to set up so I won't go into it.

    I decided to use python 3.8.10 so to install it you do this:

    pyenv install 3.8.10
    

    The only thing that didn't work for me was their pyenv which function which is supposed to show you the location of the python installation. The command might work but I couldn't figure out the arguments to use (updating the example they gave didn't work for me). It turned out the python binary was at:

    ~/.pyenv/versions/3.8.10/bin/python
    

    pyenv has it's own system for creating a virtual environment, but since I'm already using virtualfish and didn't want to try and troubleshoot yet another method I created a virtual environment the way I usually do it.

    ~/.pyenv/versions/3.8.10/bin/python -m venv fastai-doc
    

    At this point I activated the new virtual environment and had to re-do previous installation steps (for fastai and flask_compress) as well as the azure-ml installation.

    pip install -e ".[dev]"
    pip install flask-compress azureml-core
    

    The installation of fastai installs nbdev as one of the requirements so that didn't have to be re-done. And now I built the documentation and ran the jekyll server. Easy-peasy.

    make docs
    cd docs
    bundle exec jekyll serve
    

In A Nutshell

The Minimum to Get the Documentation

  • Clone the fastai git repository from github
  • Install jekyll and nbdev
  • Change into the root of the fastai repository you cloned
  • Run make docs and ignore the error-messages
  • Change into the docs folder that was created and run the jekyll server (bundle exec jekyll serve)

To Fix All the Errors

This isn't really necessary to get the documentation, but I think it's better, since you don't have to ignore all the error messages.

  • Clone the fastai git repository from github
  • Install jekyll
  • Get python 3.8 working (I used pyenv)
  • Use pip to install fastai in development mode
  • Use pip to install flask_compress and azureml-core
  • Change into the root of the fastai repository you cloned
  • Run make docs
  • Change into the docs folder that was created and run the jekyll server (bundle exec jekyll serve)