Matteo Franchin's corner

Sphinx: missing index in singlehtml output

Introduction

This document reports a problem with the Sphinx documentation system. Sphinx is becoming increasingly popular as it allows using ReStructuredText, an easy-to-read plaintext markup language, to write documentation and webpages. Sphinx is used to document the Python language and libraries (and I used Sphinx to generate the webpages that you are currently reading).

Problem

The index of the document (usually generated by the toc directive) is not shown when generating documentation in a single HTML page. The problem only occurs when using the singlehtml builder (e.g. sphinx-build -b singlehtml ...) and does not affect the standard html builder. The index would be useful to navigate the page more easily. In particular, if the output HTML page is long, the index provides a bunch of links (to the various sections and subsections of the document) which can greatly help to move quickly to the right part of the document.

By solving the problem I managed to create the Nmag manual in a single HTML page and with the index. Click here to see what I mean.

Why?

To understand why the index is automatically removed when generating the documentation in singlehtml mode (SingleFileHTMLBuilder), we have to review quickly how Sphinx works.

Sphinx generates the documentation in two phases: first the input files (containing the documentation) are scanned and an abstract syntax tree is generated. This tree is a big datastructure which contains all the information which is necessary to generate the output files, in any of the available formats. The second phase, consists in translating the abstract syntax tree into one or more output files (HTML, PDF, etc.).

At the moment, there are a few different HTML builders in Sphinx (see here). One is the html builder (class sphinx.builders.html.StandaloneHTMLBuilder), which produces several HTML files together with a global index which refers to them. The user is meant to start browsing the index and click on the links to navigate throughout the whole document. There is then a singlehtml builder (class sphinx.builders.html.SingleFileHTMLBuilder) which is similar to the html builder, but puts all the documentation in one single HTML file.

Both the builders are implemented in the file Sphinx-1.0.7/sphinx/builders/html.py, which belongs to the Sphinx source code. In particular, the single file HTML builder (class SingleFileHTMLBuilder) is derived from the standard HTML builder (class StandaloneHTMLBuilder) and implements a few changes over its parent class. In order to create a single HTML file with all the documentation, the tree is manipulated. The toctrees directives, which are nodes in the documentation tree, are found and substituted with the content they refer to. In other words, each entry in the toctree is substituted with one document (i.e. a document tree corresponding to the corresponding file). As a consequence of this substitution, the toctree node is removed from the documentation tree and a new big and global documentation tree is created, where all the toctree nodes have been consumed by the substitution. This is the reason why the indices are not shown (the index is created in the HTML file as an effect of rendering the toctree directives).

Reading the source code in html.py, one is tempted to say that the SingleFileHTMLBuilder looks like a hack. In particular, it would be cleaner to create a SingleFileHTMLBuilder class which generates documentation without having to manipulate the source documentation tree (even though the class operates on a copy of the tree).

The solution

I created an extension for Sphinx. This is a Python file which adds a new directive globalindex to Sphinx. This directive can be used to explicitly insert the index in the SingleFileHTMLBuilder output. Instructions on how to use the extension are given below. The source code of the extension is shown at the bottom of the page.

In order to use the extension you should create a directory extensions in the top directory of your Sphinx project (where the Makefile generated by the sphinx-quickstart script is):

cd my-doc-root
mkdir extensions

Enter the directory and download the extension file:

cd extensions
wget http://fnch.users.sf.net/data/globalindex.py

Edit your conf.py so that it starts with the following two lines:

import sys, os
sys.path.insert(0, os.path.abspath('./extensions'))

This is necessary in order for Python to find the Sphinx extension. Edit again conf.py to enable the extension:

extensions = ['globalindex']

That’s it! You can now use the new directive in your documentation in the following way:

.. globalindex::
   :maxdepth: 3

Note on the solution

I didn’t try to solve the problem in the right way (rewriting or adjusting the SingleFileHTMLBuilder class and sending the patch to the Sphinx developers), but this works for me with the current version of Sphinx.

This way I managed to insert the index in the documentation for the Nmag micromagnetic simulator. You can see the result by clicking here.

Source code of the extension

# Copyright (C) 2011 by Matteo Franchin
#
#   This file is free software: you can redistribute it and/or modify it
#   under the terms of the GNU General Public License as published
#   by the Free Software Foundation, either version 3 of the License, or
#   (at your option) any later version.
#
#   This file is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   A copy of the GNU General Public License is available at
#   <http://www.gnu.org/licenses/>.

from sphinx.util.compat import Directive
from sphinx.builders.html import SingleFileHTMLBuilder
from docutils import nodes
from docutils.parsers.rst import directives

class globalindex(nodes.General, nodes.Element):
    pass

def visit_globalindex_node(self, node):
    self.body.append(node['content'])

def depart_globalindex_node(self, node):
    pass

class GlobalIndexDirective(Directive):
    required_arguments = 0
    optional_arguments = 1
    final_argument_whitespace = True
    option_spec = \
      {'maxdepth': directives.nonnegative_int,
       'collapse': directives.flag,
       'titlesonly': directives.flag}

    def run(self):
        node = globalindex('')
        node['maxdepth'] = self.options.get('maxdepth', 2)
        node['collapse'] = 'collapse' in self.options
        node['titlesonly'] = 'titlesonly' in self.options
        return [node]

def process_globalindex_nodes(app, doctree, fromdocname):
    builder = app.builder
    if builder.name != SingleFileHTMLBuilder.name:
        for node in doctree.traverse(globalindex):
            node.parent.remove(node)

    else:
        docname = builder.config.master_doc
        for node in doctree.traverse(globalindex):
            kwargs = dict(maxdepth=node['maxdepth'],
                          collapse=node['collapse'],
                          titles_only=node['titlesonly'])
            rendered_toctree = builder._get_local_toctree(docname, **kwargs)
            node['content'] = rendered_toctree

def setup(app):
    app.add_node(globalindex,
                 html=(visit_globalindex_node, depart_globalindex_node))
    app.add_directive('globalindex', GlobalIndexDirective)
    app.connect('doctree-resolved', process_globalindex_nodes)