Automatically Generating index for documentation in Yaydoc

Yaydoc which uses Sphinx Documentation Generator internally needs a document named index.rst describing the overall layout of the documentation to generate a proper table of contents. Without an index.rst present, the build fails. With this week’s update that constraint has been relaxed. Now if yaydoc detects that index.rst has not been supplied, it automatically generates a minimal index for basic use. Although it is still recommended to provide your own index, you won’t be punished for its absence. The following sections show how this was implemented and also shows this feature in action.

Implementation

For generating a minimal index.rst, we perform the following steps:

  • If the repository has a README.rst or a README.md, we include it in the index
  • Several toctrees are generated as per how the documents in the repository are arranged.

The following code snippet returns a valid rst block which includes the document dirpath/filename

def get_include(dirpath, filename):
    ext = os.path.splitext(filename)[1]
    if ext == '.md':
        directive = 'mdinclude'
    else:
        directive = 'include'
    template = '.. {directive}:: {document}'
    path = os.path.relpath(os.path.join(dirpath, filename))
    document = path.replace(os.path.sep, '/')
    return template.format(directive=directive, document=document)

The following code snippet returns a valid rst block which creates a toctree of dirpath.

def get_toctree(dirpath, filenames):
    toctree = ['.. toctree::', '   :maxdepth: 1']
    caption_template = '   :caption: {caption}'
    content_template = '   {document}'

    caption = os.path.basename(dirpath).replace('_', ' ').title()
    if caption == os.curdir:
        caption = 'Contents'
    toctree.append(caption_template.format(caption=caption))
    # Inserting a blank line
    toctree.append('')

    valid = False
    for filename in filenames:
        path, ext = os.path.splitext(os.path.join(dirpath, filename))
        if ext not in ('.md', '.rst'):
            continue
        document = path.replace(os.path.sep, '/')
        document = document.lstrip('./').rstrip('/')
        toctree.append(content_template.format(document=document))
        valid = True

    if valid:
        return '\n'.join(toctree)
    else:
        return ''

The following code snippet walks the documentation directory and returns a valid content to be written to index.rst.

def get_index(root):
    index = []
    # Include README from root
    root_files = next(os.walk(root))[2]
    if 'README.rst' in root_files:
        index.append(get_include(root, 'README.rst'))
    elif 'README.md' in root_files:
        index.append(get_include(root, 'README.md'))
    # Add toctrees as per the directory structure
    for (dirpath, dirnames, filenames) in os.walk(os.curdir):
    if filenames:
        toctree = get_toctree(dirpath, filenames)
        if toctree:
            index.append(toctree)
    return '\n\n'.join(index) + '\n'

Result

Let’s assume that a sample project has the following directory tree for documentation.

+---_README.md
+---_docs/
|   +---_installation_guide/
|   |   +--- setup_heroku.md
|   |   +--- setup_docker.md
|   +---_tutorial/
|   |   +--- basic.md
|   |   +--- advanced.md

The following index.rst would be generated from the above tree

.. mdinclude:: ../README.md

.. toctree::
   :caption: Installation Guide
   :maxdepth: 1

   setup_heroku
   setup_docker

.. toctree::
   :caption: Tutorial
   :maxdepth:

   basic
   advanced

As you can see, this index.rst would be enough for most use cases. This update decreases the entry barrier for yaydoc. More features are on the way.

Resources

Continue ReadingAutomatically Generating index for documentation in Yaydoc

Using Root Directory as the Documentation Directory with Yaydoc

In our test builds for Yaydoc, we found that If we set the root as the documentation directory, the build would fail with a very long build log. In the build process, we create some temporary directories such as a virtual environment and the build directory in the root. After some inspection of the build logs, we found out that when the root is itself used as the documentation directory, we were accidently recursively copying the build directory into itself which led to build failure. Together with this, since the virtual environment directory was also being accidently copied to the build directory, we were actually building the documentation of the entire Python standard library on each build.

Once the problem and It’s cause was known, the course of action to be taken was clear. We needed to ensure that any temporary directories which we create as part of the build process was not being copied to the build directory. The following changes were made to achieve that.

  • The virtual environment directory was now being created in the HOME directory instead of the root.
  • Any other temporary directories which except the main build directory was now deleted before copying.
  • To prevent the recursive copying, we used the –exclude parameter of rsync.
rsync --exclude=BUILD_DIR DOCS_DIR/ BUILD_DIR/

After this patch, root can also be used as the documentation directory with Yaydoc. To do so, just set the environment variable DOCPATH as “.”

Continue ReadingUsing Root Directory as the Documentation Directory with Yaydoc

Advanced customization of the Yaydoc Build Process

Although, Yaydoc exposes many environment variables which can be used to configure various aspects of the build process, there may be cases where a user needs much more finer control over the build process. Yaydoc uses sphinx under the hood which uses a file named conf.py to allow users to customize the build. As part of the build process, Yaydoc generates a file named conf.py from a custom made jinja2 template. With this week’s update, now a user can extend the generated conf.py by providing their own conf.py whose content would be appended to the generated conf.py.

Why append you may ask. Why not just overwrite? This is because the generated conf.py has a lot of boilerplate code which when overwritten will need to be rewritten by the user. That is why the contents are appended so that the user will only need to specify any extra configuration options they may wish to add or override. This approach has the following advantages:

  • Ability to override or add any configuration option during build.
  • Since the conf.py file is execfile`d by sphinx during build, the user has the ability to execute arbitrary code to customize any part of the build process.

The following block of code implements this feature.

if [ -f $DOCPATH/conf.py ]; then
  echo >> BUILD_DIR/conf.py
  cat $DOCPATH/conf.py >> BUILD_DIR/conf.py
  rsync -a $DOCPATH. BUILD_DIR/ --exclude=conf.py
else
  cp -a $DOCPATH. BUILD_DIR/
fi

Here we check if user has provided a conf.py, we append it to the generated conf.py. To append we used the >> shell redirection feature. It redirects stdout to a file similar to > but instead of overwriting the file, it appends to it.

This brings us on parity with sphinx as  far as customization goes. We may expose some more configuration variables for easier setup in the future, but now you can always modify any aspects of the build process even if it is not exposed via a variable. This should be enough for most use cases. More changes are on the way. Stay tuned for more updates.

Continue ReadingAdvanced customization of the Yaydoc Build Process

Adding support for Markdown in Yaydoc

Yaydoc being based on sphinx natively supports reStructuredText. From the official docs:

reStructuredText is an easy-to-read, what-you-see-is-what-you-get plaintext markup syntax and parser system. It is useful for quickly creating simple web pages, and for standalone documents. reStructuredText is designed for extensibility for specific application domains.

Although it being superior to markdown in terms of features, Markdown is still the most heavily used markup language out there. This week we added support for markdown into Yaydoc. Now you can use Markdown to document your project and Yaydoc would create a site with no changes required from your end. To achieve this, we used recommonmark, which enables sphinx to parse CommonMark, a strongly defined, highly compatible specification of Markdown. It solved most of the problem with 3 lines of code in our customized conf.py .

from recommonmark.parser import CommonMarkParser

source_parsers = {
'.md': CommonMarkParser,
}

source_suffix = ['.rst', '.md']

With this addition, sphinx can now use recommonmark to convert markdown to html. But not everything has been solved. Here is an excerpt from a previous blogpost which explains a problem yet to be solved.

Now sphinx requires an index.rst file within docs directory  which it uses to generate the first page of the site. A very obvious way to fill it which helps us avoid unnecessary duplication is to use the include directive of reStructuredText to include the README file from the root of the repository. But the Include directive can only properly include a reStructuredText, not a markdown document. Given a markdown document, it tries to parse the markdown as  reStructuredText which leads to errors.

To solve this problem, a custom directive mdinclude was created. Directives are the primary extension mechanism of reStructuredText. Most of it’s implementation is a copy of the built in Include directive from the docutils package. Before including in the doctree, mdinclude converts the docs from markdown to reStructuredText using pypandoc. The implementation is similar to the one also discussed in a previous blogpost.

class MdInclude(rst.Directive):

required_arguments = 1
optional_arguments = 0

def run(self):
    if not self.state.document.settings.file_insertion_enabled:
        raise self.warning('"%s" directive disabled.' % self.name)
    source = self.state_machine.input_lines.source(
        self.lineno - self.state_machine.input_offset - 1)
    source_dir = os.path.dirname(os.path.abspath(source))
    path = rst.directives.path(self.arguments[0])
    path = os.path.normpath(os.path.join(source_dir, path))
    path = utils.relative_path(None, path)
    path = nodes.reprunicode(path)

    encoding = self.options.get(
        'encoding', self.state.document.settings.input_encoding)
    e_handler = self.state.document.settings.input_encoding_error_handler
    tab_width = self.options.get(
        'tab-width', self.state.document.settings.tab_width)

    try:
        self.state.document.settings.record_dependencies.add(path)
        include_file = io.FileInput(source_path=path,
                                    encoding=encoding,
                                    error_handler=e_handler)
    except UnicodeEncodeError as error:
        raise self.severe('Problems with "%s" directive path:\n'
                          'Cannot encode input file path "%s" '
                          '(wrong locale?).' %
                          (self.name, SafeString(path)))
    except IOError as error:
        raise self.severe('Problems with "%s" directive path:\n%s.' %
                          (self.name, ErrorString(error)))

    try:
        rawtext = include_file.read()
    except UnicodeError as error:
        raise self.severe('Problem with "%s" directive:\n%s' %
                          (self.name, ErrorString(error)))

    output = md2rst(rawtext)
    include_lines = statemachine.string2lines(output,
                                              tab_width, 
                                              convert_whitespace=True)
    self.state_machine.insert_input(include_lines, path)
    return []

With this, Yaydoc can now be used on projects that exclusively use markdown. There are some more hurdles which we need to cross in the following weeks. Stay tuned for more updates.

Continue ReadingAdding support for Markdown in Yaydoc

Using custom themes with Yaydoc to build documentation

What is Yaydoc?

Yaydoc aims to be a one stop solution for all your documentation needs. It is continuously integrated to your repository and builds the site on each commit. One of it’s primary aim is to minimize user configuration. It is currently in active development.

Why Themes?

Themes gives the user ability to generate visually different sites with the same markup documents without any configuration. It is one of the many features Yaydoc inherits from sphinx.

Now sphinx comes with 10 built in themes but there are much more custom themes available on PyPI, the official Python package repository. To use these custom themes, sphinx requires some setup. But Yaydoc being an automated system needs to performs those tasks automatically.

To use a custom theme which has been installed, sphinx needs to know the name of the theme and where to find it. We do that by specifying two variables in the sphinx configuration file. html_theme and html_theme_path respectively. Custom themes provide a method that can be called to get the html_theme_path of the theme. Usually that method is named get_html_theme_path . But that is not always the case. We have no way find the appropriate method automatically.

So how do we get the path of an installed theme just by it’s name and how do we add it to the generated configuration file.

The configuration file is generated by the sphinx-quickstart command which Yaydoc uses to initialize the documentation directory. We can override the default generated files by providing our own project templates. The templates are based on the Jinja2 template engine.

Firstly, I replaced

html_theme = ‘alabaster’

With

html_theme = ‘{{ html_theme }}’

This provides us the ability to pass the name of the theme as a parameter to sphinx-quickstart. Now the user has an option to choose between 10 built-in themes. For custom themes however there is a different story. I had to solve two major issues.

  • The name of the package and the theme may differ.
  • We also need the absolute path to the theme.

The following code snippet solves the above mentioned problems.

{% if html_theme in (['alabaster', 'classic', 'sphinxdoc', 'scrolls',
'agogo', 'traditional', 'nature', 'haiku',
'pyramid', 'bizstyle'])
%}
# Theme is builtin. Just set the name
html_theme = '{{ html_theme }}'
{% else %}
# Theme is a custom python package. Lets install it.
import pip
exitcode = pip.main(['install', '{{ html_theme }}'])
if exitcode:
    # Non-zero exit code
    print("""{0} is not available on pypi. Please ensure the theme can be installed using 'pip install {0}'.""".format('{{ html_theme }}'), file=sys.stderr)
else:
    import {{ html_theme }}
    def get_path_to_theme():
        package_path = os.path.dirname({{ html_theme }}.__file__)
        for root, dirs, files in os.walk(package_path):
            if 'theme.conf' in files:
                return root
    path_to_theme = get_path_to_theme()
    if path_to_theme is None:
        print("\n{0} does not appear to be a sphinx theme.".format('{{ html_theme }}'), file=sys.stderr)
        html_theme = 'alabaster'
    else:
        html_theme = os.path.basename(path_to_theme)
        html_theme_path = [os.path.abspath(os.path.join(path_to_theme, os.pardir))]
{% endif %}

It performs the following tasks in order:

  • It first checks if the provided theme is one of the built in themes. If that is indeed the case, we just set the html_theme config value to the name of the theme.
  • Otherwise, It installs the package using pip.
  • Now __file__ has a special meaning in python. It returns us the path of the module. We use it to get the path of the installed package.
  • Now each sphinx theme must have a file named `theme.conf` which defines several properties of the theme. We do a recursive search for that file.
  • We set html_theme to be the name of the directory which contains that file, and html_theme_path to be it’s parent directory.

Now let’s see everything in action. Here are four pages created by Yaydocs from a single markup document with no user configuration.

 

Now you can choose between many of the themes available on PyPI. You can even create your own theme. Follow this blog to get more insights and latest news about Yaydoc.

 

Continue ReadingUsing custom themes with Yaydoc to build documentation

Generating a documentation site from markup documents with Sphinx and Pandoc

Generating a fully fledged website from a set of markup documents is no easy feat. But thanks to the wonderful tool sphinx, it certainly makes the task easier. Sphinx does the heavy lifting of generating a website with built in javascript based search. But sometimes it’s not enough.

This week we were faced with two issues related to documentation generation on loklak_server and susi_server. First let me give you some context. Now sphinx requires an index.rst file within /docs/  which it uses to generate the first page of the site. A very obvious way to fill it which helps us avoid unnecessary duplication is to use the include directive of reStructuredText to include the README file from the root of the repository.

This leads to the following two problems:

  • Include directive can only properly include a reStructuredText, not a markdown document. Given a markdown document, it tries to parse the markdown as  reStructuredText which leads to errors.
  • Any relative links in README break when it is included in another folder.

To fix the first issue, I used pypandoc, a thin wrapper around Pandoc. Pandoc is a wonderful command line tool which allows us to convert documents from one markup format to another. From the official Pandoc website itself,

If you need to convert files from one markup format into another, pandoc is your swiss-army knife.

pypandoc requires a working installation of Pandoc, which can be downloaded and installed automatically using a single line of code.

pypandoc.download_pandoc()

This gives us a cross-platform way to download pandoc without worrying about the current platform. Now, pypandoc leaves the installer in the current working directory after download, which is fine locally, but creates a problem when run on remote systems like Travis. The installer could get committed accidently to the repository. To solve this, I had to take a look at source code for pypandoc and call an internal method, which pypandoc basically uses to set the name of the installer. I use that method to find out the name of the file and then delete it after installation is over. This is one of many benefits of open-source projects. Had pypandoc not been open source, I would not have been able to do that.

url = pypandoc.pandoc_download._get_pandoc_urls()[0][pf]
filename = url.split(‘/’)[-1]
os.remove(filename)

Here pf is the current platform which can be one of ‘win32’, ‘linux’, or ‘darwin’.

Now let’s take a look at our second issue. To solve that, I used regular expressions to capture any relative links. Capturing links were easy. All links in reStructuredText are in the same following format.

`Title <url>`__

Similarly links in markdown are in the following format

[Title](url)

Regular expressions were the perfect candidate to solve this. To detect which links was relative and need to be fixed, I checked which links start with the \docs\ directory and then all I had to do was remove the \docs prefix from those links.

A note about loklak and susi server project

Loklak is a server application which is able to collect messages from various sources, including twitter.

SUSI AI is an intelligent Open Source personal assistant. It is capable of chat and voice interaction and by using APIs to perform actions such as music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, and other real time information

Continue ReadingGenerating a documentation site from markup documents with Sphinx and Pandoc