Pipelining Bash Script’s output to Webapp using Socket.io

Yaydoc, our automatic documentation generator, among other components, consists of a Web User Interface. This UI has a form that takes as its input certain information about a user’s project and generates documentations using this information in the backend with the help of a Bash Script. The caveat of executing such a Bash Script is that a user will have to wait for the processing to complete in order to get any output on the WebApp. This creates some problem as the user may not know if the process is executing properly. Furthermore, servers that are used to deploy such web applications have a limited time span within which it must send a response to a received GET or POST request. Since executing scripts may take some time, the process may lead to a Request Timeout.

We faced a similar problem with Yaydoc while deploying it to Heroku. Since Heroku has a timeout at 30 seconds, executing the Documentation Generation script lead to a Request Timeout as it takes more than 30 seconds for the execution. After doing a bit of research, we were introduced with Socket.io. Socket.IO is one of the most powerful Javascript frameworks which enables real-time bidirectional event-based communication.

At the client side, we define an “execute” event which emits the form data when the “Generate Docs” button is clicked. At the server side, we handle the event by executing a generator.executeScript(...) function with the socket and formData as its arguments.

/**
 * Client-side Event Handling
 */

$(function () {
 var socket = io();
 $(“#btnGenerate”).click(function () {
   var formData = getData();
   socket.emit(“execute”, formData);
 });
 ...
 ...
 ...
});

/**
 * Server-side Event Handling
 */
io.on(“connection”, function (socket) {
 socket.on(“execute”, function (formData) {
   generator.executeScript(socket, formData);
 });
});

 

Bash scripts are executed in NodeJS by creating child processes using the `child_process` module. This module provides four different methods for executing external applications. They are:

  1. execFile
  2. exec
  3. spawn
  4. fork

Out of these, the exec() and execFile() methods returns buffered data when the script executes successfully. We cannot use them as a solution because we need to continuously receive certain response from the server after execution of a limited number of commands in the script. Thus, we opt for spawn() which returns a stream based object every time the script produces some data. The spawn method is called in the executeScript method.

exports.executeScript = function (socket, formData) {
 ...
 ...
 var process = spawn(“./generate.sh”, args);
 process.stdout.on(“data”, function (data) {
   socket.emit(“logs”, {data: data});
 });
 ...
 ...
};

The emitted logs are then received at the client-side for display in the web application.

/**
 * Client-side Event Handling
 */
$(function () {
 ...
 ...
 socket.on(“logs”, function (data) {
   $(“#messages”).append($(“<li>”).text(data.data));
 }
 ...
 ...
});

A minimal sample of this application can be found at: https://github.com/imujjwal96/socket-bashing

Automatically Generating index for documentation in Yaydoc

Yaydoc which uses Sphinx Documentation Generator internally needs a document named index.rst describing the overall layout of the documentation to generate a proper table of contents. Without an index.rst present, the build fails. With this week’s update that constraint has been relaxed. Now if yaydoc detects that index.rst has not been supplied, it automatically generates a minimal index for basic use. Although it is still recommended to provide your own index, you won’t be punished for its absence. The following sections show how this was implemented and also shows this feature in action.

Implementation

For generating a minimal index.rst, we perform the following steps:

  • If the repository has a README.rst or a README.md, we include it in the index
  • Several toctrees are generated as per how the documents in the repository are arranged.

The following code snippet returns a valid rst block which includes the document dirpath/filename

def get_include(dirpath, filename):
    ext = os.path.splitext(filename)[1]
    if ext == '.md':
        directive = 'mdinclude'
    else:
        directive = 'include'
    template = '.. {directive}:: {document}'
    path = os.path.relpath(os.path.join(dirpath, filename))
    document = path.replace(os.path.sep, '/')
    return template.format(directive=directive, document=document)

The following code snippet returns a valid rst block which creates a toctree of dirpath.

def get_toctree(dirpath, filenames):
    toctree = ['.. toctree::', '   :maxdepth: 1']
    caption_template = '   :caption: {caption}'
    content_template = '   {document}'

    caption = os.path.basename(dirpath).replace('_', ' ').title()
    if caption == os.curdir:
        caption = 'Contents'
    toctree.append(caption_template.format(caption=caption))
    # Inserting a blank line
    toctree.append('')

    valid = False
    for filename in filenames:
        path, ext = os.path.splitext(os.path.join(dirpath, filename))
        if ext not in ('.md', '.rst'):
            continue
        document = path.replace(os.path.sep, '/')
        document = document.lstrip('./').rstrip('/')
        toctree.append(content_template.format(document=document))
        valid = True

    if valid:
        return '\n'.join(toctree)
    else:
        return ''

The following code snippet walks the documentation directory and returns a valid content to be written to index.rst.

def get_index(root):
    index = []
    # Include README from root
    root_files = next(os.walk(root))[2]
    if 'README.rst' in root_files:
        index.append(get_include(root, 'README.rst'))
    elif 'README.md' in root_files:
        index.append(get_include(root, 'README.md'))
    # Add toctrees as per the directory structure
    for (dirpath, dirnames, filenames) in os.walk(os.curdir):
    if filenames:
        toctree = get_toctree(dirpath, filenames)
        if toctree:
            index.append(toctree)
    return '\n\n'.join(index) + '\n'

Result

Let’s assume that a sample project has the following directory tree for documentation.

+---_README.md
+---_docs/
|   +---_installation_guide/
|   |   +--- setup_heroku.md
|   |   +--- setup_docker.md
|   +---_tutorial/
|   |   +--- basic.md
|   |   +--- advanced.md

The following index.rst would be generated from the above tree

.. mdinclude:: ../README.md

.. toctree::
   :caption: Installation Guide
   :maxdepth: 1

   setup_heroku
   setup_docker

.. toctree::
   :caption: Tutorial
   :maxdepth:

   basic
   advanced

As you can see, this index.rst would be enough for most use cases. This update decreases the entry barrier for yaydoc. More features are on the way.

Resources

Using Root Directory as the Documentation Directory with Yaydoc

In our test builds for Yaydoc, we found that If we set the root as the documentation directory, the build would fail with a very long build log. In the build process, we create some temporary directories such as a virtual environment and the build directory in the root. After some inspection of the build logs, we found out that when the root is itself used as the documentation directory, we were accidently recursively copying the build directory into itself which led to build failure. Together with this, since the virtual environment directory was also being accidently copied to the build directory, we were actually building the documentation of the entire Python standard library on each build.

Once the problem and It’s cause was known, the course of action to be taken was clear. We needed to ensure that any temporary directories which we create as part of the build process was not being copied to the build directory. The following changes were made to achieve that.

  • The virtual environment directory was now being created in the HOME directory instead of the root.
  • Any other temporary directories which except the main build directory was now deleted before copying.
  • To prevent the recursive copying, we used the –exclude parameter of rsync.
rsync --exclude=BUILD_DIR DOCS_DIR/ BUILD_DIR/

After this patch, root can also be used as the documentation directory with Yaydoc. To do so, just set the environment variable DOCPATH as “.”

Deploy Static Web Pages In Six Keystrokes

I added two fairly young projects – Query Server and YayDoc to the projects list on http://labs.fossasia.org/. I pulled the code from GitHub, made the changes and it worked fine. Now to get it reviewed from a co-developer, I needed to host my changes somewhere on the web.

The fossasia-labs repository runs on gh-pages by GitHub. Hence, one way of hosting my changes was to use gh-pages on my fork but I tried this tool instead to deploy my site in six keystrokes.

This is what it took to deploy the static webpage right from my command line. Let’s dive into how this tool is as easy as it gets.

What is surge?
surge is a web-publishing tool aimed at front-end developers to help them get their static web pages up and running easily. It can be used to deploy HTML, CSS and JS with the ease of a single command.

How to use surge?
surge is quite an easy tool to use.  It has been developed as a npm package. Now for folks who don’t know what npm is – npm is the JavaScript package manager (Curious?).

To have surge running, you need to have Node.js installed. Run these in the terminal:

sudo apt-get update 
sudo apt-get install nodejs
sudo apt-get install npm 

Now you have Nodejs as well as npm installed. Let’s move on to the main course – installing surge.

npm install --global surge

You have installed surge!
(You may need to preface this command with sudo.)

So let’s go to the directory where we have our files to deploy. Here I have the labs.fossasia.org repository which we’ll try to deploy.

To clone this repo, run this command:

git clone [email protected]:fossasia/labs.fossasia.org.git

After cding into the directory named labs.fossasia.org type

surge

and hit enter.

You’ll be prompted to sign up with your email. Choose a password. After that you’ll see something similar to this.  

Properties of the directory – path and size are listed here. Also, as you can see in the picture, a domain is listed. This is a randomly generated domain by surge. You can stick with it too, or just delete it and type whatever domain you like. surge will deploy your directory to that domain, provided that it is available.

In this example, I thought to escape elfin-education and go with my-labs.surge.sh .

Press enter after typing in the desired domain name and you’ll see surge uploading files to the domain. After it successfully deploys, you’ll get a message :


That’s it. Finally it’s time to check my-labs.surge.sh .

Saving your Domain with CNAME

Next up we take a look at making surge remember the domain.

You’ll be prompted for a domain name, every time you run surge inside the same directory (this is the default behavior). This can be avoided by simply adding a CNAME file to your directory root. Let’s say that you want to stick with ‘my-labs.surge.sh’ in the above example. You can add it to the CNAME file by running this in the terminal.

  echo my-labs.surge.sh > CNAME  

surge also offers adding your own custom domain for deployments. To know about this and read further about surge, visit surge.sh .


Additional Resources

Handle sequential execution of scripts in Travis CI

Many projects on Github use Travis to automatically execute certain scripts on every build. Among these projects is Yaydoc, an Automated Documentation Generation and Deployment Project. At the crux of Yaydoc are scripts that generate and deploy documentations. It uses Travis to execute these scripts on every build to keep the generated website in sync with the documentation in the markup files.

It is possible that due to some issues, the scripts may fail to execute. Unfortunately, Travis build does not fail fast. What this means is that Travis continues the build even after it encounters errors in the building process causing it to fail.

There are many projects that involve execution of multiple scripts sequentially with each dependent on the proper execution of the previous script. This requires that if one of the scripts fails then the process should stop there and none of the scripts following it should execute. Failing to achieve this can lead to some unprecedented outcomes.

One solution for this would be to handle those statements in all the scripts that could lead to a failure in the build process. Opting this approach could be burdensome as there can be multiple scripts with a huge number of commands. Also, it is hard to realise which command could fail to execute. Instead of opting for this, our approach is to use ‘Build Stages’ offered in Travis CI.

Travis’ build stages

Travis offers `build stages`, which is a way to group jobs, and run jobs in each stage in parallel, but run one stage after another sequentially. Put simply, `Build Stages` allows us make one job run only if several other, parallel jobs have completed successfully.

These build stages can be used to execute one script at each stage, with Travis exiting at the stage in which the errored script is executed.

Consider the Travis configuration defined above. This configuration describes the three stages that are involved in a real-world project, Yaydoc, which is used to Automatically Generate and Deploy the Documentation to Github Pages. It is clear from the configuration block that the three critical stages involved in the process of generating documentations using Yaydoc are

  1. Installing and Setting Up Virtual Environment
  2. Generation of Documentation
  3. Publishing Documentation

It would not be wise for the system to publish documentations that are not generated properly. Hence, these three scripts are critical, with the execution of each script dependent on the successful execution of the previous script. Each script is defined in a separate stage and thus a failed ‘Generate documentation’ script stops the build. If the above scripts were to execute normally, the ‘Publish documentation’ script would have executed even after the `Generate documentation’ script fails.

Creating a Custom Theme Template in sphinx for the yaydoc automatic documentation generator

Sphinx is one of the most famous documentation generator out there and we can also customize sphinx to match the needs of the yaydoc automatic documentation generator we are building at FOSSASIA. Sphinx comes with lots of themes and you can also create your own theme. This blog will guide you on how to set your own custom theme and how to make use of sphnix-quickstart tool that allows you to create a boilerplate in a few seconds.

In yaydoc, we have a feature of generating documentation from markdown. So what you have to do is to modify conf.py to generate documentation from markdown. Therefore, I modified the bash script to add the necessary parser to conf.py but my co-contributor came with a better idea of solving the problem by creating a template file and specifying the path of template files to the sphinx-quickstart using the ‘t’ flag.

Below are the steps on how you can create your own sphinx template.

The command for initializing the basic template is as follows:

pip install sphinx
sphinx-quickstart

After completing the above step, it’ll ask you a series of questions. Your basic template will be created but you can customize the generated files by providing your own custom templates and ask sphinx to generate a boilerplate from our customized template. Sphinx uses jinja for templating. To know more about jinja check this link. Let’s start creating our own customized template. Basic files needed to create a new sphinx template are as follows:

  • Makefile.new_t
  • Makefile_t
  • conf.py_t
  • make.bat.new_t
  • make.bat_t
  • master_doc.rst_t

conf.py_t contains all the configuration for documentation generation. Let’s say if you have to generate documentation from markdown file you will have to add recommonmark parser. Instead of adding the parser after boiler plate generation you can simply add it in the template beforehand.

from recommonmark.parser import CommonMarkParser

With the help of jinja templating we can create boiler plate according to our business logic . For example, if you want to hard code copyright you can do it simply by changing the conf.py_t

copyright = u'{{ copyright_str }}'

master_doc.rst_t will be having the default index page generated by sphinx . You can edit that also according to your need. Remaining files are basic makefile for sphinx, no need of altering them. You can see the example snippets in yaydoc repository. After you are done with your templating, you can generate boilerplate using -t flag by specifying the folder.

sphnix-quickstart -t <template folder path>

Advanced customization of the Yaydoc Build Process

Although, Yaydoc exposes many environment variables which can be used to configure various aspects of the build process, there may be cases where a user needs much more finer control over the build process. Yaydoc uses sphinx under the hood which uses a file named conf.py to allow users to customize the build. As part of the build process, Yaydoc generates a file named conf.py from a custom made jinja2 template. With this week’s update, now a user can extend the generated conf.py by providing their own conf.py whose content would be appended to the generated conf.py.

Why append you may ask. Why not just overwrite? This is because the generated conf.py has a lot of boilerplate code which when overwritten will need to be rewritten by the user. That is why the contents are appended so that the user will only need to specify any extra configuration options they may wish to add or override. This approach has the following advantages:

  • Ability to override or add any configuration option during build.
  • Since the conf.py file is execfile`d by sphinx during build, the user has the ability to execute arbitrary code to customize any part of the build process.

The following block of code implements this feature.

if [ -f $DOCPATH/conf.py ]; then
  echo >> BUILD_DIR/conf.py
  cat $DOCPATH/conf.py >> BUILD_DIR/conf.py
  rsync -a $DOCPATH. BUILD_DIR/ --exclude=conf.py
else
  cp -a $DOCPATH. BUILD_DIR/
fi

Here we check if user has provided a conf.py, we append it to the generated conf.py. To append we used the >> shell redirection feature. It redirects stdout to a file similar to > but instead of overwriting the file, it appends to it.

This brings us on parity with sphinx as  far as customization goes. We may expose some more configuration variables for easier setup in the future, but now you can always modify any aspects of the build process even if it is not exposed via a variable. This should be enough for most use cases. More changes are on the way. Stay tuned for more updates.

Adding support for Markdown in Yaydoc

Yaydoc being based on sphinx natively supports reStructuredText. From the official docs:

reStructuredText is an easy-to-read, what-you-see-is-what-you-get plaintext markup syntax and parser system. It is useful for quickly creating simple web pages, and for standalone documents. reStructuredText is designed for extensibility for specific application domains.

Although it being superior to markdown in terms of features, Markdown is still the most heavily used markup language out there. This week we added support for markdown into Yaydoc. Now you can use Markdown to document your project and Yaydoc would create a site with no changes required from your end. To achieve this, we used recommonmark, which enables sphinx to parse CommonMark, a strongly defined, highly compatible specification of Markdown. It solved most of the problem with 3 lines of code in our customized conf.py .

from recommonmark.parser import CommonMarkParser

source_parsers = {
'.md': CommonMarkParser,
}

source_suffix = ['.rst', '.md']

With this addition, sphinx can now use recommonmark to convert markdown to html. But not everything has been solved. Here is an excerpt from a previous blogpost which explains a problem yet to be solved.

Now sphinx requires an index.rst file within docs directory  which it uses to generate the first page of the site. A very obvious way to fill it which helps us avoid unnecessary duplication is to use the include directive of reStructuredText to include the README file from the root of the repository. But the Include directive can only properly include a reStructuredText, not a markdown document. Given a markdown document, it tries to parse the markdown as  reStructuredText which leads to errors.

To solve this problem, a custom directive mdinclude was created. Directives are the primary extension mechanism of reStructuredText. Most of it’s implementation is a copy of the built in Include directive from the docutils package. Before including in the doctree, mdinclude converts the docs from markdown to reStructuredText using pypandoc. The implementation is similar to the one also discussed in a previous blogpost.

class MdInclude(rst.Directive):

required_arguments = 1
optional_arguments = 0

def run(self):
    if not self.state.document.settings.file_insertion_enabled:
        raise self.warning('"%s" directive disabled.' % self.name)
    source = self.state_machine.input_lines.source(
        self.lineno - self.state_machine.input_offset - 1)
    source_dir = os.path.dirname(os.path.abspath(source))
    path = rst.directives.path(self.arguments[0])
    path = os.path.normpath(os.path.join(source_dir, path))
    path = utils.relative_path(None, path)
    path = nodes.reprunicode(path)

    encoding = self.options.get(
        'encoding', self.state.document.settings.input_encoding)
    e_handler = self.state.document.settings.input_encoding_error_handler
    tab_width = self.options.get(
        'tab-width', self.state.document.settings.tab_width)

    try:
        self.state.document.settings.record_dependencies.add(path)
        include_file = io.FileInput(source_path=path,
                                    encoding=encoding,
                                    error_handler=e_handler)
    except UnicodeEncodeError as error:
        raise self.severe('Problems with "%s" directive path:\n'
                          'Cannot encode input file path "%s" '
                          '(wrong locale?).' %
                          (self.name, SafeString(path)))
    except IOError as error:
        raise self.severe('Problems with "%s" directive path:\n%s.' %
                          (self.name, ErrorString(error)))

    try:
        rawtext = include_file.read()
    except UnicodeError as error:
        raise self.severe('Problem with "%s" directive:\n%s' %
                          (self.name, ErrorString(error)))

    output = md2rst(rawtext)
    include_lines = statemachine.string2lines(output,
                                              tab_width, 
                                              convert_whitespace=True)
    self.state_machine.insert_input(include_lines, path)
    return []

With this, Yaydoc can now be used on projects that exclusively use markdown. There are some more hurdles which we need to cross in the following weeks. Stay tuned for more updates.

Using custom themes with Yaydoc to build documentation

What is Yaydoc?

Yaydoc aims to be a one stop solution for all your documentation needs. It is continuously integrated to your repository and builds the site on each commit. One of it’s primary aim is to minimize user configuration. It is currently in active development.

Why Themes?

Themes gives the user ability to generate visually different sites with the same markup documents without any configuration. It is one of the many features Yaydoc inherits from sphinx.

Now sphinx comes with 10 built in themes but there are much more custom themes available on PyPI, the official Python package repository. To use these custom themes, sphinx requires some setup. But Yaydoc being an automated system needs to performs those tasks automatically.

To use a custom theme which has been installed, sphinx needs to know the name of the theme and where to find it. We do that by specifying two variables in the sphinx configuration file. html_theme and html_theme_path respectively. Custom themes provide a method that can be called to get the html_theme_path of the theme. Usually that method is named get_html_theme_path . But that is not always the case. We have no way find the appropriate method automatically.

So how do we get the path of an installed theme just by it’s name and how do we add it to the generated configuration file.

The configuration file is generated by the sphinx-quickstart command which Yaydoc uses to initialize the documentation directory. We can override the default generated files by providing our own project templates. The templates are based on the Jinja2 template engine.

Firstly, I replaced

html_theme = ‘alabaster’

With

html_theme = ‘{{ html_theme }}’

This provides us the ability to pass the name of the theme as a parameter to sphinx-quickstart. Now the user has an option to choose between 10 built-in themes. For custom themes however there is a different story. I had to solve two major issues.

  • The name of the package and the theme may differ.
  • We also need the absolute path to the theme.

The following code snippet solves the above mentioned problems.

{% if html_theme in (['alabaster', 'classic', 'sphinxdoc', 'scrolls',
'agogo', 'traditional', 'nature', 'haiku',
'pyramid', 'bizstyle'])
%}
# Theme is builtin. Just set the name
html_theme = '{{ html_theme }}'
{% else %}
# Theme is a custom python package. Lets install it.
import pip
exitcode = pip.main(['install', '{{ html_theme }}'])
if exitcode:
    # Non-zero exit code
    print("""{0} is not available on pypi. Please ensure the theme can be installed using 'pip install {0}'.""".format('{{ html_theme }}'), file=sys.stderr)
else:
    import {{ html_theme }}
    def get_path_to_theme():
        package_path = os.path.dirname({{ html_theme }}.__file__)
        for root, dirs, files in os.walk(package_path):
            if 'theme.conf' in files:
                return root
    path_to_theme = get_path_to_theme()
    if path_to_theme is None:
        print("\n{0} does not appear to be a sphinx theme.".format('{{ html_theme }}'), file=sys.stderr)
        html_theme = 'alabaster'
    else:
        html_theme = os.path.basename(path_to_theme)
        html_theme_path = [os.path.abspath(os.path.join(path_to_theme, os.pardir))]
{% endif %}

It performs the following tasks in order:

  • It first checks if the provided theme is one of the built in themes. If that is indeed the case, we just set the html_theme config value to the name of the theme.
  • Otherwise, It installs the package using pip.
  • Now __file__ has a special meaning in python. It returns us the path of the module. We use it to get the path of the installed package.
  • Now each sphinx theme must have a file named `theme.conf` which defines several properties of the theme. We do a recursive search for that file.
  • We set html_theme to be the name of the directory which contains that file, and html_theme_path to be it’s parent directory.

Now let’s see everything in action. Here are four pages created by Yaydocs from a single markup document with no user configuration.

 

Now you can choose between many of the themes available on PyPI. You can even create your own theme. Follow this blog to get more insights and latest news about Yaydoc.