gh-pages Publishing in Yaydoc’s Web UI

A few weeks back we rolled out the web interface for yaydoc. Web UI will enable user to generate the documentation with one click and users can download the zipped format of generated documentation. In Yaydoc, we now added the additional feature of deploying the generated documentation to the GITHUB pages. In order to push the generated documentation, we have to get the access token of the user. So I used passport Github’s strategy to get the access token of the users.

passport.use(new githubStrategy({
  clientID: process.env.CLIENTID,
  clientSecret: process.env.CLIENTSECRET,
  callbackURL: process.env.CALLBACKURL
}, function (accessToken, refreshToken, profile, cb) {
  profile.token = accessToken;
  cb(null, profile)

passport.serializeUser(function(user, cb) {
  cb(null, user);

passport.deserializeUser(function(obj, cb) {
  cb(null, obj);

After setting the necessary environment variables, we have to pass the strategy to the express handler.

router.get("/github", function (req, res, next) {
  req.session.uniqueId = req.query.uniqueId; =
  req.session.gitURL = req.query.gitURL
}, passport.authenticate('github', {
  scope: [

For maintaining state, I’m keeping the necessary information in the session so, that in the callback URL we know which repository have to push.


router.get("/callback", passport.authenticate('github'), function (req, res, next) {
  req.session.username = req.user.username;
  req.session.token = req.user.token

router.get("/deploy", function (req, res, next) {
  res.render("deploy", {
    gitURL: req.session.gitURL,
    uniqueId: req.session.uniqueId,
    token: crypter.encrypt(req.session.token),
    username: req.session.username

Github will send the access token to our callback. After this I’m templating the necessary information to the jade deploy template where it’ll invoke the deploy function via sockets. Then we’ll stream all the bash output log to the website.

io.on('connection', function(socket){
  socket.on('execute', function (formData) {
    generator.executeScript(socket, formData);
  socket.on('deploy', function (data) {
    ghdeploy.deployPages(socket, data);

exports.deployPages = function (socket, data) {
  var donePercent = 0;
  var repoName = data.gitURL.split("/")[4].split(".")[0];
  var webUI = "true";
  var username = data.username
  var oauthToken = crypter.decrypt(data.encryptedToken)
  const args = [
    "-i", data.uniqueId,
    "-w", webUI,
    "-n", username,
    "-o", oauthToken,
    "-r", repoName
  var process = spawn("./", args);

  process.stdout.on('data', function (data) {
    socket.emit('deploy-logs', {donePercent: donePercent, data: data.toString()});
    donePercent += 18;

  process.stderr.on('data', function (data) {
    socket.emit('err-logs', data.toString());

  process.on('exit', function (code) {
    console.log('child process exited with code ' + code);
    if (code === 0) {
      socket.emit('deploy-success', {pagesURL: "https://" + data.username + "" + repoName});

Once documentation is pushed to gh-pages, the documentation URL will get appended to the web UI.


Continue Reading gh-pages Publishing in Yaydoc’s Web UI

Using Express to show previews in Yaydoc

In yaydoc WebUI, documentation is generated using sphnix and can be downloaded as a zip file. If the user wants to see a preview of the documentation they have to unzip the zipped file and have to check the generated website for each and every build. So, we decided to implement preview feature to show the preview of generated documentation so that user will have an idea of how the website would look. Since WebUI is made with Express, we implemented the preview feature using Express’s static function. Mostly static function is used for serving static assets but we used to serve our generated site because all the generated sites are static. All the generated documentation will have an unique id and all the unique ids are generated as per uuidv4 spec. The generated document will be saved and moved to the unique folder.

mv $BUILD_DIR/_build/html $ROOT_DIR/../${UNIQUEID}_preview && cd $_/../
var express = require("express")
var path = require("path")
var favicon = require("serve-favicon");
var logger = require("morgan");
var cookieParser = require("cookie-parser");
var bodyParser = require("body-parser");
var uuidV4 = require("uuid/v4");

var app = express();
app.use(bodyParser.urlencoded({ extended: false }));
app.use(cookieParser()); app.use(express.static(path.join(__dirname, "public")));
app.use("/preview", express.static(path.join(__dirname, "temp")))


The above snippet is the just a basic Express server. In which, we have a route /preview and with the express static handler. Pass the path of your generated website as  argument, Then your sites are served over /preview route.


Continue Reading Using Express to show previews in Yaydoc

How to write your own custom AST parser?

  • Post author:
  • Post category:GSoC

In Yaydoc, we are using pandoc to convert text from one format to another. Pandoc is one of the best text conversion tool which helps users to convert text between different markup formats. It is written in HASKELL. Many wrapper libraries are available for different programming languages which include python, nodejs, ruby. But in yaydoc, for a few particular scenarios we have to customize the conversion to meet our needs. So I started to build to a custom parser. The parser which I made will convert yml code block to yaml code block because sphinx need yaml code block for rendering. In order to parse, we have to split the text into tokens to our need. So initially we have to write a lexer to split the text into tokens. Here is the sample snippet for a basic lexer.

class Node:
    def __init__(self, text, token):
        self.text = text
        self.token = token
    def __str__(self):
        return self.text+' '+self.token
def lexer(text):
    def syntax_highliter_lexer(nodes, words):
        splitted_syntax_highligter = words.split('```')
        if splitted_syntax_highligter[0] is not '':
            nodes.append(Node(splitted_syntax_highligter[0], 'WORD'))
        splitted_syntax_highligter[0] = '```'
        words = ''.join([x for x in splitted_syntax_highligter])
        nodes.append(Node(words, 'SYNTAX HIGHLIGHTER'))
        return nodes
    syntax_re = re.compile('```')
    nodes = []
    pos = 0
    words = ''
    while pos < len(text):
        if text[pos] == ' ':
            if len(words) > 0:
                if is not None:
                    nodes = syntax_highliter_lexer(nodes, words)
                    nodes.append(Node(words, 'WORD'))
                words = ''
            nodes.append(Node(text[pos], 'SPACE'))
            pos = pos + 1
        elif text[pos] == '\n':
            if len(words) > 0:
                if is not None:
                    nodes = syntax_highliter_lexer(nodes, words)
                    nodes.append(Node(words, 'WORD'))
                words = ''
            nodes.append(Node(text[pos], 'NEWLINE'))
            pos = pos + 1
            words += text[pos]
            pos = pos + 1
    if len(words) > 0:
        if is not None:
            nodes = syntax_highliter_lexer(nodes, words)
            nodes.append(Node(words, 'WORD'))
    return nodes

After converting your text into tokens. We have to parse the tokens to match our need. In this case we need to build a simple parser

I chose the ABSTRACT SYNTAX TREE to build the parser. AST is a simple tree based on root node expression. The left node is evaluated first then the right node value. If there is one node after the root node just return the value. Sample snippet for AST parser

def parser(nodes, index):
    if nodes[index].token == 'NEWLINE':
        if index + 1 < len(nodes):
            return nodes[index].text + parser(nodes, index + 1)
            return nodes[index].text
    elif nodes[index].token == 'WORD':
        if index + 1 < len(nodes):
            return nodes[index].text + parser(nodes, index + 1)
            return nodes[index].text
    elif nodes[index].token == 'SYNTAX HIGHLIGHTER':
        if index + 1 < len(nodes):
            word = ''
            j = index + 1
            end_highligher = False
            end_pos = 0
            while j < len(nodes):
                if nodes[j].token == 'SYNTAX HIGHLIGHTER':
                    end_pos = j
                    end_highligher = True
                j = j + 1
            if end_highligher:
                for k in range(index, end_pos + 1):
                    word += nodes[k].text
                if index != 0:
                    if nodes[index - 1].token != 'NEWLINE':
                        word = '\n' + word
                if end_pos + 1 < len(nodes):
                    if nodes[end_pos + 1].token != 'NEWLINE':
                        word = word + '\n'
                    return word + parser(nodes, end_pos + 1)
                    return word
                return nodes[index].text + parser(nodes, index + 1)
            return nodes[index].text
    elif nodes[index].token == 'SPACE':
        if index + 1 < len(nodes):
            return nodes[index].text + parser(nodes, index + 1)
            return nodes[index].text

But we didn’t use the parser in Yaydoc because maintaining a custom parser is a huge hurdle. But it provided a good learning experience.


Continue Reading How to write your own custom AST parser?

How to add a custom filter to pypandoc

In Yaydoc, we met the problem of converting Markdown file into restructuredText because sphinx needs restructured text.  

Let us say we have a yml CodeBlock in Yaydoc’s, but sphinx  uses pygments for code highlighting which needs yaml instead of yml for proper documentation generation. Pandoc has an excellent feature which allows us to write our own custom logic to the AST parser.

INPUT --reader--> AST --filter--> AST --writer--> OUTPUT

Let me explain this in a few steps:

  1. Initially pandoc reads the file and then converts it into nodes.
  2. Then the nodes is sent to Pandoc AST for parsing the markdown to restructuredText.
  3. The parsed node will then go to the filter. The filter is converting the parsed node according to the logic implemented.
  4. Then the Pandoc AST performs further parsing and joins the Nodes into text and is written to the file.

One important point to remember is that, Pandoc reads the conversion from the filter output stream so don’t write print statement in the filter. If you write print statement pandoc cannot  parse the JSON. In order to do debugging you can use logging module from python standard module. Here is the sample Pypandoc filter:

#!/usr/bin/env python
from pandocfilters import CodeBlock, toJSONFilter

def yml_to_yaml(key, val, fmt, meta):
    if key == 'CodeBlock':
        [[indent, classes, keyvals], code] = val
        if len(classes) > 0:
            if classes[0] == u'yml':
                classes[0] = u'yaml'
        val = [[indent, classes, keyvals], code]
        return CodeBlock(val[0], val[1])

if __name__ == "__main__":

The above snippet checks whether the node is a CodeBlock or not. If it is a CodeBlock, it changes yml to yaml and prints it as a JSON in the output stream. It is then parsed by pandoc.

Finally, all you have to do is to add your filter to the Pypandoc’s filters argument.

output = pypandoc.convert_text(text, 'rst', format='md', filters=[os.path.join('filters', '')])


Continue Reading How to add a custom filter to pypandoc

Creating a Custom Theme Template in sphinx for the yaydoc automatic documentation generator

Sphinx is one of the most famous documentation generator out there and we can also customize sphinx to match the needs of the yaydoc automatic documentation generator we are building at FOSSASIA. Sphinx comes with lots of themes and you can also create your own theme. This blog will guide you on how to set your own custom theme and how to make use of sphnix-quickstart tool that allows you to create a boilerplate in a few seconds.

In yaydoc, we have a feature of generating documentation from markdown. So what you have to do is to modify to generate documentation from markdown. Therefore, I modified the bash script to add the necessary parser to but my co-contributor came with a better idea of solving the problem by creating a template file and specifying the path of template files to the sphinx-quickstart using the ‘t’ flag.

Below are the steps on how you can create your own sphinx template.

The command for initializing the basic template is as follows:

pip install sphinx

After completing the above step, it’ll ask you a series of questions. Your basic template will be created but you can customize the generated files by providing your own custom templates and ask sphinx to generate a boilerplate from our customized template. Sphinx uses jinja for templating. To know more about jinja check this link. Let’s start creating our own customized template. Basic files needed to create a new sphinx template are as follows:

  • Makefile.new_t
  • Makefile_t
  • conf.py_t
  • make.bat.new_t
  • make.bat_t
  • master_doc.rst_t

conf.py_t contains all the configuration for documentation generation. Let’s say if you have to generate documentation from markdown file you will have to add recommonmark parser. Instead of adding the parser after boiler plate generation you can simply add it in the template beforehand.

from recommonmark.parser import CommonMarkParser

With the help of jinja templating we can create boiler plate according to our business logic . For example, if you want to hard code copyright you can do it simply by changing the conf.py_t

copyright = u'{{ copyright_str }}'

master_doc.rst_t will be having the default index page generated by sphinx . You can edit that also according to your need. Remaining files are basic makefile for sphinx, no need of altering them. You can see the example snippets in yaydoc repository. After you are done with your templating, you can generate boilerplate using -t flag by specifying the folder.

sphnix-quickstart -t <template folder path>
Continue Reading Creating a Custom Theme Template in sphinx for the yaydoc automatic documentation generator