Setting up Yaydoc on Heroku

Yaydoc takes as its input the information about a user’s repository containing the documentation in Markup files and generates a static website from it. The website also includes search functionality within the documentation. It supports various built-in and custom Sphinx themes.

Since the Web User Interface is now prepared with some solid features, it was time to deploy. We chose Heroku for this because of the ease with which we can build and scale the application at free of cost.

Yaydoc consists of two components; A Web User Interface and the Generation and Deployment Scripts. The Web UI being developed with NodeJs and the scripts involving Python modules, require us to include the following buildpacks

  • heroku/nodejs
  • heroku/python
  • https://github.com/imujjwal96/heroku-buildpack-pandoc.git

We need to set certain Environment Variables in Heroku for proper functioning of the Yaydoc. These include

  • CALLBACKURL – URL where Github must return to after successful authentication
  • CLIENTID – Unique Client-Id generated by Github OAuth Application
  • CLIENTSECRET – Unique Client-Secret generated by Github OAuth Application
  • ENCRYPTION_KEY – Required to encrypt Personal Access Token of the user
  • ON_HEROKU – True, since the application is deployed to Heroku
  • PYPANDOC_PANDOC – Location of Pandoc binaries
  • SECRET – A very secret token

Steps for Manual Deployment

  1. Install Heroku on your local machine.

    • If you have a linux based Operating Systems, type the following command in the terminal
wget -qO- https://cli-assets.heroku.com/install-ubuntu.sh | sh
heroku login
    • Enter your credentials and login.
  • Deploy Yaydoc to Heroku

    • Clone the original yaydoc repository or your own fork
git clone https://github.com/<username>/yaydoc.git
    • Move to the directory of the cloned repository
cd yaydoc/
    • Create a Heroku application using the following command
heroku create <your-app-name>
    • Add buildpacks to the application using the following commands
heroku buildpacks:set heroku/nodejs
heroku buildpacks:add --index 2 heroku/python
heroku buildpacks:add --index 3 https://github.com/imujjwal96/heroku-buildpack-pandoc.git
    • Set Environment Variables using the following commands
heroku config:set CALLBACKURL=https://<your-app-name>.herokuapp.com/callback
heroku config:set CLIENTID=<github-generated>
heroku config:set CLIENTSECRET=<github-generated>
heroku config:set ENCRYPTION_KEY=AVERYSECRETTOKENOFSPECIFICLENGTH
heroku config:set ON_HEROKU=true
heroku config:set PYPANDOC_PANDOC=~/vendor/pandoc/bin/pandoc
heroku config:set SECRET=averysecrettoken
    • Now deploy your code
git push heroku master
    • Visit the app at the URL generated by its app name
heroku open
Continue ReadingSetting up Yaydoc on Heroku

Web User Interface for Yaydoc

Yaydoc consists of two components:

  1. A configuration for various Continuous Integration software including Travis CI among others.
  2. A Web User Interface

Since the initial stage of its development, the team has been focused on developing a `documentation generation` script and a `publish to Github Pages` script. These scripts have been developed and tested by using Travis CI.

We are now at that stage in the development of the project that we can generate the documentation of a project and keep it updated with every push in the Github repository that consists of changes in the documentation. A sample of this can be seen at https://yaydoc.fossasia.org which is a deployment of the documentation of Yaydoc using its own scripts.

After having enough confidence in the working of the script, we have now shifted our inclination towards developing a Web User Interface for the app. The WebUI is intended to perform various functionalities. These include, among others:

  • Generate the documentation and Download the static files in a compressed format.
  • Generate the documentation and make them available for a Preview
  • Generate the documentation and Deploy them to Heroku
  • Generate the documentation and Deploy them to web server using SFTP

NOTE:- The aforementioned functionalities are not exhaustive. Also, they are not certain to be developed if they are not fruitful for the users of Yaydoc. We do not intend to bloat the application with features and functionalities that may never be used.

Technology Stack

The first issue that comes with developing any Web Application is the selection of its technology stack. With a huge number of languages and their web application frameworks, it becomes very difficult to reach a conclusion. After a lot of discussions, NodeJS was selected.

The User Interface involves various technologies including

  1. NodeJS – A JavaScript runtime.
  2. ExpressJS – A minimal and flexible Node.js web application framework.
  3. Pug (ex – Jade) – A high-performance template engine implemented for NodeJS.
  4. Socket.IO – A JavaScript library for realtime web application that enables realtime, bi-directional communication between web clients and servers.

ExpressJS is set up using the express-generator as it prepares a proper minimal architecture which makes it easy to scale up the application. Since the HTML part of the application will be minimal, Pug was chosen as it has a very clean and easy to read syntax. The use of Socket.IO became necessary as the app has a bidirectional communication with the `GENERATE` script sending its log output to the front-end.

Components of the Web User Interface

The UI consists of a Form that asks the user to input

  1. Email address – To provide a unique identity for a user to isolate the documentation
  2. GITURL – URL of the repository which consists the docs to be generated
  3. Doc  Theme – A dropdown that consists of built in Sphinx themes.

Out of the various arguments used to generate documentation in Sphinx, following are assumed

  • AUTHOR – Name of the user/organization of the repository
  • PROJECTNAME – Name of the repository
  • DOCPATH – Documentations are assumed to be stored at “docs/”

Apart from the form, the UI also has a block that is used to display the logs while the bash script is running in the backend.

The components defined above are those that have been developed and are being tested rigorously. Since the app is constantly being developed with new features added almost daily, new components will be added to the User Interface.

Continue ReadingWeb User Interface for Yaydoc

Pipelining Bash Script’s output to Webapp using Socket.io

Yaydoc, our automatic documentation generator, among other components, consists of a Web User Interface. This UI has a form that takes as its input certain information about a user’s project and generates documentations using this information in the backend with the help of a Bash Script. The caveat of executing such a Bash Script is that a user will have to wait for the processing to complete in order to get any output on the WebApp. This creates some problem as the user may not know if the process is executing properly. Furthermore, servers that are used to deploy such web applications have a limited time span within which it must send a response to a received GET or POST request. Since executing scripts may take some time, the process may lead to a Request Timeout.

We faced a similar problem with Yaydoc while deploying it to Heroku. Since Heroku has a timeout at 30 seconds, executing the Documentation Generation script lead to a Request Timeout as it takes more than 30 seconds for the execution. After doing a bit of research, we were introduced with Socket.io. Socket.IO is one of the most powerful Javascript frameworks which enables real-time bidirectional event-based communication.

At the client side, we define an “execute” event which emits the form data when the “Generate Docs” button is clicked. At the server side, we handle the event by executing a generator.executeScript(...) function with the socket and formData as its arguments.

/**
 * Client-side Event Handling
 */

$(function () {
 var socket = io();
 $(“#btnGenerate”).click(function () {
   var formData = getData();
   socket.emit(“execute”, formData);
 });
 ...
 ...
 ...
});

/**
 * Server-side Event Handling
 */
io.on(“connection”, function (socket) {
 socket.on(“execute”, function (formData) {
   generator.executeScript(socket, formData);
 });
});

 

Bash scripts are executed in NodeJS by creating child processes using the `child_process` module. This module provides four different methods for executing external applications. They are:

  1. execFile
  2. exec
  3. spawn
  4. fork

Out of these, the exec() and execFile() methods returns buffered data when the script executes successfully. We cannot use them as a solution because we need to continuously receive certain response from the server after execution of a limited number of commands in the script. Thus, we opt for spawn() which returns a stream based object every time the script produces some data. The spawn method is called in the executeScript method.

exports.executeScript = function (socket, formData) {
 ...
 ...
 var process = spawn(“./generate.sh”, args);
 process.stdout.on(“data”, function (data) {
   socket.emit(“logs”, {data: data});
 });
 ...
 ...
};

The emitted logs are then received at the client-side for display in the web application.

/**
 * Client-side Event Handling
 */
$(function () {
 ...
 ...
 socket.on(“logs”, function (data) {
   $(“#messages”).append($(“<li>”).text(data.data));
 }
 ...
 ...
});

A minimal sample of this application can be found at: https://github.com/imujjwal96/socket-bashing

Continue ReadingPipelining Bash Script’s output to Webapp using Socket.io

Automatic deployment of Github repositories to a web server using SFTP

Git and Github are amazing tools for managing projects and are being used widely to manage various web applications among other projects.  Even though it is very efficient to manage codebase of the project, there is one concern. Since generally the project is developed at a different location and then deployed to the production server, it becomes burdensome to keep both the instances in sync.

One of the most basic methods to achieve this is to push the changes at the developing instance to Github and then pull them to the instance on the server. A slightly better and less stringent option would be to use tools like git-ftp.

From git-ftp’s documentation

If you use Git and you need to upload your files to an FTP server, Git-ftp can save you some time and bandwidth by uploading only those files that changed since the last upload.

Despite the fact that it decreases the strenuous work of accessing the server and pulling the changes, but there is a caveat. In the world of Version Control, a lot of projects are developed in collaboration with various contributors and it wouldn’t be wise to give everybody access to the server.

Among the different ways to facilitate the task of keeping the two instances in sync, there is one way using which we can automate the deployment process of a Github repository to a web server and keep it in sync. We can achieve this by using a script that executes git-ftp commands in every CI build.

The script does include some dependencies but at the crux of it are the following commands.

git config git-ftp.url $FTP_URL
git config git-ftp.user $FTP_USER
git config git-ftp.password $FTP_PASSWORD

if ! git ftp push ; then
  git ftp init
fi

Travis configuration

sudo: required

before_script:
- sudo apt-get install build-essential debhelper libssh2-1-dev
- sudo apt-get source libcurl13
- sudo apt-get build-dep libcurl13
- sudo add-apt-repository -y ppa:git-ftp/ppa
- sudo apt-get update
- sudo apt-get install git-ftp

script:
- source <(curl -s https://raw.githubusercontent.com/imujjwal96/sftp-audodeploy/master/init.sh)
Continue ReadingAutomatic deployment of Github repositories to a web server using SFTP

Handle sequential execution of scripts in Travis CI

Many projects on Github use Travis to automatically execute certain scripts on every build. Among these projects is Yaydoc, an Automated Documentation Generation and Deployment Project. At the crux of Yaydoc are scripts that generate and deploy documentations. It uses Travis to execute these scripts on every build to keep the generated website in sync with the documentation in the markup files.

It is possible that due to some issues, the scripts may fail to execute. Unfortunately, Travis build does not fail fast. What this means is that Travis continues the build even after it encounters errors in the building process causing it to fail.

There are many projects that involve execution of multiple scripts sequentially with each dependent on the proper execution of the previous script. This requires that if one of the scripts fails then the process should stop there and none of the scripts following it should execute. Failing to achieve this can lead to some unprecedented outcomes.

One solution for this would be to handle those statements in all the scripts that could lead to a failure in the build process. Opting this approach could be burdensome as there can be multiple scripts with a huge number of commands. Also, it is hard to realise which command could fail to execute. Instead of opting for this, our approach is to use ‘Build Stages’ offered in Travis CI.

Travis’ build stages

Travis offers `build stages`, which is a way to group jobs, and run jobs in each stage in parallel, but run one stage after another sequentially. Put simply, `Build Stages` allows us make one job run only if several other, parallel jobs have completed successfully.

These build stages can be used to execute one script at each stage, with Travis exiting at the stage in which the errored script is executed.

Consider the Travis configuration defined above. This configuration describes the three stages that are involved in a real-world project, Yaydoc, which is used to Automatically Generate and Deploy the Documentation to Github Pages. It is clear from the configuration block that the three critical stages involved in the process of generating documentations using Yaydoc are

  1. Installing and Setting Up Virtual Environment
  2. Generation of Documentation
  3. Publishing Documentation

It would not be wise for the system to publish documentations that are not generated properly. Hence, these three scripts are critical, with the execution of each script dependent on the successful execution of the previous script. Each script is defined in a separate stage and thus a failed ‘Generate documentation’ script stops the build. If the above scripts were to execute normally, the ‘Publish documentation’ script would have executed even after the `Generate documentation’ script fails.

Continue ReadingHandle sequential execution of scripts in Travis CI

Link one repository’s subdirectory to another

Loklak, a distributed social media message search server, generates its documentation automatically using Continuous Integration. It generates the documentation in the gh-pages branch of its repository. From there, the documentation is linked and updated in the gh-pages branch of dev.loklak.org repository.

The method with which Loklak links the documentations from their respective subprojects to the dev.loklak.org repository is by using `subtrees`. Stating the git documentation,

“Subtrees allow subprojects to be included within a subdirectory of the main project, optionally including the subproject’s entire history. A subtree is just a subdirectory that can be committed to, branched, and merged along with your project in any way you want.”

What this means is that with the help of subtrees, we can link any project to our project keeping it locally in a subdirectory of our main repository. This is done in such a way that the local content of that project can easily be synced with the original project through a simple git command.

Representation:

A representation of how Loklak currently uses subtrees to link the repositories is as follows.

git clone --quiet --branch=gh-pages git@github.com/dev.loklak.org.git central-docs
cd central docs

git subtree pull --prefix=server https://github.com/loklak/loklak_server.git gh-pages --squash -m "Update server subtree"

git subtree split:

The problem with the way documentation is generated currently is that the method is very bloated. The markup of the documentation is first compiled in each sub-projects’s repository and then aggregated in the dev.loklak.org repository. What we intend to do now is to keep all the markup documentation in the docs folder of the respective repositories and then create subtrees from these folders to a master repository. From the master repository, we then intend to compile the markups to generate the HTML pages.

Intended Implementation:

To implement this we will be making use of subtree split command.

    1. Run the following command in the subprojects repository. It creates a subtree from the docs directory and moves it to a new branch documentation.
git subtree --prefix=docs/ split -b documentation
    1. Clone the master repository. (This is the repository we intend to link the subdirectory with.)
git clone --quiet --branch=master git@github.com:loklak/dev.loklak.org.git loklak_docs
cd loklak_docs
    1. Retrieve the latest content from the subproject.
      • During the first run.
git subtree add --prefix=raw/server ../loklak_server documentation --squash -m "Update server subtree"
      • During following runs.
git subtree pull --prefix=raw/server ../loklak_server documentation --squash -m "Update server subtree"
    1. Push changes.
git push -fq origin master > /dev/null 2>&1
Continue ReadingLink one repository’s subdirectory to another