Store Log History for Repositories Registered to Yaydoc

Yaydoc, our automatic documentation generation and deployment project, generates and deploys documentation for each of its registered repository. For every commit made to the registered repository, there is a corresponding build process running at Yaydoc. These build processes have their own logs which are stored as text files. However, until now, these commits were never visible to the user. So, if there would have been a failed build process, the user would never know the reason behind, rendering the user unable to rectify the error.

Hence, there was a need to make these logs available to users. The initial thought was to store only the latest log overriding all the previous logs for the repository. However, it was unanimously decided by the developers to store a history of logs for the repository. The main motive behind this was to enable users to compare logs between different commits.

The content from the log files created is stored in a MongoDB collection. Following is the schema defined for the build logs.

const BuildLog = mongoose.model(‘BuildLog’, mongoose.Schema({
  repository: String,  // `full_name` of the repository
  buildNumber: {       // Incrementing number for each build
    type: Number,
    default: 0,
  },
  generate: {
    data: Buffer,      // Generate Logs content
    datetime: Date,    // Date time of generate log creation
  },
  ghpages: {
    data: Buffer,      // Github Pages Logs content
    datetime: Date,    // Date time of github pages log creation
  },
}));

The repository collection is also updated, adding a builds key storing the number of times the build process was triggered for a given repository. This key is incremented on every new build and the new value is stored along with the builds as buildNumber.

The build process involves a documentation generation and a documentation deployment script. The process of incrementing the build number in the repository occurs when we store the documentation generation logs. After that, Github pages logs are stored when the documentation deployment process is completed.

Since the logs are stored in a text file at the location temp/<email>/<filename>.txt, we had to read the file using NodeJS File system. The file is read synchronously using the fs.readFileSync(filename) function and then stored in the MongoDB collection.

/**
 * Store logs created while generating docs for a given repository
 * @param name: `full_name` of the repository
 * @param filepath: file path of the generate logs
 * @param callback
 */
module.exports.storeGenerateLogs = function (name, filepath, callback) {
  Repository.incrementBuildNumber(name, function (error, repository) {
    var buildlog = new BuildLog({
      repository: name,
      buildNumber: repository.builds,
      generate: {
        data: fs.readFileSync(filepath),
        datetime: new Date()
      }
    });
    buildlog.save(function (error, repository) {
      callback(error, repository);
    });
  });
};

/**
 * Store logs created while deploying docs for a given repository
 * @param name: `full_name` of the repository
 * @param filepath: file of the ghpages deploy logs
 * @param callback
 */
module.exports.storeGithubPagesLogs = function (name, filepath, callback) {
  Repository.getRepositoryByName(name, function (error, repository) {
    if (error) {
      callback(error);
    } else {
      BuildLog.getParticularBuildLog(repository.name, repository.builds, 
      function (error, buildLog) {
        buildLog.ghpages.data = fs.readFileSync(filepath);
        buildLog.ghpages.datetime = new Date();
        buildLog.save(function (error, buildLog) {
          callback(error, buildLog);
        });
      });
    }
  });
};

The stored logs can then be retrieved at two different routes, with /:owner/:name/logs showing a list to logs generated in at most 10 builds and /:owner/:name showing the latest log. Similar to logs generated by Travis, accessing these routes doesn’t require the user to login to Yaydoc.

/**
 * Get a single repository with a log history of 10
 * @param name: `full_name` of the repository
 * @param callback
 */
module.exports.getRepositoryWithLogs = function (name, callback) {
  Repository.aggregate([
    { $match: {name: name}},
    {
      $lookup: {
        from: ‘buildLogs’,
        localField: ‘name’,
        foreignField: ‘repository’,
        as: ‘logs’
      }
    },
    { $unwind: ‘$logs’ },
    { $sort: { ‘logs.buildNumber’: -1 } },
    { $limit: 10 }
  ]).exec(function (error, results){
    callback(error, results);
  });
};

In order to retrieve a repository along with its logs, we perform an aggregation in MongoDB which is similar to a Left Join in SQL. This is the $lookup aggregation and it performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing. A similar method is used to retrieve the latest log by setting the limit aggregation to 1.

Resources:

  1. MongoDB Aggregation Lookup: https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
  2. Mongoose Aggregate Constructor: http://mongoosejs.com/docs/api.html#index_Mongoose-Aggregate
  3. NodeJS File System: https://nodejs.org/api/fs.html#fs_fs_readfilesync_path_options