Creating a command line tool to initiate loklak app development

There are various apps presently hosted on apps.loklak.org, which shows several interesting and awesome stuff that can be done using the loklak API. Now the question is how to create a loklak app? Well previously there were two ways to create a loklak app, you could either create an app from scratch creating all the necessary files including index.html (app entry point), app.json (required for maintaining app metadata and store listing), getStarted.md, appUse.md and other necessary files, or you could use the boilerplate app which provides you with a ready to go app template which you have to edit to create your app. Recently there has been a new addition to the repository, a command line tool called loklakinit, built with python which will automate the process of initiating a loklak app on the go. The tool creates the entire loklak app directory structure including app.json with all the necessary details (as provided by the user) and defaults, index.html, a css and js subdirectory with a style.css and script.js in the respective folders, getStarted.md file, appUse.md file and others.md file with proper references in app.json. All that the developer needs to do is, execute the following from the root of the apps.loklak.org repository. bin/loklakinit.sh This will start the process and initiate the app. Creating loklakinit tool Now let us delve into the code. So what actually the script does? How it works? Well the tool acts very much like the popular npm init command. At first the script creates a dictionary which stores the defaults for app.json. Next the script takes a number of inputs from the user and predicts some default values, if the user refuses to provide any input for a particular parameter then the default value for that parameter is used. app_json = collections.OrderedDict() app_json["@context"] = "http://schema.org" app_json["@type"] = "SoftwareApplication" app_json["permissions"] = "/api/search.json" app_json["name"] = "myloklakapp" app_json["headline"] = "My first loklak app" app_json["alternativeHeadline"] = app_json["headline"] app_json["applicationCategory"] = "Misc" app_json["applicationSubCategory"] = "" app_json["operatingSystem"] = "http://loklak.org" app_json["promoImage"] = "promo.png" app_json["appImages"] = "" app_json["oneLineDescription"] = "" app_json["getStarted"] = "getStarted.md" app_json["appUse"] = "appUse.md" app_json["others"] = "others.md" author = collections.OrderedDict() author["@type"] = "Person" author["name"] = "" author["email"] = "" author["url"] = "" author["sameAs"] = "" app_json["author"] = author The first part of the script inserts some default values into the app_json dictionary. Now what is an OrderedDict? Well, it is nothing but a python dictionary in which the order in which the keys are inserted is maintained. A ordinary python dictionary does not maintain the order of the keys. while True : app_name = raw_input("name (" + app_json["name"] + ") : ") if app_name : app_json["name"] = app_name app_context = raw_input("@context (" + app_json["@context"] + ") : ") if app_context : app_json["@context"] = app_context app_type = raw_input("@type (" + app_json["@type"] + ") : ") if app_type : app_json["@type"] = app_type app_permissions = raw_input("permissions (" + app_json["permissions"] + ") : ") if app_permissions : app_json["permissions"] = app_permissions.split(",") app_headline = raw_input("headline (" + app_json["headline"] + ") : ") if app_headline : app_json["headline"] = app_headline…

Continue ReadingCreating a command line tool to initiate loklak app development

Enhancing Github profile scraper service in loklak

The Github profile scraper is one of the several scrapers present in the loklak project, some of the other scrapers being Quora profile scraper, Wordpress profile scraper, Instagram profile scraper, etc.Github profile scraper scrapes the profile of a given github user and provides the data in json format. The scraped data contains user_id, followers, users following the given user, starred repositories of an user, basic information like user’s name, bio and much more. It uses the popular Java Jsoup library for scraping. The entire source code for the scraper can be found here. Changes made The scraper previously provided very limited information like, name of the user, description, starred repository url, followers url, following url, user_id, etc. One of the major problem was the api accepted only the profile name as a parameter and it returned the entire data, that is, the scraper scraped all the data even if the user did not ask for it. Moreover the service provided certain urls as data for example starred repository url, followers url, following url instead of providing the actual data present at those urls. The scraper contained only one big function where all the scraping was performed. The code was not modular. The scraper has been enhanced in the ways mentioned below. The entire code has been refactored.Code scraping user specific data has been separated from code scraping organization specific data. Also separate method has been created for accessing the Github API. Apart from profile parameter, the api now accepts another parameter called terms. It is a list of fields the user wants information on. In this way the scraper scrapes only that much data which is required by the user. This allows better response time and prevents unnecessary scraping. The scraper now provides more information like user gists, subscriptions, events, received_events and repositories. Code refactoring The code has been refactored to smaller methods. ‘getDataFromApi’ method has been designed to access Github API. All the other methods which want to make request to api.github.com/ now calls ‘getDatFromApi’ method with required parameter. The method is shown below. private static JSONArray getDataFromApi(String url) { URI uri = null; try { uri = new URI(url); } catch (URISyntaxException e1) { e1.printStackTrace(); } JSONTokener tokener = null; try { tokener = new JSONTokener(uri.toURL().openStream()); } catch (Exception e1) { e1.printStackTrace(); } JSONArray arr = new JSONArray(tokener); return arr; }   For example if we want to make a request to the endpoint https://api.github.com/users/djmgit/followers then we can use the above method and we will get a JSONArray in return. All the code which scrapes user related data has been moved to ‘scrapeGithubUser’ method. This method scrapes basic user information like full name of the user, bio, user name, atom feed link, and location. String fullName = html.getElementsByAttributeValueContaining("class", "vcard-fullname").text(); githubProfile.put("full_name", fullName); String userName = html.getElementsByAttributeValueContaining("class", "vcard-username").text(); githubProfile.put("user_name", userName); String bio = html.getElementsByAttributeValueContaining("class", "user-profile-bio").text(); githubProfile.put("bio", bio); String atomFeedLink = html.getElementsByAttributeValueContaining("type", "application/atom+xml").attr("href"); githubProfile.put("atom_feed_link", "https://github.com" + atomFeedLink); String worksFor = html.getElementsByAttributeValueContaining("itemprop", "worksFor").text(); githubProfile.put("works_for", worksFor); String homeLocation = html.getElementsByAttributeValueContaining("itemprop", "homeLocation").attr("title"); githubProfile.put("home_location", homeLocation);…

Continue ReadingEnhancing Github profile scraper service in loklak

Link one repository’s subdirectory to another

Loklak, a distributed social media message search server, generates its documentation automatically using Continuous Integration. It generates the documentation in the gh-pages branch of its repository. From there, the documentation is linked and updated in the gh-pages branch of dev.loklak.org repository. The method with which Loklak links the documentations from their respective subprojects to the dev.loklak.org repository is by using `subtrees`. Stating the git documentation, "Subtrees allow subprojects to be included within a subdirectory of the main project, optionally including the subproject's entire history. A subtree is just a subdirectory that can be committed to, branched, and merged along with your project in any way you want." What this means is that with the help of subtrees, we can link any project to our project keeping it locally in a subdirectory of our main repository. This is done in such a way that the local content of that project can easily be synced with the original project through a simple git command. Representation: A representation of how Loklak currently uses subtrees to link the repositories is as follows. git clone --quiet --branch=gh-pages git@github.com/dev.loklak.org.git central-docs cd central docs git subtree pull --prefix=server https://github.com/loklak/loklak_server.git gh-pages --squash -m "Update server subtree" git subtree split: The problem with the way documentation is generated currently is that the method is very bloated. The markup of the documentation is first compiled in each sub-projects’s repository and then aggregated in the dev.loklak.org repository. What we intend to do now is to keep all the markup documentation in the docs folder of the respective repositories and then create subtrees from these folders to a master repository. From the master repository, we then intend to compile the markups to generate the HTML pages. Intended Implementation: To implement this we will be making use of subtree split command. Run the following command in the subprojects repository. It creates a subtree from the docs directory and moves it to a new branch documentation. git subtree --prefix=docs/ split -b documentation Clone the master repository. (This is the repository we intend to link the subdirectory with.) git clone --quiet --branch=master git@github.com:loklak/dev.loklak.org.git loklak_docs cd loklak_docs Retrieve the latest content from the subproject. During the first run. git subtree add --prefix=raw/server ../loklak_server documentation --squash -m "Update server subtree" During following runs. git subtree pull --prefix=raw/server ../loklak_server documentation --squash -m "Update server subtree" Push changes. git push -fq origin master > /dev/null 2>&1

Continue ReadingLink one repository’s subdirectory to another

Continuous Deployment Implementation in Loklak Search

In current pace of web technology, the quick response time and low downtime are the core goals of any project. To achieve a continuous deployment scheme the most important factor is how efficiently contributors and maintainers are able to test and deploy the code with every PR. We faced this question when we started building loklak search. As Loklak Search is a data driven client side web app, GitHub pages is the simplest way to set it up. At FOSSASIA apps are developed by many developers working together on different features. This makes it more important to have a unified flow of control and simple integration with GitHub pages as continuous deployment pipeline. So the broad concept of continuous deployment boils down to three basic requirements Automatic unit testing. The automatic build of the applications on the successful merge of PR and deployment on the gh-pages branch. Easy provision of demo links for the developers to test and share the features they are working on before the PR is actually merged. Automatic Unit Testing At Loklak Search we use karma unit tests. For loklak search, we get the major help from angular/cli which helps in running of unit tests. The main part of the unit testing is TravisCI which is used as the CI solution. All these things are pretty easy to set up and use. Travis CI has a particular advantage which is the ability to run custom shell scripts at different stages of the build process, and we use this capability for our Continuous Deployment. Automatic Builds of PR’s and Deploy on Merge This is the main requirement of the our CD scheme, and we do so by setting up a shell script. This file is deploy.sh in the project repository root. There are few critical sections of the deploy script. The script starts with the initialisation instructions which set up the appropriate variables and also decrypts the ssh key which travis uses for pushing the repo on gh-pages branch (we will set up this key later). Here we also check that we run our deploy script only when the build is for Master Branch and we do this by early exiting from the script if it is not so. #!/bin/bash SOURCE_BRANCH="master" TARGET_BRANCH="gh-pages" # Pull requests and commits to other branches shouldn't try to deploy. if [ "$TRAVIS_PULL_REQUEST" != "false" -o "$TRAVIS_BRANCH" != "$SOURCE_BRANCH" ]; then echo "Skipping deploy; The request or commit is not on master" exit 0 fi   We also store important information regarding the deploy keys which are generated manually and are encrypted using travis. # Save some useful information REPO=`git config remote.origin.url` SSH_REPO=${REPO/https:\/\/github.com\//git@github.com:} SHA=`git rev-parse --verify HEAD` # Decryption of the deploy_key.enc ENCRYPTED_KEY_VAR="encrypted_${ENCRYPTION_LABEL}_key" ENCRYPTED_IV_VAR="encrypted_${ENCRYPTION_LABEL}_iv" ENCRYPTED_KEY=${!ENCRYPTED_KEY_VAR} ENCRYPTED_IV=${!ENCRYPTED_IV_VAR} openssl aes-256-cbc -K $ENCRYPTED_KEY -iv $ENCRYPTED_IV -in deploy_key.enc -out deploy_key -d chmod 600 deploy_key eval `ssh-agent -s` ssh-add deploy_key   We clone our repo from GitHub and then go to the Target Branch which is gh-pages in our case. # Cloning the repository to repo/ directory, #…

Continue ReadingContinuous Deployment Implementation in Loklak Search

Build a simple Angular 4 message publisher

Introduction The differences between Angular 1, Angular 2 and Angular 4 AngularJS/Angular 1 is a very popular framework which is based on MVC model, and it was released in October, 2010. Angular 2, also called Angular, is based on component and is completely different from Angular 1 , which was released in September 2016. Angular 4 is simply an update of Angular 2, which is released in March, 2017. Note that Angular 3 was skipped due to the version number conflicts. Installation Prerequisites 1. Install TypeScript: npm install -g typescript 2. Install code editor: vscode, sublime, intellij. 3. Use Google chrome and development tool to debug. Install Angular CLI by command line Angular CLI is a very useful tool which contains a brunch of commands to work with Angular 4. In order to install Angular CLI, the version of Node should be higher or equal to 6.9.0, and NPM need to higher or equal to 3. npm install -g @angular/cli Create a new Project ng new loklak-message-publisher cd loklak-message-publisher ng serve The command will launch web browser automatically in localhost:4200 to run the application. It will detect any change in files. Of course you can change the default port to others. The basic project structure is like this: package.json: standard node configuration file, which including the name, version and dependency of the configuration. tsconfig.json: configuration file for TypeScript compiler. typings.json: another configuration file for typescript, mainly used for type checking. app folder contains the main ts file, boot.ts and app.component.ts.   Next we will show how to build a simple message publisher. The final application looks like this, we can post new tweets in the timeline, and remove any tweet by clicking ‘x’ button: The first step in to create new elements in app.component.html file. The HTML file look quite simple, we save the file and switch to browser, the application is like this: A very simple interface! In next step we need to add some style for the elements. Edit the app.component.css file: .block {   width: 800px;   border: 1px solid #E8F5FD;   border-bottom: 0px;   text-align: center;   margin: 0 auto;   padding: 50px;   font-family: monospace;   background-color: #F5F8FA;   border-radius: 5px; } h1 {   text-align: center;   color: #0084B4;   margin-top: 0px; } .post {   width: 600px;   height: 180px;   text-align: center;   margin: 0 auto;   background-color: #E8F5FD;   border-radius: 5px; } .addstory {   width: 80%;   height: 100px;   font-size: 15px;   margin-top: 20px;   resize: none;   -webkit-border-radius: 5px;   -moz-border-radius: 5px;   border-radius: 5px;   border-color: #fff;   outline: none; } .post button {   height: 30px;   width: 75px;   margin-top: 5px;   position: relative;   left: 200px;   background-color: #008CBA;   border: none;   color: white;   display: inline-block;   border-radius: 6px;   outline: none; } .post button:active {   border: 0.2em solid #fff;   opacity: 0.6; } .stream-container {   width: 600px;   border-bottom: 1px solid #E8F5FD;   background-color: white;   padding: 10px 0px;   margin: 0 auto;   border-radius: 5px; } .tweet {   border: 1px solid;   border-color: #E8F5FD;   border-radius: 3px; } p {   text-align: left;   text-indent: 25px;   display: inline-block;   width: 500px; } span {   display: inline-block;   width: 30px;   height: 30px;   font-size: 30px;   cursor: pointer;   -webkit-text-fill-color: #0084B4; } The app is…

Continue ReadingBuild a simple Angular 4 message publisher

Using LokLak to Scrape Profiles from Quora, GitHub, Weibo and Instagram

Most of us are really so curious to know about one's social life. So taking this as a key point, LokLak has many profile scrapers in it. Profile scraper which are now available in LokLak  helps us to know about the posts, followers one has. Few of the profile scrapers available in LokLak are Quora Profile, GitHub Profile, Weibo Profile and Instagram Profile. How do the scrapers work? In loklak now we are using java to get the json objects of the scraped profile from different websites as mentioned above. So here is a simple explanation how one of the scraper works. In this post I am going to give you a gist about how Github Profile scraper API works: In the github profile scraper one can search for a profile without logging in and know the contents like the followers, repositories, gists of that profile and many more. The simple query which can be used is: To scrape individual profiles: https://loklak.org/api/githubprofilescraper.json?profile=kavithaenair To scrape organization profiles: https://loklak.org/api/githubprofilescraper.json?profile=fossasia Jsoup is an API and it is a easiest way used by java developers for scraping the web i.e.,web scraping. This API is used for manipulating and extracting data using DOM, CSS like methods. So in here, the Jsoup API is helping us to extract the html data and with the help of the tags used in the html extracted data we are trying to get the relevant data which is needed. How do we get the matching elements? We here are using special methods like getElementsByAttributeValueContaining() of the org.jsoup.nodes.Element class to get the data. For instance, to get the email from the extracted data the code is written as: String email = html.getElementsByAttributeValueContaining("itemprop", "email").text(); if (!email.contains("@")) email = ""; githubProfile.put("email", email); Code: Here is the java code which imports and extracts data: Imports the html file: html = Jsoup.connect("https://github.com/" + profile).get(); Extracts the html file for individual user: /*If Individual*/ if (html.getElementsByAttributeValueContaining("class", "user-profile-nav").size() != 0) { scrapeGithubUser(githubProfile, terms, profile, html); } if (terms.contains("gists") || terms.contains("all")) { String gistsUrl = GITHUB_API_BASE + profile + GISTS_ENDPOINT; JSONArray gists = getDataFromApi(gistsUrl); githubProfile.put("gists", gists); } if (terms.contains("subscriptions") || terms.contains("all")) { String subscriptionsUrl = GITHUB_API_BASE + profile + SUBSCRIPTIONS_ENDPOINT; JSONArray subscriptions = getDataFromApi(subscriptionsUrl); githubProfile.put("subscriptions", subscriptions); } if (terms.contains("repos") || terms.contains("all")) { String reposUrl = GITHUB_API_BASE + profile + REPOS_ENDPOINT; JSONArray repos = getDataFromApi(reposUrl); githubProfile.put("repos", repos); } if (terms.contains("events") || terms.contains("all")) { String eventsUrl = GITHUB_API_BASE + profile + EVENTS_ENDPOINT; JSONArray events = getDataFromApi(eventsUrl); githubProfile.put("events", events); } if (terms.contains("received_events") || terms.contains("all")) { String receivedEventsUrl = GITHUB_API_BASE + profile + RECEIVED_EVENTS_ENDPOINT; JSONArray receivedEvents = getDataFromApi(receivedEventsUrl); githubProfile.put("received_events", receivedEvents); } Extracts the html file for organization: /*If organization*/ if (html.getElementsByAttributeValue("class", "orgnav").size() != 0) { scrapeGithubOrg(profile, githubProfile, html); } And this is the sample output: For query: https://loklak.org/api/githubprofilescraper.json?profile=kavithaenair { "data": [{ "joining_date": "2016-04-12", "gists_url": "https://api.github.com/users/kavithaenair/gists", "repos_url": "https://api.github.com/users/kavithaenair/repos", "user_name": "kavithaenair", "bio": "GSoC'17 @loklak @fossasia ; Developer @fossasia ; Intern @amazon", "subscriptions_url": "https://api.github.com/users/kavithaenair/subscriptions", "received_events_url": "https://api.github.com/users/kavithaenair/received_events", "full_name": "Kavitha E Nair", "avatar_url": "https://avatars0.githubusercontent.com/u/18421291", "user_id": "18421291", "events_url": "https://api.github.com/users/kavithaenair/events", "organizations": [ { "img_link": "https://avatars1.githubusercontent.com/u/6295529?v=3&s=70", "link": "https://github.com/fossasia", "label": "fossasia",…

Continue ReadingUsing LokLak to Scrape Profiles from Quora, GitHub, Weibo and Instagram

Automatic Imports of Events to Open Event from online event sites with Query Server and Event Collect

One goal for the next version of the Open Event project is to allow an automatic import of events from various event listing sites. We will implement this using Open Event Import APIs and two additional modules: Query Server and Event Collect. The idea is to run the modules as micro-services or as stand-alone solutions. Query Server The query server is, as the name suggests, a query processor. As we are moving towards an API-centric approach for the server, query-server also has API endpoints (v1). Using this API you can get the data from the server in the mentioned format. The API itself is quite intuitive. API to get data from query-server GET /api/v1/search/<search-engine>/query=query&format=format Sample Response Header Cache-Control: no-cache Connection: keep-alive Content-Length: 1395 Content-Type: application/xml; charset=utf-8 Date: Wed, 24 May 2017 08:33:42 GMT Server: Werkzeug/0.12.1 Python/2.7.13 Via: 1.1 vegur The server is built in Flask. The GitHub repository of the server contains a simple Bootstrap front-end, which is used as a testing ground for results. The query string calls the search engine result scraper scraper.py that is based on the scraper at searss. This scraper takes search engine, presently Google, Bing, DuckDuckGo and Yahoo as additional input and searches on that search engine. The output from the scraper, which can be in XML or in JSON depending on the API parameters is returned, while the search query is stored into MongoDB database with the query string indexing. This is done keeping in mind the capabilities to be added in order to use Kibana analyzing tools. The frontend prettifies results with the help of PrismJS. The query-server will be used for initial listing of events from different search engines. This will be accessed through the following API. The query server app can be accessed on heroku. ➢ api/list​: To provide with an initial list of events (titles and links) to be displayed on Open Event search results. When an event is searched on Open Event, the query is passed on to query-server where a search is made by calling scraper.py with appending some details for better event hunting. Recent developments with Google include their event search feature. In the Google search app, event searches take over when Google detects that a user is looking for an event. The feed from the scraper is parsed for events inside query server to generate a list containing Event Titles and Links. Each event in this list is then searched for in the database to check if it exists already. We will be using elastic search to achieve fuzzy searching for events in Open Event database as elastic search is planned for the API to be used. One example of what we wish to achieve by implementing this type of search in the database follows. The user may search for -Google Cloud Event Delhi -Google Event, Delhi -Google Cloud, Delhi -google cloud delhi -Google Cloud Onboard Delhi -Google Delhi Cloud event All these searches should match with “Google Cloud Onboard Event, Delhi” with good accuracy.…

Continue ReadingAutomatic Imports of Events to Open Event from online event sites with Query Server and Event Collect

Writing Simple Unit-Tests with JUnit

In the Loklak Server project, we use a number of automation tools like the build testing tool ‘TravisCI’, automated code reviewing tool ‘Codacy’, and ‘Gemnasium’. We are also using JUnit, a java-based unit-testing framework for writing automated Unit-Tests for java projects. It can be used to test methods to check their behaviour whenever there is any change in implementation. These unit-tests are handy and are coded specifically for the project. In the Loklak Server project it is used to test the web-scrapers. Generally JUnit is used to check if there is no change in behaviour of the methods, but in this project, it also helps in keeping check if the website code has been modified, affecting the data that is scraped. Let’s start with basics, first by setting up, writing a simple Unit-Tests and then Test-Runners. Here we will refer how unit tests have been implemented in Loklak Server to familiarize with the JUnit Framework. Setting-UP Setting up JUnit with gradle is easy, You have to do just 2 things:- 1) Add JUnit dependency in build.gradle Dependencies { . . . . . .<other compile groups>. . . compile group: 'com.twitter', name: 'jsr166e', version: '1.1.0' compile group: 'com.vividsolutions', name: 'jts', version: '1.13' compile group: 'junit', name: 'junit', version: '4.12' compile group: 'org.apache.logging.log4j', name: 'log4j-1.2-api', version: '2.6.2' compile group: 'org.apache.logging.log4j', name: 'log4j-api', version: '2.6.2' . . . . . . }   2) Add source for 'test' task from where tests are built (like here). Save all tests in test directory and keep its internal directory structure identical to src directory structure. Now set the path in build.gradle so that they can be compiled. sourceSets.test.java.srcDirs = ['test']   Writing Unit-Tests In JUnit FrameWork a Unit-Test is a method that tests a particular behaviour of a section of code. Test methods are identified by annotation @Test. Unit-Test implements methods of source files to test their behaviour. This can be done by fetching the output and comparing it with expected outputs. The following test tests if twitter url that is created is valid or not that is to be scraped. /** * This unit-test tests twitter url creation */ @Test public void testPrepareSearchURL() { String url; String[] query = {"fossasia", "from:loklak_test", "spacex since:2017-04-03 until:2017-04-05"}; String[] filter = {"video", "image", "video,image", "abc,video"}; String[] out_url = { "https://twitter.com/search?f=tweets&vertical=default&q=fossasia&src=typd", "https://twitter.com/search?f=tweets&vertical=default&q=from%3Aloklak_test&src=typd", "and other output url strings to be matched…..." }; // checking simple urls for (int i = 0; i < query.length; i++) { url = TwitterScraper.prepareSearchURL(query[i], ""); //compare urls with urls created assertThat(out_url[i], is(url)); } // checking urls having filters for (int i = 0; i < filter.length; i++) { url = TwitterScraper.prepareSearchURL(query[0], filter[i]); //compare urls with urls created assertThat(out_url[i+3], is(url)); } }   Testing the implementation of code is useless as it will either make code more difficult to change or tests useless  . So be cautious while writing tests and keep difference between Implementation and Behaviour in mind. This is the perfect example for a simple Unit-Test. As we see there are some points,…

Continue ReadingWriting Simple Unit-Tests with JUnit

Displaying error notifications in whatsTrending? app

The issue I am solving in the whatsTrending app is to display error notifications when the date fields and the count field are not validated and when a user enters invalid data. Specifically we want to display error notifications for junk values and dates with formats other than YYYY-MM-DD and any other invalid data in the whatsTrending app’s filter option. The whatsTrending app is a web app that shows the top trending hashtags of twitter messages in a given date range using tweets collected by the loklak search engine. Users can also limit the number of top hash tags they want to see and use filters with start and end dates. What is the problem? The date fields and the count field are not validated which means junk values and date with formats other than YYYY-MM-DD do not show any error. So how can the problem be solved? Well the format (pattern) of the date can be verified by regular expression. A regular expression describes a pattern in a given text.So the format checking problem can be described as finding the pattern YYYY-MM-DD in the input date where Y, M and D are numbers.The Regex should specify that the pattern should be present at the beginning of the text. More detailed information about regex can be found here. The regex for this pattern is : /^\d{4}-\d{2}-\d{2}$/ The pattern says there should be 4 numbers followed by ‘-’ then two numbers then again ‘-’ and then again two numbers. This can be implemented the following way : $scope.isValidDate = function(dateString) { var regEx = /^\d{4}-\d{2}-\d{2}$/; if (dateString.match(regEx) === null) { return false; } dateComp = dateString.split('-'); var i=0; for (i=0; i<dateComp.length; i++) { dateComp[i] = parseInt(dateComp[i]); } if (dateComp.length > 3) { return false; } if (dateComp[1] > 12 || dateComp[1] <= 0) { return false; } if (dateComp[2] > 31 || dateComp[2] <= 0) { return false; } if (((dateComp[1] === 4) || (dateComp[1] === 6) || (dateComp[1] === 9) || (dateComp[1] === 11)) && (dateComp[2] > 30)) { return false; } if (dateComp[1] ===2) { if (((dateComp[0] % 4 === 0) && (dateComp[0] % 100 !== 0)) || (dateComp[0] % 400 === 0)) { if (dateComp[2] > 29) { return false; } } else { if (dateComp[2] > 28) { return false; } } } return true; } So the first part of the code checks for the above mentioned pattern in the input. If not found it returns false.If found then we split the entire date into a list containing year, month and day and the remaining part if any is removed.Each component is converted to integer.Then further validation is done on the month and day as can be seen from the code above.The range of the month and date is checked.Also leap year checking is done. In the same way the count field is also validated. The regex for this field is much simpler. We just need to check that the input consists only of numbers and…

Continue ReadingDisplaying error notifications in whatsTrending? app

Generating a documentation site from markup documents with Sphinx and Pandoc

Generating a fully fledged website from a set of markup documents is no easy feat. But thanks to the wonderful tool sphinx, it certainly makes the task easier. Sphinx does the heavy lifting of generating a website with built in javascript based search. But sometimes it’s not enough. This week we were faced with two issues related to documentation generation on loklak_server and susi_server. First let me give you some context. Now sphinx requires an index.rst file within /docs/  which it uses to generate the first page of the site. A very obvious way to fill it which helps us avoid unnecessary duplication is to use the include directive of reStructuredText to include the README file from the root of the repository. This leads to the following two problems: Include directive can only properly include a reStructuredText, not a markdown document. Given a markdown document, it tries to parse the markdown as  reStructuredText which leads to errors. Any relative links in README break when it is included in another folder. To fix the first issue, I used pypandoc, a thin wrapper around Pandoc. Pandoc is a wonderful command line tool which allows us to convert documents from one markup format to another. From the official Pandoc website itself, If you need to convert files from one markup format into another, pandoc is your swiss-army knife. pypandoc requires a working installation of Pandoc, which can be downloaded and installed automatically using a single line of code. pypandoc.download_pandoc() This gives us a cross-platform way to download pandoc without worrying about the current platform. Now, pypandoc leaves the installer in the current working directory after download, which is fine locally, but creates a problem when run on remote systems like Travis. The installer could get committed accidently to the repository. To solve this, I had to take a look at source code for pypandoc and call an internal method, which pypandoc basically uses to set the name of the installer. I use that method to find out the name of the file and then delete it after installation is over. This is one of many benefits of open-source projects. Had pypandoc not been open source, I would not have been able to do that. url = pypandoc.pandoc_download._get_pandoc_urls()[0][pf] filename = url.split(‘/’)[-1] os.remove(filename) Here pf is the current platform which can be one of ‘win32’, ‘linux’, or ‘darwin’. Now let’s take a look at our second issue. To solve that, I used regular expressions to capture any relative links. Capturing links were easy. All links in reStructuredText are in the same following format. `Title <url>`__ Similarly links in markdown are in the following format [Title](url) Regular expressions were the perfect candidate to solve this. To detect which links was relative and need to be fixed, I checked which links start with the \docs\ directory and then all I had to do was remove the \docs prefix from those links. A note about loklak and susi server project Loklak is a server application which is able…

Continue ReadingGenerating a documentation site from markup documents with Sphinx and Pandoc