Exporting Speakers and Sessions list download as CSV in the Open Event Server

In an event management system there is a need for organizers to get speakers data from the system and be able to use and process it elsewhere. Organizers might need a list of emails, phone numbers or other information. An “export” of speakers and sessions in a CSV file that can be opened in any spreadsheet application is a good way to obtain this data. Therefore we implemented an export as CSV funtion in the Open Event Server.

 

Now the Open Event Orga Server allows event organizers to export a list of speakers and details about their sessions such as Session status, Session title, Session speaker, Session track.

The speaker csv includes details about the speaker such as Name, Sessions, Email, Status, Mobile, Organisation, and Position.

 

How did we implement it:

When clicking on the Export As CSV button on either of the pages it calls the download_speakers_as_csv / download_sessions_as_csv  from speakers / sessions tab .

The Speaker’s Tab

Session’s Tab

The functions are defined in sessions.py file.

First we get data from the sessions table by event_id and the event details from event table :

sessions = DataGetter.get_sessions_by_event_id(event_id)

# This is for getting event name to put as the filename

event = DataGetter.get_event(event_id)

 

Then we go on with creating the csv file.

We iterate through all the sessions and find the speaker associated with each. We save each row in a new list named data and after each iteration add it to a main list .

main = [["Session Title", "Session Speakers", \
         "Session Track", "Session Abstract", "Email Sent"]]
for session in sessions:
    if not session.deleted_at:
        data = [session.title + " (" + session.state + ")" if session.title else '']
        if session.speakers:
            inSession = ''
            for speaker in session.speakers:
                if speaker.name:
                    inSession += (speaker.name + ', ')
            data.append(inSession[:-2])
        data.append(session.track.name if session.track.name else '')
        data.append(strip_tags(session.short_abstract) if session.short_abstract else '')
        data.append('Yes' if session.state_email_sent else 'No')
        main.append(data)

In the last part of the code python’s csv  module is used to create a csv file out of the nested lists. The csv module takes care of properly escaping the characters such as punctuation marks ( like comma ) and appending a newline character in the end of each list .

Snippet of Code from sessions.py file

 

In the last few lines of this code section , you can see the headers being added , necessary for downloading the file on the user’s end.

The make_response function is imported from flask package.

make_response : Converts the return value from a view function to a real response object  ( as documented here).

 

The  exported filename format is like :

‘%event-name%-Speakers.csv’ , ‘%event-name%-Sessions.csv

Thus getting the list of speakers and session details as csv files.

Speaker’s CSV

Session’s CSV

 

How to teach SUSI skills calling an External API

SUSI is an intelligent  personal assistant. SUSI can learn skills to understand and respond to user queries better. A skill is taught using rules. Writing rules is an easy task and one doesn’t need any programming background too. Anyone can start contributing. Check out these tutorials and do watch this video to get started and start teaching susi.

SUSI can be taught to call external API’s to answer user queries.

While writing skills we first mention string patterns to match the user’s query and then tell SUSI what to do with the matched pattern. The pattern matching is similar to regular expressions and we can also retrieve the matched parameters using $<parameter number>$ notation.

Example :

 My name is *
 Hi $1$!

When the user inputs “My name is Uday” , it is matched with “My name is *” and “Uday” is stored in $1$. So the output given is “Hi Uday!”.

SUSI can call an external API to reply to user query. An API endpoint or url when called must return a JSON or JSONP response for SUSI to be able to parse the response and retrieve the answer.

Rule Format for a skill calling an external API

The rule format for calling an external API is :

<regular expression for pattern matching>
!console: <return answer using $object$ or $required_key$>
{
“url”: “<API endpoint or url>”,
“path”: “$.<key in the API response to find the answer>”,

}
eol
  • Url is the API endpoint to be called which returns a JSON or JSONP response.
    The parameters to the url if any can be added using $$ notation.
  • Path is used to help susi know where to look for the answer in the returned response.
    If the path points to a root element, then the answer is stored in $object$, otherwise we can query $key$ to get the answer which is a value to the key under the path.
  • eol or end of line indicates the end of the rule.

Understanding the Path Attribute

Let us understand the Path attribute better through some test cases.

In each of the test cases we discuss what the path should be and how to retrieve the answer for a given required answer from the json response of an API.

  1. API response in json :

    { 
       “Key1” : “Value1”
    }

Required answer : Value1
Path : “$.Key1    =>   Retrieve Answer:  $object$

 

  1. API response in json :

    { 
      “Key1” : [{“Key11” : “Value11”}]
    }

Required answer : Value11
Path : $.Key1[0]   =>  Retrieve Answer: $Key11$
Path : $.Key1[0].Key11   => Retrieve Answer: $object$

 

  1. API response in json :

    { 
      “Key1” : {“Key11” : “Value11”}
    }


Required answer : Value11
Path : $.Key1  => Retrieve Answer:  $Key11$
Path : $.Key1.Key11  => Retrieve Answer: $object$

 

  1. API response in json :
{ 
  “Key1” : {
               “Key11” : “Value11”,
               “Key12” : “Value12”
           }
}

Required answer : Value11 , Value12
Path : $.Key1  => Retrieve Answer:  $Key11$ , $Key12$

Where to write these rules?

Now, since we know how to write rules let’s see where to write them.

We use etherpads to write and test rules and once we finish testing our rule we can push those rules to the repo.

Steps to open, write and test rules:

  1. Open a new etherpad with a desired name <etherpad name> at http://dream.susi.ai/
  2. Write your skills code in the etherpad following the code format explained above.
  3. Now, to test your skill let’s chat with susi. Start a conversation with susi at http://susi.ai/chat to test your skills.
  4. Load your skills by typing dream <etherpad name> and wait for a response saying dreaming enabled for <etherpad name>
  5. Test your skill and follow step 4 every time you make changes to the code in your etherpad.
  6. After you are done testing, type stop dreaming and if you are satisfied with your skill do send a PR to help susi learn.

Examples

Let us try an example to understand this better.

1. Plot of a TV Show

Tvmaze is an open  TV API that provides information about tv shows. Let us write a rule to know the plot of a tv show. We can find many such APIs. Check out this link listing few of them.

  1.  Open an etherpad at http://dream.susi.ai/ named tvshowplot. 

  2.   Enter the code to query plot of a TV show in the etherpad at                           http://dream.susi.ai/p/tvshowplot

  * plot of *|* summary of *
  !console:$object$
  {
      "url":"http://api.tvmaze.com/singlesearch/shows?q=$2$",
      "path":"$.summary"
  }
  eol
  1. Now lets test our skill by starting a conversation with susi at http://susi.ai/chat.
  • User Query: dream tvshowplot
    Response:  dreaming enabled  for tvshowplot
  • User Query: what is the plot of legion
    Response: Legion introduces the story of David Haller: Since he was a teenager, David has struggled with mental illness. Diagnosed as schizophrenic, David has been in and out of psychiatric hospitals for years. But after a strange encounter with a fellow patient, he’s confronted with the possibility that the voices he hears and the visions he sees might be real. He’s based on the Marvel comics character Legion, the son of X-Men founder Charles Xavier (played by Patrick Stewart and James McAvoy in the films), first introduced in 1985.

Intermediate Processing:

Pattern Matching : $1$ = “what is the” ; $2$ = “legion”

Url : http://api.tvmaze.com/singlesearch/shows?q=legion

API response:

{
   "id": 6393,
   "url": "http:\/\/www.tvmaze.com\/shows\/6393\/legion",
   "name": "Legion",
   "type": "Scripted",
   "language": "English",
   "genres": [
     "Drama",
     "Action",
     "Science-Fiction"
   ], 
   "summary": "<p><strong>Legion<\/strong> introduces the story of David Haller: Since he was a teenager, David has struggled with mental illness. Diagnosed as schizophrenic, David has been in and out of psychiatric hospitals for years. But after a strange encounter with a fellow patient, he's confronted with the possibility that the voices he hears and the visions he sees might be real. He's based on the Marvel comics character Legion, the son of X-Men founder Charles Xavier (played by Patrick Stewart and James McAvoy in the films), first introduced in 1985.<\/p>",
   "updated": 1491955072,  
 }

Note: The API response has been trimmed to show only the relevant content.

Path : $.summary

Retrieving Answer: so our required answer in the api response is under the key summary and is retrieved using $object$ since it is a root element.

 

Screenshots:

2. Cooking Recipes

Let us try it out with another API.
Recipepuppy is an cooking recipe API where users can query various recipes.

  1.  Open a etherpad at http://dream.susi.ai/ named recipe. 

  2.   Enter the code to query a recipe in the etherpad at  http://dream.susi.ai/p/recipe
#Gives recipes and links to cook a dish
* cook *
!console:<p>To cook  <strong>$title$</strong> : <br>The ingridients required are: $ingredients$. <br> For instruction to prepare the dish $href$ </p>
{
  "url":"http://www.recipepuppy.com/api/?q=$2$",
  "path":"$.results"
}
eol
  1. Now lets test our skill by starting a conversation with susi at http://susi.ai/chat.
  • User Query: dream recipe
    Response:  dreaming enabled  for recipe
  • User Query: how to cook chicken biryani
    Response: To cook Chicken Biryani Recipe :
    The ingridients required are: chicken, seeds, chicken broth, rice, butter, peas, garlic, red onions, cardamom, curry paste, olive oil, tomato, coriander, cumin, brown sugar, tumeric.
    For instruction to prepare the dish Click Here!

Intermediate Processing:

Pattern Matching : $1$ = “how to” ; $2$ = “chicken biryani”

Url : http://www.recipepuppy.com/api/?q=chicken biryani

API response:

{
   "title": "Recipe Puppy",
   "version": 0.1,
   "href": "http:\/\/www.recipepuppy.com\/",
   "results": [
     {
       "title": "Chicken Biryani Recipe",
       "href": "http:\/\/www.grouprecipes.com\/53040\/chicken-biryani.html",
       "ingredients": "chicken, seeds, chicken broth, rice, butter, peas, garlic, red onions, cardamom, curry paste, olive oil, tomato, coriander, cumin, brown sugar, tumeric",
       "thumbnail": "http:\/\/img.recipepuppy.com\/413822.jpg"
     },
 ]
 }

Note: The API response has been trimmed to show only the relevant content.

Path : $.results[0]

Retrieving Answer: so our required answer in the api response is under the key results and since it’s an array we are using the first element of the array and since the element is a dictionary too we use its keys correspondingly to answer. The $href$ is rendered as “Click Here” hyperlinked to the actual url.

 

Screenshots:

 

We have successfully taught susi a skill which tells users about the plot of a tv show and a skill to query recipes.
Cheers!
Following similar procedure, we can make use of other APIs and teach susi several new skills.

 

Deploying Susi Server on Google Cloud with Kubernetes

Susi (acronym for Scientific User Support Intelligence) is an advanced AI made by people at FOSSASIA. It is an AI made by the people and for the people.
Susi is an Open Source Project under LGPL Licence.

SUSI.AI already has many Skills and anyone can add new skills through simple console rules.

If you want to participate in the development of the SUSI server you can start by learning to deploy it on a cloud system like Google Cloud.

This way whenever you make a change to Susi Server, you can test it out on various Susi Apps instantly.

Google Cloud with Kubernetes provide this ability. Let’s dig deep into what is Google Cloud Platform and Kubernetes.

What is Google Cloud Platform ?

Google Cloud Platform lets you build and host applications and websites, store data, and analyze data on Google’s scalable infrastructure.
Google Cloud Platform (at the time of writing this article) also provides free credits worth $300 for 1 year for testing out the Platform and test your applications.

What is Kubernetes ?

Kubernetes is an open-source system for automatic deployment, management and scaling of containerized applications. It makes it easy to roll out updates to your application with simple commands from your development machine and scale horizontally easily by adding more clusters as demand increase.

Deploying Susi Server on Kubernetes

Deploying Susi Server on Kubernetes is a fairly easy task. Follow up the steps to get it running.

Create a Google Cloud Account

Sign up for a Google Cloud Account (https://cloud.google.com/free-trial/) and get 300$ credits for initial use.

Create a New Project

After successful sign up, create a new project on Google Cloud Console.
Let’s name it Susi-Kubernetes . 

You will be provided a ProjectID. Remember it for further reference.

Install Google Cloud SDK and kubectl

Go to https://cloud.google.com/sdk/ and see instructions to setup Google Cloud SDK on your respective OS.

After Google Cloud SDK install, run

gcloud components install kubectl

This will install kubectl for interacting with Kubernetes.

Login and setup project

  1. Login to your Google Cloud Account using
$ gcloud auth login

2. List all the projects using

$ gcloud config list project
[core]
project = <PROJECT_ID>

3. Select your project

$ gcloud config set project <PROJECT_ID>

4. Install JDK8 for susi_server setup and set it as default.

5. Clone your fork of the Susi Server Repository

$ git clone https://github.com/<your_username>/susi_server.git
$ cd susi_server/

6. Build project and run Susi Server locally

$ ./gradlew build
$ bin/start.sh

Susi server must have been started started and web interface is accessible on http://localhost:4000

Install Docker and build Docker image for Susi

  1. Install Docker.
    Debian and derivatives:  sudo apt install docker
    Arch Linux:   sudo pacman -S docker 
  2. Build Docker Image for Susi
    $ docker build -t gcr.io/<Project_id>/susi:v1 .
  3. Push Image to Google Container Registry private to your project.
$ gcloud docker -- push gcr.io/<Project_id>/susi:v1

Create Cluster and Deploy your Susi Server there

  1. Create Cluster. You may specify different zone, number of nodes and machine type depending upon requirement.
    $ gcloud container clusters create <Cluster-Name> --num-nodes 2 --machine-type n1-standard-1 --zone us-central1-c
  2. Run your deployment. You may specify any name for deployment.
    $ kubectl run <deployment_name> --image=gcr.io/<Project_id>/susi:v1 --port=80
    $ kubectl get deployments
    $ kubectl expose deployment susi --type=LoadBalancer
  3. Check your deployment and get Public IP for Access.
    $ kubectl get services
    NAME         CLUSTER-IP     EXTERNAL-IP     PORT(S)       AGE
    kubernetes   10.3.240.1     <none>          443/TCP        1d
    susi         10.3.241.145   <PUBLIC_IP>     80:31155/TCP   1d
  4. Go to provided public IP to check, if Susi Server is running.

Congratulations, you successfully setup Susi Server on Google Cloud with Kubernetes.

Updating the deployment

Next step is to update deployment when you wish to roll out changes. To do so.

Build Docker Image and Push it to Google Container Registry

$ docker build -t gcr.io/<Project_Id>/susi:v2 .
$ gcloud docker -- push gcr.io/<Project_Id>/susi:v2

Update Deployment Image with Kubernetes

$ kubectl set image deployment/<Deployment_Name> \
  <Deployment_Name>=gcr.io/<Project_id>/susi:v2
deployment "<Deployment_Name>" image updated

Go to public ip to see the changes.

That’s it. Now, you have fully running Susi Server on your own Google Cloud Cluster using Kubernetes.

Susi AI Skill Development

What is Susi?

Susi is an open source intelligent personal assistant which has the capability to learn and respond better to queries. It is also capable of making to-do lists, setting alarms, providing weather and traffic info all in real time. Susi responds based on skills.

What is a skill? How do we teach a skill?

A skill is a piece of code which performs a set of actions in order to respond to the user’s query. These skills are based on pattern matching which help them mapping the user’s query to a specific skill and responding accordingly. Teaching a skill to Susi is surprisingly very easy to implement. One can take a look at the Susi Skill Development Tutorial and a video workshop by Michael Christen.

I will try to give a basic idea on how to create a skill, it’s basic structure and some of the skills I developed in the first week.

Prepare to create a skill:

  • Head over to http://dream.susi.ai
  • Create a etherpad with some relevant name
  • Delete all text currently present in there
  • Start writing your skill

Adding to this, for testing a skill one can head over to Susi Web Chat Interface.

Basic Structure for calling an API:

<Regular expression to be matched here>

!console:<response given to the user>
 {
 "url":"<API endpoint>",
 "path":"<Json path here>"
 }
 eol

So, let me explain this line by line.

  1. The regular expression is the one to which the user’s query is matched first.
  2. The console is meant to output the actual response the user sees as response.
  3. In place of the “url”, the API endpoint is passed in.
  4. “path” here specifies how we traverse through the response Json or Jsonp to get the object, starts with “$.”.
  5. At last, “eol” which is the end-of-line marks the end of a skill.

Let’s take an example for better understanding of this:

random gif
!console: $url$
{
    "url" : "http://api.giphy.com/v1/gifs/trending?api_key=dc6zaTOxFJmzC",
    "path" : "$.data[0].images.fixed_height"
}
eol 

 

This skill responds with a link to a random gif.

Steps involved:

  1. Match the string “random gif” with the user’s query.
  2. On successful match, make an API call to the API endpoint specified in “url”
  3. On response, extract the object at the specified path in the json under “path”
  4. Respond to the user with the “url” key’s value which would here be an URL of a GIF.

Let’s try it out on Susi Web Chat. For this, you will first have to load your skill using the dream command followed by etherpad name: dream <etherpad name>. And then you can start testing your skill.

So, we queried “random gif” and we got a response “Click Here!”. The complete URL didn’t show up because all the URLs are currently parsed and a hyperlink for each is created. So try clicking on it to find a GIF.

 

Now, let’s look at one more skill I developed during this period.

# Returns the name of the president of a country

 president of *|who is the president of *| president *
 !console:$plaintext$
 {      "url":"https://api.wolframalpha.com/v2/query?input=president+$1$&output=JSON&appid=9WA6XR-26EWTGEVTE&includepodid=Result",
   "path" : "$.queryresult.pods[0].subpods[0]"
 }
 eol

 

Let’s understand this step by step:

  1. We have here “president of *|who is the president of *| president *”, which means the user’s query matches with anyone of the following because of the use of pipe symbol “|”. The “*” here replaces a word or a list of words, which can be accessed like: “${index}$”  where index is replaced by the position of the “*” in the expression starting from 1.
  2. Now we have something new in the URL. See that  $1$  inside the URL? On runtime, that is replaced with the content of the “*” variable. So if a user puts in query like: “president of usa”, “usa” is mapped to $1$ and is replaced in the URL and appropriate API request is made.
  3. Then the path is traversed in the json response and the value of the “plaintext” key is used to respond to the user.

 

It’s now time to try it out on Susi Web Chat.

So, we got our desired response here, i.e., the name of the president of usa.

Displaying error notifications in whatsTrending? app

The issue I am solving in the whatsTrending app is to display error notifications when the date fields and the count field are not validated and when a user enters invalid data. Specifically we want to display error notifications for junk values and dates with formats other than YYYY-MM-DD and any other invalid data in the whatsTrending app’s filter option.

The whatsTrending app is a web app that shows the top trending hashtags of twitter messages in a given date range using tweets collected by the loklak search engine. Users can also limit the number of top hash tags they want to see and use filters with start and end dates.

App to know trending hashtags on twitter

What is the problem? The date fields and the count field are not validated which means junk values and date with formats other than YYYY-MM-DD do not show any error.

So how can the problem be solved? Well the format (pattern) of the date can be verified by regular expression. A regular expression describes a pattern in a given text.So the format checking problem can be described as finding the pattern YYYY-MM-DD in the input date where Y, M and D are numbers.The Regex should specify that the pattern should be present at the beginning of the text.

More detailed information about regex can be found here.

The regex for this pattern is :

/^\d{4}-\d{2}-\d{2}$/

The pattern says there should be 4 numbers followed by ‘-’ then two numbers then again ‘-’ and then again two numbers.

This can be implemented the following way :

$scope.isValidDate = function(dateString) {
        var regEx = /^\d{4}-\d{2}-\d{2}$/;
        if (dateString.match(regEx) === null) {
            return false;
        }

        dateComp = dateString.split('-');
        var i=0;
        for (i=0; i<dateComp.length; i++) { dateComp[i] = parseInt(dateComp[i]); } if (dateComp.length > 3) {
            return false;
        }

        if (dateComp[1] > 12 || dateComp[1] <= 0) { return false; } if (dateComp[2] > 31 || dateComp[2] <= 0) { return false; } if (((dateComp[1] === 4) || (dateComp[1] === 6) || (dateComp[1] === 9) || (dateComp[1] === 11)) && (dateComp[2] > 30)) {
            return false;
        }

        if (dateComp[1] ===2) {
            if (((dateComp[0] % 4 === 0) && (dateComp[0] % 100 !== 0)) || (dateComp[0] % 400 === 0)) {
                if (dateComp[2] > 29) {
                    return false;
                }
            } else {
                if (dateComp[2] > 28) {
                    return false;
                }
            }
        }

        return true;
    }

So the first part of the code checks for the above mentioned pattern in the input. If not found it returns false.If found then we split the entire date into a list containing year, month and day and the remaining part if any is removed.Each component is converted to integer.Then further validation is done on the month and day as can be seen from the code above.The range of the month and date is checked.Also leap year checking is done.

In the same way the count field is also validated. The regex for this field is much simpler. We just need to check that the input consists only of numbers and nothing else.
So the regex for this is :

 /^[0-9]+$/

This means repetition of digits in the range 0-9.We search for this pattern in the text. If found we return true else false.The function for this is as follows:

$scope.isNumber = function(numString) {
        var regEx = /^[0-9]+$/;
        return String(numString).match(regEx) != null;
    }

Next we need to call these function and see if their is any error. If there is an error we need to display it.This can be done using a modal. Bootstrap has got an inbuilt modal which can be invoked using javascript.

Showing error using modal

So at first we need to define the modal and its content (empty if necessary as in this case)using HTML.The HTML code for this can be found here.

A small yet nice tutorial on Bootstrap modal can be found here
Next we need to set the content of the modal and invoke it from our JS file on encountering an error.

$scope.displayErrorModal = function(val, type) {
        if (type === 0) {
            if (!$scope.isValidDate(val)) {
                $scope.loading = false;
                $('.modal-body').html('Please enter valid date in YYYY-MM-DD format'); 
                $('#myModal').modal('show');                 
                return false; 
            } 
         } else { 
             if (!$scope.isNumber(val)) { 
                 $scope.loading = false; 
                 $('.modal-body').html('Please enter a valid number'); 
                 $('#myModal').modal('show');                    
                 return false; 
             } 
         } 
         return true; 
}

The above function accepts a parameter val and another parameter type.The parameter type tells what validation needs to be performed, date validation or number validation and calls previous two methods accordingly and passes val which is the value to validated.If any of the validation fails then it sets the content of the modal using : $(‘.modal-body’).html(“your content”) and then invokes it using : $(‘#modalID’).modal(‘show’). This displays a nice modal on the page and the user is notified about the error.

So this is it for this post.Thanks for reading it.My next post will be on fixing the design of the boilerplate app.

Using Cloud storage for event exports

Open-event orga server provides the ability to the organizer to create a complete export of the event they created. Currently, when an organizer triggers the export in orga server, A celery job is set to complete the export task resulting asynchronous completion of the job. Organizer gets the download button enabled once export is ready.

Till now the main issue was related to storage of those export zip files. All exported zip files were stored directly in local storage and that even not by using storage module created under orga server.

local storage path

On a mission to solve this, I made three simple steps that I followed to solve this issue.

These three steps were:

  1. Wait for shutil.make_archive to complete archive and store it in local storage.
  2. Copy the created archive to storage ( specified by user )
  3. Delete local archive created.

The easiest part here was to make these files upload to different storage ( s3, gs, local) as we already have storage helper

def upload(uploaded_file, key, **kwargs):
    """
    Upload handler
    """

The most important logic of this issue resides to this code snippet.

    dir_path = dir_path + ".zip"
 
     storage_path = UPLOAD_PATHS['exports']['zip'].format(
         event_id = event_id
     )
     uploaded_file = UploadedFile(dir_path, dir_path.rsplit('/', 1)[1])
     storage_url = upload(uploaded_file, storage_path)
 
    if get_settings()['storage_place'] != "s3" or get_settings()['storage_place'] != 'gs':
        storage_url = app.config['BASE_DIR'] + storage_url.replace("/serve_","/")
    return storage_url

From above snippet, it is clear that we are extending the process of creating the zip. Once the zip is created we will make storage path for cloud storage and upload it. Only one thing will take the time to understand here is the last second and third line of above snippet.

if get_settings()['storage_place'] != "s3" or get_settings()['storage_place'] != 'gs':
        storage_url = app.config['BASE_DIR'] + storage_url.replace("/serve_","/")

Initial the plan was simple to serve the files through “serve_static” but then the test cases were expecting a file at this location thus I had to remove “serve_” part for local storage and then it works fine on those three steps.

Next thing on this storage process need to be discussed is the feature to delete old exports. I believe one reason to keep them would be an old backup of your event will be always there with us at our cloud storage.

Generating a documentation site from markup documents with Sphinx and Pandoc

Generating a fully fledged website from a set of markup documents is no easy feat. But thanks to the wonderful tool sphinx, it certainly makes the task easier. Sphinx does the heavy lifting of generating a website with built in javascript based search. But sometimes it’s not enough.

This week we were faced with two issues related to documentation generation on loklak_server and susi_server. First let me give you some context. Now sphinx requires an index.rst file within /docs/  which it uses to generate the first page of the site. A very obvious way to fill it which helps us avoid unnecessary duplication is to use the include directive of reStructuredText to include the README file from the root of the repository.

This leads to the following two problems:

  • Include directive can only properly include a reStructuredText, not a markdown document. Given a markdown document, it tries to parse the markdown as  reStructuredText which leads to errors.
  • Any relative links in README break when it is included in another folder.

To fix the first issue, I used pypandoc, a thin wrapper around Pandoc. Pandoc is a wonderful command line tool which allows us to convert documents from one markup format to another. From the official Pandoc website itself,

If you need to convert files from one markup format into another, pandoc is your swiss-army knife.

pypandoc requires a working installation of Pandoc, which can be downloaded and installed automatically using a single line of code.

pypandoc.download_pandoc()

This gives us a cross-platform way to download pandoc without worrying about the current platform. Now, pypandoc leaves the installer in the current working directory after download, which is fine locally, but creates a problem when run on remote systems like Travis. The installer could get committed accidently to the repository. To solve this, I had to take a look at source code for pypandoc and call an internal method, which pypandoc basically uses to set the name of the installer. I use that method to find out the name of the file and then delete it after installation is over. This is one of many benefits of open-source projects. Had pypandoc not been open source, I would not have been able to do that.

url = pypandoc.pandoc_download._get_pandoc_urls()[0][pf]
filename = url.split(‘/’)[-1]
os.remove(filename)

Here pf is the current platform which can be one of ‘win32’, ‘linux’, or ‘darwin’.

Now let’s take a look at our second issue. To solve that, I used regular expressions to capture any relative links. Capturing links were easy. All links in reStructuredText are in the same following format.

`Title <url>`__

Similarly links in markdown are in the following format

[Title](url)

Regular expressions were the perfect candidate to solve this. To detect which links was relative and need to be fixed, I checked which links start with the \docs\ directory and then all I had to do was remove the \docs prefix from those links.

A note about loklak and susi server project

Loklak is a server application which is able to collect messages from various sources, including twitter.

SUSI AI is an intelligent Open Source personal assistant. It is capable of chat and voice interaction and by using APIs to perform actions such as music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, and other real time information

Using NodeBuilder to instantiate node based Elasticsearch client and Visualising data

As elastic.io mentions, Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. But in many setups, it is not possible to manually install an Elasticsearch node on a machine. To handle these type of scenarios, Elasticsearch provides the NodeBuilder module, which can be used to spawn Elasticsearch node programmatically. Let’s see how.

Getting Dependencies

In order to get the ES Java API, we need to add the following line to dependencies.

compile group: 'org.elasticsearch', name: 'securesm', version: '1.0'

The required packages will be fetched the next time we gradle build.

Configuring Settings

In the Elasticsearch Java API, Settings are used to configure the node(s). To create a node, we first need to define its properties.

Settings.Builder settings = new Settings.Builder();

settings.put("cluster.name", "cluster_name");  // The name of the cluster

// Configuring HTTP details
settings.put("http.enabled", "true");
settings.put("http.cors.enabled", "true");
settings.put("http.cors.allow-origin", "https?:\/\/localhost(:[0-9]+)?/");  // Allow requests from localhost
settings.put("http.port", "9200");

// Configuring TCP and host
settings.put("transport.tcp.port", "9300");
settings.put("network.host", "localhost");

// Configuring node details
settings.put("node.data", "true");
settings.put("node.master", "true");

// Configuring index
settings.put("index.number_of_shards", "8");
settings.put("index.number_of_replicas", "2");
settings.put("index.refresh_interval", "10s");
settings.put("index.max_result_window", "10000");

// Defining paths
settings.put("path.conf", "/path/to/conf/");
settings.put("path.data", "/path/to/data/");
settings.put("path.home", "/path/to/data/");

settings.build();  // Buid with the assigned configurations

There are many more settings that can be tuned in order to get desired node configuration.

Building the Node and Getting Clients

The Java API makes it very simple to launch an Elasticsearch node. This example will make use of settings that we just built.

Node elasticsearchNode = NodeBuilder.nodeBuilder().local(false).settings(settings).node();

A piece of cake. Isn’t it? Let’s get a client now, on which we can execute our queries.

Client elasticsearhClient = elasticsearchNode.client();

Shutting Down the Node

elasticsearchNode.close();

A nice implementation of using the module can be seen at ElasticsearchClient.java in the loklak project. It uses the settings from a configuration file and builds the node using it.


Visualisation using elasticsearch-head

So by now, we have an Elasticsearch client which is capable of doing all sorts of operations on the node. But how do we visualise the data that is being stored? Writing code and running it every time to check results is a lengthy thing to do and significantly slows down development/debugging cycle.

To overcome this, we have a web frontend called elasticsearch-head which lets us execute Elasticsearch queries and monitor the cluster.

To run elasticsearch-head, we first need to have grunt-cli installed –

$ sudo npm install -g grunt-cli

Next, we will clone the repository using git and install dependencies –

$ git clone git://github.com/mobz/elasticsearch-head.git
$ cd elasticsearch-head
$ npm install

Next, we simply need to run the server and go to indicated address on a web browser –

$ grunt server

At the top, enter the location at which elasticsearch-head can interact with the cluster and Connect.

Upon connecting, the dashboard appears telling about the status of cluster –

The dashboard shown above is from the loklak project (will talk more about it).

There are 5 major sections in the UI –
1. Overview: The above screenshot, gives details about the indices and shards of the cluster.
2. Index: Gives an overview of all the indices. Also allows to add new from the UI.
3. Browser: Gives a browser window for all the documents in the cluster. It looks something like this –

The left pane allows us to set the filter (index, type and field). The table listed is sortable. But we don’t always get what we are looking for manually. So, we have the following two sections.
4. Structured Query: Gives a dead simple UI that can be used to make a well structured request to Elasticsearch. This is what we need to search for to get Tweets from @gsoc that are indexed –

5. Any Request: Gives an advance console that allows executing any query allowable by Elasticsearch API.

A little about the loklak project and Elasticsearch

loklak is a server application which is able to collect messages from various sources, including twitter. The server contains a search index and a peer-to-peer index sharing interface. All messages are stored in an elasticsearch index.

Source: github/loklak/loklak_server

The project uses Elasticsearch to index all the data that it collects. It uses NodeBuilder to create Elasticsearch node and process the index. It is flexible enough to join an existing cluster instead of creating a new one, just by changing the configuration file.

Conclusion

This blog post tries to explain how NodeBuilder can be used to create Elasticsearch nodes and how they can be configured using Elasticsearch Settings.

It also demonstrates the installation and basic usage of elasticsearch-head, which is a great library to visualise and check queries against an Elasticsearch cluster.

The official Elasticsearch documentation is a good source of reference for its Java API and all other aspects.

This API or that Library – which one?

Last week, I was playing with a scraper program in Loklak Server project when I came across a library Boilerpipe. There were some issues in the program related to it’s implementation. It worked well. I implemented it, pulled a request but was rejected due to it’s maintenance issues. This wasn’t the first time an API(or a library) has let me down, but this added one more point to my ‘Linear Selection Algorithm’ to select one.

Once Libraries revolutionized the Software Projects and now API‘s are taking abstraction to a greater level. One can find many API’s and libraries on GitHub or on their respective websites, but they may be buggy. This may lead to waste of one’s time and work. I am not blogging to suggest which one to choose between the two, but what to check before getting them into use in development.

So let us select a bunch of these and give score +1 if it satisfies the point, 0 for Don’t care condition and -1 , a BIG NO.

Now initialize the variable score to zero and lets begin.

1. First thing first. is it easy to understand

Does this library code belongs to your knowledge domain? Can you use it without any issue? Also consider your project’s platform compatibility with the library. If you are developing a prototype or a small software(like for an event like Hackathon), you shall choose easy-to-read tutorial as higher priority and score++. But if you are working on a project, you shouldn’t shy going an extra mile and retain the value of score.

2. Does it have any documentation or examples of implementation

It shall have to be well written, well maintained documentation. If it doesn’t, I am ok with examples. Choose well according to your comfort. If none, at least code shall be easy to understand.

3. Does it fulfill all my needs?

Test and try to implement all the methods/ API calls needed for the project. Sometimes it may not have all the methods you need for your application or may be some methods are buggy. Take care of this point, a faulty library can ruin all your hard work.

4. Efficiency and performance (BONUS POINT for this one)

Really important for projects with high capacity/performance issues.

5. See for the Apps where they are implemented

If you are in a hackathon or a dev sprint, Checking for applications working on this API shall work. Just skip the rest of the steps (except the first).

6. Can you find blogs, Stack Overflow questions and tutorials?

If yes, This is a score++

7. An Active Community, a Super GO!

Yaay! An extra plus with the previous point.

8. Don’t tell me it isn’t maintained

This is important as if the library isn’t maintained, you are prone to bugs that may pop up in  future and couldn’t be solved. Also it’s performance can never be improved. If there is no option, It is better to use it’s parts in your code so that you can work on it, if needed.

Now calculate the scores, choose the fittest one and get to work.

So with the deserving library in your hand, my first blog post here ends.

Ticket Ordering or Positioning (back-end)

One of the many feature requests that we got for our open event organizer server or the eventyay website is ticket ordering. The event organizers wanted to show the tickets in a particular order in the website and wanted to control the ordering of the ticket. This was a common request by many and also an important enhancement. There were two main things to deal with when ticket ordering was concerned. Firstly, how do we store the position of the ticket in the set of tickets. Secondly, we needed to give an UI in the event creation/edit wizard to control the order or position of a ticket. In this blog, I will talk about how we store the position of the tickets in the backend and use it to show in our public page of the event.

Continue reading Ticket Ordering or Positioning (back-end)