How SUSI Analyzes A Given Response

Ever wondered where SUSI’s answers come from? Now Susi has ability to do an answer analysis. To get that analysis, just ask susi “analysis”. This will set susi into an analysis mode, will tell where the latest answer came from and will give you the link for improving the skill.

Let’s check out how Susi analysis work. The skill for analysis is defined  en_0001_foundation.txt  as following

analysis|analyse|analyze|* analysis|* analyse|* analyze|analysis *|analyse *|analyze *
My previous answer is defined in the skill $skill$. You can help to improve this skill and <a href="$skill_link$" target="_blank"> edit it in the code repository here.</a>

$skill$ and $skill_link$ are the variable compiled using

public static final Pattern variable_pattern = Pattern.compile("\\$.*?\\$");

These variables are memorized in Susi cognition. A cognition is the combination of a query of a user with the response of susi.

SusiThought dispute = new SusiThought();
List<String> skills = clonedThought.getSkills();
 if (skills.size() > 0) {
    dispute.addObservation("skill", skills.get(0));
    dispute.addObservation("skill_link",getSkillLink(skills.get(0)));
   }

Susi Thought is a piece of data that can be remembered. The structure of the thought is modeled as a table in which information contained in it is organized in rows and columns.

 public SusiThought addObservation(String featureName, String observation) ;

One can memorize using addObservation() method.  It takes two parameter featureName the object key and observation the object value. It is a table of information pieces as a set of rows which all have the same column names. It inserts the new data always in front of existing similar data rather than overwriting them.

 public String getSkillLink(String skillPath) {
       String link=skillPath;
        if(skillPath.startsWith("/susi_server")) {
            link ="https://github.com/fossasia/susi_server/blob/development" + skillPath.substring("/susi_server".length());
        } else if (skillPath.startsWith("/susi_skill_data")) {
            link = "https://github.com/fossasia/susi_skill_data/blob/master" + skillPath.substring("/susi_skill_data".length());
        }
        return link;
    }

The getSkillLink is a utitlity method to return the link of the skill source github repository based on skillPath.

private String skill;
SusiThought recall;
final SusiArgument flow = new SusiArgument().think(recall);
this.skill = origin.getAbsolutePath();
 if (this.skill != null && this.skill.length() > 0) flow.addSkill(this.skill);

The source of the skill gets added in SusiIntent.java using getAbsolutePath() method which resolves the skill path in the filesystem. Intent  considers the key from the user query, matches the intent tokens to get the optimum result and produces json like

 "data": [
      {
        "object": "If you spend too much time thinking about a thing, you'll never get it done.",
        "0": "tell me a quote",
        "token_original": "quote",
        "token_canonical": "quote",
        "token_categorized": "quote",
        "timezoneOffset": "-330",
        "answer": "When you discover your mission, you will feel its demand. It will fill you with enthusiasm and a burning desire to get to work on it. ",
        "skill_link": "https://github.com/fossasia/susi_skill_data/blob/master/models/general/entertainment/en/quotes.txt",
        "query": "tell me a quote",
        "skill": "/susi_skill_data/models/general/entertainment/en/quotes.txt"
      },

The getskills() method returns list of skill from json which are later added for memorization.

    public List<String> getSkills() {
        List<String> skills = new ArrayList<>();
        getSkillsJSON().forEach(skill -> skills.add((String) skill));
        return skills;
    }

This is how Susi is able to fetch  where the answer came from. Next time when you have a chat with susi do check skill analysis and add your ideas to improve the skill. Take a look at Susi_skill_data for more skills and  read this tutorial  for creating skills for susi.

Resources

Continue ReadingHow SUSI Analyzes A Given Response

Create Discount Code Component in Open-Event-Frontend

We in Open-Event-Frontend have given the event organiser the feature to create discount coupons for his or her event. Here the organiser can either enter the discount amount or discount percentage and can set even set the total number of coupons he wants to make available for his customers. We have also automatically generated an unique link for each discount coupon.

We’ll be creating a separate component create-discount-code for creating discount codes.To create the component we’ll run the following command

ember g component forms/events/view/create-discount-code

This will create

1.Create-discount-code.hbs

Here we have designed our form.We have nested all the fields used, inside semantic’s ui form class.Some of the helpers used in the form are

We have used the ember input helper in following way for all the input fields.The

attribute name,value corresponds to the id and value attached with the helper

{{input type=‘text’ name=‘discount_code’ value=data.code}}

Ember radio buttons are used by the organizer to select between discount

{{ui-radio label=(t ‘Amount (US$)’)
          name=‘discount_type’  
          value=‘amount’
          current=‘amount’
          onChange=(action (mut selectedMode))}}

 

We have given the organizer an additional option to set the validity of the discount code. For this we have used date-picker and time-picker component already present in Open-Event-Frontend in the following manner.

<div class=“fields”>
       <div class=“wide field {{if device.isMobile ‘sixteen’ ‘five’}}”>
         <label>{{t ‘Valid from’}}</label>
         {{widgets/forms/date-picker id=’start_date’ value=data.validFromDate rangePosition=’start’}}
         <div class=“ui hidden divider”></div>
         {{widgets/forms/time-picker id=’start_time’ value=data.validFromTime rangePosition=’start’}}
       </div>
       <div class=“wide field {{if device.isMobile ‘sixteen’ ‘five’}}”>
         <label>{{t ‘Expires on’}}</label>
         {{widgets/forms/date-picker id=‘end_date’ value=data.validTillDate rangePosition=‘end’}}
         <div class=“ui hidden divider”></div>
         {{widgets/forms/time-picker id=‘end_time’ value=data.validTillTime rangePosition=‘end’}}
       </div>
     </div>

The above snippet will the following output

2.Create-discount-code.js

Here we validate the form and provide it with an unique discount code url. We have generated the url using the event id and the discount code.

discountLink: computed(‘data.code’, function() {
 const params = this.get(‘routing.router.router.state.params’);
 return location.origin + this.get(‘routing.router’)
                         .generate(‘public’, params[‘events.view’]                          .event_id,
        { queryParams: { discount_code: this.get(‘data.code’) } });
}),
actions: {
 submit() {
   this.onValid(() => {
   });
 }
}

3.Create-discount-code-test.js

This is where we check whether our component is compatible with other components of the system or not. Here, for now, we are just making sure if our component renders or not, by checking the presence of ‘Save’.

import { test } from ’ember-qunit’;
import moduleForComponent from ‘open-event-frontend/tests/helpers/component-helper’;
import hbs from ‘htmlbars-inline-precompile’;

moduleForComponent(‘forms/events/view/create-discount-code’, ‘Integration | Component | forms/events/view/create discount code’);

test(‘it renders’, function(assert) {
 this.render(hbs`{{forms/events/view/create-discount-code routing=routing}}`);
 assert.ok(this.$().html().trim().includes(‘Save’));
});

Now, our component is ready, and the only part remaining is to place it in our application. We place it in app/templates/events/view/tickets/discount-codes/create.hbs in the given form.

{{forms/events/view/create-discount-code data=model}}

Here we have passed model from create-discount-code.js to data used in Create-discount-code.hbs

Now our create discount code page is up and running

Additional Resources

Continue ReadingCreate Discount Code Component in Open-Event-Frontend

Adding Messaging Route in Ember.js for Admin UI of Open Event Frontend

In this blog post I am explaining how we implement a messages page for admins to keep track of all types of system messages sent to users in the Open Event Frontend. The page shows the types of messages sent out to various users at one place and as well as additional details. It offers configuration options to control which messages get sent out  as emails or notifications or both. And, the page shows when and what message should be sent via notification or mail.
To create the messages page we’ll run the following command

ember generate route admin/messages

This will create

This command will also add  this.route(‘messages’);  to router.js. As admin is the parent route for messages, messages will be nested inside admin in router.js

this.route(‘admin’, function(){
  this.route(‘messages’);
});

Let’s now understand the content of each of above files.

  1. Messages.js

In admin/messages.js we have used titletoken helper to set the title of the tab. Here we have created the message model and added attribute like recipient, trigger, emailMessage, notificationMessage, options and sentAt. We have returned this model from the js file to template.

import Ember from ’ember’;
const { Route } = Ember;
export default Route.extend({
 titleToken() {
   return this.l10n.t(‘Messages’);
 },
 model() {
   return [{
     recipient: [
       {
         name: ‘Organizer’
       },
       {
         name: ‘Speaker’
       }
     ],
     trigger      : ‘Title1’,
     emailMessage : {
       subject : ‘Email subject1’,
       message : ‘Hi, the schedule for session1 has been changed’
     },
     notificationMessage: {
       subject : ‘Notification subject1’,
       message : ‘Hi, the schedule for session1 has been changed’
     },
     option: {
       mail         : true,
       notification : false,
       userControl  : true
     },
     sentAt: new Date()
   }];
 }
});

 

  1. Messages.hbs

In template we have created a table and added classes like stackable and compact. Stackable class makes the table responsive and stacks all the rows for devices with smaller screen size. Compact class helps to show more number of rows at a time.

Then in the template we iterate through the model using a loop. Here we have used other semantic-ui elements like ui ordered list , ui header, ui-checkbox inside the table. For options column we have three attributes denoting how the admin wants to send the message to the user. Here we have grouped three fields using the class grouped fields .In each field we have used semantic’s  ui-checkbox .In check-box we are mutating values on click by using mut helper.

<div class=“grouped fields”>
 <div class=“field”>
   {{ui-checkbox checked=message.option.mail
                 label=(t ‘Mail’)      
                 onChange=(action (mut message.option.mail))}}
 </div>
 <div class=“field”>
   {{ui-checkbox checked=message.option.notification
                 label=(t ‘Notification’)  
               onChange=(action (mut message.option.notification))}}
 </div>

 <div class=“field”>
   {{ui-checkbox checked=message.option.userControl
                label=(t ‘User Control’)  
               onChange=(action (mut message.option.userControl))}}
 </div>
</div>

We are getting date object from js and to convert it into human readable format we have used moment like {{moment-format message.sentAt ‘dddd, DD MMMM YYYY’}}

  1. Messages-test.js
import { test } from ’ember-qunit’;
import moduleFor from ‘open-event-frontend/tests/helpers/unit-helper’;

moduleFor(‘route:admin/messages’, ‘Unit | Route | admin/messages’, []);

test(‘it exists’, function(assert) {
 let route = this.subject();
 assert.ok(route);
});

Using this we can test the existence of our route. These tests are run using the command ember t.

Our message page is ready now. The admin can have a check at all the messages sent to users.

Additional Resources

Continue ReadingAdding Messaging Route in Ember.js for Admin UI of Open Event Frontend

How to use Digital Ocean and Docker to setup Test CMS for Phimpme

One of the core feature of Phimpme app is sharing images to other different accounts, including various open source CMS such as WordPress, Drupal etc and other open source data storage account such as OwnCloud, NextCloud etc.

One can not have everything at place, but for development and testing purpose it is required in our end. So problem I was facing to get things done in most optimize way. I thought setting things on hosted server would be good, because it saves lots of time in setting locally in our system, adding all the dependencies. And also we cannot share the account as it is limited to our local environment.

Digital Ocean caught my attention in providing hosting service. It is very easy to use with their droplet creation. Select as per your system requirement and service requirement, the droplet will be ready in few moments and we can use it anywhere.

Note: DigitalOcean is a paid service. Student can use Github Education Pack for free credits on Digital Ocean. I used the same.

I currently worked on Nextcloud integration so here in this blog I will tell how to quickly create nextcloud server using Digital Ocean and Docker.

Step 1: Creating Droplet

DigitalOcean completely work on droplets and one can anytime create and destroy different droplets associated with their account.

Choose an Image

So there are three options of choosing the image of Droplet.

Distributions : Which is other operating systems you want to use

One Click app: It is a very good feature as it creates everything for use in just one click. But again, it doesn’t provide everything, like there is no NextCloud. That’s why I used docker to take its image.

Snapshots: This is if you saved your droplet already, so it will pick it and creates similar to the saved image. Here I selected Docker from one-click apps section.

Selecting the size

This is for selecting the size of the server we are creating, For small development purpose $5 plan is good. There is a good thing in DigitalOcean as it didn’t charge on the monthly basis to the use. It works on hourly basis and charge according to that. Also providing the SSD Disk for fast IO operations.

Choose a datacenter Region

Add SSH

This is very important to add a ssh key. Otherwise you have to setup root password or used the shell they provide. To work on your computer terminal, its good that you setup an ssh key already and it to the account.

How to get ssh key in your system: https://help.github.com/

Rename the number of droplet and name of the droplet and create.

Now it will show there in your droplet section

Step 2: Access the Server

As we have already added the ssh key to our droplet. Now we can access it from our terminal. Open the terminal and type this

➜  ~ ssh root@<your IP> 

It will logged in to you

root@docker-512mb-blr1-01:~# 

Our objective is setting a NextCloud Account.

Here now I will use Docker. Firstly, What is Docker?

Go here to read: https://www.docker.com/what-docker

I will explain docker in other words. Like I setted up everything which I need. Now If I have to destroy this all and want to use it after some days. Or if my friends wants to use the setted platform. What is the option here?

Recreate and everything everytime? NO.

Just create docker image, save it pull the image when you want, and run it to serve on the serve. Your friends need, provide them the docker image.

Isn’t it cool and much time saving.

Browse the Docker Hub

In the hub we can find docker images for various platforms officially maintained by the authors.

Nextcloud have their official account on Docker to provide latest images to the developers.

Here is the link : https://hub.docker.com/_/nextcloud/

Pull the image in your server.

root@docker-512mb-blr1-01:~# docker pull nextcloud
Using default tag: latest
latest: Pulling from library/nextcloud
9f0706ba7422: Pull complete
4c407763908f: Pull complete
82e2bc3a45c1: Pull complete
c84e1013aed1: Pull complete
a3b5e03d7e24: Pull complete
917f836a88be: Pull complete
b2dc54431819: Pull complete
a60b574790b8: Pull complete
49ef0f1aff88: Pull complete
7773a865ee49: Pull complete
9e0e5cc56a9d: Pull complete
bfade1c7421e: Pull complete
ece8ceb33bed: Pull complete
c691d2747a3e: Pull complete
4b5e96bf54c9: Pull complete
6fbe30ae456b: Pull complete
e0c534b35a6b: Pull complete
4d2687f4b6f3: Pull complete
00197422846a: Pull complete
6ab57168c49c: Pull complete
9e1260db005f: Pull complete
Digest: sha256:1bb5c256f19dcec60d8468c00bc7dc74efdf93390666cb82e20bcacbbbd9746c
Status: Downloaded newer image for nextcloud:latest
root@docker-512mb-blr1-01:~#

Following the documentation

I need to run this command $ docker run -d -p 8080:80 nextcloud

It serves the account on localhost.

Check on your https://<IP>:8080

So in this way I easily setup different account for testing and integration purpose in Phimpme Android app. It really saves lots of time and speed up the process.

You can easily destroy the droplet after work is done.

Student can use the free credits from GitHub Education Pack.

Source:

 

Continue ReadingHow to use Digital Ocean and Docker to setup Test CMS for Phimpme

Using Variables in a SUSI skill

One of the best feature provided in making a skill is the ease of using variables. From storing the favourite book of the user to the most recent movie he searched for to the mood he is in, variables play an indispensable part. If any problem is faced with the code part, the skill referred in this blog is coded in this file in susi_skill_data repository

This link refers to the official docs of SUSI, which walk you through some basic examples of how to use variables in a SUSI skill. Great skills can be achieved using them like the skill below:

It’s easy to make such skills by using variables. Let’s check it out how this skill can be achieved.

To store value in a variable we use this syntax during the skill development

^value^>_variableName

First, let’s save the favourite dish of the user and then we will try to surprise him/her with a witty answer.

I love * dish
^$1$^>_userFavouriteDish

So, if the user types “I love biryani dish”, $1$ will be equal to biryani. Let’s save it to _userFavouriteDish variable.

Now if user asks “What should i eat” to SUSI, I bet SUSI will answer a well calculated answer!

What should i eat?
I am sure you will love $_userFavouriteDish$!

Another example that can answer back the user efficiently:

How to cook biryani?

#Gives recipies and links to cook a dish
* cook *
!console:To cook  $title$ , check out $href$ and make sure you have $ingredients$! ^$2$^>_recentSearch
{
"url":"http://www.recipepuppy.com/api/?q=$2$",
"path":"$.results"
}
eol

In the above code, we saved the dish searched for at the end of the output.

If somehow user ends up asking “what is the most recent dish i searched for”. It’s skill will be:

what is the most recent dish I searched for?
It was $_recentSearch$

Even if before asking this question, user asks “how to cook sushi”. The _recentSearch variable will be overridden with value “sushi” instead of “biryani”. Hence, SUSI won’t mistake answering “most recent dish” as “sushi”!

Now I think we are bit comfortable with use of variables in a skill. Let’s get back to our target skill i.e. remembering skill. We store the thing asked to remember in a variable having the same name as of that thing and the statement related to it as the value of that variable. Examples:

Remember that my keys are on the table. So the variable will be named “keys” and it’s value will be “on the table”.

Remember that my birthday is on 20th of December. So the variable will be named “birthday” and it’s value will be “on 20th of December”.

Remember that my meetings are at 8 pm with mentors and at 9:30 pm with Shruti. So the variable will be named “meetings” and it’s value will be “at 8 pm with mentors and at 9:30 pm with Shruti”.

Hence the skill:

Remember that my * is * | Remember that my * is *
Okay, remembered!^$2$^>_$1$

When the user will ask for any of its thing, we will just show the value of the variable having the same name as of the thing asked. Examples:

#$_keys$ will be our answer
Where are my keys?
On the table                   

#$_meetings$ will be our answer
When are my meetings?
at 8 pm with mentors and at 9:30 pm with Shruti

Hence the skill which answers the question is:

when are my * | where is my * | where are my *
$_$1$$

So the skill as a whole will be:

Remember that my * is * | Remember that my * is *
Okay, remembered!^$2$^>_$1$

when are my * | where is my * | where are my *
$_$1$$

Resources

Continue ReadingUsing Variables in a SUSI skill

Generating responsive email using mjml in Yaydoc

In Yaydoc, an email with a download, preview and deploy link will be sent to the user after documentation is generated. But then initially, Yaydoc was sending email in plain text without any styling, so I decided to make an attractive HTML email template for it. The problem with HTML email is adding custom CSS and making it responsive, because the emails will be seen on various devices like mobile, tablet and desktops. When going through the GitHub trending list, I came across mjml and was totally stunned by it’s capabilities. Mjml is a responsive email generation framework which is built using React (popular front-end framework maintained by Facebook)

Install mjml to your system using npm.

npm init -y && npm install mjml

Then add mjml to your path

export PATH="$PATH:./node_modules/.bin”

Mjml has a lot of react components pre-built for creating the responsive email. For example mj-text, mj-image, mj-section etc…

Here I’m sharing the snippet used for generating email in Yaydoc.

<mjml>
  <mj-head>
    <mj-attributes>
      <mj-all padding="0" />
      <mj-class name="preheader" color="#CB202D" font-size="11px" font-family="Ubuntu, Helvetica, Arial, sans-serif" padding="0" />
    </mj-attributes>
    <mj-style inline="inline">
      a { text-decoration: none; color: inherit; }
 
    </mj-style>
  </mj-head>
  <mj-body>
    <mj-container background-color="#ffffff">
 
      <mj-section background-color="#CB202D" padding="10px 0">
        <mj-column>
          <mj-text align="center" color="#ffffff" font-size="20px" font-family="Lato, Helvetica, Arial, sans-serif" padding="18px 0px">Hey! Your documentation generated successfully<i class="fa fa-address-book-o" aria-hidden="true"></i>
 
          </mj-text>
        </mj-column>
      </mj-section>
      <mj-section background-color="#ffffff" padding="20px 0">
        <mj-column>
          <mj-image src="http://res.cloudinary.com/template-gdg/image/upload/v1498552339/play_cuqe89.png" width="85px" padding="0 25px">
</mj-image>
 
          <mj-text align="center" color="#EC652D" font-size="20px" font-family="Lato, Helvetica, Arial, sans-serif" vertical-align="top" padding="20px 25px">
            <strong><a>Preview it</a></strong>
            <br />
          </mj-text>
        </mj-column>
        <mj-column>
          <mj-image src="http://res.cloudinary.com/template-gdg/image/upload/v1498552331/download_ktlqee.png" width="100px" padding="0 25px" >
        </mj-image>
          <mj-text align="center" color="#EC652D" font-size="20px" font-family="Lato, Helvetica, Arial, sans-serif" vertical-align="top" padding="20px 25px">
            <strong><a>Download it</a></strong>
            <br />
          </mj-text>
        </mj-column>
        <mj-column>
          <mj-image src="http://res.cloudinary.com/template-gdg/image/upload/v1498552325/deploy_yy3oqw.png" width="100px" padding="0px 25px" >
        </mj-image>
          <mj-text align="center" color="#EC652D" font-size="20px" font-family="Lato, Helvetica, Arial, sans-serif" vertical-align="top" padding="20px 25px">
 
            <strong><a>Deploy it</a></strong>
            <br />
          </mj-text>
        </mj-column>
      </mj-section>
      <mj-section background-color="#333333" padding="10px">
        <mj-column>
        <mj-text align="center" color="#ffffff" font-size="20px" font-family="Lato, Helvetica, Arial, sans-serif" padding="18px 0px">Thanks for using Yaydoc<i class="fa fa-address-book-o" aria-hidden="true"></i>
        </mj-column>
        </mj-text>
      </mj-section>
    </mj-container>
  </mj-body>
</mjml>

The main goal of this example is to make a responsive email which looks like the image given below. So, In mj-head tag, I have imported all the necessary fonts using the mj-class tag and wrote my custom CSS in mj-style. Then I made a container with one row and one column using mj-container, mj-section and mj-column tag and changed the container background color to #CB202D using background-color attribute, then In that container I wrote a heading which says `Hey! Your documentation generated successfully`  with mj-text tag, Then you will get the red background top bar with the success message. Then moving on to the second part, I made a container with three columns and added one image to each column using mj-image tag by specifying image URL as src attribute, added the corresponding text below the mj-image tag using the mj-text tag. At last,  I  made one more container as the first one with different message saying `Thanks for using yaydoc`  with background color #333333

At last, transpile your mjml code to HTML by executing the following command.

mjml -r index.mjml -o index.html

Rendered Email
Resources:

Continue ReadingGenerating responsive email using mjml in Yaydoc

Testing child process using Mocha in Yaydoc

Mocha is a javascript testing framework. It can be used in both nodeJS and browser as well, also it is one of the most popular testing framework available out there. Mocha is widely used for the Behavior Driven Development (BDD). In yaydoc, we are using mocha to test our web UI. One of the main task in yaydoc is documentation generation. We build a bash script to do our documentation generation. We run the bash script using node’s child_process module, but then in order to run the test you have to execute the child process before test execution. This can be achieved by mochas’s before hook. Install mocha in to your system

npm install -g mocha

Here is the test case which i wrote in yaydoc test file.

const assert = require('assert')
const spawn = require('child_process').spawn
const uuidV4 = require("uuid/v4")
describe('WebUi Generator', () => {
  let uniqueId = uuidV4()
  let email = 'fossasia@gmail.com'
  let args = [
    "-g", "https://github.com/fossasia/yaydoc.git",
    "-t", "alabaster",
    "-m", email,
    "-u", uniqueId,
    "-w", "true"
  ]
  let exitCode

  before((done) => {
    let process = spawn('./generate.sh', args)
    process.on('exit', (code) => {
      exitCode = code
      done()
    })
  })
  it('exit code should be zero', () => {
    assert.equal(exitCode, 0)
  })
 })

Describe() function is used to describe our test case. In our scenario we’re testing the generate script so we write as webui generator. As I mentioned above we have to run our child_process in before hook. It() function is the place where we write our test case. If the test case fails, an error will be thrown. We use the assert module from mocha to do the assertion. You can see our assertion in first it()  block for checking exit code is zero or not.

mocha test.js --timeout 1500000

Since documentation takes time so we have to mention time out while running mocha. If your test case passes successfully, you will get output similar to this.

WebUi Generator
    ✓ exit code should be zero

Resources:

 

Continue ReadingTesting child process using Mocha in Yaydoc

Storing a Data List in Phimpme Android

In Phimpme Android, it is required to store all the available camera parameters like a list of ISO values, available camera resolution etc. so that it can be displayed to the user in the camera settings. In Phimpme, we have stored these list of data in SharedPreferences with some modifications. As we cannot store a list directly in SharedPreference, in this post I will be discussing how we achieved this in Phimpme Android application.

To store the ArrayList you have to create a function that will convert the array into a string by using some symbol.

Step – 1

First, Create a class say TinyDB which contains functions to store an array in sharedPreferences.

public class TinyDB
{
private SharedPreferences preferences;
public TinyDB(Context appContext) {
preferences = PreferenceManager.getDefaultSharedPreferences(appContext);
}
}

Step – 2

Create functions to convert the array into string and store in sharedPreferences.

putListInt() method will convert the string ArrayList to String and store in sharedPreferences.

Similarly, putListString() method will convert the integer ArrayList to string and store in sharedPreferences.

public void putListInt(String key, ArrayList<Integer> intList) {
  if (key == null) return;
  if (intList==null) return;
  Integer[] myIntList = intList.toArray(new Integer[intList.size()]);
  preferences.edit().putString(key, TextUtils.join(“‚‗‚”, myIntList)).apply();
}

  
public void putListString(String key, ArrayList<String> stringList) {
  if (key == null) return;
  if (stringList ==null)return;
  String[] myStringList = stringList.toArray(new String[stringList.size()]);
  preferences.edit().putString(key, TextUtils.join(“‚‗‚”, myStringList)).apply();
}

 

 

Now create the object of TinyDB.class to call the above functions using tinyDb object.

Now our data is saved in sharedPreference to get this data we have to create a getter for the ArrayList.

 

Step-3

Add two functions in TinyDB.class to get the string and integer ArrayList.

public ArrayList<String> getListString(String key) {
        return new ArrayList<String>(Arrays.asList(TextUtils.split(preferences.
=getString(key, “”), “‚‗‚”)));
}



public ArrayList<Integer> getListInt(String key) {
  String[] myList = TextUtils.split(preferences.getString(key, “”), “‚‗‚”);
  ArrayList<String> arrayToList = new ArrayList<String>(Arrays.asList(myList));
  ArrayList<Integer> newList = new ArrayList<Integer>();

  for (String item : arrayToList)
      newList.add(Integer.parseInt(item));

  return newList;
}

Now to get the saved integer and string ArrayList simply call this function by creating an instance of TinyDB.class.

The below screenshot depicts how we have stored the list of camera resolutions in SharedPreference using TinyDB class.

So this is how you can store the entire ArrayList in sharedPreferences. For more detail, you can see the TinyDb.class in our Phimpme project.

Resources:  

https://stackoverflow.com/questions/7057845/save-arraylist-to-sharedpreferences

http://blog.nkdroidsolutions.com/arraylist-in-sharedpreferences/

http://findnerd.com/list/view/Save-ArrayList-of-Object-into-Shared-Preferences-in-Android/510?page=10&ppage=3

https://github.com/fossasia/phimpme-android/blob/development/app/src/main/java/org/fossasia/phimpme/opencamera/Camera/TinyDB.java

 

 

Continue ReadingStoring a Data List in Phimpme Android

Introducing Priority Kaizen Harvester for loklak server

In the previous blog post, I discussed the changes made in loklak’s Kaizen harvester so it could be extended and other harvesting strategies could be introduced. Those changes made it possible to introduce a new harvesting strategy as PriorityKaizen harvester which uses a priority queue to store the queries that are to be processed. In this blog post, I will be discussing the process through which this new harvesting strategy was introduced in loklak.

Background, motivation and approach

Before jumping into the changes, we first need to understand that why do we need this new harvesting strategy. Let us start by discussing the issue with the Kaizen harvester.

The produce consumer imbalance in Kaizen harvester

Kaizen uses a simple hash queue to store queries. When the queue is full, new queries are dropped. But numbers of queries produced after searching for one query is much higher than the consumption rate, i.e. the queries are bound to overflow and new queries that arrive would get dropped. (See loklak/loklak_server#1156)

Learnings from attempt to add blocking queue for queries

As a solution to this problem, I first tried to use a blocking queue to store the queries. In this implementation, the producers would get blocked before putting the queries in the queue if it is full and would wait until there is space for more. This way, we would have a good balance between consumers and producers as the consumers would be waiting until producers can free up space for them –

public class BlockingKaizenHarvester extends KaizenHarvester {
   ...
   public BlockingKaizenHarvester() {
       super(new KaizenQueries() {
           ...
           private BlockingQueue<String> queries = new ArrayBlockingQueue<>(maxSize);

           @Override
           public boolean addQuery(String query) {
               if (this.queries.contains(query)) {
                   return false;
               }
               try {
                   this.queries.offer(query, this.blockingTimeout, TimeUnit.SECONDS);
                   return true;
               } catch (InterruptedException e) {
                   DAO.severe("BlockingKaizen Couldn't add query: " + query, e);
                   return false;
               }
           }
           @Override
           public String getQuery() {
               try {
                   return this.queries.take();
               } catch (InterruptedException e) {
                   DAO.severe("BlockingKaizen Couldn't get any query", e);
                   return null;
               }
           }
           ...
       });
   }
}

[SOURCE, loklak/loklak_server#1210]

But there is an issue here. The consumers themselves are producers of even higher rate. When a search is performed, queries are requested to be appended to the KaizenQueries instance for the object (which here, would implement a blocking queue). Now let us consider the case where queue is full and a thread requests a query from the queue and scrapes data. Now when the scraping is finished, many new queries are requested to be inserted to most of them get blocked (because the queue would be full again after one query getting inserted).

Therefore, using a blocking queue in KaizenQueries is not a good thing to do.

Other considerations

After the failure of introducing the Blocking Kaizen harvester, we looked for other alternatives for storing queries. We came across multilevel queues, persistent disk queues and priority queues.

Multilevel queues sounded like a good idea at first where we would have multiple queues for storing queries. But eventually, this would just boil down to how much queue size are we allowing and the queries would eventually get dropped.

Persistent disk queues would allow us to store greater number of queries but the major disadvantage was lookup time. It would terribly slow to check if a query already exists in the disk queue when the queue is large. Also, since the queries would always increase practically, the disk queue would also go out of hand at some point in time.

So by now, we were clear that not dropping queries is not an alternative. So what we had to use the limited size queue smartly so that we do not drop queries that are important.

Solution: Priority Queue

So a good solution to our problem was a priority queue. We could assign a higher score to queries that come from more popular Tweets and they would go higher in the queue and do not drop off until we have even higher priority queried in the queue.

Assigning score to a Tweet

Score for a tweet was decided using the following formula –

α= 5* (retweet count)+(favourite count)

score=α/(α+10*exp(-0.01*α))

This equation generates a score between zero and one from the retweet and favourite count of a Tweet. This normalisation of score would ensure we do not assign an insanely large score to Tweets with a high retweet and favourite count. You can see the behaviour for the second mentioned equation here.

Graph?

Changes required in existing Kaizen harvester

To take a score into account, it became necessary to add an interface to also provide a score as a parameter to the addQuery() method in KaizenQueries. Also, not all queries can have a score associated with it, for example, if we add a query that would search for Tweets older than the oldest in the current timeline, giving it a score wouldn’t be possible as it would not be associated with a single Tweet. To tackle this, a default score of 0.5 was given to these queries –

public abstract class KaizenQueries {

   public boolean addQuery(String query) {
       return this.addQuery(query, 0.5);
   }

   public abstract boolean addQuery(String query, double score);
   ...
}

[SOURCE]

Defining appropriate KaizenQueries object

The KaizenQueries object for a priority queue had to define a wrapper class that would hold the query and its score together so that they could be inserted in a queue as a single object.

ScoreWrapper and comparator

The ScoreWrapper is a simple class that stores score and query object together –

private class ScoreWrapper {

   private double score;
   private String query;

   ScoreWrapper(String m, double score) {
       this.query = m;
       this.score = score;
   }

}

[SOURCE]

In order to define a way to sort the ScoreWrapper objects in the priority queue, we need to define a Comparator for it –

private Comparator<ScoreWrapper> scoreComparator = (scoreWrapper, t1) -> (int) (scoreWrapper.score - t1.score);

[SOURCE]

Putting things together

Now that we have all the ingredients to declare our priority queue, we can also declare the strategy to getQuery and putQuery in the corresponding KaizenQueries object –

public class PriorityKaizenHarvester extends KaizenHarvester {

   private static class PriorityKaizenQueries extends KaizenQueries {
       ...
       private Queue<ScoreWrapper> queue;
       private int maxSize;

       public PriorityKaizenQueries(int size) {
           this.maxSize = size;
           queue = new PriorityQueue<>(size, scoreComparator);
       }

       @Override
       public boolean addQuery(String query, double score) {
           ScoreWrapper sw = new ScoreWrapper(query, score);
           if (this.queue.contains(sw)) {
               return false;
           }
           try {
               this.queue.add(sw);
               return true;
           } catch (IllegalStateException e) {
               return false;
           }
       }

       @Override
       public String getQuery() {
           return this.queue.poll().query;
       }
       ...
}

[SOURCE]

Conclusion

In this blog post, I discussed the process in which PriorityKaizen harvester was introduced to loklak. This strategy is a flavour of Kaizen harvester which uses a priority queue to store queries that are to be processed. These changes were possible because of a previous patch which allowed extending of Kaizen harvester.

The changes were introduced in pull request loklak/loklak#1240 by @singhpratyush (me).

Resources

Continue ReadingIntroducing Priority Kaizen Harvester for loklak server

Fetching URL for Embedded Twitter Videos in loklak server

The primary web service that loklak scrapes is Twitter. Being a news and social networking service, Twitter allows its users to post videos directly to Twitter and they convey more thoughts than what text can. But for an automated scraper, getting the links is not a simple task.

Let us see that what were the problems we faced with videos and how we solved them in the loklak server project.

Previous setup and embedded videos

In the previous version of loklak server, the TwitterScraper searched for videos in 2 ways –

  1. Youtube links
  2. HTML5 video links

To fetch the video URL from HTML5 video, following snippet was used –

if ((p = input.indexOf("<source video-src")) >= 0 && input.indexOf("type=\"video/") > p) {
   String video_url = new prop(input, p, "video-src").value;
   videos.add
   continue;
}

Here, input is the current line from raw HTML that is being processed and prop is a class defined in loklak that is useful in parsing HTML attributes. So in this way, the HTML5 videos were extracted.

The Problem – Embedded videos

Though the previous setup had no issues, it was useless as Twitter embeds the videos in an iFrame and therefore, can’t be fetched using simple HTML5 tag extraction.

If we take the following Tweet for example,

the requested HTML from the search page contains video in following format –

<src="https://twitter.com/i/videos/tweet/881946694413422593?embed_source=clientlib&player_id=0&rpc_init=1" allowfullscreen="" id="player_tweet_881946694413422593" style="width: 100%; height: 100%; position: absolute; top: 0; left: 0;">

So we needed to come up with a better technique to get those videos.

Parsing video URL from iFrame

The <div> which contains video is marked with AdaptiveMedia-videoContainer class. So if a Tweet has an iFrame containing video, it will also have the mentioned class.

Also, the source of iFrame is of the form https://twitter.com/i/videos/tweet/{Tweet-ID}. So now we can programmatically go to any Tweet’s video and parse it to get results.

Extracting video URL from iFrame source

Now that we have the source of iFrame, we can easily get the video source using the following flow –

public final static Pattern videoURL = Pattern.compile("video_url\\\":\\\"(.*?)\\\"");

private static String[] fetchTwitterIframeVideos(String iframeURL) {
   // Read fron iframeURL line by line into BufferReader br
   while ((line = br.readLine()) != null ) {
       int index;
       if ((index = line.indexOf("data-config=")) >= 0) {
           String jsonEscHTML = (new prop(line, index, "data-config")).value;
           String jsonUnescHTML = HtmlEscape.unescapeHtml(jsonEscHTML);
           Matcher m = videoURL.matcher(jsonUnescHTML);
           if (!m.find()) {
               return new String[]{};
           }
           String url = m.group(1);
           url = url.replace("\\/", "/");  // Clean URL
           /*
            * Play with url and return results
            */
       }
   }
}

MP4 and M3U8 URLs

If we encounter mp4 URLs, we’re fine as it is the direct link to video. But if we encounter m3u8 URL, we need to process it further before we can actually get to the videos.

For Twitter, the hosted m3u8 videos contain link to further m3u8 videos which are of different resolution. These m3u8 videos again contain link to various .ts files that contain actual video in parts of 3 seconds length each to support better streaming experience on the web.

To resolve videos in such a setup, we need to recursively parse m3u8 files and collect all the .ts videos.

private static String[] extractM3u8(String url) {
   return extractM3u8(url, "https://video.twimg.com/");
}

private static String[] extractM3u8(String url, String baseURL) {
   // Read from baseURL + url line by line
   while ((line = br.readLine()) != null) {
       if (line.startsWith("#")) {  // Skip comments in m3u8
           continue;
       }
       String currentURL = (new URL(new URL(baseURL), line)).toString();
       if (currentURL.endsWith(".m3u8")) {
           String[] more = extractM3u8(currentURL, baseURL);  // Recursively add all
           Collections.addAll(links, more);
       } else {
           links.add(currentURL);
       }
   }
   return links.toArray(new String[links.size()]);
}

And then in fetchTwitterIframeVideos, we can return the all .ts URLs for the video –

if (url.endsWith(".mp4")) {
   return new String[]{url};
} else if (url.endsWith(".m3u8")) {
   return extractM3u8(url);
}

Putting things together

Finally, the TwitterScraper can discover the video links by tweaking a little –

if (input.indexOf("AdaptiveMedia-videoContainer") > 0) {
   // Fetch Tweet ID
   String tweetURL = props.get("tweetstatusurl").value;
   int slashIndex = tweetURL.lastIndexOf('/');
   if (slashIndex < 0) {
       continue;
   }
   String tweetID = tweetURL.substring(slashIndex + 1);
   String iframeURL = "https://twitter.com/i/videos/tweet/" + tweetID;
   String[] videoURLs = fetchTwitterIframeVideos(iframeURL);
   Collections.addAll(videos, videoURLs);
}

Conclusion

This blog post explained the process of extracting video URL from Twitter and the problem faced. The discussed change enabled loklak to extract and serve URLs to video for tweets. It was introduced in PR loklak/loklak_server#1193 by me (@singhpratyush).

The service was further enhanced to collect single mp4 link for videos (see PR loklak/loklak_server#1206), which is discussed in another blog post.

Resources

Continue ReadingFetching URL for Embedded Twitter Videos in loklak server