Versioning of SUSI Skills

This is a concept for the management of the skill repository aka The “SUSI Skill CMS.

With SUSI we are building a personal assistant where the users are able to write and edit skills in the easiest way that we can think of. To do that we have to develop a very simple skill language and a very simple skill editor

The skill editor should be done as a ‘wiki’-like content management system (cms). To create the wiki, we follow an API-centric approach. The SUSI server acts as an API server with a web front-end which acts as a client of the API and provides the user interface.

The skill editor will be ‘expert-centric’, an expert is a set of skills. That means if we edit one text file, that text file represents one expert, it may contain several skills which all belong together.

An ‘expert’ is stored within the following ontology:

model  >  group  >  language  >  expert  >  skill

To Implement the CMS wiki system we need versioning with a working AAA System. To implement versioning we used JGit. JGit is an EDL licensed, lightweight, pure Java library implementing the Git version control system.

So I included a Gradle dependency to add JGit to the SUSI Server project.

compile 'org.eclipse.jgit:org.eclipse.jgit:4.6.1.201703071140-r'

Now the task was to execute git commands when the authorised user makes changes in any of the expert. The possible changes in an expert can be

1. Creating an Expert
2. Modifying an existing Expert
3. Deleting an Expert

1. git add <filename>

2. git commit -m “commit message”

Real Example in SUSI Server

This is the code that every servlet shares. It defines the base user role set a URL endpoint to trigger the endpoint

public class ModifyExpertService extends AbstractAPIHandler implements APIHandler {
    @Override
    public String getAPIPath() {
        return "/cms/modifyExpert.json";
    }

This is the part where we do all the processing of the URL parameters and store their versions. This method takes the “Query call” and then extracts the “get” parameters from it.
For the functioning of this service, we need 5 things, “model”, “group”, “language”, “expert” and the “commit message”.

@Override
public ServiceResponse serviceImpl(Query call, HttpServletResponse response, Authorization rights, final JsonObjectWithDefault permissions) {

    String model_name = call.get("model", "general");
    File model = new File(DAO.model_watch_dir, model_name);
    String group_name = call.get("group", "knowledge");
    File group = new File(model, group_name);
    String language_name = call.get("language", "en");
    File language = new File(group, language_name);
    String expert_name = call.get("expert", null);
    File expert = new File(language, expert_name + ".txt");

Then we need to open your SUSI Skill DATA repository and commit the new file in it. Here we call the functions of JGit, which do the work of git add and git commit.

FileRepositoryBuilder builder = new FileRepositoryBuilder();
Repository repository = null;
try {

    repository = builder.setGitDir((DAO.susi_skill_repo))
            .readEnvironment() // scan environment GIT_* variables
            .findGitDir() // scan up the file system tree
            .build();

    try (Git git = new Git(repository)) {

The code above opens our local git repository and creates an object “git”. Then we perform further operations on “git” object. Now we add our changes to “git”. This is similar to when we run “git add . ”

    git.add()
                .addFilepattern(expert_name)
                .call();

Finally, we commit the changes. This is similar to “git commit -m “message”.

    git.commit()
                .setMessage(commit_message)
                .call();

At last, we return the success object and set the “accepted” value as “true”.

    json.put("accepted", true);
        return new ServiceResponse(json);
    } catch (GitAPIException e) {
        e.printStackTrace();
    }

Resources

JGit documentation : https://eclipse.org/jgit/documentation/

SUSI Server : https://github.com/fossasia/susi_server

 

Servlets and Containers in SUSI Server

The core of SUSI Clients is the SUSI server that holds the “intelligence” and “personality” of SUSI AI. SUSI Server is in JAVA and it uses the concepts of Servlets heavily. While implementing server for SUSI, we have used JAVA as the backend language and hence are using servlets and Servlet containers heavily.

The problem that servlets are solving is regarding how to generate dynamic content on every request if the request comes with different parameters.  This is not possible by simply using HTML.

Servlets are widely used for dynamic content and have inbuilt support for HTTP (Hypertext Transfer Protocol).

In this blog post, along with basics of servlets and servlet containers, I am explaininghow to make classes for custom servlets. I am using an example of AbstractAPIHandler class that we have used in SUSI server for clear picture.

Web Server

Before we can understand a  servlet container, we need to understand what is a web server.

A web server uses HTTP protocol to transfer data. When a user types in a URL in a browser or any client, he first sees loading and then see the content of the page. So actually there is a server that is sending the content of the page to the user. The transfer of data is in HTTP protocol.

Servlet Container

From the example above, a user can request static pages from the server. Hence the pages will be very basic and we cannot do much with user inputs and dynamic data such as making a real time chat bot. So the idea of having Servlet container is using Java to dynamically generate the content of a web page from the server side.  

The flow of Control

The servlet container manages servlets through servlet life cycles. The container calls a servlet as soon as it receives an HTTP request. Then the servlet does the processing of the data which it has received from the request. the container sends the HTTP response back to the client and then the clients render it to show the final output.

The flowchart below explains how a request is processed from a browser to a servlet container and then to a servlet. It shows the usage of a generic Servlet.

Similarly in SUSI-Server we have a “AbstractAPIHandler” which is a generic servlet that every other servlet inherits.

Structure of a Servlet

public class ServletName extends HttpServlet {

public void init() throws ServletException {
// Servlet Initialization
}

protected void service(HttpServletRequest req,
HttpServletResponse resp)
throws ServletException, IOException {
// Code for the Service Method
}
/**
* Process a GET request
*/
protected void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException {
// Code for the doGet() method
}
/**
* Process a POST request
*/
protected void doPost(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException {
// Code for the doPost() method

}
/**
* Process a PUT request
*/
protected void doPut(HttpServletRequest req,
HttpServletResponse resp)
throws ServletException, IOException {
//Code for the doPut() method

}
}

Abstract API Handler

public abstract class AbstractAPIHandler extends HttpServlet implements APIHandler {
}

The function below sets the Minimum Base Role of the SUSI user. We can set which servlet is accessible by which user.

@Override
public abstract BaseUserRole getMinimalBaseUserRole();

The function below gets the Default Permissions of a user Role in SUSI Server. The permissions for each SUSI user can be different.

@Override
public abstract JSONObject getDefaultPermissions(BaseUserRole baseUserRole);

The function below finally sets whatever output we want to show as a response when we query the SUSI Server.

   public abstract ServiceResponse serviceImpl(Query post,  HttpServletResponse response, Authorization rights, final JsonObjectWithDefault permissions) throws APIException;

Each Servlet in SUSI Server extends this class (AbstractAPIHandler.java). This also increases code reusability.

A Servlet is not a just a simple java class. You cannot run a servlet using javac or by running main() function. In order to run a servlet, you have to deploy this Servlet on a web server.

References

SUSI Server : https://github.com/fossasia/susi_server/

Docs: http://docs.oracle.com/javaee/6/tutorial/doc/bnafd.html

Creating and Maintaining User Sessions Using Universal-Cookies in SUSI Web Chat

If you login to SUSI Web Chat, and come back again after some days, you find that you didn’t have to login and all your previous sent messages are in there in the message pane. To achieve this, SUSI Web Chat uses cookies stored in your browser which is featured in this blog.  

In ReactJS, it’s highly misleading and extensive to use the conventional Javascript methodology of saving tokens and deleting them. However, universal-cookie, a node package allows you to store and get cookies with the least possible confusion. In the following examples, I have made use of the get, set and remove functions of the universal-cookie package which have documentations easily available at this link. The basic problem one finds while setting cookies and maintaining sessions is the time through which it should be valid and to secure the routes. We shall look at the steps below to figure out how was it implemented in SUSI Web Chat.

1. The first step is to install the packages by using the following command in your project’s root-

npm install universal-cookie --save

2. Import the Cookies Class, where-ever you want to create a Cookie object in your repository.

import Cookies from 'universal-cookie';

Create a Cookie object at the same time in the file you want to use it,

const cookies = new Cookies();

3. We make use of the set function of the package first, where we try to set the cookie while the User is trying to login to the account.

Note – The cookie value can be set to any value one wants, however, here I am setting it to the access token which is generated by the server so that I can access it throughout the application.

$.ajax({ options: options,
        success: function (response) {
//Get the response token generated from the server
                let accessToken = response.access_token;                       // store the current state
                 let state = this.state;
// set the time for which the session needs to be valid
            let time = response.valid_seconds;
//set the access token in the state
             state.accessToken = accessToken;
// set the time in the state
             state.time = time;           
// Pass the accessToken and the time through the binded function
             this.handleOnSubmit(accessToken, time);
            }.bind(this),
        error: function ( jqXHR, textStatus, errorThrown) {
                   // Handle errors
                   }
        });

Function –  handleOnSubmit()

// Receive the accessToken and the time for which it needs to be valid
handleOnSubmit = (loggedIn, time) => {
        let state = this.state;
        if (state.success) {
              // set the cookie of with the value of the access token at path ‘/’ and set the time using the parameter ‘maxAge’
            cookies.set('loggedIn', loggedIn, { path: '/', maxAge: time });
// Redirect the user to logged in state and reload
            this.props.history.push('/', { showLogin: false });
            window.location.reload();
        }
        else {
        // Handle errors
    }
}

4.  To access the value set to the cookie, we make use of the get function. To check the logged in state of the User we check if get method is returning a null value or an undefined value, this helps in maintaining the User behaviour at every point in the application.

if(cookies.get('loggedIn')===null||
    cookies.get('loggedIn')===undefined) {
    // Handle User behaviours do not send chat queries with access token if the cookie is null
    url = BASE_URL+'/susi/chat.json?q='+
          createdMessage.text+
          '&language='+locale;
  }
  else{
   //  Send the messages with User’s access token
    url = BASE_URL+'/susi/chat.json?q='
          +createdMessage.text+'&language='
          +locale+'&access_token='
          +cookies.get('loggedIn');
  }

5. To delete the cookies, we make use of the remove function, which deletes that cookie. This function is called while logging the user out of the application.

cookies.remove('loggedIn');
this.props.history.push('/');
window.location.reload();

Here’s the full code in the repository. Feel free to contribute:https://github.com/fossasia/chat.susi.ai

Resources

Using SUSI AI Accounting Object to Write User Settings

SUSI Server uses DAO in which accounting object is stored as JSONTray. SUSI clients are using this accounting object for user settings data. In this blogpost we will focus on how to use accounting JSONTray to write the user settings, so that a client can use such endpoint to store the user related settings in Susi server. The Susi server provides the required API endpoints to its web and mobile clients. Before starting with the implementation of servlet let’s take a look at Accounting.java file, to check how Susi server stores the accounting data.

public class Accounting {

        private JsonTray parent;
        private JSONObject json;
        private UserRequests requests;
        private ClientIdentity identity;
    ...
}

 

The JsonTray is class to hold the volume as <String,JsonObject> pairs as a Json file. The UserRequests  class holds all the user activities. The ClientIdentity class extend the base class Client and provides an Identification String for authentication of users. Now that we have understood about accounting in SUSI server let’s proceed for making an API endpoint to Store Webclient User settings. To make an endpoint we will use the HttpServlet class which provides methods, such as doGet and doPost, for handling HTTP-specific services. We will inherit our ChangeUserSettings class from AbstractAPIHandler yand implement APIhandler interface. In Susi Server the AbsrtactAPI handler extends a HTTPServlet which implements doGet and doPost ,all servlet in SUSI Server extends this class to increase code reusability.  

Since a User has to store its setting, set the minimum base role to access this endpoint to User. Apart from ‘User’ there are Admin and Anonymous roles too.

   @Override
    public BaseUserRole getMinimalBaseUserRole() {
        return BaseUserRole.USER;
    }

Next set the path for using this endpoint, by overriding getAPIPath method().

 @Override
    public String getAPIPath() {
        return "/aaa/changeUserSettings.json";
    }

We won’t be dealing with getdefault permissions so null can be return.

  @Override
    public JSONObject getDefaultPermissions(BaseUserRole baseUserRole) {
        return null;
    }

Next we implement serviceImpl method which takes four parameters the query, response, authorization and default permissions.

@Override
    public ServiceResponse serviceImpl(Query query, HttpServletResponse response, Authorization authorization, JsonObjectWithDefault permissions) throws APIException {
       String key = query.get("key", null);
       String value =query.get("value", null);
       if (key == null || value == null ) {
           throw new APIException(400, "Bad Service call, key or value parameters not provided");
       } else {
           if (authorization.getIdentity() == null) {
               throw new APIException(400, "Specified User Setting not found, ensure you are logged in");
           } else {
               Accounting accounting = DAO.getAccounting(authorization.getIdentity());
               JSONObject jsonObject = new JSONObject();
               jsonObject.put(key, value);
               if (accounting.getJSON().has("settings")) {
                   accounting.getJSON().getJSONObject("settings").put(key, value);
               } else {
                   accounting.getJSON().put("settings", jsonObject);
               }
               JSONObject result = new JSONObject();
               result.put("message", "You successfully changed settings to your account!");
               return new ServiceResponse(result);
           }
       }

    }

We will be storing the setting in Json object using key, value pairs. Take the values from user using query.get(“param”,”default value”) and set the default value to null. So that in case the parameters are not present the servlet can return “Bad service call”. To get the accounting object user identity string given by authorization.getIdentity() method is used. Now check if the same user settings is already present, if yes, overwrite it and if not append a new Json object with received key and value. And return the success message through ServiceResponse method.

Proceed to test the working of endpoint at http://127.0.0.1:4000/aaa/changeUserSettings.json?key=theme&value=dark and see if it’s stored using   http://127.0.0.1:4000/aaa/listUserSettings.json.

You have successfully created an endpoint to store user settings and  enhanced Susi Server, take a look and contribute to Susi Server.

Resources

Implementing Search Feature In SUSI Web Chat

SUSI WebChat now has a search feature. Users now have an option to filter or find messages. The user can enter a keyword or phrase in the search field and all the matched messages are highlighted with the given keyword and the user can then navigate through the results.

Lets visit SUSI WebChat and try it out.

  1. Clicking on the search icon on the top right corner of the chat app screen, we’ll see a search field expand to the left from the search icon.
  2. Type any word or phrase and you see that all the matches are highlighted in yellow and the currently focused message is highlighted in orange
  3. We can use the up and down arrows to navigate between previous and recent messages containing the search string.
  4. We can also choose to search case sensitively using the drop down provided by clicking on the vertical dots icon to the right of the search component.
  5. Click on the `X` icon or the search icon to exit from the search mode. We again see that the search field contracts to the right, back to its initial state as a search icon.

How does the search feature work?

We first make our search component with a search field, navigation arrow icon buttons and exit icon button. We then listen to input changes in our search field using onChange function, and on input change, we collect the search string and iterate through all the existing messages checking if the message contains the search string or not, and if present, we mark that message before passing it to MessageListItem to render the message.

let match = msgText.indexOf(matchString);

  if (match !== -1) {
    msgCopy.mark = {
    matchText: matchString,
    isCaseSensitive: isCaseSensitive
  };

}

We alse need to pass the message ID of the currently focused message to MessageListItem as we need to identify that message to highlight it in orange instead of yellow differentiating between all matches and the current match.

function getMessageListItem(messages, markID) {
  if(markID){
    return messages.map((message) => {
      return (
        <MessageListItem
          key={message.id}
          message={message}
          markID={markID}
        />
      );
    });
  }
}

We also store the indices of the messages marked in the MessageSection Component state which is later used to iterate through the highlighted results.

searchTextChanged = (event) => {

  let matchString = event.target.value;
  let messages = this.state.messages;
  let markingData = searchMsgs(messages, matchString,
    this.state.searchState.caseSensitive);

  if(matchString){

    let searchState = {
      markedMsgs: markingData.allmsgs,
      markedIDs: markingData.markedIDs,
      markedIndices: markingData.markedIndices,
      scrollLimit: markingData.markedIDs.length,
      scrollIndex: 0,
      scrollID: markingData.markedIDs[0],
      caseSensitive: this.state.searchState.caseSensitive,
      open: false,
      searchText: matchString
    };

    this.setState({
      searchState: searchState
    });

  }
}

After marking the matched messages with the search string, we pass the messages array into MessageListItem Component where the messages are processed and rendered. Here, we check if the message being received from MessageSection is marked or not and if marked, we then highlight the message. To highlight all occurrences of the search string in the message text, I used a module called react-text-highlight.

import TextHighlight from 'react-text-highlight';

if(this.props.message.id === markMsgID){
  markedText.push(
    <TextHighlight
      key={key}
      highlight={matchString}
      text={part}
      markTag='em'
      caseSensitive={isCaseSensitive}
    />
  );
}
else{
  markedText.push(
    <TextHighlight
      key={key}
      highlight={matchString}
      text={part}
      caseSensitive={isCaseSensitive}/>
  );
}

Here, we are using the message ID of the currently focused message, sent as props to MessageListItem to identify the currently focused message and highlight it specifically in orange instead of the default yellow color for all other matches.

I used ‘em’ tag to emphasise the currently highlighted message and colored it orange using CSS attributes.

em{
  background-color: orange;
}

We next need to add functionality to navigate through the matched results. The arrow buttons are used to navigate. We stored all the marked messages in the MessageSection state as `markedIDs` and their corresponding indices as `markedIndices`. Using the length of this array, we get the `scrollLimit` i.e we know the bounds to apply while navigating through the search results.

On clicking the up or down arrows, we update the currently highlighted message through `scrollID` and `scrollIndex`, and also check for bounds using `scrollLimit`  in the searchState. Once these are updated, the chat app must automatically scroll to the new currently highlighted message. Since findDOMNode is being deprecated, I used the custom scrollbar to find the node of the currently highlighted message without using findDOMNode. The custom scrollbar was implemented using the module react-custom-scrollbars. Once the node is found, we use the inbuilt HTML DOM method, scrollIntoView()  to automatically scroll to that message.

if(this.state.search){
  if (this.state.searchState.scrollIndex === -1
      || this.state.searchState.scrollIndex === null) {
      this._scrollToBottom();
  }
  else {
    let markedIDs = this.state.searchState.markedIDs;
    let markedIndices = this.state.searchState.markedIndices;
    let limit = this.state.searchState.scrollLimit;
    let ul = this.messageList;

    if (markedIDs && ul && limit > 0) {
      let currentID = markedIndices[this.state.searchState.scrollIndex];
      this.scrollarea.view.childNodes[currentID].scrollIntoView();
    }
  }
}

Let us now see how the search field was animated. I used a CSS transition property along width to get the search field animation to work. This gives the animation when there is a change of width for the search field. I fixed the width to be zero when the search mode is not activated, so only the search icon is displayed. When the search mode is activated i.e the user clicks on the search field, I fixed the width as 125px. Since the width has changed, the increase in width is displayed as an expanding animation due to the CSS transition property.

const animationStyle = {
  transition: 'width 0.75s cubic-bezier(0.000, 0.795, 0.000, 1.000)'
};

const baseStyles = {
  open: { width: 125 },
  closed: { width: 0 },
}

We also have a case sensitive option which is displayed on clicking the rightmost button i.e the three vertical dots button. We can toggle between case sensitive option, whose value is stored in MessageSection searchState and is passed along with the messages to MessageListItem where it is used by react-text-highlight to highlight text accordingly and render the highlighted messages.

This is how the search feature was implemented in SUSI WebChat. You can find the complete code at SUSI WebChat.

Resources

Processing Text Responses in SUSI Web Chat

SUSI Web Chat client now supports emojis, images, links and special characters. However, these aren’t declared as separate action types i.e the server doesn’t explicitly tell the client that the response contains any of the above features when it sends the JSON response. So the client must parse the text response from server and add support for each of the above mentioned features instead of rendering the plain text as is, to ensure good UX.

SUSI Web Chat client parses the text responses to support :

  • HTML Special Entities
  • Images and GIFs
  • URLs and Mail IDs
  • Emojis and Symbols
// Proccess the text for HTML Spl Chars, Images, Links and Emojis

function processText(text){

  if(text){
    let htmlText = entities.decode(text);
    let imgText = imageParse(htmlText);
    let replacedText = parseAndReplace(imgText);

    return <Emojify>{replacedText}</Emojify>;

  };
  return text;
}

Let us write sample skills to test these out. Visit http://dream.susi.ai/ and enter textprocessing.

You can then see few sample queries and responses at http://dream.susi.ai/p/textprocessing.

Lets visit SUSI WebChat and try it out.

Query : dream textprocessing

Response: dreaming enabled for textprocessing

Query : text with special characters

Response:  &para; Here are few “Special Characters&rdquo;!

All the special entities notations have been parsed and rendered accordingly!

Sometimes we might need to use HTML special characters due to reasons like

  • You need to escape HTML special characters like <, &, or .
  • Your keyboard does not support the required character. For example, many keyboards do not have em-dash or the copyright symbol.

You might be wondering why the client needs to handle this separately as it is generally, automatically converted to relevant HTML character while rendering the HTML. SUSI Web Chat client uses reactjs which has JSX and not HTML. So JSX doesn’t support HTML special characters i.e they aren’t automatically converted to relevant characters while rendering. Hence, the client needs to handle this explicitly.

We used the module, html-entities to decode all types of special HTML characters and entities. This module parses the text for HTML entities and replaces them with the relevant character for rendering when used to decode text.

import {AllHtmlEntities} from 'html-entities';
const entities = new AllHtmlEntities();

let htmlText = entities.decode(text);

Now that the HTML entities are processed, the client then processes the text for image links. Let us now look at how images and gifs are handled.

Query : random gif

Response: https://media1.giphy.com/media/AAKZ9onKpXog8/200.gif

Sometimes, the text contains links for images or gifs and the user would be expecting a media type like image or gif instead of text. So we need to replace those image links with actual images to ensure good UX. This is handled using regular expressions to match image type urls and correspondingly replace them with html img tags so that the response is a image and not URL text.

// Parse text for Image URLs

function imageParse(stringWithLinks){

  let replacePattern = new RegExp([
    '((?:https?:\\/\\/)(?:[a-zA-Z]{1}',
    '(?:[\\w-]+\\.)+(?:[\\w]{2,5}))',
    '(?::[\\d]{1,5})?\\/(?:[^\\s/]+\\/)',
    '*(?:[^\\s]+\\.(?:jpe?g|gif|png))',
    '(?:\\?\\w+=\\w+(?:&\\w+=\\w+)*)?)'
  ].join(''),'gim');

  let splits = stringWithLinks.split(replacePattern);

  let result = [];

  splits.forEach((item,key)=>{
    let checkmatch = item.match(replacePattern);

    if(checkmatch){
      result.push(
        <img key={key} src={checkmatch}
        style={{width:'95%',height:'auto'}} alt=''/>)
    }
    else{
      result.push(item);
    }
  });

  return result;
}

The text is split using the regular expression and every matched part is replaced with the corresponding image using the img tag with source as the URL contained in the text.

The client then parses URLs and Mail IDs.

Query: search internet

Response: Internet The global system of interconnected computer networks that use the Internet protocol suite to… https://duckduckgo.com/Internet

The link has been parsed from the response text and has been successfully hyperlinked. Clicking the links opens the respective url in a new window.

We used react-linkify module to parse links and email IDs. The module parses the text and hyperlinks all kinds of URLs and Mail IDs.

import Linkify from 'react-linkify';

export const parseAndReplace = (text) => {return <Linkify properties={{target:"_blank"}}>{text}</Linkify>;}

Finally, let us see, how emojis are parsed.

Query : dream textprocessing

Response: dreaming enabled for textprocessing

Query : susi, do you use emojis?

Response: Ofcourse ^__^ 😎 What about you!? 😉 😛

All the notations for emojis have been parsed and rendered as emojis instead of text!

We used react-emojine module to emojify the text.

import Emojify from 'react-emojione';

<Emojify>{text}</Emojify>;

This is how text is processed to support special characters, images, links and emojis, ensuring a rich user experience. You can find the complete code at SUSI WebChat.

Resources

Integration of SUSI AI to Alexa

An Alexa skill which can be used to ask susi for answers like: “Alexa, ask susi chat who are you” or “Alexa, ask susi chat what is the temperature in berlin”.

If at any point of time, you are unclear about the code in the blog post, you can check the code of the already made SUSI Alexa skill from the susi_alexa_skill repository.

Getting Started : Alexa Susi AI Skill

Follow the instructions below:

Visit the Amazon developer site and Login.

Click Alexa, on the top bar.

Click Alexa skills kit.

Click on add a new skill button on the top right of the page.

We will be at the skill information tab.

 

Write the name of the skill Write the invocation name of the skill i.e. the name that will be used to trigger your skill. Like in our case, if we need to ask anything(as we have ‘susi chat’ as the invocation name), we will ask with “Alexa, ask susi chat” as a prefix to our question sentence.

By clicking next, we will be redirected to the second tab i.e. Interaction model. We need to fill two fields here i.e. intent schema and sample utterances. For intent schema, we need to write all the available intents and the parameters for each of them. Like in our case:

{
 "intents": 
 [
   {
     "slots": [
       {
         "name": "query",
         "type": "AMAZON.LITERAL"
       }
     ],
     "intent": "callSusiApi"
   }
 ]
}

We have a single intent that is “callSusiApi” and the parameter it accepts is “query” of type “AMAZON.LITERAL” (in simple words, a string type). Parameters are termed as slots here. The different types of slots available, can be seen from here.

For sample utterances, we need to tell what utterances by the client will lead to what intent. In our case:

We have just one intent and the whole string uttered by the client should be fed to this intent as a “query” slot(parameter).

Let’s click next now.

We will be shifted to the configuration tab.

We will be making a lambda function, which will hold the code for our Susi skill, further we need to link that code to this skill. To do the linking we need to get the Amazon resource name i.e. ARN and fill it in the field named endpoint:


To get the amazon resource name, in a new tab, visit here. Visit “Lambda” followed by get started button. Click on “Create a lambda function”:

We need to select a blueprint for our lambda function. Select the “blank function” option for that.


Click next.

For configure triggers option, click this box and select “Alexa skills kit” option.


Click next.

In configure function tab, just write the name of the function and its description. Let’s code our lambda function:

// basic syntax that should be available in the lambda function
var https = require('http');
exports.handler = (event, context) => {
  try {
    if (event.session.new) {
      // New Session
      console.log("NEW SESSION")
    }
    switch (event.request.type) {
      case "LaunchRequest":
        // Launch Request
        console.log(`LAUNCH REQUEST`)
        context.succeed(
          generateResponse(
            buildSpeechletResponse("Welcome to a Susi Skill, this is an A.I. chatbot developed by Fossasia open source community. Ask anything to me like temperature at a place or rating of a movie or any other thing which you would like to ask?", false),
            {}
          )
        )
        break;
      case "IntentRequest":
        // Intent Request
        console.log(`INTENT REQUEST`)

        switch(event.request.intent.name) {
          case "callSusiApi":
            console.log(event.request.intent.slots.query.value)
            var endpoint = "http://api.susi.ai/susi/chat.json?q="+event.request.intent.slots.query.value; // ENDPOINT GOES HERE
            var body = ""
            https.get(endpoint, (response) => {
              response.on('data', (chunk) => { body += chunk })
              response.on('end', () => {
                var data = JSON.parse(body)
                // fetching answer from susi
                var viewCount = data.answers[0].actions[0].expression;
                if(viewCount.indexOf('I found this on the web') != -1)
                    viewCount = 'I have no idea about it, sorry.';
                context.succeed(
                  generateResponse(
                    buildSpeechletResponse(`${viewCount}`, false),
                    {}
                  )
                )
              })
            })
            break;

          default:
            throw "Invalid intent"
        }

        break;
      case "SessionEndedRequest":
        // Session Ended Request
        console.log(`SESSION ENDED REQUEST`)
        break;
      default:
        context.fail(`INVALID REQUEST TYPE: ${event.request.type}`)
    }
  } catch(error) { context.fail(`Exception: ${error}`) }
}

// Helpers
buildSpeechletResponse = (outputText, shouldEndSession) => {
  return {
    outputSpeech: {
      type: "PlainText",
      text: outputText
    },
    shouldEndSession: shouldEndSession
  }
}

generateResponse = (speechletResponse, sessionAttributes) => {
  return {
    version: "1.0",
    sessionAttributes: sessionAttributes,
    response: speechletResponse
  }
}

Paste this code into the space given below “lambda function code”. In lambda function handler and role, Click the field named role and select “create a custom role” from the dropdown shown.

You will be redirected to a new page. Select the IAM role as lambda_basic_execution:

Click allow button in the bottom right. We will be redirected back to our previous page. We don’t need to worry about other settings on this page.

Click next.

Again cross-check the details shown and click next.

Now we will have our ARN(Amazon resource name) on the top right of the page.


Copy that and paste it into the field “endpoint” on our previously open browser tab:


Click next.

Great that our SUSI AI skill is ready!

Now we can test it with a sample query, when we get redirected to the test tab:


Also we can test it using our voice through reverb app available on play store or echosim by logging into your amazon account.

Till now, the skill can just be invoked or tested from your own amazon id. To make this skill public , you need to fill the other 2 tabs left that are “publishing information” and “privacy and compliance”.

Some sample strings that we can speak to test it: “Alexa, ask susi chat where are you” “Alexa, ask susi chat tell me a joke” “Alexa, ask susi chat what is a table” (where ‘susi chat’ is the invocation name).

This was the video that helped a lot in making this skill for Alexa. It can be referred too.

Getting user Location in SUSI Android App and using it for various SUSI Skills

Using user location in skills is a very common phenomenon among various personal assistant like Google Assistant, Siri, Cortana etc. SUSI is no different. SUSI has various skills which uses user’s current location to implement skills. Though skills like “restaurant nearby” or “hotels nearby” are still under process but skills like “Where am I” works perfectly indicating SUSI has all basic requirements to create more advance skills in near future.

So let’s learn about how the SUSI Android App gets location of a user and sends it to SUSI Server where it is used to implement various location based skills.

Sources to find user location in an Android app

There are three sources from which android app gets users location :

  1. GPS
  2. Network
  3. Public IP Address

All three of these have various advantages and disadvantages. The SUSI Android app uses cleverly each of them to always get user location so that it can be used anytime and anywhere.

Some factors for comparison of these three sources :

Factors GPS Network IP Address
Source Satellites Wifi/Cell Tower Public IP address of user’s mobile
Accuracy Most Accurate (20ft) Moderately Accurate (200ft) Least Accurate (5000+ ft)
Requirements GPS in mobile Wifi or sim card Internet connection
Time taken to give location Takes long time to get location Fastest way to get location Fast enough (depends on internet speed)
Battery Consumption High Medium Low
Permission Required User permission required User permission required No permission required
Location Factor Works in outdoors. Does not work near tall buildings Works everywhere Works everywhere

Implementation of location finding feature in SUSI Android App

SUSI Android app very cleverly uses all the advantages of each location finding source to get most accurate location, consume less power and find location in any scenario.

The /susi/chat.json endpoint of SUSI API requires following 7 parameters :

Sno. Parameter Type Requirement
1 q String Compulsory
2 timezoneOffset int Optional
3 longitude double Optional
4 latitude double Optional
5 geosource String Optional
6 language Language  code Optional
7 access_token String Optional

In this blog we will be talking about latitude , longitude and geosource. So, we need these three things to pass as parameters for location related skills. Let’s see how we do that.

Finding location using IP Address: At the starting of app, user location is found by making an API call to ipinfo.io/json . This results in following JSON response having a field “loc” giving location of user (latitude and longitude.

{
  "ip": "YOUR_IP_ADDRESS",
  "city": "YOUR_CITY",
  "region": "YOUR_REGION",
  "country": "YOUR_COUNTRY_CODE",
  "loc": "YOUR_LATITUDE,YOUR_LONGITUDE",
  "org": "YOUR_ISP"
}

By this way we got latitude, longitude and geosource will be “ip” . We find location using IP address only once the app is started because there is no need of finding it again and again as making network calls takes time and drains battery.

So, now we have user’s location but this is not accurate. So, we will now proceed to check if we can find location using network is more accurate than location using IP address.

Finding location using Network Service Provider : To actually use the network provider and find out location requires ACCESS_COARSE_LOCATION permission from user which can be asked during the run time. Also, the location can only be found out using this if user has his location setting is enabled. So, we check following condition.

if (ActivityCompat.checkSelfPermission(mContext, Manifest.permission.ACCESS_COARSE_LOCATION) == PackageManager.PERMISSION_GRANTED) {
}

If permission is granted by user to find location using network provider, we use following code snippet to find location. It updates location of user after every 5 minutes or 10 meters (whichever is achieved first).

locationManager.requestLocationUpdates(LocationManager.NETWORK_PROVIDER, 5 * 60 * 1000, 10, this);
location = locationManager.getLastKnownLocation(LocationManager.NETWORK_PROVIDER);
if (location != null) {
   source = "network";
   canGetLocation = true;
   latitude = location.getLatitude();
   longitude = location.getLongitude();
}

So, whenever we are about to send query to SUSI Server, we take location from Network services, thus updating previous values of latitude, longitude and geosource (found using IP address) with the new values (found using Network Provider), provided the user has granted permission. So, we now have location is from Network Provider which is more accurate than location from IP address. Now we will check if we can find location from GPS or not.

Finding location using GPS Service Provider : Finding location from GPS Provider is almost same as Network Provider. To find location using GPS Provider user must give  ACCESS_FINE_LOCATION permission. We just check if GPS of user is enabled and user has given permission to use GPS and also if GPS can actually provide location. Sometimes, GPS can not provide location because user is indoor. In that cases we leave location from GPS.

So, now if we update previous values of latitude, longitude and geosource (found using Network Provider) with the new values (found using GPS Provider) and send query to SUSI Server.

Summary

To send location to server for location skills, we need latitude, longitude and geosource. We first find these 3 things using IP address (no that accurate). So, geosource will be “ip” for now. Then check if we can find values using network provider. If yes, we update those 3 values with the ones got from network Provider (more accurate). Geosource will change to “network”. Finally, we check if we can find values using GPS provider. If yes, we update those 3 values with the ones got from GPS Provider (most accurate). Geosource will change to “gps”. So, by this way we can find location of user in any circumstance possible. If you want to use location in your app too. Just follow the above steps and you are good to go.

Resources

 

Understanding the working of SUSI Hardware

Susi on Hardware is the latest addition to full suite of SUSI Apps. Being a hardware project, one might feel like it is too much complex, however it is not the case.

The solution is being primary built on a Raspberry Pi which, however small it may be, is a computer. Most things you expect to work on a normal computer, work on Raspberry Pi as well with a few advantages being its small size and General Purpose I/O access. But it comes with caveats of an ARM CPU, which may not support all applications which are mainly targeted for x86.
There are a few other development boards from Intel as well, which use x86/x64 architecture.

While working on the project, I did not wanted to make it too generic for a board or set of Hardware, thus all components used were targeted to be cross-platform.

Components that make Susi Hardware

SUSI Server

SUSI Server is the foremost important thing in any SUSI Project. SUSI Server handles all the queries by user which can be supplied using REST API and supplies answer in a nice format for clients. It also provides AAA: Authentication, Authorization and Accounting support for managing user accounts across platforms.

Github Repository: https://github.com/fossasia/susi_server

Susi Python Library

Susi Python Library was developed along with Susi Hardware project. It can work independent of Hardware Project and can be included in any Python Project for Susi Intelligence. It provides easy access to Susi Server REST API through easy python methods.
Github Repository: https://github.com/fossasia/susi_api_wrapper

Python Speech Recognition Library

The best advantage of using Python is that in most cases , you do not need to re-invent the wheel, some already has done the work for you. Python Speech Recognition library support for speech recognition through microphone and by a voice sample. It supports a number of Speech API providers like Google Speech API. Wit.AI, IBM Watson Speech-To-Text and a lot more.
This provides free to choose any of the speech recognition providers. For now, we are using Google Speech API and IBM Watson Speech API.


Pypi Package: https://pypi.python.org/pypi/SpeechRecognition/Github Repository: https://github.com/Uberi/speech_recognition

PocketSphinx for Hotword Detection

CMU PocketSphinx is an open-source offline speech recognition library. We have used PocketSphinx to enable hotword detection to Susi Hardware so you can interact with Susi handsfree.
More information on its working can be found in my other blog post.

Github Repository: https://github.com/cmusphinx/pocketsphinx

Flite Speech Synthesis System

CMU Flite (Festival-Lite) is a small sized , fast and open source speech synthesis engine developed by Carnegie Mellon University .
More information of integration and usage in Susi can be found in my other blog post

Project Website: http://www.festvox.org/flite/

The whole working of all these components together can be explained using the Diagram below.

Hotword Detection for SUSI Android with CMUsphinx

Being an AI for conversational bots, Hotword detection of SUSI is the top priority to the community. Another requirement was that there should be an option for an offline hotword detection. So, I was searching for an API that has all these capabilities. Sphinx by CMU was the obvious choice. It provides robust mechanism for hotword detection.

What is CMUsphinx?

CMUsphinx is open source and leading speech recognition toolkit. CMUsphinx has different modules for different tasks it needs to perform. Our requirement for SUSI is, that is needs to be lightweight, So we are using Pocketsphinx. Before going into integration let us discuss about basics of speech recognition.

Let us dive into coding and integrating Susi with pocketsphinx.

Building Pocketsphinx .AAR file

Git clone the sphinxbase, pocketsphinx and pocketsphinx-android and put them in the same folder. By following commands below.

git clone http://github.com/cmupshinx/sphinxbase
git clone http://github.com/cmupshinx/pocketsphinx
git clone http://github.com/cmupshinx/pocketsphinx-android

Then import pocketsphinx Android into Android studio. Run the project. .aar files pocketsphinx-android-5prealpha-debug.aar & pocketsphinx-android-5prealpha-release.aar  will be created in the build/outputs/aar.

Integrating Susi with Pocketsphinx

In Android Studio you need to the above generated AAR into your project. Just go to File > New > New module and choose Import .JAR/.AAR Package. After this, We need to change permissions of project. Add the following permissions in AndroidManifest.xml.

<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />

Import the following functions into your main activity.

import edu.cmu.pocketsphinx.Assets;
import edu.cmu.pocketsphinx.Hypothesis;
import edu.cmu.pocketsphinx.RecognitionListener;
import edu.cmu.pocketsphinx.SpeechRecognizer;
import edu.cmu.pocketsphinx.SpeechRecognizerSetup;

Next we need to sync the assets we get from .aar file in to our project. Edit app/build.gradle build file to run assets.xml. We do it by adding following code to build.gradle.

ant.importBuild 'assets.xml'
preBuild.dependsOn(list, checksum)
clean.dependsOn(clean_assets)

Now all the import and sync errors of gradle must disappear and you should be good to go. You can start your recognizer by adding this code to your activity.

recognizer = defaultSetup()
        .setAcousticModel(new File(assetsDir, "en-us-ptm"))
        .setDictionary(new File(assetsDir, 
"cmudict-en-us.dict"))
        .getRecognizer();
recognizer.addListener(this);

Decoder model is lengthy process that contains many operations, so it’s recommended to run in inside async task. These are commands for decoder to run. These commands essentially do acoustic and language modelling of speech.

// Create keyword-activation search.
recognizer.addKeyphraseSearch(KWS_SEARCH, KEYPHRASE);

// Create grammar-based searches.
File menuGrammar = new File(assetsDir, "menu.gram");
recognizer.addGrammarSearch(MENU_SEARCH, menuGrammar);

// Next search for digits
File digitsGrammar = new File(assetsDir, "digits.gram");
recognizer.addGrammarSearch(DIGITS_SEARCH, digitsGrammar);

// Create language model search.
File languageModel = new File(assetsDir, "weather.dmp");
recognizer.addNgramSearch(FORECAST_SEARCH, languageModel);

Speech recognition will end at onEndOfSpeech callback of the recognizer listener.  We can call recognizer.stop or recognizer.cancel(). Cancel will cancel the recognition, stop will cause the final result be passed you in onResult callback. During the recognition, you will get partial results in onPartialResult callback.

Now we have integrated Pocketsphinx with SUSI.AI in Android.