Connecting SUSI iOS App to SUSI Smart Speaker

SUSI Smart Speaker is an Open Source speaker with many exciting features. The user needs an Android or iOS device to set up the speaker. You can refer this post for initial connection to SUSI Smart Speaker. In this post, we will see how a user can connect SUSI Smart Speaker to iOS devices (iPhone/iPad).

Implementation –

The first step is to detect whether an iOS device connects to SUSI.AI hotspot or not. For this, we match the currently connected wifi SSID with SUSI.AI hotspot SSID. If it matches, we show the connected device in Device Activity to proceed further with setups.

Choosing Room –

Room name is basically the location of your SUSI Smart Speaker in the home. You may have multiple SUSI Smart Speaker in different rooms, so the purpose of adding the room is to differentiate between them.

When the user clicks on Wi-Fi displayed cell, it starts the initial setups. We are using didSelectRowAt method of UITableViewDelegate to get which cell is selected. On clicking the displayed Wi-Fi cell, a popup is open with a Room Location Text field.

override func tableView(_ tableView: UITableView, didSelectRowAt indexPath: IndexPath) {
if indexPath.row == 0, let speakerSSID = fetchSSIDInfo(), speakerSSID == ControllerConstants.DeviceActivity.susiSSID {
// Open a popup to select Rooms
presentRoomsPopup()
}
}

When the user clicks the Next button, we send the speaker room location to the local server of the speaker by the following API endpoint with room name as a parameter:

http://10.0.0.1:5000/speaker_config/

Refer this post for getting more detail about how choosing room work and how it is implemented in SUSI iOS.

Sharing Wi-Fi Credentials –

On successfully choosing the room, we present a popup that asks the user to enter the Wi-Fi credentials of previously connected Wi-Fi so that we can connect our Smart Speaker to the wifi which can provide internet connection to play music and set commands over the speaker.

We present a popup with a text field for entering wifi password.

When the user clicks the Next button, we share the wifi credentials to wifi by the following API endpoint:

http://10.0.0.1:5000/wifi_credentials/

With the following params-

  1. Wifissid – Connected Wi-Fi SSID
  2. Wifipassd – Connected Wi-Fi password

In this API endpoint, we are sharing wifi SSID and wifi password with Smart Speaker. If the credentials successfully accepted by speaker than we present a popup for user SUSI account password, otherwise we again present Enter Wifi Credentials popup.

Client.sharedInstance.sendWifiCredentials(params) { (success, message) in
DispatchQueue.main.async {
self.alertController.dismiss(animated: true, completion: nil)
if success {
self.presentUserPasswordPopup()
} else {
self.view.makeToast("", point: self.view.center, title: message, image: nil, completion: { didTap in
UIApplication.shared.endIgnoringInteractionEvents()
self.presentWifiCredentialsPopup()
})
}
}
}

 

Sharing SUSI Account Credentials –

In the method above we have seen that when SUSI Smart Speaker accept the wifi credentials, we proceed further with SUSI account credentials. We open a popup to Enter user’s SUSI account password:

When the user clicks the Next button, we use following API endpoint to share user’s SUSI account credentials to SUSI Smart Speaker:

http://10.0.0.1:5000/auth/

With the following params-

  1. email
  2. password

User email is already saved in the device so the user doesn’t have to type it again. If the user credentials successfully accepted by speaker then we proceed with configuration process otherwise we open up Enter Password popup again.

Client.sharedInstance.sendAuthCredentials(params) { (success, message) in
DispatchQueue.main.async {
self.alertController.dismiss(animated: true, completion: nil)
if success {
self.setConfiguration()
} else {
self.view.makeToast("", point: self.view.center, title: message, image: nil, completion: { didTap in
UIApplication.shared.endIgnoringInteractionEvents()
self.presentUserPasswordPopup()
})
}
}
}

 

Setting Configuration –

After successfully sharing SUSI account credentials, following API endpoint is using for setting configuration.

http://10.0.0.1:5000/config/

With the following params-

  1. sst
  2. tts
  3. hotword
  4. wake

The success of this API call makes successfully connection between user iOS Device and SUSI Smart Speaker.

Client.sharedInstance.setConfiguration(params) { (success, message) in
DispatchQueue.main.async {
if success {
// Successfully Configured
self.isSetupDone = true
self.view.makeToast(ControllerConstants.DeviceActivity.doneSetupDetailText)
} else {
self.view.makeToast("", point: self.view.center, title: message, image: nil, completion: { didTap in
UIApplication.shared.endIgnoringInteractionEvents()
})
}
}
}

After successful connection-

 

Resources –

  1. Apple’s Documentation of tableView(_:didSelectRowAt:) API
  2. Initial Setups for Connecting SUSI Smart Speaker with iPhone/iPad
  3. SUSI Linux Link: https://github.com/fossasia/susi_linux
  4. Adding Option to Choose Room for SUSI Smart Speaker in iOS App
Continue ReadingConnecting SUSI iOS App to SUSI Smart Speaker

Skill Development using SUSI Skill CMS

There are a lot of personal assistants around like Google Assistant, Apple’s Siri, Windows’ Cortana, Amazon’s Alexa, etc. What is then special about SUSI.AI which makes it stand apart from all the different assistants in the world? SUSI is different as it gives users the ability to create their own skills in a Wiki-like system. You don’t need to be a developer to be able to enhance SUSI. And, SUSI is an Open Source personal assistant which can do a lot of incredible stuff for you, made by you.

So, let’s say you want to create your own Skill and add it to the existing SUSI Skills. So, these are the steps you need to follow regarding the same –

  1. The current SUSI Skill Development Environment is based on an Etherpad. An Etherpad is a web-based collaborative real-time editor. https://dream.susi.ai/ is one such Etherpad. Open https://dream.susi.ai/ and name your dream (in lowercase letters).
  2. Define your skill in the Etherpad. The general skill format is

::name <Skill_name>
::author <author_name>
::author_url <author_url>
::description <description> 
::dynamic_content <Yes/No>
::developer_privacy_policy <link>
::image <image_url>
::term_of_use <link>

#Intent
User query1|query2|query3....
Answer answer1|answer2|answer3...

 

Patterns in query can be learned easily via this tutorial.

  1. Open any SUSI Client and then write dream <your dream name> so that dreaming is enabled for SUSI. Once dreaming is enabled, you can now test any skills which you’ve made in your Etherpad.
  2. Once you’ve tested your skill, write ‘stop dreaming’ to disable dreaming for SUSI.
  3. If the testing was successful and you want your skill to be added to SUSI Skills, send a Pull Request to susi_skill_data repository providing your dream name.

How do you modify an existing skill?

SUSI Skill CMS is a web interface where you can modify the skills you’ve made. All the skills of SUSI are directly in sync with the susi_skill_data.

To edit any skill, you need to follow these steps –

  1. Login to SUSI Skill CMS website using your email and password (or Sign Up to the website if you haven’t already).
  2. Click on the skill which you want to edit and then click on the “edit” icon.
  3. You can edit all aspects of the skill in the next state. Below is a preview:

Make the changes and then click on “SAVE” button to save the skill.

What’s happening Behind The Scenes in the EDIT process?

  • SkillEditor.js is the file which is responsible for keeping a check over various validations in the Skill Editing process. There are certain validations that need to be made in the process. Those are as follows –
  • Check whether User has logged in or not

if (!cookies.get('loggedIn')) {
            notification.open({
                message: 'Not logged In',
                description: 'Please login and then try to create/edit a skill',
                icon: <Icon type='close-circle' style={{ color: '#f44336' }} />,
            });
            this.setState({
                loading: false
            });
            return 0;
        }

 

  • Check whether Commit Message has been entered by User or not

if (this.state.commitMessage === null) {
            notification.open({
                message: 'Please add a commit message',
                icon: <Icon type='close-circle' style={{ color: '#f44336' }} />,
            });

            this.setState({
                loading: false
            });
            return 0;
        }

 

  • Check to ensure that request is sent only if there are some differences in old values and new values

if (this.state.oldGroupValue === this.state.groupValue &&
          this.state.oldExpertValue === this.state.expertValue &&
          this.state.oldLanguageValue === this.state.languageValue &&
          !this.state.codeChanged && !this.state.image_name_changed) {
            notification.open({
                message: 'Please make some changes to save the Skill',
                icon: <Icon type='close-circle' style={{ color: '#f44336' }} />,
            });
            self.setState({
                loading: false
            });
            return 0;
        }

 

  • After doing the above validations, a request is sent to the Server and the User is shown a notification accordingly, whether the Skill has been uploaded to the Server or there has been some error.

$.ajax(settings)
            .done(function (response) {
                this.setState({
                    loading: false
                });
                let data = JSON.parse(response);
                if (data.accepted === true) {
                    notification.open({
                        message: 'Accepted',
                        description: 'Your Skill has been uploaded to the server',
                        //success/>
                    });
                }
                else {
                    this.setState({
                        loading: false
                    });
                    notification.open({
                        message: 'Error Processing your Request',
                        description: String(data.message),
                        //failure />
                    });
                }
            }

 

  • If the User is notified with a Success notification, then to verify whether the Skill has been added or not, the User can go to susi_skill_data repo and see if he has a recent commit regarding the same or not.

Resources

Continue ReadingSkill Development using SUSI Skill CMS

Implementing Version Control System for SUSI Skill CMS

SUSI Skill CMS now has a version control system where users can browse through all the previous revisions of a skill and roll back to a selected version. Users can modify existing skills and push the changes. So a skill could have been edited many times by the same or different users and so have many revisions. The version control functionalities help users to :

  • Browse through all the revisions of a selected skill
  • View the content of a selected revision
  • Compare any two selected revisions highlighting the changes
  • Option to edit and rollback to a selected revision.

Let us visit SUSI Skill CMS and try it out.

  1. Select a skill
  2. Click on versions button
  3. A table populated with previous revisions is displayed

  1. Clicking on a single revision opens the content of that version
  2. Selecting 2 versions and clicking on compare selected versions loads the content of the 2 selected revisions and shows the differences between the two.
  3. Clicking on Undo loads the selected revision and the latest version of that skill, highlighting the differences and also an editor loaded with the code of the selected revision to make changes and save to roll back.

How was this implemented?

Firstly, to get the previous revisions of a selected skill, we need the skills meta data including model, group, language and skill name which is used to make an ajax call to the server using the endpoint :

http://api.susi.ai/cms/getSkillHistory.json?model=MODEL&group=GROUP&language=LANGUAGE&skill=SKILL_NAME

We create a new component SkillVersion and pass the skill meta data in the pathname while accessing that component. The path where SkillVersion component is loaded is /:category/:skill/versions/:lang . We parse this data from the path and set our state with skill meta data. In componentDidMount we use this data to make the ajax call to the server to get all previous version data and update our state. A sample response from getSkillHistory endpoint looks like :

{
  "session": {
    "identity": {
      "type": "",
      "name": "",
      "anonymous":
    }
  },
  "commits": [
    {
      "commitRev": "",
      "author_mail": "AUTHOR_MAIL_ID",
      "author": "AUTOR_NAME",
      "commitID": "COMMIT_ID",
      "commit_message": "COMMIT_MESSAGE",
     "commitName": "COMMIT_NAME",
     "commitDate": "COMMIT_DATE"
    },
  ],
  "accepted": TRUE/FALSE
}

We now populate the table with the obtained revision history. We used Material UI Table for tabulating the data. The first 2 columns of the table have radio buttons to select any 2 revisions. The left side radio buttons are for selecting the older versions and the right side radio buttons to select the more recent versions. We keep track of the selected versions through onCheck function of the radio buttons and updating state accordingly.

if(side === 'right'){
  if(!(index >= currLeft)){
    rightChecks.fill(false);
    rightChecks[index] = true;
    currRight = index;
  }
}
else if(side === 'left'){
  if(!(index <= currRight)){
    leftChecks.fill(false);
    leftChecks[index] = true;
    currLeft = index;
  }
}
this.setState({
  currLeftChecked: currLeft,
  currRightChecked: currRight,
  leftChecks: leftChecks,
  rightChecks: rightChecks,
});

Once 2 versions are selected and we click on compare selected versions button, we get the currently selected versions stored from getCheckedCommits function and we are redirected to /:category/:skill/compare/:lang/:oldid/:recentid where we pass the selected 2 revisions commitIDs in the URL.

{(this.state.commitsChecked.length === 2) &&
<Link to={{
  pathname: '/'+this.state.skillMeta.groupValue+
            '/'+this.state.skillMeta.skillName+
            '/compare/'+this.state.skillMeta.languageValue+
            '/'+checkedCommits[0].commitID+
            '/'+checkedCommits[1].commitID,
}}>
  <RaisedButton
    label='Compare Selected Versions'
    backgroundColor='#4285f4'
    labelColor='#fff'
    style={compareBtnStyle}
  />
</Link>
}

SkillHistory Component is now loaded and the 2 selected revisions commitIDs are parsed from the URL pathname. Once we have the commitIDs we make ajax calls to the server to get the code for that particular commit. The skill meta data is also parsed from the URL path which is required to make the server call to getFileAtCommitID.

http://api.susi.ai/cms/getSkillHistory.json?model=MODEL&group=GROUP&language=LANGUAGE&skill=SKILL_NAME&commitID=COMMIT_ID

We make the ajax calls in componentDidMount and update the state with the received data. A sample response from getFileAtCommitID looks like :

{
  "accepted": TRUE/FALSE,
  "file": "CONTENT",
  "session": {
    "identity": {
       "type": "",
       "name": "",
       "anonymous":
    }
  }
}

We populate the code of each revision in an editor. We used react-ace as our editor component where we use the value prop to populate the content and display it in read-only mode.

<AceEditor
  mode='java'
  readOnly={true}
  theme={this.state.editorTheme}
  width='100%'
  fontSize={this.state.fontSizeCode}
  height= '400px'
  value={this.state.commitData[0].code}
  showPrintMargin={false}
  name='skill_code_editor'
  editorProps={{$blockScrolling: true}}
/>

We then show the differences between the 2 selected versions content. To compare and highlight the differences, we used react-diff package which takes in the content of both the commits as inputA and inputB props and we compare character by character using the type chars prop. Here input A is compared with input B. The component compares and returns the highlighted element which we display in a scrollable div preventing overflows.

{/* latest code should be inputB */}
<Diff
  inputA={this.state.commitData[0].code}
  inputB={this.state.commitData[1].code}
  type='chars'
/>

Clicking on Undo then redirects to /:category/:skill/edit/:lang/:latestid/:revertid where latest id is the commitID of the latest revision and revert id is the commitID of the oldest commit ID selected amongst the 2 commits selected initially. This redirects to SkillRollBack component where we again parse the skill meta data and the commit IDs from the URL pathname and call getFileAtCommitID to get the content for the latest and the reverting commit and again populate the content in editor using react-ace and also show the differences using react-diff and finally load the modify skill component where an editor is preloaded with the content of the reverting commit and a similar interface like modify skill is shown where user can edit the content of the reverting commit and push the changes.

let baseUrl = this.getSkillAtCommitIDUrl() ;
let self = this;
var url1 = baseUrl + self.state.latestCommit;
$.ajax({
  url: url1,
  jsonpCallback: 'pc',
  dataType: 'jsonp',
  jsonp: 'callback',
  crossDomain: true,
  success: function (data1) {
    var url2 = baseUrl + self.state.revertingCommit;
    $.ajax({
      url: url2,
      jsonpCallback: 'pd',
      dataType: 'jsonp',
      jsonp: 'callback',
      crossDomain: true,
      success: function (data2) {
        self.updateData([{
        code:data1.file,
        commitID:self.state.latestCommit,
      },{
        code:data2.file,
        commitID:self.state.revertingCommit,
      }])
      }
    });
  }
});

Here, we make nested ajax calls to maintain synchronization and update state after we receive data from both the calls else if we make ajax calls in a loop, then the second ajax call doesn’t wait for the first one to finish and is most likely to fail.

This is how the skill version system was implemented in SUSI Skill CMS. You can find the complete code at SUSI Skill CMS Repository. Feel free to contribute.

Resources:

Continue ReadingImplementing Version Control System for SUSI Skill CMS

Fetching Images for RSS Responses in SUSI Web Chat

Initially, SUSI Web Chat rendered RSS action type responses like this:

The response from the server initially only contained

  • Title
  • Description
  • Link

We needed to improvise the web search & RSS results display and also add images for the results.

The web search & RSS results are now rendered as :

How was this implemented?

SUSI AI uses Yacy to fetchRSSs feeds. Firstly the server using the console process to return the RSS feeds from Yacy needs to be configured to return images too.

"yacy":{
  "example":"http://127.0.0.1:4000/susi/console.json?q=%22SELECT%20title,%20link%20FROM%20yacy%20WHERE%20query=%27java%27;%22",
  "url":"http://yacy.searchlab.eu/solr/select?wt=yjson&q=",
  "test":"java",
  "parser":"json",
  "path":"$.channels[0].items",
  "license":""
}

In a console process, we provide the URL needed to fetch data from, the query parameter needed to be passed to the URL and the path to look for the answer in the API response.

  • url = <url>   – the URL to the remote JSON service which will be used to retrieve information. It must contain a $query$ string.
  • test = <parameter> – the parameter that will replace the $query$ string inside the given URL. It is required to test the service.

Here the URL used is :

http://yacy.searchlab.eu/solr/select?wt=yjson&q=QUERY

To include images in RSS action responses, we need to parse the images also from the Yacy response. For this, we need to add `image` in the selection rule while calling the console process

"process":[
  {
    "type":"console",
    "expression":"SELECT title,description,link FROM yacy WHERE query='$1$';"
  }
]

Now the response from the server for RSS action type will also include `image` along with title, description, and link. An example response for the query `Google` :

{
  "title": "Terms of Service | Google Analytics \u2013 Google",
  "description": "Read Google Analytics terms of service.",
  "link": "http://www.google.com/analytics/terms/",
  "image":   "https://www.google.com/images/branding/googlelogo/1x/googlelogo_color_116x41dp.png",
}

However, the results at times, do not contain images because there are none stored in the index. This may happen if the result comes from p2p transmission within Yacy where no images are transmitted. So in cases where images are not returned by the server, we use the link preview service to preview the link and fetch the image.

The endpoint for previewing the link is :

BASE_URL+'/susi/linkPreview.json?url=URL'

On the client side, we first search the response for data objects with images in API actions. And the amongst the remaining data objects in answers[0].data, we preview the link to fetch image keeping a check on the count. This needs to be performed for processing the history cognitions too.To preview the remaining links in a loop, we cannot make ajax calls directly in a loop. To handle this, nested ajax calls are made using the function previewURLForImage() where we loop through the remaining links and on the success we decrement the count and call previewURLForImage() on the next link and on error we try previewURLForImage() on the next link without decrementing the count.

success: function (rssResponse) {
  if(rssResponse.accepted){
    respData.image = rssResponse.image;
    respData.descriptionShort = rssResponse.descriptionShort;
    receivedMessage.rssResults.push(respData);
  }
  if(receivedMessage.rssResults.length === count ||
    j === remainingDataIndices.length - 1){
    let message = ChatMessageUtils.getSUSIMessageData(receivedMessage, currentThreadID);
    ChatAppDispatcher.dispatch({
      type: ActionTypes.CREATE_SUSI_MESSAGE,
      message
    });
  }
  else{
    j+=1;
    previewURLForImage(receivedMessage,currentThreadID,
BASE_URL,data,count,remainingDataIndices,j);
  }
},

And we store the results as rssResults which are used in MessageListItems to fetch the data and render. The nested calling of previewURLForImage() ends when we have the required count of results or we have finished trying all links for previewing images. We then dispatch the message to the message store. We now improvise the UI. I used Material UI Cards to display the results and for the carousel like display, react-slick.

<Card className={cardClass} key={i} onClick={() => {
  window.open(tile.link,'_blank')
}}>
  {tile.image &&
    (
      <CardMedia>
        <img src={tile.image} alt="" className='card-img'/>
      </CardMedia>
    )
  }
  <CardTitle title={tile.title} titleStyle={titleStyle}/>
  <CardText>
    <div className='card-text'>{cardText}</div>
    <div className='card-url'>{urlDomain(tile.link)}</div>
  </CardText>
</Card>

We used the full width of the message section to display the results by not wrapping the result in message-list-item class. The entire card is hyperlinked to the link. Along with title and description, the URL info is also shown at the bottom right. To get the domain name from the link, urlDomain() function is used which makes use of the HTML anchor tag to get the domain info.

function urlDomain(data) {
  var a = document.createElement('a');
  a.href = data;
  return a.hostname;
}

To prevent stretching of images we use `object-fit: contain;` to make the images fit the image container and align it to the middle.

We finally have our RSS results with images and an improvised UI. The complete code can be found at SUSI WebChat Repo. Feel free to contribute

Resources
Continue ReadingFetching Images for RSS Responses in SUSI Web Chat

Implementing Text To Speech Settings in SUSI WebChat

SUSI Web Chat has Text to Speech (TTS) Feature where it gives voice replies for user queries. The Text to Speech functionality was added using Speech Synthesis Feature of the Web Speech API. The Text to Speech Settings were added to customise the speech output by controlling features like :

  1. Language
  2. Rate
  3. Pitch

Let us visit SUSI Web Chat and try it out.

First, ensure that the settings have SpeechOutput or SpeechOutputAlways enabled. Then click on the Mic button and ask a query. SUSI responds to your query with a voice reply.

To control the Speech Output, visit Text To Speech Settings in the /settings route.

First, let us look at the language settings. The drop down list for Language is populated when the app is initialised. speechSynthesis.onvoiceschanged function is triggered when the app loads initially. There we call speechSynthesis.getVoices() to get the voice list of all the languages currently supported by that particular browser. We store this in MessageStore using ActionTypes.INIT_TTS_VOICES action type.

window.speechSynthesis.onvoiceschanged = function () {
  if (!MessageStore.getTTSInitStatus()) {
    var speechSynthesisVoices = speechSynthesis.getVoices();
    Actions.getTTSLangText(speechSynthesisVoices);
    Actions.initialiseTTSVoices(speechSynthesisVoices);
  }
};

We also get the translated text for every language present in the voice list for the text – `This is an example of speech synthesis` using google translate API. This is called initially for all the languages and is stored as translatedText attribute in the voice list for each element. This is later used when the user wants to listen to an example of speech output for a selected language, rate and pitch.

https://translate.googleapis.com/translate_a/single?client=gtx&sl=en-US&tl=TARGET_LANGUAGE_CODE&dt=t&q=TEXT_TO_BE_TRANSLATED

When the user visits the Text To Speech Settings, then the voice list stored in the MessageStore is retrieved and the drop down menu for Language is populated. The default language is fetched from UserPreferencesStore and the default language is accordingly highlighted in the dropdown. The list is parsed and populated as a drop down using populateVoiceList() function.

let voiceMenu = voices.map((voice,index) => {
  if(voice.translatedText === null){
    voice.translatedText = this.speechSynthesisExample;
  }
  langCodes.push(voice.lang);
  return(
    <MenuItem value={voice.lang}
              key={index}
              primaryText={voice.name+' ('+voice.lang+')'} />
  );
});

The language selected using this dropdown is only used as the language for the speech output when the server doesn’t specify the language in its response and the browser language is undefined. We then create sliders using Material UI for adjusting speech rate and pitch.

<h4 style={{'marginBottom':'0px'}}><Translate text="Speech Rate"/></h4>
<Slider
  min={0.5}
  max={2}
  value={this.state.rate}
  onChange={this.handleRate} />

The range for the sliders is :

  • Rate : 0.5 – 2
  • Pitch : 0 – 2

The default value for both rate and pitch is 1. We create a controlled slider saving the values in state and using onChange function to record change in values. The Reset buttons can be used to reset the rate and pitch values respectively to their default values. Once the language, rate and pitch values have been selected we can click on `Play a short demonstration of speech synthesis`  to listen to a voice reply with the chosen settings.

{ this.state.playExample &&
  (
    <VoicePlayer
       play={this.state.play}
       text={voiceOutput.voiceText}
       rate={this.state.rate}
       pitch={this.state.pitch}
       lang={this.state.ttsLanguage}
       onStart={this.onStart}
       onEnd={this.onEnd}
    />
  )
}

We use the VoicePlayer by passing the required props to get the speech output. onStart and onEnd functions are triggered at the beginning and ending of the speech synthesis and are used to control the state from the parent component. Chosen language, rate, pitch and translated text are passed as props to VoicePlayer which creates a new SpeechSynthesisUtterance() with the passed props and plays the speech output.

On saving these settings and then using the Mic button to get voice replies we see that the voice output is controlled according to the selected settings.

Finally, we have to store the selected settings on the server and ensure that these are pulled when the app is initialized. The format in which these settings are stored in the server is :

Speech Rate

- Used to control rate of speech output.
- SETTING_NAME :  `speechRate`
- SETTING_VALUE : `0.5 - 2`
- DEFAULT_VALUE : `1`
 
Speech Pitch

- Used to control pitch of speech output.
- SETTING_NAME :  `speechPitch`
- SETTING_VALUE : `0 - 2`
- DEFAULT_VALUE : `1`
 
TTS Language

- Used to set the language for Text-To-Speech used when the response from server doesnt specify language and the browser language is also undefined.
- SETTING_NAME :  `ttsLanguage`
- SETTING_VALUE : `Language Code (string)`
- DEFAULT_VALUE : `en-US`

This is how the Text To Speech Settings were implemented in SUSI Web Chat. The complete code can be found at SUSI Web Chat Repository.

PS: To test whether your browser supports Text To Speech, open your browser console and try the following :

  • var msg = new SpeechSynthesisUtterance(‘Hello World’);
  • window.speechSynthesis.speak(msg)

If you get a speech output then the Web API Speech Synthesis is supported by your browser and Text To Speech features of SUSI Web Chat will work. The Web Speech API has support for all latest Chrome browsers as mentioned in the Web Speech API Mozilla docs.However there are few bugs with some Chromium versions please check out more on how to fix them locally here in this link.

Resources:

 

 

Continue ReadingImplementing Text To Speech Settings in SUSI WebChat

Generating Map Action Responses in SUSI AI

SUSI AI responds to location related user queries with a Map action response. The different types of responses are referred to as actions which tell the client how to render the answer. One such action type is the Map action type. The map action contains latitude, longitude and zoom values telling the client to correspondingly render a map with the given location.

Let us visit SUSI Web Chat and try it out.

Query: Where is London

Response: (API Response)

The API Response actions contain text describing the specified location, an anchor with text ‘Here is a map` linked to openstreetmaps and a map with the location coordinates.

Let us look at how this is implemented on server.

For location related queries, the key where is used as an identifier. Once the query is matched with this key, a regular expression `where is (?:(?:a )*)(.*)` is used to parse the location name.

"keys"   : ["where"],
"phrases": [
  {"type":"regex", "expression":"where is (?:(?:a )*)(.*)"},
]

The parsed location name is stored in $1$ and is used to make API calls to fetch information about the place and its location. Console process is used to fetch required data from an API.

"process": [
  {
    "type":"console",
    "expression":"SELECT location[0] AS lon, location[1] AS lat FROM locations WHERE query='$1$';"},
  {
    "type":"console",
    "expression":"SELECT object AS locationInfo FROM location-info WHERE query='$1$';"}
],

Here, we need to make two API calls :

  • For getting information about the place
  • For getting the location coordinates

First let us look at how a Console Process works. In a console process we provide the URL needed to fetch data from, the query parameter needed to be passed to the URL and the path to look for the answer in the API response.

  • url = <url>   – the url to the remote json service which will be used to retrieve information. It must contain a $query$ string.
  • test = <parameter> – the parameter that will replace the $query$ string inside the given url. It is required to test the service.

For getting the information about the place, we used Wikipedia API. We name this console process as location-info and added the required attributes to run it and fetch data from the API.

"location-info": {
  "example":"http://127.0.0.1:4000/susi/console.json?q=%22SELECT%20*%20FROM%20location-info%20WHERE%20query=%27london%27;%22",
  "url":"https://en.wikipedia.org/w/api.php?action=opensearch&limit=1&format=json&search=",
  "test":"london",
  "parser":"json",
  "path":"$.[2]",
  "license":"Copyright by Wikipedia, https://wikimediafoundation.org/wiki/Terms_of_Use/en"
}

The attributes used are :

  • url : The Media WIKI API endpoint
  • test : The Location name which will be appended to the url before making the API call.
  • parser : Specifies the response type for parsing the answer
  • path : Points to the location in the response where the required answer is present

The API endpoint called is of the following format :

https://en.wikipedia.org/w/api.php?action=opensearch&limit=1&format=json&search=LOCATION_NAME

For the query where is london, the API call made returns

[
  "london",
  ["London"],
  ["London  is the capital and most populous city of England and the United Kingdom."],
  ["https://en.wikipedia.org/wiki/London"]
]

The path $.[2] points to the third element of the array i.e “London  is the capital and most populous city of England and the United Kingdom.” which is stored in $locationInfo$.

Similarly to get the location coordinates, another API call is made to loklak API.

"locations": {
  "example":"http://127.0.0.1:4000/susi/console.json?q=%22SELECT%20*%20FROM%20locations%20WHERE%20query=%27rome%27;%22",
  "url":"http://api.loklak.org/api/console.json?q=SELECT%20*%20FROM%20locations%20WHERE%20location='$query$';",
  "test":"rome",
  "parser":"json",
  "path":"$.data",
  "license":"Copyright by GeoNames"
},

The location coordinates are found in $.data.location in the API response. The location coordinates are stored as latitude and longitude in $lat$ and $lon$ respectively.

Finally we have description about the location and its coordinates, so we create the actions to be put in the server response.

The first action is of type answer and the text to be displayed is given by $locationInfo$ where the data from wikipedia API response is stored.

{
  "type":"answer",
  "select":"random",
  "phrases":["$locationInfo$"]
},

The second action is of type anchor. The text to be displayed is `Here is a map` and it must be hyperlinked to openstreetmaps with the obtained $lat$ and $lon$.

{
  "type":"anchor",
  "link":"https://www.openstreetmap.org/#map=13/$lat$/$lon$",
  "text":"Here is a map"
},

The last action is of type map which is populated for latitude and longitude using $lat$ and $lon$ respectively and the zoom value is specified to be 13.

{
  "type":"map",
  "latitude":"$lat$",
  "longitude":"$lon$",
  "zoom":"13"
}

Final output from the server will now contain the three actions with the required data obtained from the respective API calls made. For the sample query `where is london` , the actions will look like :

"actions": [
  {
    "type": "answer",
    "language": "en",
    "expression": "London  is the capital and most populous city of England and the United Kingdom."
  },
  {
    "type": "anchor",
    "link":   "https://www.openstreetmap.org/#map=13/51.51279067225417/-0.09184009399817228",
    "text": "Here is a map",
    "language": "en"
  },
  {
    "type": "map",
    "latitude": "51.51279067225417",
    "longitude": "-0.09184009399817228",
    "zoom": "13",
    "language": "en"
  }
],

This is how the map action responses are generated for location related queries. The complete code can be found at SUSI AI Server Repository.

Resources:

Continue ReadingGenerating Map Action Responses in SUSI AI

Adding Push Wake Button to SUSI on Raspberry PI

SUSI Linux for Raspberry Pi provides the ability to call SUSI with the help of a Hotword ‘Susi’. Calling via Hotword is a natural way of interaction but it is even handier to invoke SUSI listening mode with the help of a Push button. It enables to call SUSI in a noisy environment where detection of Hotword is not that accurate.

To enable Push Wake button is Susi, we need access to Hardware Pins. Devices like Raspberry PI provides GPIO (General Purpose Input Output) Pins for interacting with Hardware Devices.

In this tutorial, we are adding support for Push Wake Button in Raspberry PI, though similar procedure can be extended to add Wake Button to Orange Pi, Beaglebone Black, and other devices. For adding push wake button, we require:

We now need to do wiring to connect button to Raspberry Pi. The button can be connected to Raspberry Pi following the connection diagram. 

After this, we need to install the Raspberry Pi GPIO Python Library. Install it using:

$ pip3 install RPi.GPIO

Now, we may detect the press of the button in our code. We declare an abstract class for implementing Wake Button. In this way, we can later extend our code to include Wake Buttons for more platforms.

import os
from abc import ABC, abstractclassmethod
from queue import Queue
from threading import Thread

from utils.susi_config import config


class WakeButton(ABC, Thread):
   def __init__(self, detection_callback, callback_queue: Queue):
       super().__init__()
       self.detection_callback = detection_callback
       self.callback_queue = callback_queue
       self.is_active = False

   @abstractclassmethod
   def run(self):
       pass

   def on_detected(self):
       if self.is_active:
           self.callback_queue.put(self.detection_callback)
           os.system('play {0} &'.format(config['detection_bell_sound']))
           self.is_active = False

We defined WakeButton class as a Thread. This is done to ensure that listening to Wake Buttons is done in background thread and it does not disturb the main thread. The callback to be executed on main thread after button press is detected is added to callback queue. Main Thread listens on the callback queue and executes any pending functions from other threads.

We also play an Audio File additionally on detection of a button press to confirm the activation of detection to the user.

Now, we define Raspberry Pi Wake Button class. This class extends from abstract WakeButton declared above.

from queue import Queue

import RPi.GPIO as GPIO
import time
from .wake_button import WakeButton


class RaspberryPiWakeButton(WakeButton):
   def __init__(self, detection_callback, callback_queue: Queue):
       super().__init__(detection_callback, callback_queue)
       GPIO.setmode(GPIO.BCM)
       GPIO.setup(18, GPIO.IN, pull_up_down=GPIO.PUD_UP)

   def run(self):
       while True:
           input_state = GPIO.input(18)
           if not input_state:
               self.on_detected()
               self.is_active = False
               time.sleep(0.2)

This class defines the Wake Button for Raspberry Pi. We continuously poll for the input value of GPIO Pin number 18 on which button is connected. If value is negative, it indicated that button was pressed.

Now, we need to add an option if configuration script to give users a choice to enable or disable wake button. We first need to check, if device is Raspberry Pi, since feature is available on Raspberry PI only. To do this, we try to import RPi.GPIO module. If module loading fails, it indicates that device does not support Raspberry Pi GPIO modes. We set the configuration parameters according to it.

def setup_wake_button():
   try:
       import RPi.GPIO
       print("Device supports RPi.GPIO")
       choice = input("Do you wish to enable hardware wake button? (y/n)")
       if choice == 'y':
           config['WakeButton'] = 'enabled'
           config['Device'] = 'RaspberryPi'
       else:
           config['WakeButton'] = 'disabled'
   except ImportError:
       print("This device does not support RPi.GPIO")
       config['WakeButton'] = 'not available'

Now, we simply use the Raspberry Pi wake button detector in our code.

if config['wake_button'] == 'enabled':
   if config['device'] == 'RaspberryPi':
       from hardware_components import RaspberryPiWakeButton

       wake_button = RaspberryPiWakeButton(callback_queue=callback_queue, detection_callback=start_speech_recognition)
       wake_button.start()

Now, when you need to invoke SUSI Listening Mode, instead of saying ‘SUSI’ as Hotword, you may also press the push button. Ask your query after hearing a small bell and get instant reply from SUSI.

Resources:

Continue ReadingAdding Push Wake Button to SUSI on Raspberry PI

Implementing the Feedback Functionality in SUSI Web Chat

SUSI AI now has a feedback feature where it collects user’s feedback for every response to learn and improve itself. The first step towards guided learning is building a dataset through a feedback mechanism which can be used to learn from and improvise the skill selection mechanism responsible for answering the user queries.

The flow behind the feedback mechanism is :

  1. For every SUSI response show thumbs up and thumbs down buttons.
  2. For the older messages, the feedback thumbs are disabled and only display the feedback already given. The user cannot change the feedback already given.
  3. For the latest SUSI response the user can change his feedback by clicking on thumbs up if he likes the response, else on thumbs down, until he gives a new query.
  4. When the new query is given by the user, the feedback recorded for the previous response is sent to the server.

Let’s visit SUSI Web Chat and try this out.

We can find the feedback thumbs for the response messages. The user cannot change the feedback he has already given for previous messages. For the latest message the user can toggle feedback until he sends the next query.

How is this implemented?

We first design the UI for feedback thumbs using Material UI SVG Icons. We need a separate component for the feedback UI because we need to store the state of feedback as positive or negative because we are allowing the user to change his feedback for the latest response until a new query is sent. And whenever the user clicks on a thumb, we update the state of the component as positive or negative accordingly.

import ThumbUp from 'material-ui/svg-icons/action/thumb-up';
import ThumbDown from 'material-ui/svg-icons/action/thumb-down';

feedbackButtons = (
  <span className='feedback' style={feedbackStyle}>
    <ThumbUp
      onClick={this.rateSkill.bind(this,'positive')}
      style={feedbackIndicator}
      color={positiveFeedbackColor}/>
    <ThumbDown
      onClick={this.rateSkill.bind(this,'negative')}
      style={feedbackIndicator}
      color={negativeFeedbackColor}/>
  </span>
);

The next step is to store the response in Message Store using saveFeedback Action. This will help us later to send the feedback to the server by querying it from the Message Store. The Action calls the Dispatcher with FEEDBACK_RECEIVED ActionType which is collected in the MessageStore and the feedback is updated in the Message Store.

let feedback = this.state.skill;

if(!(Object.keys(feedback).length === 0 &&    
feedback.constructor === Object)){
  feedback.rating = rating;
  this.props.message.feedback.rating = rating;
  Actions.saveFeedback(feedback);
}

case ActionTypes.FEEDBACK_RECEIVED: {
  _feedback = action.feedback;
  MessageStore.emitChange();
  break;
}

The final step is to send the feedback to the server. The server endpoint to store feedback for a skill requires other parameters apart from feedback to identify the skill. The server response contains an attribute `skills` which gives the path of the skill used to answer that query. From that path we need to parse :

  • Model : Highest level of abstraction for categorising skills
  • Group : Different groups under a model
  • Language : Language of the skill
  • Skill : Name of the skill

For example, for the query `what is the capital of germany` , the skills object is

"skills": ["/susi_skill_data/models/general/smalltalk/en/English-Standalone-aiml2susi.txt"]

So, for this skill,

    • Model : general
    • Group : smalltalk
    • Language : en
    • Skill : English-Standalone-aiml2susi

The server endpoint to store feedback for a particular skill is :

BASE_URL+'/cms/rateSkill.json?model=MODEL&group=GROUP&language=LANGUAGE&skill=SKILL&rating=RATING'

Where Model, Group, Language and Skill are parsed from the skill attribute of server response as discussed above and the Rating is either positive or negative and is collected from the user when he clicks on feedback thumbs.

When a new query is sent, the sendFeedback Action is triggered with the required attributes to make the server call to store feedback on server. The client then makes an Ajax call to the rateSkill endpoint to send the feedback to the server.

let url = BASE_URL+'/cms/rateSkill.json?'+
          'model='+feedback.model+
          '&group='+feedback.group+
          '&language='+feedback.language+
          '&skill='+feedback.skill+
          '&rating='+feedback.rating;

$.ajax({
  url: url,
  dataType: 'jsonp',
  crossDomain: true,
  timeout: 3000,
  async: false,
  success: function (response) {
    console.log(response);
  },
  error: function(errorThrown){
    console.log(errorThrown);
  }
});

This is how the feedback feedback mechanism works in SUSI Web Chat. The entire code can be found at SUSI Web Chat Repository.

Resources

 

Continue ReadingImplementing the Feedback Functionality in SUSI Web Chat

Adding a Scroll To Bottom button in SUSI WebChat

SUSI Web Chat now has a scroll-to-bottom button which helps the users to scroll the app automatically to the bottom of the scroll area on button click. When the chat history is lengthy and the user has to scroll down manually it results in a bad UX. So the basic requirements of this scroll-to-bottom button are:

  1. The button must only be displayed when the user has scrolled up the message section
  2. On clicking the scroll-to-bottom button, the scroll area must be automatically scrolled to bottom.

Let’s visit SUSI Web Chat and try this out.

The button is not visible until there are enough messages to enable scrolling and the user has scrolled up. On clicking the button, the app automatically scrolls to the bottom pointing to the most recent message.

How was this implemented?

We first design our scroll-to-bottom button using Material UI  Floating Action Button and SVG Icons.

import FloatingActionButton from 'material-ui/FloatingActionButton';
import NavigateDown from 'material-ui/svg-icons/navigation/expand-more';

The button needs to be styled to be displayed at a fixed position on the bottom right corner of the message section. Positioning it on top of MessageSection above the MessageComposer, the button is also aligned with respect to the edges.

const scrollBottomStyle = {
  button : {
    float: 'right',
    marginRight: '5px',
    marginBottom: '10px',
    boxShadow:'none',
  },
  backgroundColor: '#fcfcfc',
  icon : {
    fill: UserPreferencesStore.getTheme()==='light' ? '#90a4ae' : '#7eaaaf'
  }
}

The button must only be displayed when the user has scrolled up. To implement this we need a state variable showScrollBottom which must be set to true or false accordingly based on the scroll offset.

{this.state.showScrollBottom &&
  <div className='scrollBottom'>
    <FloatingActionButton mini={true}
      style={scrollBottomStyle.button}
      backgroundColor={scrollBottomStyle.backgroundColor}
      iconStyle={scrollBottomStyle.icon}
      onTouchTap={this.forcedScrollToBottom}>
      <NavigateDown />
    </FloatingActionButton>
  </div>
}

Now we have to set our state variable showScrollBottom corresponding to the scroll offset. It must be set to true is the user has scrolled up and false if the scrollbar is already at the bottom. To implement this we need to listen to the scrolling events. We used react-custom-scrollbars for the scroll area wrapping the message section. We can listen to the scrolling events using the onScroll props. We also need to tag the scroll area using refs to access the scroll area instead of using findDOMNode as it is being deprecated.

import { Scrollbars } from 'react-custom-scrollbars';

<Scrollbars
  ref={(ref) => { this.scrollarea = ref; }}
  onScroll={this.onScroll}
>
  {messageListItems}
</Scrollbars>

Now, whenever a scroll action is performed, the onScroll() function is triggered. We now have to know if the scroll bar is at the bottom or not. We make use of the scroll area’s props to get the scroll offsets. The getValues() function returns an object containing different scroll offsets and scroll area dimensions. We are interested in values.top which tells about the scroll-top’s progress from 0 to 1 i.e when the scroll bar is at the top most point values.top is 0 and when its on the bottom most point, values.top is 1. So whenever values.top is 1, showScrollBottom is false else true.

onScroll = () => {
  let scrollarea = this.scrollarea;
  if(scrollarea){
    let scrollValues = scrollarea.getValues();
    if(scrollValues.top === 1){
      this.setState({
        showScrollBottom: false,
      });
    }
    else if(!this.state.showScrollBottom){
      this.setState({
        showScrollBottom: true,
      });
    }
  }
}

Finally, we need to scroll the chat app to the bottom on button click. Whenever showScrollBottom is updated, the state is changed, so componentDidUpdate is triggered which calls the _scrollToBottom() function. But we should change this to avoid scrolling to bottom on showScrollBottom update and the user is intending to scroll here. We use the function forcedScrollToBottom to be triggered on clicking the scroll-to-bottom button, which resets the scrollTop value to the height of the scroll area, thus pointing the scrollbar to the bottom.

forcedScrollToBottom = () => {
  let ul = this.scrollarea;
  if (ul) {
    ul.scrollTop(ul.getScrollHeight());
  }
}

We don’t have to worry about resetting showScrollBottom on forced scroll to bottom as the scrolling will trigger the onScroll function where the showScrollBottom state is handled accordingly.

This is how the scroll to bottom button has been implemented in SUSI Web Chat. The entire code can be found at SUSI Web Chat Repository.

Resources

 

Continue ReadingAdding a Scroll To Bottom button in SUSI WebChat

Setup SUSI Assistant on Raspberry Pi in under 30 minutes

With our ever growing list of list of platforms supported by Susi AI, we now have a client that can run on Raspberry Pi and you can access it hands-free!! Here is a video that you can refer for its working.

But it might have left you wondering how you can replicate such a setup yourself? It is fairly easy and will be done fairly easy. Just follow the following instructions.

You need to have following hardware in order to have your own SUSI Assistant running on Raspberry Pi.

  • A Raspberry Pi (prefer 2 or 3) with Raspbian Jessie OS.
  • A stable internet connection.  ( Recommended 4 Mbps )
  • A USB Microphone /  USB Webcam with Microphone. You may buy one like this.
  • A Speaker that connects through 3.5mm jack. You may buy one like this.

After you get all the above items in order, you need to get access to a terminal of your Raspberry Pi. You can have that by either connecting a monitor to Raspberry Pi temporarily or by connecting to Raspberry Pi over SSH.

Once this is done, next step is the installation of the dependencies. The installation of the SUSI on Raspberry is automated after dependencies are installed. Run the following command on Raspberry Pi terminal.

sudo apt install git swig3.0 portaudio19-dev pulseaudio libpulse-dev unzip sox libatlas-dev libatlas-base-dev libsox-fmt-all python3

After this, you may check if your output and input devices are working alright. For this, run rec recording.wav . It will start recording audio and saving it to a file named recording.wav. Play back the file using play recording.wav If you hear your audio clearly, setup is done right else you need to configure your Audio Devices correctly.  Most of the time the configuration of Audio works out the box and devices are plug and play so you would not encounter any errors. If you are successful in configuring your devices, install extra dependencies for SUSI Hardware by running the automated install script. In your terminal run,

$ git clone https://github.com/fossasia/susi_hardware.git
$ cd susi_hardware
$ ./install.sh 

This will install all the remaining dependencies. After the above step is complete, you may run configuration file generator script to choose the Text to Speech and Speech to Text service according to your wish. For doing so, you need to run

$ python3 config_generator.py

Follow the instructions in the script. It will ask you to configure the default service for Text to Speech and Speech to Text and other options. After the configuration is complete, you can simply run the following command to start SUSI.

$ python3 main.py

This will start SUSI in a continuously listening mode. You may invoke SUSI anytime, just by saying SUSI followed by a query. The query will be answered by SUSI subsequently.

Since configurations for different hardware devices may vary, you may encounter some problems. In such a scenario, you may refer to the following resources to solve the issues.

Resources:

Continue ReadingSetup SUSI Assistant on Raspberry Pi in under 30 minutes