Teaching Susi

In this blog post, I’ll explain how you can add a skill to SUSI.

Skills to be added in this tutorial:

  1. Ask SUSI for conversion of decimal into Binary .
  2. Ask SUSI to tell a quote.
  3. Skill development in SUSI is very interesting and easy. A comprehensive guide for skill development can be found here.

We have a Susi Skill development environment based on an Etherpad. Are you unaware what an Etherpad is? It is a blank web page where you can just put in your text and everyone can collaborate.

Here is a screenshot of what etherpad looks like:

  • open http://dream.susi.ai
  • name a dream (just pick a name for your tests in lower case letters)
  • the etherpad is filled with a welcome message, just delete the content completely

Ask SUSI for conversion of decimal into Binary

“*” here represents any decimal number number.Suppose we enter a decimal number and want susi to return it’s binary equivalent. So to make our skill, first of all we should form a general query.

Query: Convert * into binary or * in binary

After defining our query we want susi to reply with relevant answer.

For taking out the conversion from decimal to binary we can use JavaScript.

We define Javascript syntax in etherpad as follows :

!javascript:Binary for $1$ = $!$
(+$1$).toString(2);
eol

Explanation :

!javascript” allows us to print like javascript console. We can add JavaScript code by using this and do mathematical calculations to convert our decimal into binary.

“Binary for $1$ = $!$”  represents output format where $1$ stores the value in “*” in query given by user to susi. $!$ will print the result of javascript code below.

“(+$1$).toString(2);” This is a single line javascript code while converts value in “$1$” to binary, It’s output is reflected in “$!$”

“eol” represents end of line, which signifies our code for skill is finished.

In etherpad our skill would look like :

“#” – used for commenting out lines in skill

#The following code returns binary equivalent of a decimal number given by user

convert * into binary || binary of *
!javascript:Binary for $1$ = $!$
(+$1$).toString(2);
eol

Screenshot:

Ask SUSI to tell a quote

We can use external API’s for providing answer to user’s query. The external API used for telling the quote is Quotes Rest API created by They Said So. It offers a very good quotes platform and also the quotes are not repeated when several requests are made continuously.

We need a query for our skill.

Query: Tell me a quote.

Now let’s say that we use this JSON response for fetching quote data.

Our JSON object looks like:

{
 success: {
  total: 1
 },
contents: {
quotes: [
{
quote: "Let our advance worrying become advance thinking and planning.",
length: "62",
author: "Winston Churchill",
tags: [
"anxiety",
"inspire",
"planning",
"time-management"
],
category: "inspire",
date: "2017-05-31",
permalink: "https://theysaidso.com/quote/o3aMOSUwOPqUnJv9YyfYHweF/winston-churchill-let-our-advance-worrying-become-advance-thinking-and-planning",
title: "Inspiring Quote of the day",
background: "https://theysaidso.com/img/bgs/man_on_the_mountain.jpg",
id: "o3aMOSUwOPqUnJv9YyfYHweF"
}
],
copyright: "2017-19 theysaidso.com"
}
}

Now we are interested in getting value corresponding to  quotes key, which takes the path:  “path”:”$.contents.quotes“.

We make output query as follows:

!console:$quote$
     {
        "url" : "http://quotes.rest/qod.json",
        "path":"$.contents.quotes"
      }
eol

Explanation :

“!console” prints the output and this is returned by SUSI.AI.

$quote$” is the key whose value will be printed from parsed JSON.It contains our quote and this is a random quote. This is the string which will be responded by SUSI.AI

“url” generated JSON response from external API.

“path” is parsing path of JSON object which we follow for getting correct response.

Our skill will look like:

tell me a quote
!console:$quote$
     {
        "url" : "http://quotes.rest/qod.json",
        "path":"$.contents.quotes"
      }
eol

Screenshot :

Resources:

Continue ReadingTeaching Susi

Pie Chart Responses from SUSI Server

Giving out responses in charts and graphs is a very common reply of various assistants. We also have it in SUSI. We can show users the output of stocks, market covers and various percentages output in Pie Charts.  

A pie chart is a circular chart/graph which is divided in some segments like a pie. The segments in a pie chart shows the share of each object or category.

The PieChartServlet  in SUSI Server is a servlet that takes the JSON data of a the pie chart components as input parameters and returns an Image of the rendered Pie Chart..

public class PieChartServlet extends HttpServlet {

This is a simple HttpServlet. It does not require any authentication or base user role. So we extend the HttpServlet here.

    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws  IOException {

doGet is a method which is triggered whenever the PieChartServlet receives a GET Query. This will contain all the code that we will need to render the final output.

{"ford":"17.272992","toyota":"27.272992","renault":"47.272992"}

This is the sample JSON that we send to the PieChartServlet. This contains the names of the pie chart components and their respective percentages. After we receive these parameters we parse them and store them in our local variables.
These variables are then further used to plot the pie chart.

To plot these values in pie chart we have used a library JFreeChart.

This is a free and well documented Java chart library. This library supports PNGs and JPEGs as well as vector graphics file formats

JFreeChart chart = getChart(json, legendBit, tooltipBit);

From here we call a function getChart This function accept 3 parameters. The json which we sent as the GET parameter, legendBit and tooltilBit. These are also sent as GET parameters. In this example I will use legendBit as true and tooltipBit as false.

        JFreeChart chart = ChartFactory.createPieChart("SUSI Visualizes -   PieChart", dataset, legend, tooltips, urls);

        chart.setBorderPaint(Color.BLACK);
        chart.setBorderStroke(new BasicStroke(5.0f));
        chart.setBorderVisible(true);

        return chart;

 This is the function getChart. It creates a chart using the ChartFactory method and set sets the SUSI branding on it as “SUSI Visualizes – PieChart”. It accepts the datasets, legends and tool tips. The variable, dataset is nothing but the json keys and their values.

After the ChartFactory returns the chart we set the border of the chart and returns a pie chart back the function where it was called.

ChartUtilities.writeChartAsPNG(outputStream, chart, width, height);

Finally we write the chart as a PNG image and send it to the user.

Output

This can be tested at

https://api.susi.ai/vis/piechart.png?data={%22ford%22:%2217.272992%22,%22toyota%22:%2227.272992%22,%22renault%22:%2247.272992%22}&width=1000&height=1000&legend=true&tooltip=false

References

Continue ReadingPie Chart Responses from SUSI Server

Markdown responses from SUSI Server

Most of the times SUSI sends a plain text reply. But for some replies we can set the type of the query as markdown and format the output in computer or bot typed images. In this blog I will explain how to get images with markdown instead of large texts.

This servlet is a simple HttpServlet and do not require any types of user authentication or base user roles. So, instead of extending it from AbstractAPIHandler we extend a HttpServlet.

public class MarkdownServlet extends HttpServlet {

This method is fired when we send a GET request to the server. It accepts those parameters and send it to the “process(…)” method.

One major precaution in open source is to ensure no one takes advantages out of it. In the first steps, we ensure that a user is not trying to access the server very frequently. If the server find the request frequency high, it returns a 503 error to the user.

if (post.isDoS_blackout()) {response.sendError(503, "your request frequency is too high"); return;} // DoS protection
process(request, response, post);
}

 The process function is where all the processing is done. Here the text is extracted from the URL. All the parameters are sent in GET request and the “process(…)” functions parses the query. After we check all the parameters like color, padding, uppercase, text color and get them in our local variables.

http://api.susi.ai/vis/markdown.png?text=hello%20world%0Dhello%20universe&color_text=000000&color_background=ffffff&padding=3

Here we calculate the optimum image size. A perfect size has the format 2:1, that fits into the preview window. We should not allow that the left or right border is cut away. We also resize the image here if necessary. Different clients can request different sizes of images and we can process the optimum image size here.

int lineheight = 7;
int yoffset = 0;
int height = width / 2;
while (lineheight <= 12) {
height = linecount * lineheight + 2 * padding - 1;
if (width <= 2 * height) break;
yoffset = (width / 2 - height) / 2;
height = width / 2;
lineheight++;
}

Then we print our text to the image. This is also done using the RasterPlotter. Using all the parameters that we parsed above we create a new image and set the colors, linewidth, padding etc. Here we are making a matrix with and set all the parameters that we calculated above to our image.

RasterPlotter matrix = new RasterPlotter(width, height, drawmode, color_background);
matrix.setColor(color_text);
if (c == '\n' || (c == ' ' && column + nextspace - pos >= 80)) {
x = padding - 1;
y += lineheight;
column = 0;
hashcount = 0;
if (!isFormatted) {
matrix.setColor(color_text);
}
isBold = false;
isItalic = false;
continue;
}
}

After we have our image we print the SUSI branding. Susi branding is put at the bottom right of the image. It prints “MADE WITH HTTP://SUSI.AI” at the bottom right of the image.

PrintTool.print(matrix, matrix.getWidth() - 6, matrix.getHeight() - 6, 0, "MADE WITH HTTP://SUSI.AI", 1, false, 50);

At the end we write  the image and set the cross origin access headers. This header is very important when we are using different domains on different clients. If this is not provided, the query may give the error of “Cross Origin Access blocked”.

response.addHeader("Access-Control-Allow-Origin", "*");
RemoteAccess.writeImage(fileType, response, post, matrix);
post.finalize();
}
}

This servlet can be locally tested at:

http://localhost:4000/vis/markdown.png?text=hello%20world%0Dhello%20universe&color_text=000000&color_background=ffffff&padding=3

Or at SUSI.ai API Server http://api.susi.ai/vis/markdown.png?text=hello%20world%0Dhello%20universe&color_text=000000&color_background=ffffff&padding=

Outputs

References

Oracle ImageIO Docs: https://docs.oracle.com/javase/7/docs/api/javax/imageio/ImageIO.html

Markdown Tutorial: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet

Java 2D Graphics: http://docs.oracle.com/javase/tutorial/2d/

Continue ReadingMarkdown responses from SUSI Server

Implementing Text-to-Speech (TTS) in SUSI Android

Mobile assistants are designed to perform tasks that the user “commands” through by chat UI or speech. The Android OS already provides Text to speech (TTS) and Speech to text (STT) features. This feature is available from Android version 1.6 onward. In this blog post I will show how tts is implemented in SUSI Android and how I fix the issue ‘delay in speech response’.

TextToSpeech class controls the tts engine. To use TextToSpeech class import it in the activity where you want to use text to speech feature.

import android.speech.tts.TextToSpeech;

After you import TextToSpeech class now we need to initialize TextToSpeech

TextToSpeech tts = new TextToSpeech(this,this);

Here first parameter is the Context and the other one is the listener. The listener is  use  to  inform our app that the engine is ready to use. In order to be notified we have to  implement  TextToSpeech.OnInitListener.

TextToSpeech.OnInitListener listener = new  TextToSpeech.OnInitListener {
@Override
public void onInit(int status) {
if (status == TextToSpeech.SUCCESS)
tts.setLanguage(Locale.UK/* set the default language*/);
}
}

Hence the engine can be initialized asIf status is success then, it means that TTS is initialized successfully and now we can use it. Otherwise, we can’t use it. setLanguage method is used to set language in which we want reply.

TextToSpeech tts = new TextToSpeech(getApplicationContext,listener)

When you use TTS one thing you have to remember that TTS run  on main thread so sometimes it may cause delays in text to speech conversion or it may block UI for a while. It is better to wrap it like below code.

new Handler().post(new Runnable() {
      @Override
      public void run() {
         tts = new TextToSpeech(getApplicationContext(), listener);
        }
    });

Now our engine is ready to speak, we need simply pass the string we want to read.

tts.speak(text to read,TextToSpeech.QUEUE_FLUSH, null, null);

But before tts.speak, it is important to check for the audio focus change request. It is important because only one audio source can have focus at a time. You can check it using below code.

private AudioManager.OnAudioFocusChangeListener afChangeListener =
           new AudioManager.OnAudioFocusChangeListener() {
                 public void onAudioFocusChange(int focusChange) {
                                                        //check for focus
                                                   }
                                           };

OnAudioFocusChangeListener is called when audio focus of the system is changed and according to value of focusChange either we stop TTS or keep using it.

AudioManager audiofocus = (AudioManager)                                    getSystemService(Context.AUDIO_SERVICE);

audiofocus is instance of AudioManager class. We need it to call requestAudioFocus method of AudioManager class. requestAudioFocus method returns the status of request for audio focus change. This method requires three parameter  instance of AudioManager.OnAudioFocusChangeListener, stream type and duration hint. If request is granted only then we can we can use tts.speak .

int result = audiofocus.requestAudioFocus(afChangeListener,AudioManager.STREAM_MUSIC, AudioManager.AUDIOFOCUS_GAIN);

if (result == AudioManager.AUDIOFOCUS_REQUEST_GRANTED) {

tts.speak(text to read,TextToSpeech.QUEUE_FLUSH, null, null);

}

We were continuously facing issue ‘delay in speech response’ because voiceReply method implementation was wrong. We were initializing TextToSpeech on each call of voiceReply method and since onInit method runs on main thread causing delay in voice response. So I removed it and instead of initializing tts each time I used the tts instance already initialized when activity create.

 String spoken = reply;

textToSpeech.speak(spoken, TextToSpeech.QUEUE_FLUSHnull);

You can also control how the engine read text. Like we can modify pitch and speech rate.

tts.setPitch((float)pitch);

tts.setSpeechRate((float)speed);

Resource

Continue ReadingImplementing Text-to-Speech (TTS) in SUSI Android

Managing States in SUSI MagicMirror Module

SUSI MagicMirror Module is a module for MagicMirror project by which you can use SUSI directly on MagicMirror. While developing the module, a problem I faced was that we need to manage the flow between the various stages of processing of voice input by the user and displaying SUSI output to the user. This was solved by making state management flow between various states of SUSI MagicMirror Module namely,

  • Idle State: When SUSI MagicMirror Module is actively listening for a hotword.
  • Listening State: In this state, the user’s speech input from the microphone is recorded to a file.
  • Busy State: The user has finished speaking or timed out. Now, we need to transcribe the audio spoken by the user, send the response to SUSI server and speak out the SUSI response.

The flow between these states can be explained by the following diagram:

As clear from the above diagram, transitions are not possible from a state to all other states. Only some transitions are allowed. Thus, we need a mechanism to guarantee only allowed transitions and ensure it triggers on the right time.

For achieving this, we first implement an abstract class State with common properties of a state. We store the information whether a state can transition into some other state in a map allowedTransitions which maps state names “idle”, “listening” and “busy” to their corresponding states. The transition method to transition from one state to another is implemented in the following way.

protected transition(state: State): void {
   if (!this.canTransition(state)) {
       console.error(`Invalid transition to state: ${state}`);
       return;
   }

   this.onExit();
   state.onEnter();
}

private canTransition(state: State): boolean {
   return this.allowedStateTransitions.has(state.name);
}

Here we first check if a transition is valid. Then we exit one state and enter into the supplied state.  We also define a state machine that initializes the default state of the Mirror and define valid transitions for each state. Here is the constructor for state machine.

constructor(components: IStateMachineComponents) {
        this.idleState = new IdleState(components);
        this.listeningState = new ListeningState(components);
        this.busyState = new BusyState(components);

        this.idleState.AllowedStateTransitions = new Map<StateName, State>([["listening", this.listeningState]]);
        this.listeningState.AllowedStateTransitions = new Map<StateName, State>([["busy", this.busyState], ["idle", this.idleState]]);
        this.busyState.AllowedStateTransitions = new Map<StateName, State>([["idle", this.idleState]]);

        this.currentState = this.idleState;
        this.currentState.onEnter();
}

Now, the question arises that how do we detect when we need to transition from one state to another. For that we subscribe on the Snowboy Detector Observable. We are using Snowboy library for Hotword Detection. Snowboy detects whether an audio stream is silent, has some sound or whether hotword was spoken. We bind all this information to an observable using the ReactiveX Observable pattern. This gives us a stream of events to which we can subscribe and get the results. It can be understood in the following code snippet.

detector.on("silence", () => {
   this.subject.next(DETECTOR.Silence);
});

detector.on("sound", () => {});

detector.on("error", (error) => {
   console.error(error);
});

detector.on("hotword", (index, hotword) => {
   this.subject.next(DETECTOR.Hotword);
});
public get Observable(): Observable<DETECTOR> {
   return this.subject.asObservable();
}

Now, in the idle state, we subscribe to the values emitted by the observable of the detector to know when a hotword is detected to transition to the listening state. Here is the code snippet for the same.

this.detectorSubscription = this.components.detector.Observable.subscribe(
   (value) => {
   switch (value) {
       case DETECTOR.Hotword:
           this.transition(this.allowedStateTransitions.get("listening"));
           break;
   }
});

In the listening state, we subscribe to the states emitted by the detector observable to find when silence is detected so that we can stop recording the audio stream for processing and move to busy state.

this.detectorSubscription = this.components.detector.Observable.subscribe(
   (value) => {
   switch (value) {
       case DETECTOR.Silence:
           record.stop();
           this.transition(this.allowedStateTransitions.get("busy"));
           break;
   }
});

The task of speaking the audio and displaying results on the screen is done by a renderer. The communication to renderer is done via a RendererCommunicator object using a notification system. We also bind its events to an observable so that we know when SUSI has finished speaking the result. To transition from busy state to idle state, we subscribe to renderer observable in the following manner.

this.rendererSubscription = this.components.rendererCommunicator.Observable.subscribe((type) => {
   if (type === "finishedSpeaking") {
       this.transition(this.allowedStateTransitions.get("idle"));
   }
});

In this way, we transition between various states of MagicMirror Module for SUSI in an efficient manner.

Resources

Continue ReadingManaging States in SUSI MagicMirror Module

Hotword Detection on SUSI MagicMirror with Snowboy

Magic Mirror in the story “Snow White and the Seven Dwarfs” had one cool feature. The Queen in the story could call Mirror just by saying “Mirror” and then ask it questions. MagicMirror project helps you develop a Mirror quite close to the one in the fable but how cool it would be to have the same feature? Hotword Detection on SUSI MagicMirror Module helps us achieve that.

The hotword detection on SUSI MagicMirror Module was accomplished with the help of Snowboy Hotword Detection Library. Snowboy is a cross platform hotword detection library. We are using the same library for Android, iOS as well as in MagicMirror Module (nodejs).

Snowboy can be added to a Javascript/Typescript project with Node Package Manager (npm) by:

$ npm install --save snowboy

For detecting hotword, we need to record audio continuously from the Microphone. To accomplish the task of recording, we have another npm package node-record-lpcm16. It used SoX binary to record audio. First we need to install SoX using

Linux (Debian based distributions)

$ sudo apt-get install sox libsox-fmt-all

Then, you can install node-record-lpcm16 package using npm using

$ npm install node-record-lpcm16

Then, we need to import it in the needed file using

import * as record from "node-record-lpcm16";

You may then create a new microphone stream using,

const mic = record.start({
   threshold: 0,
   sampleRate: 16000,
   verbose: true,
});

The mic constant here is a NodeJS Readable Stream. So, we can read the incoming data from the Microphone and process it.

We can now process this stream using Detector class of Snowboy. We declare a child class extending Snowboy Hotword Decoder to suit our needs.

import { Detector, Models } from "snowboy";

export class HotwordDetector extends Detector {
  
  1 constructor(models: Models) {
       super({
           resource: `${process.env.CWD}/resources/common.res`,
           models: models,
           audioGain: 2.0,
       });
       this.setUp();
   }

   // other methods
}

First, we create a Snowboy Detector by calling the parent constructor with resource file as common.res and a Snowboy model as argument. Snowboy model is a file which tells the detector which Hotword to listen for. Currently, the module supports hotword Susi but it can be extended to support other hotwords like Mirror too. You can train the hotword for SUSI for your voice and get the latest model file at https://snowboy.kitt.ai/hotword/7915 . You may then replace the susi.pmdl file in resources folder with our own susi.pmdl file for a better experience.

Now, we need to delegate the callback methods of Detector class to know about the current state of detector and take an action on its basis. This is done in the setUp() method.

private setUp(): void {
   this.on("silence", () => {
      // handle silent state
   });

   this.on("sound", () => {
      // handle sound detected state
   });

   this.on("error", (error) => {
      // handle error
   });

   this.on("hotword", (index, hotword) => {
      // hotword detected 
   });
}

If you go into the implementation of Detector class of Snowboy, it extends from NodeJS.WritableStream. So, we can pipe our microphone input read stream to Detector class and it handles all the states. This can be done using

mic.pipe(detector as any);

So, now all the input from Microphone will be processed by Snowboy detector class and we can know when the user has spoken the word “SUSI”. We can start speech recognition and do other changes in User Interface based on the different states.

After this, we can simply say “Susi” followed by our query to ask SUSI on the MagicMirror. A video implementation of the same can be seen here: 

Resources:

Continue ReadingHotword Detection on SUSI MagicMirror with Snowboy

Implementing Speech to Text for Chrome in SUSI Web Chat

SUSI Web Chat now replies to voice inputs. To achieve this, I made use of the Web Speech API. The voice input saves one from the pain of typing and it’s a much needed feature for the Web Chat and to maintain the similarity with the other SUSI Android and SUSI iOS clients.

To test the feature out in SUSI Web Chat, click on the microphone icon beside the text area on chat.susi.ai.

Say the message once the dialog appears, and you will see the message being sent to the Chat List rendered in text.

Let’s achieve the same result following the steps below.

  1. First, initialize the class Voice Recognition with defaults for the Speech Recognition, for that we create a file VoiceRecognition.js
  1. We first initialize the Speech Recognition API with the window object.
  2. We warn the User with a console message if there is no Speech Recognition API available.
  3. If it’s available call the recognition function using the following line

this.recognition = this.createRecognition(SpeechRecognition)

// Initialise the Speech recognition API
const SpeechRecognition = window.SpeechRecognition
      || window.webkitSpeechRecognition
      || window.mozSpeechRecognition
      || window.msSpeechRecognition
      || window.oSpeechRecognition
    // Warn the user if not available otherwise call the createRecognition function
    if (SpeechRecognition != null) {
      this.recognition = this.createRecognition(SpeechRecognition)
    } else {
      console.warn('The current browser does not support the SpeechRecognition API.');
    }
  }
  1. Then we write the createRecognition function
  1. We set our defaults first as “continuous  – true, interimResults – false, and language – ‘en-US’ ”
  2. We pass these options to the recognition object that we created in the above step and finally return the recognition object.
createRecognition = (SpeechRecognition) => {
    const defaults = {
      continuous: true,
      interimResults: false,
      lang: 'en-US'
    }

    const options = Object.assign({}, defaults, this.props)

    let recognition = new SpeechRecognition()

    recognition.continuous = options.continuous
    recognition.interimResults = options.interimResults
    recognition.lang = options.lang

    return recognition
  }
  1. Initialize all the helper functions to be passed as props.
  1. start – This method starts the recognition and invokes the Mic of the browser. It also checks if the browser has the access to the user’s Mic.
  2. stop – Stop method closes the Mic and returns the audio captured so far.
  3. abort – Abort method stops the SpeechRecognition service.
  4. onspeechend – This method is called if there is any inactivity and there is no voice input. Hence, stops the recognition service.
  5. componentWillReceiveProps – This method waits for the stop method and calls it when it has received the stop object.
  6. componentWIllUnmount – This method is invoked just before the component is about to unmount and therefore its function is to abort the Speech Recognition Service
  7. render –  We return null as there is nothing to return in this component and all the converted text of the captured Speech will be sent to the parent element.
start = () => {
    this.recognition.start()
  }

  stop = () => {
    this.recognition.stop()
  }

  abort = () => {
    this.recognition.abort()
  }
  onspeechend = () => {
    console.log('no sound detected');
    this.recognition.stop()
  }

  componentWillReceiveProps ({ stop }) {
    if (stop) {
      this.stop()
    }
  }

  componentWillUnmount () {
    this.abort()
  }

  render () {
    return null
  }
  1. Add event listeners to start and stop functions inside componentDidMount() to ensure every action that we want to perform from the parent element is after the component has successfully mounted itself.
  1. start – The start method is set with an action start so that we can pass the required action name to the VoiceRecognition component that we created
  2. end – The end method similarly is set with an action end
  3. After setting up the actions we finally call the bindResult function with the result that we received.
componentDidMount () {
    const events = [
      { name: 'start', action: this.props.onStart },
      { name: 'end', action: this.props.onEnd },
      { name: 'onspeechend', action: this.props.onspeechend }
    ]

    events.forEach(event => {
      this.recognition.addEventListener(event.name, event.action)
    })

    this.recognition.addEventListener('result', this.bindResult)

    this.start()
  }
  1. Bind the result and send it as the props to the parent element.
  2. Combine all interim results of the recognition and send it to the onResult function as finalTranscript
  3. The function bindResult – The function bindResult does all the binding of the interim results that we received and output a final result as finalTranscript.
  4. Lastly, we add the prop validations to ensure the correct props are being passed to our VoiceRecognition component.
// bindResult function
 bindResult = (event) => {
    let interimTranscript = ''
    let finalTranscript = ''
   // Bind all the results to finalTranscript
    for (let i = event.resultIndex; i < event.results.length; ++i) {
      if (event.results[i].isFinal) {
        finalTranscript += event.results[i][0].transcript
      } else {
        interimTranscript += event.results[i][0].transcript
      }
    }

    this.props.onResult({ interimTranscript, finalTranscript })
  }
// Add Prop Validations
VoiceRecognition.propTypes = {
  onStart: PropTypes.func,
  onEnd : PropTypes.func,
  onResult: PropTypes.func,
  Onspeechend: PropTypes.func,
  continuous: PropTypes.bool,
  lang: PropTypes.string,
  stop: PropTypes.bool
};
// Finally export the VoiceRecognition Component
export default VoiceRecognition
  1. Lastly, call the VoiceRecogntion component and pass the props from the MessageComposer Section to it in the following way.
  1. Initialize the default state in the constructor inside this.state
this.state = {
      text: '',
      start: false, // Starting the VoiceRecognition
      stop: false, // Stop the VoiceRecognition
      open: false, // Maintain the modal state
      result:'' // Maintain the result state
    };
  1. onStart function to call the VoiceRecognition component only when the Mic Button is pressed.
  2. onEnd to end the Speech Recognition service.
  3. onResult to send the message through the Actions.createMessage() function
onResult = ({interimTranscript,finalTranscript }) => {
    let result = interimTranscript;
    let voiceResponse = false;
    this.setState({result:result});
    if(finalTranscript) {
      result = finalTranscript;
      this.setState({
      start: false,
      result:result,
      stop: false,
      open:false,
      animate:false
      });
      if(this.props.speechOutputAlways || this.props.speechOutput){
        voiceResponse = true;
      }
      Actions.createMessage(result, this.props.threadID, voiceResponse);
      setTimeout(()=>this.setState({result: ''}),400);
      this.Button = <Mic />
    }
  }
  1. Fire the component based on the value of start variable and pass the requisite props as given below in the code.
// Only when the start is ‘true’ call the VoiceRecognition component

    {this.state.start && (
          <VoiceRecognition
            onStart={this.onStart}
            onEnd={this.onEnd}
            onResult={this.onResult}
            continuous={true}
            lang="en-US"
            stop={this.state.stop}
          />
        )}

 

  1. Update the text in the “Speak Now” Dialog to show the user the Speech to Text conversion
  1. Update the text in the Modal when it is converted from Speech to Text, i.e. when we set the state of the result variable.
{this.state.result !=='' ? this.state.result :
          'Speak Now...'}

To get access to the full code, go to the repository https://github.com/fossasia/chat.susi.ai

Resources

Continue ReadingImplementing Speech to Text for Chrome in SUSI Web Chat

Implementing Text to Speech on SUSI Web Chat

SUSI Web Chat now gives voice replies while chatting with it similar to SUSI Android and SUSI iOS clients. To test the Voice Output on Chrome,

  1. Visit chat.susi.ai
  2. Click on the Mic input button.
  3. Say something using the Mic when the Speak Now View appears

The simplest way to add text-to-speech to your website is by using the official Speech API currently available for Chrome Browser. The following steps help to achieve it in ReactJS.

  1. Initialize state defaults in a component named, VoicePlayer.
const defaults = {
      text: '',
      volume: 1,
      rate: 1,
      pitch: 1,
      lang: 'en-US'
    } 

There are two states which need to be maintained throughout the component and to be passed as such in our props which will maintain the state. Thus our state is simply

this.state = {
      started: false,
      playing: false
    }
  1. Our next step is to make use of functions of the Web Speech API to carry out the relevant processes.
  1. speak() – window.speechSynthesis.speak(this.speech) – Calls the speak method of the Speech API
  2. cancel() – window.speechSynthesis.cancel() – Calls the cancel method of the Speech API
  1. We then use our component helper functions to assign eventListeners and listen to any changes occurring in the background. For this purpose we make use of the functions componentWillReceiveProps(), shouldComponentUpdate(), componentDidMount(), componentWillUnmount(), and render().
  1. componentWillReceiveProps() – receives the object parameter {pause} to listen to any paused action
  2. shouldComponentUpdate() simply returns false if no updates are to be made in the speech outputs.
  3. componentDidMount() is the master function which listens to the action start, adds the eventListener start and end, and calls the speak() function if the prop play is true.
  4. componentWillUnmount() destroys the speech object and ends the speech.

Here’s a code snippet for Function componentDidMount()

componentDidMount () {
  
    const events = [
      { name: 'start', action: this.props.onStart }
    ]
    // Adding event listeners
    events.forEach(e => {
      this.speech.addEventListener(e.name, e.action)
    })

    this.speech.addEventListener('end', () => {
      this.setState({ started: false })
      this.props.onEnd()
    })

    if (this.props.play) {
      this.speak()
    }
  }
  1. We then add proper props validation in the following way to our VoicePlayer component.
VoicePlayer.propTypes = {
  play: PropTypes.bool,
  text: PropTypes.string,
  onStart: PropTypes.func,
  onEnd: PropTypes.func
};
  1. The next step is to pass the props from a listener view to the VoicePlayer component. Hence the listener here is the component MessageListItem.js from where the voice player is initialized.
  1. First step is to initialise the state.
this.state = {
      play: false,
    }
  onStart = () => {
    this.setState({ play: true });
  }
  onEnd = () => {
    this.setState({ play: false });
  }
  1. Next, we set play to true when we want to pass the props and the text which is to be said and append it to the message lists which have voice set as `true`
 { this.props.message.voice &&
      (<VoicePlayer
             play
             text={voiceOutput}
             onStart={this.onStart}
             onEnd={this.onEnd}
 />)}

Finally, our message lists with voice true will be heard on the speaker as they have been spoken on the microphone. To get access to the full code, go to the repository https://github.com/fossasia/chat.susi.ai or on our chat channel at gitter.im/fossasia/susi_webchat

Resources

  1. Speak-easy-synthesis repository http://mdn.github.io/web-speech-api/speak-easy-synthesis
  2. Web-speech-api repository https://github.com/mdn/web-speech-api/
Continue ReadingImplementing Text to Speech on SUSI Web Chat

Building Preference Screen in SUSI Android

SUSI provides various preferences to the user in the settings to customize the app. This allows the user to configure the application according to his own choice. There are different preferences available such as to select the theme or the language for text to speech. Preference Setting Activity is an important part of an Android application. Here we will see how we can implement it in an Android app taking SUSI Android (https://github.com/fossasia/susi_android) as the example.

Firstly, we will proceed by adding the Gradle Dependency for the Setting Preferences

compile 'com.takisoft.fix:preference-v7:25.4.0.3'

Then to create the custom style for our setting preference screen we can set

@style/PreferenceFixTheme

as the base theme and can apply various other modifications and color over this. By default it has the usual Day and Night theme with NoActionBar extension.

Now to make different preferences we can use different classes as shown below:

SwitchPreferenceCompat: This gives us the Switch Preference which we can use to toggle between two different modes in the setting.

EditTextPreference: This preference allows the user to give its own choice of number or string in the settings which can be used for different actions.

For more details on this you can refer the this link.

Implementation in SUSI Android

In SUSI Android we have created an activity named activity_settings which holds the Preference Fragment for the setting.

<?xml version="1.0" encoding="utf-8"?>


<fragment

  xmlns:android="http://schemas.android.com/apk/res/android"

  xmlns:tools="http://schemas.android.com/tools"

  android:id="@+id/chat_pref"

  android:name="org.fossasia.susi.ai.activities.SettingsActivity$ChatSettingsFragment"

  android:layout_width="match_parent"

  android:layout_height="match_parent"

  tools:context="org.fossasia.susi.ai.activities.SettingsActivity"/>

The Preference Settings Fragment contains different Preference categories that are implemented to allow the user to have different customization option while using the app. The pref_settings.xml is as follows

<?xml version="1.0" encoding="utf-8"?>

<PreferenceScreen xmlns:android="http://schemas.android.com/apk/res/android"

  xmlns:app="http://schemas.android.com/apk/res-auto"

  android:title="@string/settings_title">


  <PreferenceCategory

      android:title="@string/server_settings_title">

      <PreferenceScreen

          android:title="@string/server_pref"

          android:key="Server_Select"

          android:summary="@string/server_select_summary">

      </PreferenceScreen>

  </PreferenceCategory>


  <PreferenceCategory

      android:title="@string/settings_title">

      <com.takisoft.fix.support.v7.preference.SwitchPreferenceCompat

          android:id="@+id/enter_key_pref"

          android:defaultValue="true"

          android:key="@string/settings_enterPreference_key"

          android:summary="@string/settings_enterPreference_summary"

          android:title="@string/settings_enterPreference_label" />

  </PreferenceCategory>

All the logic related to Preferences and their action is written in SettingsActivity Java class. It listens for any change in the preference options and take actions accordingly in the following way.

public class SettingsActivity extends AppCompatActivity {


  private static final String TAG = "SettingsActivity";

  private static SharedPreferences prefs;


  @Override

  protected void onCreate(Bundle savedInstanceState) {

      super.onCreate(savedInstanceState);

      prefs = getSharedPreferences(Constant.THEME, MODE_PRIVATE);

      Log.d(TAG, "onCreate: " + (prefs.getString(Constant.THEME, DARK)));

      if(prefs.getString(Constant.THEME, "Light").equals("Dark")) {

          setTheme(R.style.PreferencesThemeDark);

      }

      else {

          setTheme(R.style.PreferencesThemeLight);

      }

      setContentView(R.layout.activity_settings);


  }

The class contains a ChatSettingFragment which extends the PreferenceFragmentCompat to give access to override functions such as onPreferenceClick. The code below shows the implementation of it.

public boolean onPreferenceClick(Preference preference) {

              Intent intent = new Intent();

              intent.setComponent( new ComponentName("com.android.settings","com.android.settings.Settings$TextToSpeechSettingsActivity" ));

              startActivity(intent);

              return true;

          }

      });


      rate=(Preference)getPreferenceManager().findPreference(Constant.RATE);

      rate.setOnPreferenceClickListener(new Preference.OnPreferenceClickListener() {

          @Override

          public boolean onPreferenceClick(Preference preference) {

              startActivity(new Intent(Intent.ACTION_VIEW, Uri.parse("http://play.google.com/store/apps/details?id=" + getContext().getPackageName())));

              return true;

          }

      });
}

For diving more into the code we can refer to the github repo of Susi Android (https://github.com/fossasia/susi_android).

Resources

Continue ReadingBuilding Preference Screen in SUSI Android