Managing States in SUSI MagicMirror Module

SUSI MagicMirror Module is a module for MagicMirror project by which you can use SUSI directly on MagicMirror. While developing the module, a problem I faced was that we need to manage the flow between the various stages of processing of voice input by the user and displaying SUSI output to the user. This was solved by making state management flow between various states of SUSI MagicMirror Module namely,

  • Idle State: When SUSI MagicMirror Module is actively listening for a hotword.
  • Listening State: In this state, the user’s speech input from the microphone is recorded to a file.
  • Busy State: The user has finished speaking or timed out. Now, we need to transcribe the audio spoken by the user, send the response to SUSI server and speak out the SUSI response.

The flow between these states can be explained by the following diagram:

As clear from the above diagram, transitions are not possible from a state to all other states. Only some transitions are allowed. Thus, we need a mechanism to guarantee only allowed transitions and ensure it triggers on the right time.

For achieving this, we first implement an abstract class State with common properties of a state. We store the information whether a state can transition into some other state in a map allowedTransitions which maps state names “idle”, “listening” and “busy” to their corresponding states. The transition method to transition from one state to another is implemented in the following way.

protected transition(state: State): void {
   if (!this.canTransition(state)) {
       console.error(`Invalid transition to state: ${state}`);
       return;
   }

   this.onExit();
   state.onEnter();
}

private canTransition(state: State): boolean {
   return this.allowedStateTransitions.has(state.name);
}

Here we first check if a transition is valid. Then we exit one state and enter into the supplied state.  We also define a state machine that initializes the default state of the Mirror and define valid transitions for each state. Here is the constructor for state machine.

constructor(components: IStateMachineComponents) {
        this.idleState = new IdleState(components);
        this.listeningState = new ListeningState(components);
        this.busyState = new BusyState(components);

        this.idleState.AllowedStateTransitions = new Map<StateName, State>([["listening", this.listeningState]]);
        this.listeningState.AllowedStateTransitions = new Map<StateName, State>([["busy", this.busyState], ["idle", this.idleState]]);
        this.busyState.AllowedStateTransitions = new Map<StateName, State>([["idle", this.idleState]]);

        this.currentState = this.idleState;
        this.currentState.onEnter();
}

Now, the question arises that how do we detect when we need to transition from one state to another. For that we subscribe on the Snowboy Detector Observable. We are using Snowboy library for Hotword Detection. Snowboy detects whether an audio stream is silent, has some sound or whether hotword was spoken. We bind all this information to an observable using the ReactiveX Observable pattern. This gives us a stream of events to which we can subscribe and get the results. It can be understood in the following code snippet.

detector.on("silence", () => {
   this.subject.next(DETECTOR.Silence);
});

detector.on("sound", () => {});

detector.on("error", (error) => {
   console.error(error);
});

detector.on("hotword", (index, hotword) => {
   this.subject.next(DETECTOR.Hotword);
});
public get Observable(): Observable<DETECTOR> {
   return this.subject.asObservable();
}

Now, in the idle state, we subscribe to the values emitted by the observable of the detector to know when a hotword is detected to transition to the listening state. Here is the code snippet for the same.

this.detectorSubscription = this.components.detector.Observable.subscribe(
   (value) => {
   switch (value) {
       case DETECTOR.Hotword:
           this.transition(this.allowedStateTransitions.get("listening"));
           break;
   }
});

In the listening state, we subscribe to the states emitted by the detector observable to find when silence is detected so that we can stop recording the audio stream for processing and move to busy state.

this.detectorSubscription = this.components.detector.Observable.subscribe(
   (value) => {
   switch (value) {
       case DETECTOR.Silence:
           record.stop();
           this.transition(this.allowedStateTransitions.get("busy"));
           break;
   }
});

The task of speaking the audio and displaying results on the screen is done by a renderer. The communication to renderer is done via a RendererCommunicator object using a notification system. We also bind its events to an observable so that we know when SUSI has finished speaking the result. To transition from busy state to idle state, we subscribe to renderer observable in the following manner.

this.rendererSubscription = this.components.rendererCommunicator.Observable.subscribe((type) => {
   if (type === "finishedSpeaking") {
       this.transition(this.allowedStateTransitions.get("idle"));
   }
});

In this way, we transition between various states of MagicMirror Module for SUSI in an efficient manner.

Resources

Continue ReadingManaging States in SUSI MagicMirror Module

Implementation of Responsive SUSI Web Chat Search Bar

When we were building the first phase of the SUSI Web Chat Application we didn’t consider about  the responsiveness as a main goal. The main thing we needed was a working application. This changed at a later stage. In this post I’m going to emphasize how we implemented the responsive design and problems we met while we were developing the design.

When we were moving to Material-UI from static HTML CSS we were able to make most of the parts responsive. As an example App-bar of the application. We added App-bar like as follows: We made a separate component for App-bar and it includes the “searchfield” element. Material-UI app bar handles the responsiveness for some extent. We have to handle responsiveness of other sub-parts of the app bar manually.

In “TopBar.react.js” I returned marterial-ui <Toolbar> element like this.

<Toolbar>
                <ToolbarGroup >
                </ToolbarGroup>
                <ToolbarGroup lastChild={true}> //inside of this we have to include other components of the top bar inside this element

                </ToolbarGroup>
             </Toolbar>

We have to add the search button inside the element.
In this we added search component.

This field has the ability to expand and collapse like this.

It looks good. But it appears on mobile screen in a different way. This is how it appears on mobile devices.

So we wanted to hide the SUSI logo on small sized screens. For that we wrote medial queries like this.

@media only screen and (max-width: 860px){
 .app-bar-search{
   background-image: none;
 }
 .search{
   width: 100px !important;
 }
}

Even in smaller screens it appears like this.

To avoid that we minimized the width of the search bar in different screen sizes using media queries .

@media only screen and (max-width: 480px){
 .search{
   width: 100px !important;
 }
}
@media only screen and (max-width: 360px){
 .search{
   width: 65px !important;
 }
}

But in even smaller screens it still appears in the same way. We can’t show the search bar on small screens because the screen size is not enough to show the search bar.
So we wrote another media query to hide all the elements of search component in small screens except close button. Because when we collapse the screen on search mode it hides all the search components and messagecomposer. To take it back to the chat mode we have to enable the close button on smaller screens.

@media only screen and (max-width: 300px){
 .displayNone{
   display: none !important;
 }
 .displayCloseNone{
     position: relative;
     top:6px  !important;
 }
}

We have to define these two classes in “SearchField.react.js” file.

<IconButton className='displayNone'
                   <SearchIcon />
               </IconButton>
               <TextField  name='search'
                   className='search displayNone'
                   placeholder="Search..."/>
               <IconButton className='displayNone'>
                   <UpIcon />
               </IconButton>
               <IconButton className='displayNone'>
                   <DownIcon />
               </IconButton>
               <IconButton className='displayCloseNone'>
                   <ExitIcon />
               </IconButton>

Since we have “Codacy” integrated into our Github Repository we have to change Codacy rules because we used “!important” in media queries to override inline style which comes from Material-UI.
To change codacy rules we can simply login to the codacy and select the “code pattern ” button from the column left side.

It shows the list of rules that Codacy checks. And you can see the “!important” rule under CSSlint category like this.

Just uncheck it. Then codacy will not check your source code for “!important” attributes.

Resources
Configuring Codacy: Use Your Own Conventions: https://blog.codacy.com/configuring-codacy-use-your-own-conventions-9272bee5dcdb
Media queries: https://www.w3schools.com/css/css_rwd_mediaqueries.asp

Continue ReadingImplementation of Responsive SUSI Web Chat Search Bar

The Pocket Science Lab: Who Needs it, and Why

Science and technology share a symbiotic relationship. The degree of success of experimentation is largely dependent on the accuracy and flexibility of instrumentation tools at the disposal of the scientist, and the subsequent findings in fundamental sciences drive innovation in technology itself. In addition to this, knowledge must be free as in freedom. That is, all information towards constructing such tools and using them must be freely accessible for the next generation of citizen scientists. A common platform towards sharing results can also be considered in the path to building a better open knowledge network.

But before we get to scientists, we need to consider the talent pool in the student community that gave rise to successful scientists, and the potential talent pool that lost out on the opportunity to better contribute to society because of an inadequate support system. And this brings us to the Pocket Science Lab

How can PSLab help electronics engineers & students?

This device packs a variety of fundamental instruments into one handy package, with a Bill-of-materials that’s several orders of magnitude less than a distributed set of traditional instruments.

It does not claim to be as good as a Giga Samples Per second oscilloscope, or a 22-bit multimeter, but has the potential to offer a greater learning experience. Here’s how:

  • A fresh perspective to characterize the real world. The visualization tools that can be coded on an Android device/Desktop (3D surface plots, waterfall charts, thermal distributions etc ), are far more advanced than what one can expect from a reasonably priced oscilloscope. If the same needs to be achieved with an ordinary scope, a certain level of technical expertise is expected from the user who must interface the oscilloscope with a computer, and write their own acquisition & visualization app.
  • Reduce the entry barrier for advanced experiments.: All the tools are tightly integrated in a cost-effective package, and even the average undergrad student that has been instructed to walk on eggshells around a conventional scope, can now perform elaborate data acquisition tasks such as plotting the resonant frequency of a tuning fork as a function of the relative humidity/temperature. The companion app is being designed to offer varying levels of flexibility as demanded by the target audience.

  • Is there a doctor in the house? With the feature set available in the PSlab , most common electronic components can be easily studied , and will save hours while prototyping new designs.  Components such as resistors, capacitors, diodes, transistors, Op-amps, LEDs, buffers etc can be tested.

How can PSLab help science enthusiasts ?

Physicists, Chemists and biologists in the applied fields are mostly dependent on instrument vendors for their measurement gear. Lack of an electronic/technical background hinders their ability to improve the gear at their disposal, and this is why a gauss meter which is basically a magnetometer coupled with a crude display in an oversized box with an unnecessarily huge transformer can easily cost upwards of $150 . The PSLab does not ask the user to be an electronics/robotics expert , but helps them to get straight to the acquisition part. It takes care of the communication protocols, calibration requirements, and also handles visualization via attractive plots.

A physicist might not know what I2C is , but is more than qualified to interpret the data acquired from a physical sensor, and characterize its accuracy.

  • The magnetometer (HMC5883L) can be used to demonstrate the dependence of the axial magnetic field on distance from the center of a solenoid
  • The pressure,temperature sensor (BMP280) can be used to verify the gas laws, and verify thermodynamic phenomena against prevalent theories.

Similarly, a chemist can use an RGB sensor (TCS3200) to put the colour of a solution into numbers, and develop a colorimeter in the process. Colorimeters are quite handy for determining molality of coloured solutions., and commercial ones are rather expensive. What it also needs is a set of LEDs with known wavelengths, and most manufacturers offer proper characterisation information.

What does it mean for the hobbyist?

It is capable of greatly speeding up the troubleshooting process . It can also instantly characterize the expected data from various sensors so that the hobbyist can code accordingly. For example, ‘beyond what tilt threshold & velocity should my humanoid robot swing its arms forward in order to prevent a broken nose?’ . That’s not a question that can be easily answered by said hobbyist who is currently in the process of developing his/her own acquisition system.

How can we involve the community?

The PSLab features an experiment designer that speeds acquisition by providing spreadsheets, analytical tools, and visualisation options all in one place. An option for users to upload their new experiments/utilities to the cloud, and subject those to a peer-review process has been planned. Following which , these new experiments can be pumped back into the ecosystem which will find more uses for it, improve it, and so on.

For example , a user can combine the waveform generator with an analog multiplier IC, and develop a spectrum analyzer.

The case for self-reliance

The average undergraduate laboratory currently employs dedicated instruments for each experiment as prescribed by the curriculum. These instruments often only include the measurement tools essential to the experiment, and students merely repeat the procedure verbatim. That’s not experimentation, it’s rather just verification. PSLab offers a wide array of additional instruments that can be employed by the student to enhance the experiment with their own inputs.

For example, a commonly used diode IV curve-tracer kit usually has a couple of power supplies, a voltmeter, and an ammeter. But, if a student wishes to study the impact of temperature on the band gap, he will hard pressed for the additional tools, and software to combine the acquisition process. With the PSLab, however , he/she can pick from a variety of temperature sensors (LM35, BMP180, Si7021 .. ) depending on the requirement, and explore beyond the book. They are thus better prepared to enter research labs .

And in conclusion , this project has immense potential to help create the next generation of scientists, engineers and creators.

Resources

 

Continue ReadingThe Pocket Science Lab: Who Needs it, and Why

Hotword Detection on SUSI MagicMirror with Snowboy

Magic Mirror in the story “Snow White and the Seven Dwarfs” had one cool feature. The Queen in the story could call Mirror just by saying “Mirror” and then ask it questions. MagicMirror project helps you develop a Mirror quite close to the one in the fable but how cool it would be to have the same feature? Hotword Detection on SUSI MagicMirror Module helps us achieve that.

The hotword detection on SUSI MagicMirror Module was accomplished with the help of Snowboy Hotword Detection Library. Snowboy is a cross platform hotword detection library. We are using the same library for Android, iOS as well as in MagicMirror Module (nodejs).

Snowboy can be added to a Javascript/Typescript project with Node Package Manager (npm) by:

$ npm install --save snowboy

For detecting hotword, we need to record audio continuously from the Microphone. To accomplish the task of recording, we have another npm package node-record-lpcm16. It used SoX binary to record audio. First we need to install SoX using

Linux (Debian based distributions)

$ sudo apt-get install sox libsox-fmt-all

Then, you can install node-record-lpcm16 package using npm using

$ npm install node-record-lpcm16

Then, we need to import it in the needed file using

import * as record from "node-record-lpcm16";

You may then create a new microphone stream using,

const mic = record.start({
   threshold: 0,
   sampleRate: 16000,
   verbose: true,
});

The mic constant here is a NodeJS Readable Stream. So, we can read the incoming data from the Microphone and process it.

We can now process this stream using Detector class of Snowboy. We declare a child class extending Snowboy Hotword Decoder to suit our needs.

import { Detector, Models } from "snowboy";

export class HotwordDetector extends Detector {
  
  1 constructor(models: Models) {
       super({
           resource: `${process.env.CWD}/resources/common.res`,
           models: models,
           audioGain: 2.0,
       });
       this.setUp();
   }

   // other methods
}

First, we create a Snowboy Detector by calling the parent constructor with resource file as common.res and a Snowboy model as argument. Snowboy model is a file which tells the detector which Hotword to listen for. Currently, the module supports hotword Susi but it can be extended to support other hotwords like Mirror too. You can train the hotword for SUSI for your voice and get the latest model file at https://snowboy.kitt.ai/hotword/7915 . You may then replace the susi.pmdl file in resources folder with our own susi.pmdl file for a better experience.

Now, we need to delegate the callback methods of Detector class to know about the current state of detector and take an action on its basis. This is done in the setUp() method.

private setUp(): void {
   this.on("silence", () => {
      // handle silent state
   });

   this.on("sound", () => {
      // handle sound detected state
   });

   this.on("error", (error) => {
      // handle error
   });

   this.on("hotword", (index, hotword) => {
      // hotword detected 
   });
}

If you go into the implementation of Detector class of Snowboy, it extends from NodeJS.WritableStream. So, we can pipe our microphone input read stream to Detector class and it handles all the states. This can be done using

mic.pipe(detector as any);

So, now all the input from Microphone will be processed by Snowboy detector class and we can know when the user has spoken the word “SUSI”. We can start speech recognition and do other changes in User Interface based on the different states.

After this, we can simply say “Susi” followed by our query to ask SUSI on the MagicMirror. A video implementation of the same can be seen here: 

Resources:

Continue ReadingHotword Detection on SUSI MagicMirror with Snowboy

Using API Blueprint with Yaydoc

As part of extending the capability of Yaydoc to document APIs, this week we integrated API Blueprint with Yaydoc. Now we can parse apib files and add the parsed content to the generated documentation. From the official Homepage of API Blueprint,

API Blueprint is simple and accessible to everybody involved in the API lifecycle. Its syntax is concise yet expressive. With API Blueprint you can quickly design and prototype APIs to be created or document and test already deployed mission-critical APIs. It is a documentation-oriented web API description language. The API Blueprint is essentially a set of semantic assumptions laid on top of the Markdown syntax used to describe a web API.

To Integrate API Blueprint with Yaydoc, we used an sphinx extension named sphinxcontrib-apiblueprint. This extension can directly translate text in API Blueprint format into docutils nodes. The advantage with this approach as compared to using tools like aglio is that the generated html fits in nicely with the already existent theme. Though we may in future provide ability to generate html using tools like aglio if the user prefers. Adding an extension to sphinx is very easy. In the conf.py template, we added the extension to the already enabled list of extensions.

extensions += [‘sphinxcontrib.apiblueprint’]

The above extension provides a directive apiblueprint which can be then used to include apib files. The directive is very similar to the built in include directive. The difference is just that it should be only be used to include files in API Blueprint format. You can see an example below of how to use this directive.

.. apiblueprint:: <path to apib file>

Although this is enough for projects which use the ResT markup format, This cannot be used with projects using markdown as the primary markup format, since markdown doesn’t support the concept of directives. To solve this, we used the eval_rst block provided by recommonmark in Yaydoc. It allows users to embed valid ReST within markdown and recommonmark will properly parse the embedded text as ReST. Now a user can use this to use directives within markdown. You can see an example below.

```eval_rst
.. apiblueprint:: <path to apib file>
```

In order to implement this, we used the AutoStructify class provided by recommonmark. Here’s a snippet from our conf.py template. Note that this does have far reaching effects. Now users would be able to use this to add constructs like toctree in markdown which wasn’t possible before.

from recommonmark.transform import AutoStructify

def setup(app):
    app.add_config_value('recommonmark_config', {
    'enable_eval_rst': True,
    }, True)
    app.add_transform(AutoStructify)

Let’s see all of this in action. Here’s a preview of a generated documentation with API Blueprint using Yaydoc.

Resources

Continue ReadingUsing API Blueprint with Yaydoc

Implementing Speech to Text for Chrome in SUSI Web Chat

SUSI Web Chat now replies to voice inputs. To achieve this, I made use of the Web Speech API. The voice input saves one from the pain of typing and it’s a much needed feature for the Web Chat and to maintain the similarity with the other SUSI Android and SUSI iOS clients.

To test the feature out in SUSI Web Chat, click on the microphone icon beside the text area on chat.susi.ai.

Say the message once the dialog appears, and you will see the message being sent to the Chat List rendered in text.

Let’s achieve the same result following the steps below.

  1. First, initialize the class Voice Recognition with defaults for the Speech Recognition, for that we create a file VoiceRecognition.js
  1. We first initialize the Speech Recognition API with the window object.
  2. We warn the User with a console message if there is no Speech Recognition API available.
  3. If it’s available call the recognition function using the following line

this.recognition = this.createRecognition(SpeechRecognition)

// Initialise the Speech recognition API
const SpeechRecognition = window.SpeechRecognition
      || window.webkitSpeechRecognition
      || window.mozSpeechRecognition
      || window.msSpeechRecognition
      || window.oSpeechRecognition
    // Warn the user if not available otherwise call the createRecognition function
    if (SpeechRecognition != null) {
      this.recognition = this.createRecognition(SpeechRecognition)
    } else {
      console.warn('The current browser does not support the SpeechRecognition API.');
    }
  }
  1. Then we write the createRecognition function
  1. We set our defaults first as “continuous  – true, interimResults – false, and language – ‘en-US’ ”
  2. We pass these options to the recognition object that we created in the above step and finally return the recognition object.
createRecognition = (SpeechRecognition) => {
    const defaults = {
      continuous: true,
      interimResults: false,
      lang: 'en-US'
    }

    const options = Object.assign({}, defaults, this.props)

    let recognition = new SpeechRecognition()

    recognition.continuous = options.continuous
    recognition.interimResults = options.interimResults
    recognition.lang = options.lang

    return recognition
  }
  1. Initialize all the helper functions to be passed as props.
  1. start – This method starts the recognition and invokes the Mic of the browser. It also checks if the browser has the access to the user’s Mic.
  2. stop – Stop method closes the Mic and returns the audio captured so far.
  3. abort – Abort method stops the SpeechRecognition service.
  4. onspeechend – This method is called if there is any inactivity and there is no voice input. Hence, stops the recognition service.
  5. componentWillReceiveProps – This method waits for the stop method and calls it when it has received the stop object.
  6. componentWIllUnmount – This method is invoked just before the component is about to unmount and therefore its function is to abort the Speech Recognition Service
  7. render –  We return null as there is nothing to return in this component and all the converted text of the captured Speech will be sent to the parent element.
start = () => {
    this.recognition.start()
  }

  stop = () => {
    this.recognition.stop()
  }

  abort = () => {
    this.recognition.abort()
  }
  onspeechend = () => {
    console.log('no sound detected');
    this.recognition.stop()
  }

  componentWillReceiveProps ({ stop }) {
    if (stop) {
      this.stop()
    }
  }

  componentWillUnmount () {
    this.abort()
  }

  render () {
    return null
  }
  1. Add event listeners to start and stop functions inside componentDidMount() to ensure every action that we want to perform from the parent element is after the component has successfully mounted itself.
  1. start – The start method is set with an action start so that we can pass the required action name to the VoiceRecognition component that we created
  2. end – The end method similarly is set with an action end
  3. After setting up the actions we finally call the bindResult function with the result that we received.
componentDidMount () {
    const events = [
      { name: 'start', action: this.props.onStart },
      { name: 'end', action: this.props.onEnd },
      { name: 'onspeechend', action: this.props.onspeechend }
    ]

    events.forEach(event => {
      this.recognition.addEventListener(event.name, event.action)
    })

    this.recognition.addEventListener('result', this.bindResult)

    this.start()
  }
  1. Bind the result and send it as the props to the parent element.
  2. Combine all interim results of the recognition and send it to the onResult function as finalTranscript
  3. The function bindResult – The function bindResult does all the binding of the interim results that we received and output a final result as finalTranscript.
  4. Lastly, we add the prop validations to ensure the correct props are being passed to our VoiceRecognition component.
// bindResult function
 bindResult = (event) => {
    let interimTranscript = ''
    let finalTranscript = ''
   // Bind all the results to finalTranscript
    for (let i = event.resultIndex; i < event.results.length; ++i) {
      if (event.results[i].isFinal) {
        finalTranscript += event.results[i][0].transcript
      } else {
        interimTranscript += event.results[i][0].transcript
      }
    }

    this.props.onResult({ interimTranscript, finalTranscript })
  }
// Add Prop Validations
VoiceRecognition.propTypes = {
  onStart: PropTypes.func,
  onEnd : PropTypes.func,
  onResult: PropTypes.func,
  Onspeechend: PropTypes.func,
  continuous: PropTypes.bool,
  lang: PropTypes.string,
  stop: PropTypes.bool
};
// Finally export the VoiceRecognition Component
export default VoiceRecognition
  1. Lastly, call the VoiceRecogntion component and pass the props from the MessageComposer Section to it in the following way.
  1. Initialize the default state in the constructor inside this.state
this.state = {
      text: '',
      start: false, // Starting the VoiceRecognition
      stop: false, // Stop the VoiceRecognition
      open: false, // Maintain the modal state
      result:'' // Maintain the result state
    };
  1. onStart function to call the VoiceRecognition component only when the Mic Button is pressed.
  2. onEnd to end the Speech Recognition service.
  3. onResult to send the message through the Actions.createMessage() function
onResult = ({interimTranscript,finalTranscript }) => {
    let result = interimTranscript;
    let voiceResponse = false;
    this.setState({result:result});
    if(finalTranscript) {
      result = finalTranscript;
      this.setState({
      start: false,
      result:result,
      stop: false,
      open:false,
      animate:false
      });
      if(this.props.speechOutputAlways || this.props.speechOutput){
        voiceResponse = true;
      }
      Actions.createMessage(result, this.props.threadID, voiceResponse);
      setTimeout(()=>this.setState({result: ''}),400);
      this.Button = <Mic />
    }
  }
  1. Fire the component based on the value of start variable and pass the requisite props as given below in the code.
// Only when the start is ‘true’ call the VoiceRecognition component

    {this.state.start && (
          <VoiceRecognition
            onStart={this.onStart}
            onEnd={this.onEnd}
            onResult={this.onResult}
            continuous={true}
            lang="en-US"
            stop={this.state.stop}
          />
        )}

 

  1. Update the text in the “Speak Now” Dialog to show the user the Speech to Text conversion
  1. Update the text in the Modal when it is converted from Speech to Text, i.e. when we set the state of the result variable.
{this.state.result !=='' ? this.state.result :
          'Speak Now...'}

To get access to the full code, go to the repository https://github.com/fossasia/chat.susi.ai

Resources

Continue ReadingImplementing Speech to Text for Chrome in SUSI Web Chat

Implementing Text to Speech on SUSI Web Chat

SUSI Web Chat now gives voice replies while chatting with it similar to SUSI Android and SUSI iOS clients. To test the Voice Output on Chrome,

  1. Visit chat.susi.ai
  2. Click on the Mic input button.
  3. Say something using the Mic when the Speak Now View appears

The simplest way to add text-to-speech to your website is by using the official Speech API currently available for Chrome Browser. The following steps help to achieve it in ReactJS.

  1. Initialize state defaults in a component named, VoicePlayer.
const defaults = {
      text: '',
      volume: 1,
      rate: 1,
      pitch: 1,
      lang: 'en-US'
    } 

There are two states which need to be maintained throughout the component and to be passed as such in our props which will maintain the state. Thus our state is simply

this.state = {
      started: false,
      playing: false
    }
  1. Our next step is to make use of functions of the Web Speech API to carry out the relevant processes.
  1. speak() – window.speechSynthesis.speak(this.speech) – Calls the speak method of the Speech API
  2. cancel() – window.speechSynthesis.cancel() – Calls the cancel method of the Speech API
  1. We then use our component helper functions to assign eventListeners and listen to any changes occurring in the background. For this purpose we make use of the functions componentWillReceiveProps(), shouldComponentUpdate(), componentDidMount(), componentWillUnmount(), and render().
  1. componentWillReceiveProps() – receives the object parameter {pause} to listen to any paused action
  2. shouldComponentUpdate() simply returns false if no updates are to be made in the speech outputs.
  3. componentDidMount() is the master function which listens to the action start, adds the eventListener start and end, and calls the speak() function if the prop play is true.
  4. componentWillUnmount() destroys the speech object and ends the speech.

Here’s a code snippet for Function componentDidMount()

componentDidMount () {
  
    const events = [
      { name: 'start', action: this.props.onStart }
    ]
    // Adding event listeners
    events.forEach(e => {
      this.speech.addEventListener(e.name, e.action)
    })

    this.speech.addEventListener('end', () => {
      this.setState({ started: false })
      this.props.onEnd()
    })

    if (this.props.play) {
      this.speak()
    }
  }
  1. We then add proper props validation in the following way to our VoicePlayer component.
VoicePlayer.propTypes = {
  play: PropTypes.bool,
  text: PropTypes.string,
  onStart: PropTypes.func,
  onEnd: PropTypes.func
};
  1. The next step is to pass the props from a listener view to the VoicePlayer component. Hence the listener here is the component MessageListItem.js from where the voice player is initialized.
  1. First step is to initialise the state.
this.state = {
      play: false,
    }
  onStart = () => {
    this.setState({ play: true });
  }
  onEnd = () => {
    this.setState({ play: false });
  }
  1. Next, we set play to true when we want to pass the props and the text which is to be said and append it to the message lists which have voice set as `true`
 { this.props.message.voice &&
      (<VoicePlayer
             play
             text={voiceOutput}
             onStart={this.onStart}
             onEnd={this.onEnd}
 />)}

Finally, our message lists with voice true will be heard on the speaker as they have been spoken on the microphone. To get access to the full code, go to the repository https://github.com/fossasia/chat.susi.ai or on our chat channel at gitter.im/fossasia/susi_webchat

Resources

  1. Speak-easy-synthesis repository http://mdn.github.io/web-speech-api/speak-easy-synthesis
  2. Web-speech-api repository https://github.com/mdn/web-speech-api/
Continue ReadingImplementing Text to Speech on SUSI Web Chat

Exporting Functions in PSLab Firmware

Computer programs consist of several hundreds line of code. Smaller programs contain around few hundreds while larger programs like PSLab firmware contains lines of code expanding over 2000 lines. The major drawback with a code with lots of lines is the difficulty to debug and hardship for a new developer to understand the code. As a solution modularization techniques can be used to have the long code broken down to small set of codes. In C programming this can be achieved using different .c files and relevant .h(header) files.

The task is tricky as the main code contains a lot of interrelated functions and variables. Lots of errors were encountered when the large code in pslab-firmware being modularized into application specific files. In a scenario like this, global variables or static variables were a much of a help.

Global Variables in C

A global variable in C is a variable whose scope spans wide across the entire program. Updation to the variable will be reflected everywhere it is used.

extern TYPE VARIABLE = VALUE;

Static Variables in C

This type of variables preserve their content regardless of the scope they are in.

static TYPE VARIABLE = VALUE;

Both the variables preserve their values but the memory usage is different depending on the implementation. This can be explained using a simple example.

Suppose a variable is required in different C files and it is defined in one of the header files as a local variable. The header file is then added to several other c files. When the program is compiled the compiler will create several copies of the same variable which will throw a compilation error as variable is already declared. In PSLab firmware, a variable or a method from one library has a higher probability of it being used in another one or many libraries.

This issue can be addressed in two different ways. They are by using static variables or global variables. Depending on the selection, the implementation is different.

The first implementation is using static variables. This type of variables at the time of compilation, create different copies of himself in different c files. Due to the fact that C language treats every C file as a separate program, if we include a header file with static variables in two c files, it will create two copies of the same variable. This will avoid the error messages with variables redeclared. Even though this fixes the issue in firmware, the memory allocation will be of a wastage. This is the first implementation used in PSLab firmware. The memory usage was very high due to duplicate variables taking much memory than they should take. This lead to the second implementation.

first_header.h

#ifndef FIRST_HEADER_H 
#define FIRST_HEADER_H 

static int var_1;

#endif 

/* FIRST_HEADER_H */

second_header.h

#ifndef SECOND_HEADER_H
#define SECOND_HEADER_H

static int var_1;

#endif    

/* SECOND_HEADER_H */

first_header.c

#include <stdio.h>
#include "first_header.h"
#include "second_header.h"

int main() {
    var_1 = 10;
    printf("%d", var_1);
}

The next implementation uses global variables. This type of variables need to be declared only in one header file and can be reused by declaring the header file in other c files. The global variables must be declared in a header file with the keyword extern and defined in the relevant c file once. Then it will be available throughout the application and no errors of variable redeclaration will occur while compiling. This became the final implementation for the PSLab-firmware to fix the compilation issues modularizing application specific C and header files.

Resources:

Continue ReadingExporting Functions in PSLab Firmware

Drawing Lines on Images in Phimpme

In the editing section of the Phimpme app, we want a feature which lets the user write something on the image in their own handwriting. The user can also select different colour palette available inside the app. We aligned this feature in our editor section as Phimpme Image app allows a user to use our custom camera with the large number of editing options (Enhancement, Transform, Applying Stickers, Applying filters and writing on images). In this post, I am explaining how I implemented the draw feature in Phimpme Android app.

Let’s get started

Step -1: Create bitmap of Image and canvas to draw

The first step is to create a bitmap of Image on which you want to draw lines. Now we need a canvas to draw and canvas requires bitmap to work. So in this step, we will create a bitmap and new canvas to draw the line.

BitmapFactory.Options bmOptions = new BitmapFactory.Options();

Bitmap bitmap =  BitmapFactory.decodeFile(ImagePath, bmOptions);

Canvas canvas = new Canvas(bitmap);

Once the canvas is initialized, we use paint class to draw on the canvas.

Step-2: Create Paint class to draw on canvas

In this step, we will create a new Paint class with some properties like colour, Stroke Width and Coordinate where we want to draw a line and it can be done by using the following code.

Paint paint = new Paint();

paint.setColor(Color.rgb(255, 153, 51));

paint.setStrokeWidth(10);

int startx = 50;

int starty = 90;

int endx = 150;

int endy = 360;

canvas.drawLine(startx, starty, , endy, paint);

The above code will result in a straight line on the canvas like the below screenshot.

Step-3: Add support to draw on touch dynamically

In step 2 we have added the feature to draw a straight line, but what if the user wants to draw the lines on canvas with on touch event. So to achieve this we have to apply onTouchListener and applying coordinates dynamically and it can be done by using the following code.

@Override

public boolean onTouchEvent(MotionEvent event) {

 boolean ret = super.onTouchEvent(event);

 float x = event.getX();

 float y = event.getY();

 switch (event.getAction()) {

     case MotionEvent.ACTION_DOWN:

         ret = true;

         last_x = x;

         last_y = y;

         break;

     case MotionEvent.ACTION_MOVE:

         ret = true;

         canvas.drawLine(last_x, last_y, x, y);

         last_x = x;

         last_y = y;

         this.postInvalidate();

         break;

     case MotionEvent.ACTION_CANCEL:

     case MotionEvent.ACTION_UP:

         ret = false;

         break;

 }

 return ret;

}

Now the above code will let you draw the lines dynamically. But, how? The answer is In on touch event, I am fetching coordinates of touch and applying it dynamically to a canvas. Now our feature is ready to draw the lines on canvas with finger touch.

Drawing line in Phimpme Editor

Step – 4: Adding eraser functionality

Till now we have added the draw on canvas functionality, but what if the user wants to erase particular area. So now we have to implement the eraser functionality. It is very simple now what we have to do again we have to draw, but we will set paint color to the transparent color. It will draw the transparent color on the existing line which results in the previous picture and it looks like we are erasing. It can be done by adding one line.

paint.setColor(eraser ? Color.TRANSPARENT : color);

Now we have done by drawing the lines on an image and erasing functionality.

Resources:

Continue ReadingDrawing Lines on Images in Phimpme

Upload image on Imgur account with user authentication in Phimpme

In Phimpme app we wanted to allow a user to upload an image into their personal Imgur account. To achieve this I have to authenticate the user Imgur account in our Phimpme android app and upload the image to their account. In this post, I will be explaining how to upload an image on Imgur from Phimpme Android app from user account using authentication token.

Let’s get started

Step – 1: Register your application on Imgur to receive your Client ID and Client Secret.

Register your application on Imgur from here and obtain the Client-Id and Client-Secret. Client-Id will be used to authenticate user account.

Step-2: Authenticate the user

Well, This is the tedious task as they haven’t provided any direct API for Java/Android or any other language to authenticate. So, I achieved it via WebView of Android, in  Phimpme app we asked user to login into their WebView and the we intercepted network calls and parsed URL to get the access token once the user logged in.

a.) Load below URL in WebView with your application’s client id.

https://api.imgur.com/oauth2/authorize?client_id=ClientId&response_type=token&state=APPLICATION_STATE

WebView imgurWebView = (WebView) findViewById(R.id.LoginWebView);
imgurWebView.setBackgroundColor(Color.TRANSPARENT);
imgurWebView.loadUrl(IMGUR_LOGIN_URL);
imgurWebView.getSettings().setJavaScriptEnabled(true);

imgurWebView.setWebViewClient(new WebViewClient() {
  @Override
  public boolean shouldOverrideUrlLoading(WebView view, String url) {

      if (url.contains(REDIRECT_URL)) {
          splitUrl(url, view);
      } else {
          view.loadUrl(url);
      }

      return true;
  }


b.)  Now, It will ask the user to enter the Imgur username and password after that it will redirect to the URL, which we added as a callback URL while adding our app in Imgur dashboard. In Phimpme I added https://org.fossaisa.phimpme as callback URL.

c.)  After logging in WebView it will redirect to the callback URL with the following parameters: access_token, refresh_token, authorization_code:

https://api.imgur.com/oauth2/authorize?client_id=YOUR_CLIENT_ID&response_type=REQUESTED_RESPONSE_TYPE&state=APPLICATION_STATE

Step -3

To obtain the access token from the right URL we have to intercept each and every URL is getting loaded in webview and detect the right URL and parse it. For which I have overridden the shouldOverrideUrlLoading of webViewclient and pass on it to splitUrl method.

private void splitUrl(String url, WebView view) {
  String[] outerSplit = url.split("\\#")[1].split("\\&");
  String username = null;
  String accessToken = null;

  int index = 0;

  for (String s : outerSplit) {
      String[] innerSplit = s.split("\\=");

      switch (index) {
          // Access Token
          case 0:
              accessToken = innerSplit[1];
              break;

          // Refresh Token
          case 3:
              refreshToken = innerSplit[1];
              break;

          // Username
          case 4:
              username = innerSplit[1];
              break;
          default:

      }

      index++;
  }

Now we have user access token of the user to access API to Imgur to do operations(read, write) for that particular user.

Note: Authentication token is valid for 28 days only, after 28 days you will have to get the new access token.

Step- 4: Upload image using authentication token

Now we are ready to upload images on Imgur on the behalf of user’s account and to upload we need to send requests to Imgur API to this URL with authentication header and body image. The body will contain the imageString, title and description.

When we post image on Imgur anonymously, we need to add header as Authorization Client-Id {client-id} but to upload image on imgur from an account, we need to add header Authorization Bearer with value equals to access-token and image data in base64 string as request body, as we did in our previous post on upload image on Imgur.

Imgur Account Image uploads

That’s it.

The problem I faced:

I have noticed that one token is valid for 28 days only, so after 28 you will need to ask the user to log in again in Imgur app.

To avoid this what I did is before 28 days I do send a request to get a new access token which is again valid for next 28 days.

Resources:

Continue ReadingUpload image on Imgur account with user authentication in Phimpme