Managing States in SUSI MagicMirror Module

SUSI MagicMirror Module is a module for MagicMirror project by which you can use SUSI directly on MagicMirror. While developing the module, a problem I faced was that we need to manage the flow between the various stages of processing of voice input by the user and displaying SUSI output to the user. This was solved by making state management flow between various states of SUSI MagicMirror Module namely, Idle State: When SUSI MagicMirror Module is actively listening for a hotword. Listening State: In this state, the user’s speech input from the microphone is recorded to a file. Busy State: The user has finished speaking or timed out. Now, we need to transcribe the audio spoken by the user, send the response to SUSI server and speak out the SUSI response. The flow between these states can be explained by the following diagram: As clear from the above diagram, transitions are not possible from a state to all other states. Only some transitions are allowed. Thus, we need a mechanism to guarantee only allowed transitions and ensure it triggers on the right time. For achieving this, we first implement an abstract class State with common properties of a state. We store the information whether a state can transition into some other state in a map allowedTransitions which maps state names “idle”, “listening” and “busy” to their corresponding states. The transition method to transition from one state to another is implemented in the following way. protected transition(state: State): void { if (!this.canTransition(state)) { console.error(`Invalid transition to state: ${state}`); return; } this.onExit(); state.onEnter(); } private canTransition(state: State): boolean { return this.allowedStateTransitions.has(state.name); } Here we first check if a transition is valid. Then we exit one state and enter into the supplied state.  We also define a state machine that initializes the default state of the Mirror and define valid transitions for each state. Here is the constructor for state machine. constructor(components: IStateMachineComponents) { this.idleState = new IdleState(components); this.listeningState = new ListeningState(components); this.busyState = new BusyState(components); this.idleState.AllowedStateTransitions = new Map<StateName, State>([["listening", this.listeningState]]); this.listeningState.AllowedStateTransitions = new Map<StateName, State>([["busy", this.busyState], ["idle", this.idleState]]); this.busyState.AllowedStateTransitions = new Map<StateName, State>([["idle", this.idleState]]); this.currentState = this.idleState; this.currentState.onEnter(); } Now, the question arises that how do we detect when we need to transition from one state to another. For that we subscribe on the Snowboy Detector Observable. We are using Snowboy library for Hotword Detection. Snowboy detects whether an audio stream is silent, has some sound or whether hotword was spoken. We bind all this information to an observable using the ReactiveX Observable pattern. This gives us a stream of events to which we can subscribe and get the results. It can be understood in the following code snippet. detector.on("silence", () => { this.subject.next(DETECTOR.Silence); }); detector.on("sound", () => {}); detector.on("error", (error) => { console.error(error); }); detector.on("hotword", (index, hotword) => { this.subject.next(DETECTOR.Hotword); }); public get Observable(): Observable<DETECTOR> { return this.subject.asObservable(); } Now, in the idle state, we subscribe to the values emitted by the observable of the detector to know when a hotword…

Continue ReadingManaging States in SUSI MagicMirror Module

Hotword Detection on SUSI MagicMirror with Snowboy

Magic Mirror in the story “Snow White and the Seven Dwarfs” had one cool feature. The Queen in the story could call Mirror just by saying “Mirror” and then ask it questions. MagicMirror project helps you develop a Mirror quite close to the one in the fable but how cool it would be to have the same feature? Hotword Detection on SUSI MagicMirror Module helps us achieve that. The hotword detection on SUSI MagicMirror Module was accomplished with the help of Snowboy Hotword Detection Library. Snowboy is a cross platform hotword detection library. We are using the same library for Android, iOS as well as in MagicMirror Module (nodejs). Snowboy can be added to a Javascript/Typescript project with Node Package Manager (npm) by: $ npm install --save snowboy For detecting hotword, we need to record audio continuously from the Microphone. To accomplish the task of recording, we have another npm package node-record-lpcm16. It used SoX binary to record audio. First we need to install SoX using Linux (Debian based distributions) $ sudo apt-get install sox libsox-fmt-all Then, you can install node-record-lpcm16 package using npm using $ npm install node-record-lpcm16 Then, we need to import it in the needed file using import * as record from "node-record-lpcm16"; You may then create a new microphone stream using, const mic = record.start({ threshold: 0, sampleRate: 16000, verbose: true, }); The mic constant here is a NodeJS Readable Stream. So, we can read the incoming data from the Microphone and process it. We can now process this stream using Detector class of Snowboy. We declare a child class extending Snowboy Hotword Decoder to suit our needs. import { Detector, Models } from "snowboy"; export class HotwordDetector extends Detector { 1 constructor(models: Models) { super({ resource: `${process.env.CWD}/resources/common.res`, models: models, audioGain: 2.0, }); this.setUp(); } // other methods } First, we create a Snowboy Detector by calling the parent constructor with resource file as common.res and a Snowboy model as argument. Snowboy model is a file which tells the detector which Hotword to listen for. Currently, the module supports hotword Susi but it can be extended to support other hotwords like Mirror too. You can train the hotword for SUSI for your voice and get the latest model file at https://snowboy.kitt.ai/hotword/7915 . You may then replace the susi.pmdl file in resources folder with our own susi.pmdl file for a better experience. Now, we need to delegate the callback methods of Detector class to know about the current state of detector and take an action on its basis. This is done in the setUp() method. private setUp(): void { this.on("silence", () => { // handle silent state }); this.on("sound", () => { // handle sound detected state }); this.on("error", (error) => { // handle error }); this.on("hotword", (index, hotword) => { // hotword detected }); } If you go into the implementation of Detector class of Snowboy, it extends from NodeJS.WritableStream. So, we can pipe our microphone input read stream to Detector class and it handles all…

Continue ReadingHotword Detection on SUSI MagicMirror with Snowboy

Understanding the working of SUSI Hardware

Susi on Hardware is the latest addition to full suite of SUSI Apps. Being a hardware project, one might feel like it is too much complex, however it is not the case. The solution is being primary built on a Raspberry Pi which, however small it may be, is a computer. Most things you expect to work on a normal computer, work on Raspberry Pi as well with a few advantages being its small size and General Purpose I/O access. But it comes with caveats of an ARM CPU, which may not support all applications which are mainly targeted for x86. There are a few other development boards from Intel as well, which use x86/x64 architecture. While working on the project, I did not wanted to make it too generic for a board or set of Hardware, thus all components used were targeted to be cross-platform. Components that make Susi Hardware SUSI Server SUSI Server is the foremost important thing in any SUSI Project. SUSI Server handles all the queries by user which can be supplied using REST API and supplies answer in a nice format for clients. It also provides AAA: Authentication, Authorization and Accounting support for managing user accounts across platforms. Github Repository: https://github.com/fossasia/susi_server Susi Python Library Susi Python Library was developed along with Susi Hardware project. It can work independent of Hardware Project and can be included in any Python Project for Susi Intelligence. It provides easy access to Susi Server REST API through easy python methods. Github Repository: https://github.com/fossasia/susi_api_wrapper Python Speech Recognition Library The best advantage of using Python is that in most cases , you do not need to re-invent the wheel, some already has done the work for you. Python Speech Recognition library support for speech recognition through microphone and by a voice sample. It supports a number of Speech API providers like Google Speech API. Wit.AI, IBM Watson Speech-To-Text and a lot more. This provides free to choose any of the speech recognition providers. For now, we are using Google Speech API and IBM Watson Speech API. Pypi Package: https://pypi.python.org/pypi/SpeechRecognition/Github Repository: https://github.com/Uberi/speech_recognition PocketSphinx for Hotword Detection CMU PocketSphinx is an open-source offline speech recognition library. We have used PocketSphinx to enable hotword detection to Susi Hardware so you can interact with Susi handsfree. More information on its working can be found in my other blog post. Github Repository: https://github.com/cmusphinx/pocketsphinx Flite Speech Synthesis System CMU Flite (Festival-Lite) is a small sized , fast and open source speech synthesis engine developed by Carnegie Mellon University . More information of integration and usage in Susi can be found in my other blog post Project Website: http://www.festvox.org/flite/ The whole working of all these components together can be explained using the Diagram below.

Continue ReadingUnderstanding the working of SUSI Hardware

Hotword Detection with Pocketsphinx for SUSI.AI

Susi has many apps across all the major platforms. Latest addition to them is the Susi Hardware which allows you to setup Susi on a Hardware Device like Raspberry Pi. Susi Hardware was able to interact with a push of a button, but it is always cool and convenient to call your assistant anytime rather than using a button. Hotword Detection helps achieve that. Hotword Detection involves running a process in the background that continuously listens for voice. On noticing an utterance, we need to check whether it contains desired word. Hotword Detection and its integration with Susi AI can be explained using the diagram below:   What is PocketSphinx? PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop. PocketSphinx is free and open source software. PocketSphinx has various applications but we utilize its power to detect a keyword (say Hotword) in a verbally spoken phrase. Official Github Repository: https://github.com/cmusphinx/pocketsphinx Installing PocketSphinx We shall be using PocketSphinx with Python. Latest version on it can be installed by pip install pocketsphinx If you are using a Raspberry Pi or ARM based other board, Python 3.6 , it will install from sources by the above step since author doesn't provide a Python Wheel. For that, you may need to install swig additionally. sudo apt install swig How to detect Hotword with PocketSphinx? PocketSphinx can be used in various languages. For Susi Hardware, I am using Python 3. Steps: Import PyAudio and PocketSphinx from pocketsphinx import * import pyaudio   Create a decoder with certain model, we are using en-us model and english us default dictionary. Specify a keyphrase for your application, for Susi AI , we are using “Susi” as Hotword pocketsphinx_dir = os.path.dirname(pocketsphinx.__file__) model_dir = os.path.join(pocketsphinx_dir, 'model') config = pocketsphinx.Decoder.default_config() config.set_string('-hmm', os.path.join(model_dir, 'en-us')) config.set_string('-keyphrase', 'susi') config.set_string('-dict', os.path.join(model_dir, dict_name)) config.set_float('-kws_threshold', self.threshold)   Start a PyAudio Stream from Microphone Input p = pyaudio.PyAudio() stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=20480) stream.start_stream()   In a forever running loop, read from stream and process buffer in chunks of 1024 frames if it is not empty. buf = stream.read(1024) if buf: decoder.process_raw(buff) else: break;   Check for hotword and start speech recognition if hotword detected. After returning from the method, start detection again. if decoder.hyp() is not None: print("Hotword Detected") decoder.end_utt() start_speech_recognition() decoder.start_utt()   Run the app, if detection doesn’t seem to work well, adjust kws_threshold in step 2 to give optimal results. In this way, Hotword Detection can be added to your Python Project. You may also develop some cool hacks with our AI powered Assistant Susi by Voice Control. Check repository for more info: https://github.com/fossasia/susi_hardware

Continue ReadingHotword Detection with Pocketsphinx for SUSI.AI

Giving a Voice to Susi

Susi AI already has various apps and is available as a chatbot in various messaging platforms. We are going a step forward to make an SDK available for Susi that can be integrated on any Hardware Device (say speakers, toys, your bicycle etc - possibilities are endless ) One of the problem that I encountered while making a Prototype for the same is selecting an appropriate Text to Speech (TTS)  Engine. It was a challenge, since on platforms like Android and iOS , you may utilize TTS engines bundled with Platform easily via a platform specific API, which are well optimized and give good performance.  The same was difficult on a hardware device that can run only Linux with no TTS provided by default. Thus, I explored some possibilities eSpeak TTS: eSpeak TTS (http://espeak.sourceforge.net/) was the first option considered for the task. eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows. The major advantage of eSpeak is its small size (2MB) and small memory footprint which is advantageous in Low Memory Hardware like Orange Pi Zero or Raspberry Pi Zero. Setting up eSpeak was easy but with its advantages , there were some drawbacks too. The voice synthesis was quite robotic. Very few voices were available. Festival TTS: Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell level, though a Scheme command interpreter, as a C++ library, from Java, and an Emacs interface. Festival is free software. Festival and the speech tools are distributed under an X11-type licence allowing unrestricted commercial and non-commercial use alike. Installing Festival: On Arch Linux , it was pretty straight forward. sudo pacman -S festival There is a full wiki dedicated to it. ( https://wiki.archlinux.org/index.php/Festival ) Testing Festival Festival has an interpreter to test it out. It can be invoked using You may test out a TTS output using: festival> (SayText "Hi!! I am Susi") But the default sound in festival is still robotic and male. You don’t want your Personal Assistant to scare you out when you speak to her. Thus, I searched on what are the best female voices available for Festival. After looking at a discussion on the thread, https://ubuntuforums.org/showthread.php?t=751169 , I found that CMU-Arctic and HTS are some of the best voice sets for  Festival. In Arch Linux, additional voice packs, are supplied in two additional packages, festival-us and festival-english Installation is straightforward: sudo pacman -S festival-us festival-english Now, on festival REPL , we can test out our new voices. To see all available voices festival> (voice.list) (rab_diphone kal_diphone cmu_us_rms_cg cmu_us_awb_cg cmu_us_slt_cg) Testing out a voice festival> (voice_cmu_us_awb_cg) cmu_us_awb_cg festival> (SayText "Hi!! I am Susi") This way after testing out all voices, with many different phrases. cmu_us_slt_cg  felt like an appropriate voice. Setting Voice as Default Voice may be set as default by adding following line to .festivalrc…

Continue ReadingGiving a Voice to Susi

Deploying Susi Server on Google Cloud with Kubernetes

Susi (acronym for Scientific User Support Intelligence) is an advanced AI made by people at FOSSASIA. It is an AI made by the people and for the people. Susi is an Open Source Project under LGPL Licence. SUSI.AI already has many Skills and anyone can add new skills through simple console rules. If you want to participate in the development of the SUSI server you can start by learning to deploy it on a cloud system like Google Cloud. This way whenever you make a change to Susi Server, you can test it out on various Susi Apps instantly. Google Cloud with Kubernetes provide this ability. Let’s dig deep into what is Google Cloud Platform and Kubernetes. What is Google Cloud Platform ? Google Cloud Platform lets you build and host applications and websites, store data, and analyze data on Google’s scalable infrastructure. Google Cloud Platform (at the time of writing this article) also provides free credits worth $300 for 1 year for testing out the Platform and test your applications. What is Kubernetes ? Kubernetes is an open-source system for automatic deployment, management and scaling of containerized applications. It makes it easy to roll out updates to your application with simple commands from your development machine and scale horizontally easily by adding more clusters as demand increase. Deploying Susi Server on Kubernetes Deploying Susi Server on Kubernetes is a fairly easy task. Follow up the steps to get it running. Create a Google Cloud Account Sign up for a Google Cloud Account (https://cloud.google.com/free-trial/) and get 300$ credits for initial use. Create a New Project After successful sign up, create a new project on Google Cloud Console. Let’s name it Susi-Kubernetes .  You will be provided a ProjectID. Remember it for further reference. Install Google Cloud SDK and kubectl Go to https://cloud.google.com/sdk/ and see instructions to setup Google Cloud SDK on your respective OS. After Google Cloud SDK install, run gcloud components install kubectl This will install kubectl for interacting with Kubernetes. Login and setup project Login to your Google Cloud Account using $ gcloud auth login 2. List all the projects using $ gcloud config list project [core] project = <PROJECT_ID> 3. Select your project $ gcloud config set project <PROJECT_ID> 4. Install JDK8 for susi_server setup and set it as default. 5. Clone your fork of the Susi Server Repository $ git clone https://github.com/<your_username>/susi_server.git $ cd susi_server/ 6. Build project and run Susi Server locally $ ./gradlew build $ bin/start.sh Susi server must have been started started and web interface is accessible on http://localhost:4000 Install Docker and build Docker image for Susi Install Docker. Debian and derivatives:  sudo apt install docker Arch Linux:   sudo pacman -S docker  Build Docker Image for Susi $ docker build -t gcr.io/<Project_id>/susi:v1 . Push Image to Google Container Registry private to your project. $ gcloud docker -- push gcr.io/<Project_id>/susi:v1 Create Cluster and Deploy your Susi Server there Create Cluster. You may specify different zone, number of nodes and machine type depending upon requirement. $ gcloud container clusters create <Cluster-Name> --num-nodes…

Continue ReadingDeploying Susi Server on Google Cloud with Kubernetes