Hotword Detection with Pocketsphinx for SUSI.AI

Susi has many apps across all the major platforms. Latest addition to them is the Susi Hardware which allows you to setup Susi on a Hardware Device like Raspberry Pi.

Susi Hardware was able to interact with a push of a button, but it is always cool and convenient to call your assistant anytime rather than using a button.

Hotword Detection helps achieve that. Hotword Detection involves running a process in the background that continuously listens for voice. On noticing an utterance, we need to check whether it contains desired word. Hotword Detection and its integration with Susi AI can be explained using the diagram below:

 

What is PocketSphinx?

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop.

PocketSphinx is free and open source software. PocketSphinx has various applications but we utilize its power to detect a keyword (say Hotword) in a verbally spoken phrase.

Official Github Repository: https://github.com/cmusphinx/pocketsphinx

Installing PocketSphinx

We shall be using PocketSphinx with Python. Latest version on it can be installed by

pip install pocketsphinx

If you are using a Raspberry Pi or ARM based other board, Python 3.6 , it will install from sources by the above step since author doesn’t provide a Python Wheel. For that, you may need to install swig additionally.

sudo apt install swig

How to detect Hotword with PocketSphinx?

PocketSphinx can be used in various languages. For Susi Hardware, I am using
Python 3.

Steps:

      • Import PyAudio and PocketSphinx
        from pocketsphinx import *
        import pyaudio
        

         

      • Create a decoder with certain model, we are using en-us model and english us default dictionary. Specify a keyphrase for your application, for Susi AI , we are using “Susi” as Hotword
        pocketsphinx_dir = os.path.dirname(pocketsphinx.__file__)
        model_dir = os.path.join(pocketsphinx_dir, 'model')
        
        config = pocketsphinx.Decoder.default_config()
        config.set_string('-hmm', os.path.join(model_dir, 'en-us'))
        config.set_string('-keyphrase', 'susi')
        config.set_string('-dict', os.path.join(model_dir, dict_name))
        config.set_float('-kws_threshold', self.threshold)

         

      • Start a PyAudio Stream from Microphone Input
        p = pyaudio.PyAudio()
        
        stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=20480)
        stream.start_stream()

         

      • In a forever running loop, read from stream and process buffer in chunks of 1024 frames if it is not empty.
        buf = stream.read(1024)
        if buf:
            decoder.process_raw(buff)
        else:
            break;
        

         

      • Check for hotword and start speech recognition if hotword detected. After returning from the method, start detection again.

        if decoder.hyp() is not None:
            print("Hotword Detected")
            decoder.end_utt()
            start_speech_recognition()
            decoder.start_utt()
        

         

      • Run the app, if detection doesn’t seem to work well, adjust kws_threshold in step 2 to give optimal results.

In this way, Hotword Detection can be added to your Python Project. You may also develop some cool hacks with our AI powered Assistant Susi by Voice Control.
Check repository for more info: https://github.com/fossasia/susi_hardware

betterclever

GSoC Student Developer at FOSSASIA

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.