Hotword Detection in SUSI Android App using Snowboy

Hotword Detection is as cool as it sounds. What exactly is hotword detection? Hotword detection is a feature in which a device gets activated when it listens to a specific word or a phrase. You must have said “OK Google” or “Hey Cortana” or “Siri” or “Alexa” at least once in your lifetime. These all are hotwords which trigger the specific action attached to them. That specific action can be anything.

Implementing hotword detection from scratch in SUSI Android is not an easy task. You have to define language model, train the model and do various other processes before implementing it in Android. In short, not feasible to implement that along with the code of our Android app. There are many open source projects on hotword detection and speech recognition. They already have done what we need and we can make use of it. One such project is Snowboy. According to Snowboy GitHub repo “Snowboy is a DNN based hotword and wake word detection toolkit.”

Img src: https://snowboy.kitt.ai/

In SUSI Android App, we have used Snowboy for hotword detection with hotword as “susi” (pronounced as ‘suzi’). In this blog, I will tell you how Hotword detection is implemented in SUSI Android app. So, you can just follow the steps and you will be able to implement it in your application too or if you want to contribute in SUSI android app, it may help you a little in knowing the codebase better.

Pre Processing before Implementation

1. Generating Hotword Model

The start of implementation of hotword detection begins with creating a hotword model from snowboy website https://snowboy.kitt.ai/dashboard . Just log in and search for susi and then train it by saying “susi” thrice and download the susi.pmdl file.

There are two types of models:

  1. .pmdl : Personal Model
  2. .umdl : Universal Model

The personal model is specifically trained for you and is instantly available for you to download once you train the hotword by your voice. On the other hand, the Universal model is trained by minimum 500 hundred people and is only available once it is trained. So, we are going to use personal model for now since training of universal model is not yet completed.

Img src: https://snowboy.kitt.ai/

2. Adding some predefined native binary files in your app.

Once you have downloaded the susi.pmdl file and you need to copy some already written native binary file in your app.

    1. In your assets folder, make a directory named snowboy and add your downloaded susi.pmdl file along with this file in it.
    2. Copy this folder and add it in your  /app/src/main/java folder as it is. These are autogenerated swig files. So, don’t change it unless you know what you are doing.
    3. Also, create a new folder in your /app/src/main folder called jniLibs and add these files to it.

Implementation in SUSI Android App

Check out the implementation of Hotword detection in SUSI Android App here

You now have everything ready. Now you just need to implement some code in your MainActivity and you are good to go. Initiate Hotword detection in the app by using below written code snippet. The call initHotword() method from your onCreate method in MainActivity. Make sure you have given permission to RECORD_AUDIO and WRITE_EXTERNAL_STORAGE.

private void initHotword() {
   if (ActivityCompat.checkSelfPermission(this,
           Manifest.permission.WRITE_EXTERNAL_STORAGE) == PackageManager.PERMISSION_GRANTED &&
       ActivityCompat.checkSelfPermission(this,
           Manifest.permission.RECORD_AUDIO) == PackageManager.PERMISSION_GRANTED) {
       AppResCopy.copyResFromAssetsToSD(this);

       recordingThread = new RecordingThread(new Handler() {
           @Override
           public void handleMessage(Message msg) {
               MsgEnum message = MsgEnum.getMsgEnum(msg.what);
               switch(message) {
                   case MSG_ACTIVE:
                       //HOTWORD DETECTED. NOW DO WHATEVER YOU WANT TO DO HERE
                       break;
                   case MSG_INFO:
                       break;
                   case MSG_VAD_SPEECH:
                       break;
                   case MSG_VAD_NOSPEECH:
                       break;
                   case MSG_ERROR:
                       break;
                   default:
                       super.handleMessage(msg);
                       break;
               }
           }
       }, new AudioDataSaver());
   }
}

In the above code under case, MSG_ACTIVE add your code which needs to be executed once hotword is detected. Now, hotword detection is initiated and you just have to start recording and stop recording whenever you want.

To start recording : (can be written in onStart() method)

if(recordingThread !=null && !isDetectionOn) {
    recordingThread.startRecording();
    isDetectionOn = true;
}

To stop recording : (can be written in onStop() method)

if(recordingThread !=null && isDetectionOn){
   recordingThread.stopRecording();
   isDetectionOn = false;
}

So, this this is the simple process to add hotword detection feature in your app.

Though this is great and will word wonderfully for you but it has a little problem and a solution which is discussed below.

Problems for now and future steps

As mentioned above, the susi.pmdl model is a personal model and only trained for your voice. So, it will work wonderfully for you and for people with a similar voice like you but not for everyone. There are two ways to make this solve this problem:

  1. Replace susi.pmdl with susi.umdl: This is very easy way to fix this problem. But to get the susi.umdl file at least 500 people have to train susi hotword on snowboy website. Once that target is achieved, you can download susi.umdl file and replace it with susi.pmdl file and you are good to go.
  2. Train and obtain susi.pmdl file for each user: Right now you trained and used your own personal model for everyone but there is a way you can use every user’s own personal model in the app. Now hotword will be trained by the user himself in the app. Snowboy provides a training API http://docs.kitt.ai/snowboy/#api-v1-train to train the hotword and getting personal model. Quoting snowboy docs “You can define truly customized hotword for each of your end customer. Just ask them to say the hotword 3 times and a model will be trained on the fly!”

Resources

  1. Official docs of Snowboy hotword detection http://docs.kitt.ai/snowboy/#introduction
  2. Code for snowboy hotword detection. It contains readme files which give very nice description about the project. https://github.com/kitt-ai/snowboy
  3. This readme file of snowboy which guides on how to set up snowboy in an android app https://github.com/Kitt-AI/snowboy/blob/master/examples/Android/README.md