Hotword Detection in Susi Android

Hotword detection is one of the coolest features in Android. Voice control has emerged as a popular method for interacting with smartphones and wearable devices. It allows the user to interact with the app without even touching the device in a much intuitive way. At the same time it’s difficult to implement them in the Android app and requires use of large number of resources. Let us dive deeper into this topic.

So, Firstly What is a Hotword?

Hotword is generally a phrase or a word that can be used to initiate a particular task in the app. The user needs to speak the phrase or the hotword which on detection makes a callback that can be used to carry out a particular task in the app.

Examples of hotword include “Ok Google” which is used to initiate Google assistant in the Android mobile phones.

Why is it difficult to implement hotword detection?

To enable hotword detection is a cumbersome task. There are different problems associated with it including:-

  1. Change in the accent of the speaker.
  2. Continuous task of recognizing the voice and then matching it with the hotword.
  3. The features needs to be run as a service in android which causes the drain of battery.
  4. Also it is a memory intensive task at the same time.

Let’s us look at the present techniques available for the hotword detection.

  1. The Android provides a class called as AlwaysOnHotwordDetector. This class can be used to detect the keywords inside the activity of an Android app. For more details you can visit this link.

The implementation goes like this

/**

* @param text The keyphrase text to get the detector for.

* @param locale The java locale for the detector.

* @param callback A non-null Callback for receiving the recognition events.

* @param voiceInteractionService The current voice interaction service.

* @param modelManagementService A service that allows management of sound models.

*

* @hide

*/


public AlwaysOnHotwordDetector(String text, Locale locale, Callback callback, KeyphraseEnrollmentInfo keyphraseEnrollmentInfo, IVoiceInteractionService voiceInteractionService, IVoiceInteractionManagerService modelManagementService) {

  mText = text;

  mLocale = locale;

  mKeyphraseEnrollmentInfo = keyphraseEnrollmentInfo;

  mKeyphraseMetadata = mKeyphraseEnrollmentInfo.getKeyphraseMetadata(text, locale);

  mExternalCallback = callback;

  mHandler = new MyHandler();

  mInternalCallback = new SoundTriggerListener(mHandler);

  mVoiceInteractionService = voiceInteractionService;

  mModelManagementService = modelManagementService;

  new RefreshAvailabiltyTask().execute();



}

2.  We can also detect hotword using Pocketsphinx library in Android. The library acts as a service in the Android app. It is quite efficient when it comes to battery consumption. The implementation goes like this:-    

recognizer = defaultSetup()
   .setAcousticModel(new File(assetsDir, "en-us-ptm"))
   .setDictionary(new File(assetsDir, 
"cmudict-en-us.dict"))
   .getRecognizer();
recognizer.addListener(this);



// Create keyword-activation search.
recognizer.addKeyphraseSearch(KWS_SEARCH, KEYPHRASE);

// Create grammar-based searches.
File menuGrammar = new File(assetsDir, "menu.gram");
recognizer.addGrammarSearch(MENU_SEARCH, menuGrammar);

// Next search for digits
File digitsGrammar = new File(assetsDir, "digits.gram");
recognizer.addGrammarSearch(DIGITS_SEARCH, digitsGrammar);

// Create language model search.
File languageModel = new File(assetsDir, "weather.dmp");
recognizer.addNgramSearch(FORECAST_SEARCH, languageModel);


recognizer.startListening(searchName);

Now Lets us look at how we are implementing it in Susi Android:

In Susi Android, we have used CMU’s Pocketsphinx library for the hotword detection. We are using “Hi Susi” as the hotword for the detection.

private SpeechRecognizer recognizer;

/* Keyword we are looking for to activate menu */

private static final String KEYPHRASE = "hi susi";

/* Named searches allow to quickly reconfigure the decoder */

private static final String KWS_SEARCH = "hi susi";



The function setupRecognizer is used to initialize the Recognizer and detect the keyphrase



private void setupRecognizer(File assetsDir) throws IOException {

// The recognizer can be configured to perform multiple searches

// of different kind and switch between them

recognizer = SpeechRecognizerSetup.defaultSetup()

.setAcousticModel(new File(assetsDir, "en-us-ptm"))

.setDictionary(new File(assetsDir, "cmudict-en-us.dict"))

.setRawLogDir(assetsDir)

.getRecognizer();

recognizer.addListener(this);

recognizer.addKeyphraseSearch(KWS_SEARCH, KEYPHRASE);

There is still some need of improvement towards this approach in Susi app as the success rate of detection of keyword is not very high. We can overcome this by training the model or having a set of strings similar to that of “Hi Susi” such as “Hii Susi” or “Hi Sushi”. This will increase the rate of detection.  

mayank408

Android Developer at Fossasia

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.