Implementing SUSI Linux App as a Finite State Machine

SUSI Linux app provides access to SUSI on Linux distributions on desktop as well as hardware devices like Raspberry Pi. It is a headless client that can be used to interact to SUSI via voice only. As more and more features like multiple hotword detection support and wake button support was added to SUSI Linux, the code became complex to understand and manage. A system was needed to model the app after. Finite State Machine is a perfect approach for such system.

The Wikipedia definition of State Machine is

It is an abstract machine that can be in exactly one of a finite number of states at any given time. The FSM can change from one state to another in response to some external inputs; the change from one state to another is called a transition.”

This means that if you can model your app into a finite number of states, you may consider using the State Machine implementation.

State Machine implementation has following advantages:

  • Better control over the working of the app.
  • Improved Error handling by making an Error State to handle errors.
  • States work independently which helps to modularize code in a better form.

To begin with, we declare an abstract State class. This class declares all the common properties of a state and transition method.

from abc import ABC, abstractclassmethod
import logging

class State(ABC):
   def __init__(self, components):
       self.components = components
       self.allowedStateTransitions = {}

   @abstractclassmethod
   def on_enter(self, payload=None):
       pass

   @abstractclassmethod
   def on_exit(self):
       pass

   def transition(self, state, payload=None):
       if not self.__can_transition(state):
           logging.warning("Invalid transition to State{0}".format(state))
           return

       self.on_exit()
       state.on_enter(payload)

   def __can_transition(self, state):
       return state in self.allowedStateTransitions.values()

We declared the on_enter() and on_exit() abstract method. These methods are executed on the entering and exiting a state respectively. The task designated for the state can be performed in the on_enter() method and it can free up resources or stop listening to callbacks in the on_exit() method. The transition method is to transition between one state to another. In a state machine, a state can transition to one of the allowed states only. Thus, we check if the transition is allowed or not before continuing it. The on_enter() and transition() methods additionally accepts a payload argument. This can be used to transfer some data to the state from the previous state.

We also added the components property to the State. Components store the shared components that can be used across all the State and are needed to be initialized only once. We create a component class declaring all the components that are needed to be used by states.

class Components:

   def __init__(self):
       recognizer = Recognizer()
       recognizer.dynamic_energy_threshold = False
       recognizer.energy_threshold = 1000
       self.recognizer = recognizer
       self.microphone = Microphone()
       self.susi = susi
       self.config = json_config.connect('config.json')

       if self.config['hotword_engine'] == 'Snowboy':
           from main.hotword_engine import SnowboyDetector
           self.hotword_detector = SnowboyDetector()
       else:
           from main.hotword_engine import PocketSphinxDetector
           self.hotword_detector = PocketSphinxDetector()

       if self.config['wake_button'] == 'enabled':
           if self.config['device'] == 'RaspberryPi':
               from ..hardware_components import RaspberryPiWakeButton
               self.wake_button = RaspberryPiWakeButton()
           else:
               self.wake_button = None
       else:
           self.wake_button = None

Now, we list out all the states that we need to implement in our app. This includes:

  • Idle State: App is listening for Hotword or Wake Button.
  • Recognizing State: App actively records audio from Microphone and performs Speech Recognition.
  • Busy State: SUSI API is called for the response of the query and the reply is spoken.
  • Error State: Upon any error in the above state, control transfers to Error State. This state needs to handle the speak the correct error message and then move the machine to Idle State.

Each state can be implemented by inheriting the base State class and implementing the on_enter() and on_exit() methods to implement the correct behavior.

We also declare a SUSI State Machine class to store the information about current state and declare the valid transitions for all the states.

class SusiStateMachine:
   def __init__(self):
       super().__init__()
       components = Components()
       self.__idle_state = IdleState(components)
       self.__recognizing_state = RecognizingState(components)
       self.__busy_state = BusyState(components)
       self.__error_state = ErrorState(components)
       self.current_state = self.__idle_state

       self.__idle_state.allowedStateTransitions = \
           {'recognizing': self.__recognizing_state, 'error': self.__error_state}
       self.__recognizing_state.allowedStateTransitions = \
           {'busy': self.__busy_state, 'error': self.__error_state}
       self.__busy_state.allowedStateTransitions = \
           {'idle': self.__idle_state, 'error': self.__error_state}
       self.__error_state.allowedStateTransitions = \
           {'idle': self.__idle_state}

       self.current_state.on_enter(payload=None)

We also set Idle State as the current State of the System. In this way, State Machine approach is implemented in SUSI Linux.

Resources:

 

Continue ReadingImplementing SUSI Linux App as a Finite State Machine

Managing States in SUSI MagicMirror Module

SUSI MagicMirror Module is a module for MagicMirror project by which you can use SUSI directly on MagicMirror. While developing the module, a problem I faced was that we need to manage the flow between the various stages of processing of voice input by the user and displaying SUSI output to the user. This was solved by making state management flow between various states of SUSI MagicMirror Module namely,

  • Idle State: When SUSI MagicMirror Module is actively listening for a hotword.
  • Listening State: In this state, the user’s speech input from the microphone is recorded to a file.
  • Busy State: The user has finished speaking or timed out. Now, we need to transcribe the audio spoken by the user, send the response to SUSI server and speak out the SUSI response.

The flow between these states can be explained by the following diagram:

As clear from the above diagram, transitions are not possible from a state to all other states. Only some transitions are allowed. Thus, we need a mechanism to guarantee only allowed transitions and ensure it triggers on the right time.

For achieving this, we first implement an abstract class State with common properties of a state. We store the information whether a state can transition into some other state in a map allowedTransitions which maps state names “idle”, “listening” and “busy” to their corresponding states. The transition method to transition from one state to another is implemented in the following way.

protected transition(state: State): void {
   if (!this.canTransition(state)) {
       console.error(`Invalid transition to state: ${state}`);
       return;
   }

   this.onExit();
   state.onEnter();
}

private canTransition(state: State): boolean {
   return this.allowedStateTransitions.has(state.name);
}

Here we first check if a transition is valid. Then we exit one state and enter into the supplied state.  We also define a state machine that initializes the default state of the Mirror and define valid transitions for each state. Here is the constructor for state machine.

constructor(components: IStateMachineComponents) {
        this.idleState = new IdleState(components);
        this.listeningState = new ListeningState(components);
        this.busyState = new BusyState(components);

        this.idleState.AllowedStateTransitions = new Map<StateName, State>([["listening", this.listeningState]]);
        this.listeningState.AllowedStateTransitions = new Map<StateName, State>([["busy", this.busyState], ["idle", this.idleState]]);
        this.busyState.AllowedStateTransitions = new Map<StateName, State>([["idle", this.idleState]]);

        this.currentState = this.idleState;
        this.currentState.onEnter();
}

Now, the question arises that how do we detect when we need to transition from one state to another. For that we subscribe on the Snowboy Detector Observable. We are using Snowboy library for Hotword Detection. Snowboy detects whether an audio stream is silent, has some sound or whether hotword was spoken. We bind all this information to an observable using the ReactiveX Observable pattern. This gives us a stream of events to which we can subscribe and get the results. It can be understood in the following code snippet.

detector.on("silence", () => {
   this.subject.next(DETECTOR.Silence);
});

detector.on("sound", () => {});

detector.on("error", (error) => {
   console.error(error);
});

detector.on("hotword", (index, hotword) => {
   this.subject.next(DETECTOR.Hotword);
});
public get Observable(): Observable<DETECTOR> {
   return this.subject.asObservable();
}

Now, in the idle state, we subscribe to the values emitted by the observable of the detector to know when a hotword is detected to transition to the listening state. Here is the code snippet for the same.

this.detectorSubscription = this.components.detector.Observable.subscribe(
   (value) => {
   switch (value) {
       case DETECTOR.Hotword:
           this.transition(this.allowedStateTransitions.get("listening"));
           break;
   }
});

In the listening state, we subscribe to the states emitted by the detector observable to find when silence is detected so that we can stop recording the audio stream for processing and move to busy state.

this.detectorSubscription = this.components.detector.Observable.subscribe(
   (value) => {
   switch (value) {
       case DETECTOR.Silence:
           record.stop();
           this.transition(this.allowedStateTransitions.get("busy"));
           break;
   }
});

The task of speaking the audio and displaying results on the screen is done by a renderer. The communication to renderer is done via a RendererCommunicator object using a notification system. We also bind its events to an observable so that we know when SUSI has finished speaking the result. To transition from busy state to idle state, we subscribe to renderer observable in the following manner.

this.rendererSubscription = this.components.rendererCommunicator.Observable.subscribe((type) => {
   if (type === "finishedSpeaking") {
       this.transition(this.allowedStateTransitions.get("idle"));
   }
});

In this way, we transition between various states of MagicMirror Module for SUSI in an efficient manner.

Resources

Continue ReadingManaging States in SUSI MagicMirror Module