voice-search – blog.fossasia.org

Adding Speech Component in Loklak Search

Post author:simsausaurabh
Post published:June 8, 2018
Post category:FOSSASIA GSoC loklak
Post comments:0 Comments

Speech recognition service for voice search is already embedded in Loklak. Now the idea is to use this service and create a new separate component for voice recognition with an interactive and user friendly interface. This blog will cover every single portion of an Angular’s redux based component from writing actions and reducers to the use of created component in other required components.

Creating Action and Reducer Function

The main idea to create an Action is to control the flow of use of Speech Component in Loklak Search. The Speech Component will be called on and off based on this Action.

Here, the first step is to create speech.ts file in actions folder with the following code:

import { Action } from '@ngrx/store';

export const ActionTypes = {
   MODE_CHANGE: '[Speech] Change',
};

export class SearchAction implements Action {
   type = ActionTypes.MODE_CHANGE;

   constructor(public payload: any) {}
}

export type Actions
   = SearchAction;

In the above segment, only one action (MODE_CHANGE) has been created which is like a boolean value which is being returned as true or false i.e. whether the speech component is currently in use or not. This is a basic format to be followed in creating an Action which is being followed in Loklak Search. The next step would be to create speech.ts file in reducers folder with the following code:

import { Action } from '@ngrx/store';
import * as speech from '../actions/speech';
export const MODE_CHANGE = 'MODE_CHANGE';

export interface State {
   speechStatus: boolean;
}
export const initialState: State = {
   speechStatus: false
};
export function reducer(state: State = initialState,
   action: speech.Actions): State {
        switch (action.type) {
            case speech.ActionTypes.MODE_CHANGE: {
                const response = action.payload;
                return Object.assign({}, state,
                {speechStatus: response});
       }
       default: {
           return state;
       }
   }
}
export const getspeechStatus = (state: State) =>
    state.speechStatus;

It follows the format of reducer functions created in Loklak Search. Here, the main key point is the state creation and type of value it is storing i.e. State is containing a speechStatus of type boolean. Defining an initial state with speechStatus value false (Considering initially the Speech Component will not be in use). The reducer function a new state by toggling the input state based on the type of Action created above and it returns the input state by default. At last wrapping the state as a function and returning the state’s speechStatus value.

Third and last step in this section would be to create a selector for the above reducer function in the root reducer index file.

Import and add speech from speech reducer file into the general state in root reducer file. And at last export the created selector function for speech reducer.

import * as fromSpeech from './speech';
export interface State {
   ...
   speech: fromSpeech.State;
}
export const getSpeechState = (state: State) =>
    state.speech;
export const getspeechStatus = createSelector(
    getSpeechState, fromSpeech.getspeechStatus);

Creating Speech Component

Now comes the main part to create and define the functioning of Speech Component. For creating the basic Speech Component, following command is used:

ng generate component app/speech --module=app

It will automatically create and provide Speech Component in app.module.ts. The working structure of Speech Component has been followed as of Google’s voice recognition feature for voice searching. Rather than providing description of each single line of code the following portion will cover the main code responsible for the functioning of Speech Component.

Importing and defining speech service in constructor:

import {
    SpeechService
} from '../services/speech.service';
constructor(
       private speech: SpeechService,
       private store: Store<fromRoot.State>,
       private router: Router
   ) {
       this.resultspage =this.router.url
       .toString().includes('/search');
       if (this.resultspage) {
           this.shadowleft = '-103px';
           this.shadowtop = '-102px';
       }
       this.speechRecognition();
}
speechRecognition() {
    this.speech.record('en_US').subscribe(voice =>
        this.onquery(voice));
}

When the Speech Component is called, speechRecognition() method will start recording speech (It will use the record() method from speech service to record the user voice).

For fluctuating border height and color of voice search icon, a resettimer() method is created.

randomize(min, max) {
    let x;
    x = (Math.random() * (max - min) + min);
    return x;
}
resettimer(recheck: boolean = false) {
    this.subscription.unsubscribe();
    this.timer = Observable.timer(0, 100);
    this.subscription = this.timer.subscribe(t => {
    this.ticks = t;
        if (t % 10 === 0 && t <= 20) {
            this.buttoncolor = '#f44';
            this.miccolor = '#fff';
            this.borderheight =
                this.randomize(0.7, 1);
            if (this.resultspage) {
                this.borderheight =
                    this.randomize(0.35, 0.5);
            }
            if (!recheck) {
                this.resettimer(true);
            }
        }
        if (t === 20) {
            this.borderheight = 0;
        }
        if (t === 30) {
            this.subscription.unsubscribe();
            this.store.dispatch(new speechactions
            .SearchAction(false));
        }
    });
}

The randomize() method provides a random number between min and max value.

To put on check and display status as message on things like whether microphone is working, or user has spoken something, or if the speech is being recorded, based on the time elapsed in calling of speech component and actual voice recording, the following portion of code is written in ngOnInit() method.

ngOnInit() {
    this.timer = Observable.timer(1500, 2000);
    this.subscription = this.timer.subscribe(t => {
        this.ticks = t;
        if (t === 1) {
            this.message = 'Listening...';
        }
        if (t === 4) {
            this.message = 'Please check your 
            microphone and volume levels.';
            this.miccolor = '#C2C2C2';
        }
        if (t === 6) {
            this.subscription.unsubscribe();
            this.store.dispatch(new speechactions
                .SearchAction(false));
        }
    });
}

The logic can be understood as if the elapsed time is 1 sec, it means it is listening to the speaker’s voice. And if the elapsed time is 4 sec, it means there is something wrong and user will be asked to check for the microphone and volume levels. At last if it tends to 6 seconds, then the Speech Component will be called off with the dispatched Action as false which is defined above (That means it is no longer in use).

Embed Speech Component in main App Component

Now comes the last part to use the created featured component in the required place. Code below describes embedding Speech Component in App Component.

Import SpeechService and required modules.

import {
    SpeechService
} from './services/speech.service';
import { Observable } from 'rxjs/Observable';

hidespeech will be used to store the current status of Speech Component (whether its in use or not), and completeQuery$ and searchData store the voice recorded in form Observable and String. completeQuery$ is optional (If the Speech Component is unable to track voice of speaker by any means, then it will not contain any value and hence searchData will be empty).

hidespeech: Observable<any>;
completeQuery$: Observable<any>;
searchData: String;

Creating speech parameter in constructor and store the current status of speech and store it into hidespeech. Based on the subscribed value of hidespeech, speech service’s stoprecord() will be called (To stop recording when the speech recognition completes). After recording stops, store the whole query in completeQuery$.

constructor (
    private speech: SpeechService
) {
    this.hidespeech = store.select(
        fromRoot.getspeechStatus);
    this.hidespeech.subscribe(hidespeech => {
        if (!hidespeech) {
            this.speech.stoprecord();
        }
    });
    this.completeQuery$ = store.select(
        fromRoot.getQuery);
    this.completeQuery$.subscribe(data => {
        this.searchData = data;
    });
}

Add the Speech Component in app.component.html. Now the main logic of calling Speech Component will be based on the subscribed observable value of hidespeech (If false then call Speech Component else not).

<app-speech *ngIf="hidespeech|async"></app-speech>

Using Speech Component in Home and FeedHeader Component

Import Speech Service and speech Action created above, and create hidespeech to store the current status of Speech Component.

import * as speechactions from '../../actions/speech';
import {
    SpeechService
} from '../../services/speech.service';
hidespeech: Observable<boolean>;

Create speech parameter of type SpeechService and store the current status of Speech Component in hidespeech. Dispatch speechactions.SearchAction (payload as true) for inferring that the Speech Component is currently in use.

constructor(
    private speech: SpeechService
) {
    this.hidespeech = store
        .select(fromRoot.getspeechStatus);
}
speechRecognition() {
    this.store.dispatch(
        new speechactions.SearchAction(true));
}

How to use the Speech Component?

Goto Loklak and click on Voice Input Icon. It will popup a screen as below.

Now, speak something to search. E.g. Google, the screen will turn into something like below with the spelled value displayed on screen.

If something goes wrong (Microphone did not work, low volume levels or unrecognisable voice), then screen will show something like:

On successful recognition of speech, the query will be set and the results will be shown as

Similar process is being followed on results page to make a search query using voice.

Resources

Angular docs. Angular.io: Tutorial
Rishabh (2017). Rishabh.io: Create Angular 4 Components

Adding ‘Voice Search’ Feature in Loklak Search

Post author:simsausaurabh
Post published:June 5, 2018
Post category:FOSSASIA GSoC loklak
Post comments:0 Comments

It is beneficial to have a voice search feature embedded in a search engine like loklak.org based on PWA (Progressive Web Application) architecture. This will allow users to make query using voice on phone or computer. For integrating voice search, JavaScript Web Speech API (also known as webkitSpeechRecognition) is used which allows us to add speech recognition feature in any website or a web application.

Integration Process

For using webkitSpeechRecognition, a separate typescript service is being created along with its relevant unit test

speech.service.ts
speech.service.spec.ts

The main idea to implement the voice search was

To create an injectable speech service.
Write methods for starting and stopping voice record in it.
Import and inject the created speech service into root app module.
Use the created speech service into required HomeComponent and FeedHeaderComponent.

Structure of Speech Service

The speech service mainly consists of two methods

record() method which accepts lang as string input which specifies the language code for speech recording and returns a string Observable.

   record(lang: string): Observable<string> {
           return Observable.create(observe => {
                 const webkitSpeechRecognition }: IWindow = 
                        <IWindow>window;
                 this.recognition = new webkitSpeechRecognition();
                 this.recognition.continuous = true;
                 this.recognition.interimResults = true;
                 this.recognition.onresult = take => 
                 this.zone.run(()                            
                 observe.next(take
                 .results.item
                 (take.results.
                 length - 1).
                 item(0).
                 transcript);
                 this.recognition
                 .onerror = 
                 err => observe
                 .error(err);
                 this.recognition.onend 
                 = () => observe
                 .complete();
                 this.recognition.lang = 
                 lang;
                 this.recognition.start( 
             );
        });
     }

Using observe.complete() allows speech recognition action to stop when the user stops speaking.

stoprecord() method which is used to stop the current instance of recording.

 stoprecord() {
       if (this.recognition) {
           this.recognition.stop();
       }
   }

stop() method is used to stop the current instance of speech recognition.

Using Speech Service in required Component

In Loklak Search, the speech service have been included in two main components i.e. HomeComponent and FeedHeaderComponent. The basic idea of using the created speech service in these components is same.

In the TypeScript (not spec.ts) file of the two components, firstly import the SpeechService and create its object in constructor.

constructor( private speech: SpeechService ) { }

Secondly, define a speechRecognition() method and use the created instance or object of SpeechService in it to record speech and set the query as the recorded speech input. Here, default language has been set up as ‘en_US’ i.e. English.

 stoprecord() {
       if (this.recognition) {
           this.recognition.stop();
       }
   }

After user clicks on speech icon, this method will be called and the recorded speech will be set as the query and there will be router navigation to the /search page where the result will be displayed.

Resources

George Ornbo (2014). Shapeshed: The HTML5 Speech Recognition API
Glen Shires, Hans Wennborg (2012). W3C: Web Speech API Specification

Implementation of Text-To-Speech Feature In Susper

Post author:harshit98
Post published:August 10, 2017
Post category:API FOSSASIA GSoC
Post comments:0 Comments

Susper has been given a voice search feature through which it provides the user a better experience of search. We introduced to enhance the speech recognition by adding Speech Synthesis or Text-To-Speech feature. The speech synthesis feature should only work when a voice search is attempted.

The idea was to create speech synthesis similar to market leader. Here is the link to YouTube video showing the demo of the feature: Video link

In the video, it will show demo :

If a manual search is used then the feature should not work.
If voice search is used then the feature should work.

For implementing this feature, we used Speech Synthesis API which is provided with Google Chrome browser 33 and above versions.

window.speechSynthesis.speak(‘Hello world!’); can be used to check whether the browser supports this feature or not.

First, we created an interface:

interface IWindow extends Window {

  SpeechSynthesisUtterance: any;

  speechSynthesis: any;

};

Then under @Injectable we created a class for the SpeechSynthesisService.

export class SpeechSynthesisService{
  utterence: any;  constructor(privatezone:NgZone){}  speak(text:string):void{
  const{SpeechSynthesisUtterance}:IWindow=<IWindow>window;
  const{speechSynthesis }:IWindow=<IWindow>window;  this.utterence=newSpeechSynthesisUtterance();
  this.utterence.text=text;// utters text
  this.utterence.lang=‘en-US’;// default language
  this.utterence.volume=1;// it can be set between 0 and 1
  this.utterence.rate=1;// it can be set between 0 and 1
  this.utterence.pitch=1;// it can be set between 0 and 1  (windowasany).speechSynthesis.speak(this.utterence);
}// to pause the queue of utterence
pause():void{
  const{speechSynthesis}:IWindow=<IWindow>window;

const { SpeechSynthesisUtterance }: IWindow = <IWindow>window;  this.utterence=newSpeechSynthesisUtterance();
  (windowasany).speechSynthesis.pause();
 }
}

The above code will implement the feature Text-To-Speech.

The source code for the implementation can be found here: https://github.com/fossasia/susper.com/blob/master/src/app/speech-synthesis.service.ts

We call speech synthesis only when voice search mode is activated. Here we used redux to check whether the mode is ‘speech’ or not. When the mode is ‘speech’ then it should utter the description inside the infobox.

We did the following changes in infobox.component.ts:

import { SpeechSynthesisService } from ‘../speech–synthesis.service’;
speechMode: any;
constructor(private synthesis: SpeechSynthesisService) { }
this.query$ = store.select(fromRoot.getwholequery);

this.query$.subscribe(query => {

  this.keyword = query.query;

  this.speechMode = query.mode;

});

And we added a conditional statement to check whether mode is ‘speech’ or not.

// conditional statement 

// to check if mode is ‘speech’ or not

if (this.speechMode === ‘speech’) {

  this.startSpeaking(this.results[0].description);

}

startSpeaking(description) {

  this.synthesis.speak(description);

  this.synthesis.pause();

}

The source code for the implementation can be found here: https://github.com/fossasia/susper.com/commit/3624d504c4687c227016b4fea229c680ad80a613

Resources

Implementing Voice Search In Susper (in Chrome only)

Post author:harshit98
Post published:June 29, 2017
Post category:API FOSSASIA GSoC
Post comments:0 Comments

Last week @mariobehling opened up an issue to implement voice search in Susper. Google Chrome provides an API to integrate Speech recognition feature with any website. More about API can be read here: https://shapeshed.com/html5-speech-recognition-api/

The explanation might be in Javascript but it has been written following syntax of Angular 4 and Typescript. So, I created a speech-service including files:

speech-service.ts
speech-service.spec.ts

Code for speech-service.ts: This is the code which will control the working of voice search.

import { Injectable, NgZone } from ‘@angular/core’;

import { Observable } from ‘rxjs/Rx’;

interface IWindow extends Window {

  webkitSpeechRecognition: any;

}

@Injectable()

exportclassSpeechService{
constructor(private zone: NgZone) { }
record(lang: string): Observable<string> {

  return Observable.create(observe => {

    const { webkitSpeechRecognition }: IWindow = <IWindow>window;
    const recognition = new webkitSpeechRecognition();
    recognition.continuous = true;

    recognition.interimResults = true;

    recognition.onresult = take => this.zone.run(() => observe.next(take.results.item(take.results.length – 1).item(0).transcript)

);
    recognition.onerror = err =>observe.error(err);

    recognition.onend = () => observe.complete();

    recognition.lang = lang;

    recognition.start();

});

}

}

You can find more details about API following the link which I have provided above in starting. Here recognition.onend() => observe.complete() works as an important role here. Many developers forget to use it when working on voice search feature. It works like: whenever a user stops speaking, it will automatically understand that voice action has now been completed and the search can be attempted. And for this:

speechRecognition() {

  this.speech.record(‘en_US’).subscribe(voice => this.onquery(voice));

}

We have used speechRecognition() function. onquery() function is called when a query is entered in a search bar.

Default language has been set up as ‘en_US’ i.e English. We also created an interface to link it with the API which Google Chrome provides for adding voice search feature on any website.

I have also used a separate module by name NgZone. Now, what is NgZone? It is used as an injectable service for executing working inside or outside of the Angular zone. I won’t go into detail about this module much here. More about it can be found on angular-docs website.

We have also, implemented a microphone icon on search bar similar to Google. This is how Susper’s homepage looks like now:

This feature only works in Google Chrome browser and for Firefox it doesn’t. So, for Firefox browser there was no need to show ‘microphone’ icon since voice search does not work Firefox. What we did simply use CSS code like this:

@–moz–document url–prefix() {

  .microphone {

    display: none;

  }

}

@-moz-document url-prefix() is used to target elements for Firefox browser only. Hence using, this feature we made it possible to hide microphone icon from Firefox and make it appear in Chrome.

For first time users: To use voice search feature click on the microphone feature which will trigger speechRecognition() function and will ask you permission to allow your laptop/desktop microphone to detect your voice. Once allowing it, we’re done! Now the user can easily, use voice search feature on Susper to search for a random thing.