Change Text-to-Speech Voice Language of SUSI in SUSI iOS

SUSI iOS app now enables the user to change the text-to-speech voice language within the app. Now, the user can select any language of their choice from the list of 37 languages list. To change the text-to-speech voice language, go to, Settings > Change SUSI’s Voice, choose the language of your choice. Let see here how this feature implemented. Apple’s AVFoundation API is used to implement the text-to-speech feature in SUSI iOS. AVFoundation API offers 37 voice languages which can be used for text-to-speech voice accent. AVFoundation’s AVSpeechSynthesisVoice API can be used to select a voice appropriate to the language of the text to be spoken or to select a voice exhibiting a particular local variant of that language (such as Australian or South African English). To print the list of all languages offered by AVFoundation: import AVFoundation print(AVSpeechSynthesisVoice.speechVoices()) Or the complete list of supported languages can be found at Languages Supported by VoiceOver. When the user clicks Change SUSI’s voice in settings, a screen is presented with the list of available languages with the language code. Dictionary holds the list of available languages with language name and language code and used as Data Source for tableView. var voiceLanguagesList: [Dictionary<String, String>] = [] When user choose the language and click on done, we store language chosen by user in UserDefaults: UserDefaults.standard.set(voiceLanguagesList[selectedVoiceLanguage][ControllerConstants.ChooseLanguage.languageCode], forKey: ControllerConstants.UserDefaultsKeys.languageCode) UserDefaults.standard.set(voiceLanguagesList[selectedVoiceLanguage][ControllerConstants.ChooseLanguage.languageName], forKey: ControllerConstants.UserDefaultsKeys.languageName) Language name with language code chosen by user displayed in settings so the user can know which language is currently being used for text-to-speech voice. To select a voice for use in speech, we obtain an AVSpeechSynthesisVoice instance using one of the methods in Finding Voices and then set it as the value of the voice property on an AVSpeechUtterance instance containing text to be spoken. Earlier stored language code in UserDefaults shared instance used for setting the text-to-speech language for AVSpeechSynthesisVoice. if let selectedLanguage = UserDefaults.standard.object(forKey: ControllerConstants.UserDefaultsKeys.languageCode) as? String { speechUtterance.voice = AVSpeechSynthesisVoice(language: selectedLanguage) } AVSpeechUtterance is responsible for a chunk of text to be spoken, along with parameters that affect its speech. Resources - UserDefaults: https://developer.apple.com/documentation/foundation/userdefaults AVSpeechSynthesisVoice: https://developer.apple.com/documentation/avfoundation/avspeechsynthesisvoice AVFoundation: https://developer.apple.com/av-foundation/ SUSI iOS Link: https://github.com/fossasia/susi_iOS

Continue ReadingChange Text-to-Speech Voice Language of SUSI in SUSI iOS

STOP action implementation in SUSI iOS

You may have experienced, you can stop Google home or Amazon Alexa during the ongoing task. The same feature is available for SUSI too. Now, SUSI can respond to ‘stop’ action and stop ongoing tasks (e.g. SUSI is narrating a story and if the user says STOP, it stops narrating the story). 'stop' action is introduced to enable the user to make SUSI stop anything it's doing. Video demonstration of how stop action work on SUSI iOS App can be found here. Stop action is implemented on SUSI iOS, Web chat, and Android. Here we will see how it is implemented in SUSI iOS. When you ask SUSI to stop, you get following actions object from server side: "actions": [{"type": "stop"}] Full JSON response can be found here. When SUSI respond with ‘stop’ action, we create a new action type ‘stop’ and assign `Message` object `actionType` to ‘stop’. Adding ‘stop’ to action type: enum ActionType: String { ... // other action types case stop } Assigning to the message object: if type == ActionType.stop.rawValue { message.actionType = ActionType.stop.rawValue message.message = ControllerConstants.stopMessage message.answerData = AnswerAction(action: action) } A new collectionView cell is created to respond user with “stoped” text. Registering the stopCell: collectionView?.register(StopCell.self, forCellWithReuseIdentifier: ControllerConstants.stopCell) Add cell to the chat screen: if message.actionType == ActionType.stop.rawValue { if let cell = collectionView.dequeueReusableCell(withReuseIdentifier: ControllerConstants.stopCell, for: indexPath) as? StopCell { cell.message = message let message = ControllerConstants.stopMessage let estimatedFrame = self.estimatedFrame(message: message) cell.setupCell(estimatedFrame, view.frame) return cell } } AVFoundation’s AVSpeechSynthesizer API is used to stop the action: func stopSpeakAction() { speechSynthesizer.stopSpeaking(at: AVSpeechBoundary.immediate) } This method immediately stops the speak action. Final Output: Resources -  About SUSI: https://chat.susi.ai/overview JSON response for ‘stop’ action: https://api.susi.ai/susi/chat.json?timezoneOffset=-330&q=susi+stop AVSpeechSynthesisVoice: https://developer.apple.com/documentation/avfoundation/avspeechsynthesisvoice AVFoundation: https://developer.apple.com/av-foundation/ SUSI iOS Link: https://github.com/fossasia/susi_iOS SUSI Android Link: https://github.com/fossasia/susi_android SUSI Web Chat Link: https://chat.susi.ai/

Continue ReadingSTOP action implementation in SUSI iOS

Implementing Text To Speech Settings in SUSI WebChat

SUSI Web Chat has Text to Speech (TTS) Feature where it gives voice replies for user queries. The Text to Speech functionality was added using Speech Synthesis Feature of the Web Speech API. The Text to Speech Settings were added to customise the speech output by controlling features like : Language Rate Pitch Let us visit SUSI Web Chat and try it out. First, ensure that the settings have SpeechOutput or SpeechOutputAlways enabled. Then click on the Mic button and ask a query. SUSI responds to your query with a voice reply. To control the Speech Output, visit Text To Speech Settings in the /settings route. First, let us look at the language settings. The drop down list for Language is populated when the app is initialised. speechSynthesis.onvoiceschanged function is triggered when the app loads initially. There we call speechSynthesis.getVoices() to get the voice list of all the languages currently supported by that particular browser. We store this in MessageStore using ActionTypes.INIT_TTS_VOICES action type. window.speechSynthesis.onvoiceschanged = function () { if (!MessageStore.getTTSInitStatus()) { var speechSynthesisVoices = speechSynthesis.getVoices(); Actions.getTTSLangText(speechSynthesisVoices); Actions.initialiseTTSVoices(speechSynthesisVoices); } }; We also get the translated text for every language present in the voice list for the text - `This is an example of speech synthesis` using google translate API. This is called initially for all the languages and is stored as translatedText attribute in the voice list for each element. This is later used when the user wants to listen to an example of speech output for a selected language, rate and pitch. https://translate.googleapis.com/translate_a/single?client=gtx&sl=en-US&tl=TARGET_LANGUAGE_CODE&dt=t&q=TEXT_TO_BE_TRANSLATED When the user visits the Text To Speech Settings, then the voice list stored in the MessageStore is retrieved and the drop down menu for Language is populated. The default language is fetched from UserPreferencesStore and the default language is accordingly highlighted in the dropdown. The list is parsed and populated as a drop down using populateVoiceList() function. let voiceMenu = voices.map((voice,index) => { if(voice.translatedText === null){ voice.translatedText = this.speechSynthesisExample; } langCodes.push(voice.lang); return( <MenuItem value={voice.lang} key={index} primaryText={voice.name+' ('+voice.lang+')'} /> ); }); The language selected using this dropdown is only used as the language for the speech output when the server doesn’t specify the language in its response and the browser language is undefined. We then create sliders using Material UI for adjusting speech rate and pitch. <h4 style={{'marginBottom':'0px'}}><Translate text="Speech Rate"/></h4> <Slider min={0.5} max={2} value={this.state.rate} onChange={this.handleRate} /> The range for the sliders is : Rate : 0.5 - 2 Pitch : 0 - 2 The default value for both rate and pitch is 1. We create a controlled slider saving the values in state and using onChange function to record change in values. The Reset buttons can be used to reset the rate and pitch values respectively to their default values. Once the language, rate and pitch values have been selected we can click on `Play a short demonstration of speech synthesis`  to listen to a voice reply with the chosen settings. { this.state.playExample && ( <VoicePlayer play={this.state.play} text={voiceOutput.voiceText} rate={this.state.rate} pitch={this.state.pitch} lang={this.state.ttsLanguage} onStart={this.onStart} onEnd={this.onEnd} /> ) }…

Continue ReadingImplementing Text To Speech Settings in SUSI WebChat

Adding EditText With Google Input Option While Sharing In Phimpme App

In Phimpme Android App an user can share images on multiple platforms. While sharing we have also included caption option to enter description about the image. That caption can be entered by using keyboard as well Google Voice Input method. So in this post, I will be explaining how to add edittext with google voice input option. Let’s get started Step-1 Add EditText and Mic Button in layout file <ImageView       android:layout_width="20dp"       android:id="@+id/button_mic"       android:layout_height="20dp"       android:background="?android:attr/selectableItemBackground"       android:background="@drawable/ic_mic_black"       android:scaleType="fitCenter" /> </RelativeLayout> Caption Option in Share Activity in Phimpme In Phimpme we have material design dialog box so right now I am using getTextInputDialogBox(). It will prompt the material design dialog box to enter caption to share image on multiple platform. Step-2 Now we can get caption from edittext easily using the following code. if (!captionText.isEmpty()) {   caption = captionText;   text_caption.setText(caption);   captionEditText.setSelection(caption.length()); } else {   caption = null;   text_caption.setText(caption); } Step - 3 (Now add option Google Voice input option) To use google input option I have added a global function in Utils class. To use that method just call that method with proper arguments. public static void promptSpeechInput(Activity activity, int requestCode, View parentView, String promtMsg) {   Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);   intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);   intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());   intent.putExtra(RecognizerIntent.EXTRA_PROMPT, promtMsg);   try {       activity.startActivityForResult(intent, requestCode);   } catch (ActivityNotFoundException a) {       SnackBarHandler.show(parentView,activity.getString(R.string.speech_not_supported));   } } Just pass the request code to receive the speech text in onActvityResult() method and passs promt message which will be visible to user. Step - 4 (Set text to caption ) if (requestCode == REQ_CODE_SPEECH_INPUT && data!=null){       ArrayList<String> result = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);       String voiceInput = result.get(0);       text_caption.setText(voiceInput);       caption = voiceInput;       return; } Now we can set the text in caption string right now I am adding text in existing caption i.e If the user enter some text using edittext and after that user clicked on mic button then the extra text will be added after the previous text. So this is how I have used Google voice input method and made the function globally. Resources: Stack overflow example : https://stackoverflow.com/questions/18049157/how-to-programmatically-initiate-a-google-now-voice-search Another post on Google Voice input : https://www.androidhive.info/2014/07/android-speech-to-text-tutorial/

Continue ReadingAdding EditText With Google Input Option While Sharing In Phimpme App

Adding IBM Watson TTS Support in Susi Assistant on Raspberry Pi

Susi Hardware project aims at creating a smart assistant for your home that you can run on your Raspberry Pi or similar Development Boards. I previously wrote a blog on choosing a perfect Text to Speech engine for Susi AI and had used Flite as the solution for it. While Flite is an Open Source solution that can run locally on a client, it does not provide the same quality of voice and speed as cloud providers. We always crave for a more natural voice for better interaction with our assistant. It is always good to have more options. We, therefore, added IBM Watson Text to Speech API in SUSI Hardware project. IBM Watson TTS can be added to a Python Project easily using the IBM Watson Developer SDK. For using the IBM Watson Developer SDK for Text to Speech, first of all, we need to sign up for Bluemix https://console.bluemix.net/registration/ After that, we will get the empty dashboard without any service added currently. We need to create a Text to Speech Service. To do so, click on Create Watson Service button      Select Watson on the left pane and then select Text to Speech service from the list. Select the standard plan from the options and then click on create button. You will get service credentials for your newly created text to speech service. Save it for future reference. After that, we need to add Watson developer cloud python package. sudo pip3 install watson-developer-cloud On Ubuntu with Python 3.5 watson-developer-cloud has some extra dependencies. Install them using the following command. sudo apt install libssl-dev Now we can add Text to Speech to our project. For that, we need to first import TextToSpeechV1 library. It can be added using following import statement. from watson_developer_cloud import TextToSpeechV1 Now we need to create a new TextToSpeechV1 object using the Service Credentials we created earlier. text_to_speech = TextToSpeechV1( username='API_USERNAME', password='API_PASSWORD') We can now perform synthesis of a text input and write the incoming speech stream from IBM Watson API to a file. with open('output.wav', 'wb') as audio_file: audio_file.write( text_to_speech.synthesize(text, accept='audio/wav’, voice='en-US_AllisonVoice')) In the above code snippet,  we are opening an output file ‘output.wav’ for writing. We then write the binary audio data returned by text_to_speech.synthesize method. IBM Watson provides many free voices. We supply an argument specifying which voice we need to use. We are using English female ‘en-US_AllisonVoice’. You may test out more voices in the online demo here and select the voice that you find best. We can play the ‘output.wav’ file using the play command from SoX. To do so, we need to install SoX binary. sudo apt install sox libsox-fmt-all We can play the file easily now using the following code. import os os.system('play output.wav') The above code invokes the ‘play’ command from the SoX package to play the audio file. We can also use PyAudio to play the audio file but it would require us to manage the audio thread separately. Thus, SoX is a better solution. Resources:…

Continue ReadingAdding IBM Watson TTS Support in Susi Assistant on Raspberry Pi

Implementing Text to Speech on SUSI Web Chat

SUSI Web Chat now gives voice replies while chatting with it similar to SUSI Android and SUSI iOS clients. To test the Voice Output on Chrome, Visit chat.susi.ai Click on the Mic input button. Say something using the Mic when the Speak Now View appears The simplest way to add text-to-speech to your website is by using the official Speech API currently available for Chrome Browser. The following steps help to achieve it in ReactJS. Initialize state defaults in a component named, VoicePlayer. const defaults = { text: '', volume: 1, rate: 1, pitch: 1, lang: 'en-US' } There are two states which need to be maintained throughout the component and to be passed as such in our props which will maintain the state. Thus our state is simply this.state = { started: false, playing: false } Our next step is to make use of functions of the Web Speech API to carry out the relevant processes. speak() - window.speechSynthesis.speak(this.speech) - Calls the speak method of the Speech API cancel() - window.speechSynthesis.cancel() - Calls the cancel method of the Speech API We then use our component helper functions to assign eventListeners and listen to any changes occurring in the background. For this purpose we make use of the functions componentWillReceiveProps(), shouldComponentUpdate(), componentDidMount(), componentWillUnmount(), and render(). componentWillReceiveProps() - receives the object parameter {pause} to listen to any paused action shouldComponentUpdate() simply returns false if no updates are to be made in the speech outputs. componentDidMount() is the master function which listens to the action start, adds the eventListener start and end, and calls the speak() function if the prop play is true. componentWillUnmount() destroys the speech object and ends the speech. Here’s a code snippet for Function componentDidMount() - componentDidMount () { const events = [ { name: 'start', action: this.props.onStart } ] // Adding event listeners events.forEach(e => { this.speech.addEventListener(e.name, e.action) }) this.speech.addEventListener('end', () => { this.setState({ started: false }) this.props.onEnd() }) if (this.props.play) { this.speak() } } We then add proper props validation in the following way to our VoicePlayer component. VoicePlayer.propTypes = { play: PropTypes.bool, text: PropTypes.string, onStart: PropTypes.func, onEnd: PropTypes.func }; The next step is to pass the props from a listener view to the VoicePlayer component. Hence the listener here is the component MessageListItem.js from where the voice player is initialized. First step is to initialise the state. this.state = { play: false, } onStart = () => { this.setState({ play: true }); } onEnd = () => { this.setState({ play: false }); } Next, we set play to true when we want to pass the props and the text which is to be said and append it to the message lists which have voice set as `true` { this.props.message.voice && (<VoicePlayer play text={voiceOutput} onStart={this.onStart} onEnd={this.onEnd} />)} Finally, our message lists with voice true will be heard on the speaker as they have been spoken on the microphone. To get access to the full code, go to the repository https://github.com/fossasia/chat.susi.ai or on our chat channel at gitter.im/fossasia/susi_webchat Resources Speak-easy-synthesis repository http://mdn.github.io/web-speech-api/speak-easy-synthesis Web-speech-api repository https://github.com/mdn/web-speech-api/

Continue ReadingImplementing Text to Speech on SUSI Web Chat