IBM / Watson Speech API
Watson Speech to Text can be used anywhere there is a need to bridge the gap between the spoken word and its written form. This easy-to-use service uses machine intelligence to combine information about grammar and language structure with knowledge of the composition of an audio signal to generate an accurate transcription. It uses IBM's speech recognition capabilities to convert speech in multiple languages into text. The transcription of incoming audio is continuously sent back to the client with minimal delay, and it is corrected as more speech is heard. Additionally, the service now includes the ability to detect one or more keywords in the audio stream. The service is accessed via a WebSocket connection or REST API.
API reference : https://www.ibm.com/watson/developercloud/speech-to-text/api/v1/
Create a custom language model : https://www.ibm.com/watson/developercloud/doc/speech-to-text/custom.shtml
- Configuration :
voximal.conf
[recognizer] api=watson user=(your username) password=(your password) model=(model selected) (NarrowBand models only, so limited en/es/jp/br/ch)