Microsoft BingVoiceRecognizer

Convert spoken audio to text. The API can be directed to turn on and recognize audio coming from the microphone in real-time, recognize audio coming from a different real-time audio source, or to recognize audio from within a file. In all cases, real-time streaming is available, so as the audio is being sent to the server, partial recognition results are also being returned.

API reference : https://www.microsoft.com/cognitive-services/en-us/speech-api/documentation/api-reference-rest/bingvoicerecognition

Integration example : https://gist.github.com/lukehoban/0ee5c1bef438dc5bd7cb

  • Pricing options :
    • 5K calls per month : Free
    • 15 secondes per call : $4 per 1000 calls
  • Configuration :

voximal.conf
[recognizer]
 api=microsoft
 key=(your private key)