Table of Contents

Amazon AWS / Polly TextToSpeech

About

Polly from Amazon

Definition

Amazon Polly is a service that turns text into lifelike speech. Polly lets you create applications that talk, enabling you to build entirely new categories of speech-enabled products. Polly is an Amazon AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. Polly includes 47 lifelike voices spread across 24 languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries.

Create AWS KeyID and Secret Key

From your AWS account you must create dedicated credentials

AWS Client Installation

You must get AWS Client command line client installed from /usr/bin/aws :

On debian Jessie (8) you can try this :

apt-get install python-pip python-yaml
pip install awscli --upgrade 
ln -s /usr/local/bin/aws /usr/bin/aws

Once aws-cli command is installed, you can check that it's working by using this command line, according to your credentials :

AWS_ACCESS_KEY_ID=XXXXXXXXXXXXXXXXXXXX \
AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \
/usr/bin/aws  polly synthesize-speech    --output-format mp3    --voice-id Joanna    --text 'Hello my name is Joanna!' --region 'eu-west-1' hello.mp3

The result looks like :

{
    "ContentType": "audio/mpeg", 
    "RequestCharacters": "5"
}
# file hello.mp3 
hello.mp3: Audio file with ID3 version 2.4.0, contains: MPEG ADTS, layer III, v2,  48 kbps, 22.05 kHz, Monaural

Configuration