Text To Speech | Natural Voices

Create natural sounding voices, dialogs, voiceovers, placeholders and other text layouts. The voices have been optimized with the help of a neural engine.

Export Format

The Voices

Advanced Controls

With the use of SSML tags, you can change the tone and emphasis for any word or sentence.

GroupSSML TagDemo
Break <break time='500ms' />
<break time='1s' />
<break time='2s' />
<break time='3s' />
<break time='4s' />
Emphasis <emphasis level='moderate'>Moderate</emphasis>
<emphasis level='reduced'>Reduced</emphasis>
<emphasis level='strong'>Strong</emphasis>
Pitch <prosody pitch='x-high'>Higher</prosody>
<prosody pitch='high'>High</prosody>
<prosody pitch='medium'>Default - No Pitch</prosody>
<prosody pitch='low'>Low</prosody>
<prosody pitch='x-low'>Lower</prosody>
Speed <prosody rate='x-slow'>Slower</prosody>
<prosody rate='slow'>Slow</prosody>
<prosody rate='medium'>Default - Medium</prosody>
<prosody rate='fast'>Fast</prosody>
<prosody rate='x-fast'>Faster</prosody>
Digits <say-as interpret-as="digits">123456789</say-as>
Letters <say-as interpret-as="letters">How are you?</say-as>
Date* <say-as interpret-as="date" format="mdy">06/16/21</say-as>
<say-as interpret-as="date" format="ymd">21/06/16</say-as>
<say-as interpret-as="date" format="dmy">16/06/21</say-as>
<say-as interpret-as="date" format="ydm">21/16/06</say-as>
<say-as interpret-as="date" format="my">06/21</say-as>
<say-as interpret-as="date" format="md">06/16</say-as>
<say-as interpret-as="date" format="ym">21/06</say-as>
(ISO 4217)
<say-as interpret-as="vxml:currency">EUR10.50</say-as>
<say-as interpret-as="vxml:currency">USD10.50</say-as>
<say-as interpret-as="vxml:currency">GBP10.50</say-as>
<say-as interpret-as="vxml:currency">CHF10.50</say-as>

Telephone* <say-as interpret-as="number" format="telephone">012-345-6789</say-as>
<say-as interpret-as="number" format="telephone" detail="punctuation">012-345-6789</say-as>
<say-as interpret-as="number" format="telephone" detail="punctuation">+49 1234 56 789</say-as>

* This SSML expression works only with English voices.

Text Demos with SSML Tags

Combine SSML Tags The upgrade costs <prosody rate='x-fast'><say-as interpret-as="vxml:currency">EUR10.50</say-as></prosody>
Long Text with SSML Tags A film, also called a <emphasis level='strong'>movie</emphasis>, <emphasis level='strong'>motion picture</emphasis> or <emphasis level='strong'>moving picture</emphasis>, is a work of visual art used to simulate experiences that communicate ideas, stories, perceptions, feelings, beauty, or atmosphere through the use of moving images. These images are generally accompanied by <prosody pitch='x-high'>sound, and more rarely</prosody>, other sensory stimulations.<break time='500ms' />The word cinema, <prosody rate='x-fast'>short for cinematography</prosody>, is often used to refer to filmmaking and the film industry, and to the art form that is the result of it.


AI Tools MP3 WAV Premium Tools Adobe Premiere Pro Apple Final Cut X Avid Media Composer DaVinci Resolve