Text To Speech | Natural Voices

Create natural sounding voices, dialogs, voiceovers, placeholders and other text layouts. The voices have been optimized with the help of a neural engine.

Voice
Export Format



The Voices






Speed

Under "more options" the reading speed can be changed. Here are some examples of what the speed change sounds like.

SpeedDemo
-30%
-15%
+0% (Standard)
+15%
+30%




Pitch

Under "more options" you can change the pitch of the voice. Here are some examples of what the pitch sounds like.

PitchDemo
-30%
-20%
-10%
+0% (Standard)
+20%
+20%
+30%




Advanced Controls


With the use of SSML tags, you can change the tone and emphasis for any word or sentence.

GroupSSML TagDemo
Break <break time='500ms' />
<break time='1s' />
<break time='2s' />
<break time='3s' />
<break time='4s' />
Emphasis <emphasis level='moderate'>Moderate</emphasis>
<emphasis level='reduced'>Reduced</emphasis>
<emphasis level='strong'>Strong</emphasis>
Pitch <prosody pitch='x-high'>Higher</prosody>
<prosody pitch='high'>High</prosody>
<prosody pitch='medium'>Default - No Pitch</prosody>
<prosody pitch='low'>Low</prosody>
<prosody pitch='x-low'>Lower</prosody>
Speed <prosody rate='x-slow'>Slower</prosody>
<prosody rate='slow'>Slow</prosody>
<prosody rate='medium'>Default - Medium</prosody>
<prosody rate='fast'>Fast</prosody>
<prosody rate='x-fast'>Faster</prosody>
Digits <say-as interpret-as="digits">123456789</say-as>
Letters <say-as interpret-as="letters">How are you?</say-as>
Date* <say-as interpret-as="date" format="mdy">11/21/24</say-as>
<say-as interpret-as="date" format="ymd">24/11/21</say-as>
<say-as interpret-as="date" format="dmy">21/11/24</say-as>
<say-as interpret-as="date" format="ydm">24/21/11</say-as>
<say-as interpret-as="date" format="my">11/24</say-as>
<say-as interpret-as="date" format="md">11/21</say-as>
<say-as interpret-as="date" format="ym">24/11</say-as>
Currency*
(ISO 4217)
<say-as interpret-as="vxml:currency">EUR10.50</say-as>
<say-as interpret-as="vxml:currency">USD10.50</say-as>
<say-as interpret-as="vxml:currency">GBP10.50</say-as>
<say-as interpret-as="vxml:currency">CHF10.50</say-as>

Telephone* <say-as interpret-as="number" format="telephone">012-345-6789</say-as>
<say-as interpret-as="number" format="telephone" detail="punctuation">012-345-6789</say-as>
<say-as interpret-as="number" format="telephone" detail="punctuation">+49 1234 56 789</say-as>

* This SSML expression works only with English voices.





Text Demos with SSML Tags


DemoTextAudio
Combine SSML Tags The upgrade costs <prosody rate='x-fast'><say-as interpret-as="vxml:currency">EUR10.50</say-as></prosody>
Long Text with SSML Tags A film, also called a <emphasis level='strong'>movie</emphasis>, <emphasis level='strong'>motion picture</emphasis> or <emphasis level='strong'>moving picture</emphasis>, is a work of visual art used to simulate experiences that communicate ideas, stories, perceptions, feelings, beauty, or atmosphere through the use of moving images. These images are generally accompanied by <prosody pitch='x-high'>sound, and more rarely</prosody>, other sensory stimulations.<break time='500ms' />The word cinema, <prosody rate='x-fast'>short for cinematography</prosody>, is often used to refer to filmmaking and the film industry, and to the art form that is the result of it.



Tags


AI Tools MP3 WAV Premium Tools Adobe Premiere Pro Avid Media Composer DaVinci Resolve Final Cut Pro