RUTH TTS or TTS+ (a Puretalk.ai product)

Text-to-Speech (TTS), or speech synthesis models are becoming more and more indistinguishable from human sound.

In “A Survey on Neural Speech Synthesis” by Xu Tan, Tao Qin, Frank Soong, and Tie-Yan Liu, they speak of the complexity of “key components such as text analysis, acoustic models and vocoders, and several advanced topics, including fast TTS, low-resource TTS, robust TTS, expressive TTS, and adaptive TTS, etc.”

Methods, libraries & software used to compare voice quality: Documentation

Comparison Below

Below, you’ll find voice data from real human as well as various TTS models, Puretalk.ai TTS+ included. This collection of voice data from various TTS models helps to compare our in-house model (TTS+) to our competitors.

Here are some samples

“Sphinx of black quartz judge my vow, the July sun caused a fragment of black pine wax to ooze on the velvet quilt. While the vixen jumped quickly on her foe, barking with zeal.”

Voice Name Clip
IOS 16 SIRI
Google C - US-Standard
Amazon Polly Joanna
Human Speaker
RUTH TTS+
RUTH TTS+ Male
WellSaid Labs Alana
Wellsaid Labs Ramona
Microsoft Sara neural
Microsoft Aria
IBM Kevin
IBM Female
Google Wavenet
Microsoft Nancy Standard

Technical features

The F0 and Intensity values below were determined using Praat from the clips above in which each voice reads the first two sentences of the article (~10 second clips each).

Voice Name Average F0 (Hz) Average Intensity (dB) Synthesis model Source
IOS 16 Siri 116.8 67.1 TBD  
Google C - US-Standard 133.2 74.7 WaveNet https://cloud.google.com/text-to-speech/docs/wavenet
Human 1 126.9 68.1 N/A N/A
Human speaker 185.7 72.9 N/A N/A
Ruth TTS+ 184.6 67.4 N/A N/A
iOS 166.3 77.5 TBD  
Judy GL1 188.7 76.5 Tacotron + Griffin Lim https://github.com/mozilla/TTS/wiki/Mean-Opinion-Score-Results
Judy GL2 197.3 72.7 Tacotron2 + Griffin Lim https://github.com/mozilla/TTS/wiki/Mean-Opinion-Score-Results
Judy W1 187.3 76.9 Tacotron + WaveRNN https://github.com/mozilla/TTS/wiki/Mean-Opinion-Score-Results
Judy W2 195.5 78.0 Tacotron2 + WaveRNN https://github.com/mozilla/TTS/wiki/Mean-Opinion-Score-Results
LJ Speech 215.4 73.4 Tacotron + GriffinLim https://github.com/mozilla/TTS/wiki/Mean-Opinion-Score-Results
Mac Default 113.6 65.6 TBD  
Nancy 1 197.7 75.2 Tacotron + Griffin Lim https://github.com/mozilla/TTS/wiki/Mean-Opinion-Score-Results
Nancy 2 189.0 75.9 Tacotron2 + WaveRNN https://github.com/mozilla/TTS/wiki/Mean-Opinion-Score-Results
Polly Joanna 155.3 72.6 TBD  
Polly Matthew 99.6 72.8 TBD  
Polly Sally 192.2 73.1 TBD  
Voicery Nichole 194.0 68.2 TBD  
Windows Zira 176.9 66.1 TBD  
Windows David 91.9 66.7 TBD  

Did we get something wrong? If you were involved in the development of any of these voices or notice an error, please let us know so we can correct it by filing an issue or submitting a pull request. We’d appreciate it!

Cite our work

BibTeX coming soon!