Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Segmental intelligibility of four currently used text-to-speech synthesis methods.

Literature DB >> 12703720

Segmental intelligibility of four currently used text-to-speech synthesis methods.

Abstract

The study investigated the segmental intelligibility of four currently available text-to-speech (TTS) products under 0-dB and 5-dB signal-to-noise ratios. The products were IBM ViaVoice version 5.1, which uses formant coding, Festival version 1.4.2, a diphone-based LPC TTS product, AT&T Next-Gen, a half-phone-based TTS product that uses harmonic-plus-noise method for synthesis, and FlexVoice2, a hybrid TTS product that combines concatenative and formant coding techniques. Overall, concatenative techniques were more intelligible than formant or hybrid techniques, with formant coding slightly better at modeling vowels and concatenative techniques marginally better at synthesizing consonants. No TTS product was better at resisting noise interference than others, although all were more intelligible at 5 dB than at 0-dB SNR. The better TTS products in this study were, on the average, 22% less intelligible and had about 3 times more phoneme errors than human voice under comparable listening conditions. The hybrid TTS technology of FlexVoice had the lowest intelligibility and highest error rates. There were discernible patterns of errors for stops, fricatives, and nasals. Unrestricted TTS output--e-mail messages, news reports, and so on--under high noise conditions prevalent in automobiles, airports, etc. will likely challenge the listeners.

Entities: Chemical Species

Mesh：

Year: 2003 PMID： 12703720 DOI： 10.1121/1.1558356

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 1.840

Keyword Cloud
Cited

1 in total

1. Intelligibility of naturally produced and synthesized Mandarin speech by cochlear implant listeners.

Authors: Ying Shi; Jingyuan Chen; Yue Gong; Biao Chen; Yongxin Li; John J Galvin; Qian-Jie Fu
Journal: J Acoust Soc Am Date: 2018-05 Impact factor: 1.840

1 in total