Literature DB >> 15759708

Synthesis fidelity and time-varying spectral change in vowels.

Peter F Assmann1, William F Katz.   

Abstract

Recent studies have shown that synthesized versions of American English vowels are less accurately identified when the natural time-varying spectral changes are eliminated by holding the formant frequencies constant over the duration of the vowel. A limitation of these experiments has been that vowels produced by formant synthesis are generally less accurately identified than the natural vowels after which they are modeled. To overcome this limitation, a high-quality speech analysis-synthesis system (STRAIGHT) was used to synthesize versions of 12 American English vowels spoken by adults and children. Vowels synthesized with STRAIGHT were identified as accurately as the natural versions, in contrast with previous results from our laboratory showing identification rates 9%-12% lower for the same vowels synthesized using the cascade formant model. Consistent with earlier studies, identification accuracy was not reduced when the fundamental frequency was held constant across the vowel. However, elimination of time-varying changes in the spectral envelope using STRAIGHT led to a greater reduction in accuracy (23%) than was previously found with cascade formant synthesis (11%). A statistical pattern recognition model, applied to acoustic measurements of the natural and synthesized vowels, predicted both the higher identification accuracy for vowels synthesized using STRAIGHT compared to formant synthesis, and the greater effects of holding the formant frequencies constant over time with STRAIGHT synthesis. Taken together, the experiment and modeling results suggest that formant estimation errors and incorrect rendering of spectral and temporal cues by cascade formant synthesis contribute to lower identification accuracy and underestimation of the role of time-varying spectral change in vowels.

Entities:  

Mesh:

Year:  2005        PMID: 15759708     DOI: 10.1121/1.1852549

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  11 in total

1.  Estimating speech spectra for copy synthesis by linear prediction and by hand.

Authors:  Robert E Remez; Kathryn R Dubowski; Morgana L Davids; Emily F Thomas; Nina U Paddu; Yael S Grossman; Marina Moskalenko
Journal:  J Acoust Soc Am       Date:  2011-10       Impact factor: 1.840

2.  The intelligibility of noise-vocoded speech: spectral information available from across-channel comparison of amplitude envelopes.

Authors:  Brian Roberts; Robert J Summers; Peter J Bailey
Journal:  Proc Biol Sci       Date:  2010-11-10       Impact factor: 5.349

3.  Vowel acoustics in Parkinson's disease and multiple sclerosis: comparison of clear, loud, and slow speaking conditions.

Authors:  Kris Tjaden; Jennifer Lam; Greg Wilding
Journal:  J Speech Lang Hear Res       Date:  2013-07-09       Impact factor: 2.297

4.  Perception of complete and incomplete formant transitions in vowels.

Authors:  Pierre Divenyi
Journal:  J Acoust Soc Am       Date:  2009-09       Impact factor: 1.840

5.  Acoustic cues to perception of word stress by English, Mandarin, and Russian speakers.

Authors:  Anna Chrabaszcz; Matthew Winn; Candise Y Lin; William J Idsardi
Journal:  J Speech Lang Hear Res       Date:  2014-08       Impact factor: 2.297

6.  Acoustics of clear speech: effect of instruction.

Authors:  Jennifer Lam; Kris Tjaden; Greg Wilding
Journal:  J Speech Lang Hear Res       Date:  2012-03-12       Impact factor: 2.297

7.  Perception of acoustic scale and size in musical instrument sounds.

Authors:  Ralph van Dinther; Roy D Patterson
Journal:  J Acoust Soc Am       Date:  2006-10       Impact factor: 1.840

8.  Discrimination of speaker size from syllable phrases.

Authors:  D Timothy Ives; David R R Smith; Roy D Patterson
Journal:  J Acoust Soc Am       Date:  2005-12       Impact factor: 1.840

9.  The use of acoustic cues for phonetic identification: effects of spectral degradation and electric hearing.

Authors:  Matthew B Winn; Monita Chatterjee; William J Idsardi
Journal:  J Acoust Soc Am       Date:  2012-02       Impact factor: 2.482

Review 10.  Neural and behavioral investigations into timbre perception.

Authors:  Stephen M Town; Jennifer K Bizley
Journal:  Front Syst Neurosci       Date:  2013-11-13
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.