Literature DB >> 34268444

Evaluation of Glottal Inverse Filtering Algorithms Using a Physiologically Based Articulatory Speech Synthesizer.

Yu-Ren Chien1, Daryush D Mehta2, Jón Guðnason1, Matías Zañartu3, Thomas F Quatieri4.   

Abstract

Glottal inverse filtering aims to estimate the glottal airflow signal from a speech signal for applications such as speaker recognition and clinical voice assessment. Nonetheless, evaluation of inverse filtering algorithms has been challenging due to the practical difficulties of directly measuring glottal airflow. Apart from this, it is acknowledged that the performance of many methods degrade in voice conditions that are of great interest, such as breathiness, high pitch, soft voice, and running speech. This paper presents a comprehensive, objective, and comparative evaluation of state-of-the-art inverse filtering algorithms that takes advantage of speech and glottal airflow signals generated by a physiological speech synthesizer. The synthesizer provides a physics-based simulation of the voice production process and thus an adequate test bed for revealing the temporal and spectral performance characteristics of each algorithm. Included in the synthetic data are continuous speech utterances and sustained vowels, which are produced with multiple voice qualities (pressed, slightly pressed, modal, slightly breathy, and breathy), fundamental frequencies, and subglottal pressures to simulate the natural variations in real speech. In evaluating the accuracy of a glottal flow estimate, multiple error measures are used, including an error in the estimated signal that measures overall waveform deviation, as well as an error in each of several clinically relevant features extracted from the glottal flow estimate. Waveform errors calculated from glottal flow estimation experiments exhibited mean values around 30% for sustained vowels, and around 40% for continuous speech, of the amplitude of true glottal flow derivative. Closed-phase approaches showed remarkable stability across different voice qualities and subglottal pressures. The algorithms of choice, as suggested by significance tests, are closed-phase covariance analysis for the analysis of sustained vowels, and sparse linear prediction for the analysis of continuous speech. Results of data subset analysis suggest that analysis of close rounded vowels is an additional challenge in glottal flow estimation.

Keywords:  Performance evaluation; glottal excitation; glottal flow estimation; inverse filtering; speech analysis; speech synthesis; voice production

Year:  2017        PMID: 34268444      PMCID: PMC8279087          DOI: 10.1109/taslp.2017.2714839

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  18 in total

1.  Simultaneous analysis of vocal fold vibration and transglottal airflow: exploring a new experimental setup.

Authors:  Svante Granqvist; Stellan Hertegård; Hans Larsson; Johan Sundberg
Journal:  J Voice       Date:  2003-09       Impact factor: 2.009

2.  Normalized amplitude quotient for parametrization of the glottal flow.

Authors:  Paavo Alku; Tom Bäckström; Erkki Vilkman
Journal:  J Acoust Soc Am       Date:  2002-08       Impact factor: 1.840

3.  Estimation of the voice source from speech pressure signals: evaluation of an inverse filtering technique using physical modelling of voice production.

Authors:  Paavo Alku; Brad Story; Matti Airas
Journal:  Folia Phoniatr Logop       Date:  2006       Impact factor: 0.849

4.  Vocal quality factors: analysis, synthesis, and perception.

Authors:  D G Childers; C K Lee
Journal:  J Acoust Soc Am       Date:  1991-11       Impact factor: 1.840

5.  Formant frequency estimation of high-pitched vowels using weighted linear prediction.

Authors:  Paavo Alku; Jouni Pohjalainen; Martti Vainio; Anne-Maria Laukkanen; Brad H Story
Journal:  J Acoust Soc Am       Date:  2013-08       Impact factor: 1.840

6.  Analysis, synthesis, and perception of voice quality variations among female and male talkers.

Authors:  D H Klatt; L C Klatt
Journal:  J Acoust Soc Am       Date:  1990-02       Impact factor: 1.840

7.  Subglottal Impedance-Based Inverse Filtering of Voiced Sounds Using Neck Surface Acceleration.

Authors:  Matías Zañartu; Julio C Ho; Daryush D Mehta; Robert E Hillman; George R Wodicka
Journal:  IEEE Trans Audio Speech Lang Process       Date:  2013-09

8.  Age- and gender-related difference of vocal fold vibration and glottal configuration in normal speakers: analysis with glottal area waveform.

Authors:  Akihito Yamauchi; Hisayuki Yokonishi; Hiroshi Imagawa; Ken-Ichi Sakakibara; Takaharu Nito; Niro Tayama; Tatsuya Yamasoba
Journal:  J Voice       Date:  2014-05-16       Impact factor: 2.009

9.  Modeling consonant-vowel coarticulation for articulatory speech synthesis.

Authors:  Peter Birkholz
Journal:  PLoS One       Date:  2013-04-16       Impact factor: 3.240

10.  Using Ambulatory Voice Monitoring to Investigate Common Voice Disorders: Research Update.

Authors:  Daryush D Mehta; Jarrad H Van Stan; Matías Zañartu; Marzyeh Ghassemi; John V Guttag; Víctor M Espinoza; Juan P Cortés; Harold A Cheyne; Robert E Hillman
Journal:  Front Bioeng Biotechnol       Date:  2015-10-16
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.