Literature DB >> 29423453

Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data.

Beiming Cao1, Myungjong Kim1, Ted Mau2, Jun Wang1,3.   

Abstract

Individuals with larynx (vocal folds) impaired have problems in controlling their glottal vibration, producing whispered speech with extreme hoarseness. Standard automatic speech recognition using only acoustic cues is typically ineffective for whispered speech because the corresponding spectral characteristics are distorted. Articulatory cues such as the tongue and lip motion may help in recognizing whispered speech since articulatory motion patterns are generally not affected. In this paper, we investigated whispered speech recognition for patients with reconstructed larynx using articulatory movement data. A data set with both acoustic and articulatory motion data was collected from a patient with surgically reconstructed larynx using an electromagnetic articulograph. Two speech recognition systems, Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network-HMM (DNN-HMM), were used in the experiments. Experimental results showed adding either tongue or lip motion data to acoustic features such as mel-frequency cepstral coefficient (MFCC) significantly reduced the phone error rates on both speech recognition systems. Adding both tongue and lip data achieved the best performance.

Entities:  

Keywords:  deep neural network; hidden Markov model; larynx reconstruction; speech articulation; whispered speech recognition

Year:  2016        PMID: 29423453      PMCID: PMC5800526          DOI: 10.21437/SLPAT.2016-14

Source DB:  PubMed          Journal:  Workshop Speech Lang Process Assist Technol        ISSN: 2411-9962


  8 in total

1.  Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion.

Authors:  Prasanta Kumar Ghosh; Shrikanth Narayanan
Journal:  J Acoust Soc Am       Date:  2011-10       Impact factor: 1.840

Review 2.  Diagnostic evaluation and management of hoarseness.

Authors:  Ted Mau
Journal:  Med Clin North Am       Date:  2010-09       Impact factor: 5.456

3.  An Optimal Set of Flesh Points on Tongue and Lips for Speech-Movement Classification.

Authors:  Jun Wang; Ashok Samal; Panying Rong; Jordan R Green
Journal:  J Speech Lang Hear Res       Date:  2016-02       Impact factor: 2.297

4.  Accuracy of the NDI wave speech research system.

Authors:  Jeffrey J Berry
Journal:  J Speech Lang Hear Res       Date:  2011-04-15       Impact factor: 2.297

Review 5.  Speech production knowledge in automatic speech recognition.

Authors:  Simon King; Joe Frankel; Karen Livescu; Erik McDermott; Korin Richmond; Mirjam Wester
Journal:  J Acoust Soc Am       Date:  2007-02       Impact factor: 1.840

6.  Frequency of word occurrence in communication samples produced by adult communication aid users.

Authors:  D R Beukelman; K M Yorkston; M Poblete; C Naranjo
Journal:  J Speech Hear Disord       Date:  1984-11

7.  Articulatory distinctiveness of vowels and consonants: a data-driven approach.

Authors:  Jun Wang; Jordan R Green; Ashok Samal; Yana Yunusova
Journal:  J Speech Lang Hear Res       Date:  2013-07-09       Impact factor: 2.297

8.  Modulating phonation through alteration of vocal fold medial surface contour.

Authors:  Ted Mau; Joseph Muhlestein; Sean Callahan; Roger W Chan
Journal:  Laryngoscope       Date:  2012-08-01       Impact factor: 3.325

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.