Literature DB >> 28320669

Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition.

Myungjong Kim, Younggwan Kim, Joohong Yoo, Jun Wang, Hoirin Kim.   

Abstract

This paper addresses the problem of recognizing the speech uttered by patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. Patients with dysarthria have articulatory limitation, and therefore, they often have trouble in pronouncing certain sounds, resulting in undesirable phonetic variation. Modern automatic speech recognition systems designed for regular speakers are ineffective for dysarthric sufferers due to the phonetic variation. To capture the phonetic variation, Kullback-Leibler divergence-based hidden Markov model (KL-HMM) is adopted, where the emission probability of state is parameterized by a categorical distribution using phoneme posterior probabilities obtained from a deep neural network-based acoustic model. To further reflect speaker-specific phonetic variation patterns, a speaker adaptation method based on a combination of L2 regularization and confusion-reducing regularization, which can enhance discriminability between categorical distributions of the KL-HMM states while preserving speaker-specific information is proposed. Evaluation of the proposed speaker adaptation method on a database of several hundred words for 30 speakers consisting of 12 mildly dysarthric, 8 moderately dysarthric, and 10 non-dysarthric control speakers showed that the proposed approach significantly outperformed the conventional deep neural network-based speaker adapted system on dysarthric as well as non-dysarthric speech.

Entities:  

Mesh:

Year:  2017        PMID: 28320669      PMCID: PMC5591083          DOI: 10.1109/TNSRE.2017.2681691

Source DB:  PubMed          Journal:  IEEE Trans Neural Syst Rehabil Eng        ISSN: 1534-4320            Impact factor:   3.802


  9 in total

1.  A fast learning algorithm for deep belief nets.

Authors:  Geoffrey E Hinton; Simon Osindero; Yee-Whye Teh
Journal:  Neural Comput       Date:  2006-07       Impact factor: 2.026

2.  Experiments with fast Fourier transform, linear predictive and cepstral coefficients in dysarthric speech recognition algorithms using hidden Markov Model.

Authors:  Prasad D Polur; Gerald E Miller
Journal:  IEEE Trans Neural Syst Rehabil Eng       Date:  2005-12       Impact factor: 3.802

3.  Effects of listeners' working memory and noise on speech intelligibility in dysarthria.

Authors:  Youngmee Lee; Jee Eu Sung; Hyunsub Sim
Journal:  Clin Linguist Phon       Date:  2014-04-08       Impact factor: 1.346

4.  Phonological disorders III: a procedure for assessing severity of involvement.

Authors:  L D Shriberg; J Kwiatkowski
Journal:  J Speech Hear Disord       Date:  1982-08

5.  Frequency of consonant articulation errors in dysarthric speech.

Authors:  Heejin Kim; Katie Martin; Mark Hasegawa-Johnson; Adrienne Perlman
Journal:  Clin Linguist Phon       Date:  2010-10       Impact factor: 1.346

6.  Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.

Authors:  Prasad D Polur; Gerald E Miller
Journal:  Med Eng Phys       Date:  2005-12-15       Impact factor: 2.242

7.  Representation Learning Based Speech Assistive System for Persons With Dysarthria.

Authors:  S Chandrakala; Natarajan Rajeswari
Journal:  IEEE Trans Neural Syst Rehabil Eng       Date:  2016-12-13       Impact factor: 3.802

Review 8.  Difficulties in automatic speech recognition of dysarthric speakers and implications for speech-based applications used by the elderly: a literature review.

Authors:  Victoria Young; Alex Mihailidis
Journal:  Assist Technol       Date:  2010

9.  Experiments in dysarthric speech recognition using artificial neural networks.

Authors:  G Jayaram; K Abdelhamied
Journal:  J Rehabil Res Dev       Date:  1995-05
  9 in total
  1 in total

1.  Automatic prediction of intelligible speaking rate for individuals with ALS from speech acoustic and articulatory samples.

Authors:  Jun Wang; Prasanna V Kothalkar; Myungjong Kim; Andrea Bandini; Beiming Cao; Yana Yunusova; Thomas F Campbell; Daragh Heitzman; Jordan R Green
Journal:  Int J Speech Lang Pathol       Date:  2018-11-08       Impact factor: 2.484

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.