Literature DB >> 18247915

A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition.

Amit Juneja1, Carol Espy-Wilson.   

Abstract

A probabilistic framework for a landmark-based approach to speech recognition is presented for obtaining multiple landmark sequences in continuous speech. The landmark detection module uses as input acoustic parameters (APs) that capture the acoustic correlates of some of the manner-based phonetic features. The landmarks include stop bursts, vowel onsets, syllabic peaks and dips, fricative onsets and offsets, and sonorant consonant onsets and offsets. Binary classifiers of the manner phonetic features-syllabic, sonorant and continuant-are used for probabilistic detection of these landmarks. The probabilistic framework exploits two properties of the acoustic cues of phonetic features-(1) sufficiency of acoustic cues of a phonetic feature for a probabilistic decision on that feature and (2) invariance of the acoustic cues of a phonetic feature with respect to other phonetic features. Probabilistic landmark sequences are constrained using manner class pronunciation models for isolated word recognition with known vocabulary. The performance of the system is compared with (1) the same probabilistic system but with mel-frequency cepstral coefficients (MFCCs), (2) a hidden Markov model (HMM) based system using APs and (3) a HMM based system using MFCCs.

Mesh:

Year:  2008        PMID: 18247915     DOI: 10.1121/1.2823754

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  8 in total

1.  Factors affecting masking release in cochlear-implant vocoded speech.

Authors:  Ning Li; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2009-07       Impact factor: 1.840

2.  Virtual Landmarks.

Authors:  Yubing Tong; Jayaram K Udupa; Dewey Odhner; Peirui Bai; Drew A Torigian
Journal:  Proc SPIE Int Soc Opt Eng       Date:  2017-03-03

3.  A procedure for estimating gestural scores from speech acoustics.

Authors:  Hosung Nam; Vikramjit Mitra; Mark Tiede; Mark Hasegawa-Johnson; Carol Espy-Wilson; Elliot Saltzman; Louis Goldstein
Journal:  J Acoust Soc Am       Date:  2012-12       Impact factor: 1.840

Review 4.  Relative cue encoding in the context of sophisticated models of categorization: Separating information from categorization.

Authors:  Keith S Apfelbaum; Bob McMurray
Journal:  Psychon Bull Rev       Date:  2015-08

5.  Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing.

Authors:  Fei Chen; Philipos C Loizou
Journal:  Ear Hear       Date:  2010-04       Impact factor: 3.570

6.  The contribution of obstruent consonants and acoustic landmarks to speech recognition in noise.

Authors:  Ning Li; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2008-12       Impact factor: 1.840

7.  Spectral and Temporal Envelope Cues for Human and Automatic Speech Recognition in Noise.

Authors:  Guangxin Hu; Sarah C Determan; Yue Dong; Alec T Beeve; Joshua E Collins; Yan Gai
Journal:  J Assoc Res Otolaryngol       Date:  2019-11-22

8.  A magnetic resonance imaging-based articulatory and acoustic study of "retroflex" and "bunched" American English /r/.

Authors:  Xinhui Zhou; Carol Y Espy-Wilson; Suzanne Boyce; Mark Tiede; Christy Holland; Ann Choe
Journal:  J Acoust Soc Am       Date:  2008-06       Impact factor: 2.482

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.