Literature DB >> 15058352

Detection of speech landmarks: use of temporal information.

Ariel Salomon1, Carol Y Espy-Wilson, Om Deshmukh.   

Abstract

Studies by Shannon et al. [Science, 270, 303-304 (1995)], Van Tasell et al. [J. Acoust. Soc. Am. 82, 1152-1161 (1987)], and others show that human listeners can understand important aspects of the speech signal when spectral shape has been significantly degraded. These experiments suggest that temporal information is particularly important in human speech perception when the speech signal is heavily degraded. In this study, a system is developed that extracts linguistically relevant temporal information that can be used in the front end of an automatic speech recognition system. The parameters targeted include energy onset and offsets (computed using an adaptive algorithm) and measures of periodic and aperiodic content; together these are used to find abrupt acoustic events which signify landmarks. Overall detection rates for strongly robust events, robust events, and weak events in a portion of the TIMIT test database are 98.9%, 94.7%, and 52.1%, respectively. Error rates increase by less than 5% when the speech signals are spectrally impoverished. Use of the four temporal parameters as the front end of a hidden Markov model (HMM)-based system for the automatic recognition of the manner classes "sonorant," "fricative," "stop," and "silence" results in the same recognition accuracy achieved when the standard 39 cepstral-based parameters are used, 70.1%. The combination of the temporal parameters and cepstral parameters results in an accuracy of 74.8%.

Entities:  

Mesh:

Year:  2004        PMID: 15058352     DOI: 10.1121/1.1646400

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  2 in total

1.  Contribution of consonant landmarks to speech recognition in simulated acoustic-electric hearing.

Authors:  Fei Chen; Philipos C Loizou
Journal:  Ear Hear       Date:  2010-04       Impact factor: 3.570

2.  A speech envelope landmark for syllable encoding in human superior temporal gyrus.

Authors:  Yulia Oganian; Edward F Chang
Journal:  Sci Adv       Date:  2019-11-20       Impact factor: 14.136

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.