Literature DB >> 33748329

Speech Intelligibility Prediction using Spectro-Temporal Modulation Analysis.

Amin Edraki1, Wai-Yip Chan1, Jesper Jensen2, Daniel Fogerty3.   

Abstract

Spectro-temporal modulations are believed to mediate the analysis of speech sounds in the human primary auditory cortex. Inspired by humans' robustness in comprehending speech in challenging acoustic environments, we propose an intrusive speech intelligibility prediction (SIP) algorithm, wSTMI, for normal-hearing listeners based on spectro-temporal modulation analysis (STMA) of the clean and degraded speech signals. In the STMA, each of 55 modulation frequency channels contributes an intermediate intelligibility measure. A sparse linear model with parameters optimized using Lasso regression results in combining the intermediate measures of 8 of the most salient channels for SIP. In comparison with a suite of 10 SIP algorithms, wSTMI performs consistently well across 13 datasets, which together cover degradation conditions including modulated noise, noise reduction processing, reverberation, near-end listening enhancement, and speech interruption. We show that the optimized parameters of wSTMI may be interpreted in terms of modulation transfer functions of the human auditory system. Thus, the proposed approach offers evidence affirming previous studies of the perceptual characteristics underlying speech signal intelligibility.

Entities:  

Keywords:  spectro-temporal modulation; speech intelligibility; speech quality model

Year:  2020        PMID: 33748329      PMCID: PMC7978234          DOI: 10.1109/taslp.2020.3039929

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  38 in total

1.  Speech-in-noise enhancement using amplification and dynamic range compression controlled by the speech intelligibility index.

Authors:  Henning Schepker; Jan Rennies; Simon Doclo
Journal:  J Acoust Soc Am       Date:  2015-11       Impact factor: 1.840

2.  Multiresolution spectrotemporal analysis of complex sounds.

Authors:  Taishih Chi; Powen Ru; Shihab A Shamma
Journal:  J Acoust Soc Am       Date:  2005-08       Impact factor: 1.840

3.  The contribution of temporal fine structure to the intelligibility of speech in steady and modulated noise.

Authors:  Kathryn Hopkins; Brian C J Moore
Journal:  J Acoust Soc Am       Date:  2009-01       Impact factor: 1.840

4.  A comparative intelligibility study of single-microphone noise reduction algorithms.

Authors:  Yi Hu; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2007-09       Impact factor: 1.840

5.  Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing.

Authors:  Søren Jørgensen; Torsten Dau
Journal:  J Acoust Soc Am       Date:  2011-09       Impact factor: 1.840

6.  Separable spectro-temporal Gabor filter bank features: Reducing the complexity of robust features for automatic speech recognition.

Authors:  Marc René Schädler; Birger Kollmeier
Journal:  J Acoust Soc Am       Date:  2015-04       Impact factor: 1.840

7.  Prediction of Intelligibility of Non-linearly Processed Speech.

Authors:  Carl Ludvigsen; Claus Elberling; Gitte Keidser; Torben Poulsen
Journal:  Acta Otolaryngol       Date:  1990       Impact factor: 1.494

8.  The effect of simulated room acoustic parameters on the intelligibility and perceived reverberation of monosyllabic words and sentences.

Authors:  Daniel Fogerty; Ahmed Alghamdi; Wai-Yip Chan
Journal:  J Acoust Soc Am       Date:  2020-05       Impact factor: 1.840

9.  Temporal envelope and fine structure cues for speech intelligibility.

Authors:  R Drullman
Journal:  J Acoust Soc Am       Date:  1995-01       Impact factor: 1.840

10.  Effect of temporal envelope smearing on speech reception.

Authors:  R Drullman; J M Festen; R Plomp
Journal:  J Acoust Soc Am       Date:  1994-02       Impact factor: 1.840

View more
  2 in total

1.  Relating Suprathreshold Auditory Processing Abilities to Speech Understanding in Competition.

Authors:  Frederick J Gallun; Laura Coco; Tess K Koerner; E Sebastian Lelo de Larrea-Mancera; Michelle R Molis; David A Eddins; Aaron R Seitz
Journal:  Brain Sci       Date:  2022-05-27

2.  Glimpsing keywords across sentences in noise: A microstructural analysis of acoustic, lexical, and listener factors.

Authors:  Daniel Fogerty; Jayne B Ahlstrom; Judy R Dubno
Journal:  J Acoust Soc Am       Date:  2021-09       Impact factor: 2.482

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.