Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain.

Literature DB >> 27794330

Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain.

Helia Relaño-Iborra¹, Tobias May¹, Johannes Zaar¹, Christoph Scheidiger¹, Torsten Dau¹.

Abstract

A speech intelligibility prediction model is proposed that combines the auditory processing front end of the multi-resolution speech-based envelope power spectrum model [mr-sEPSM; Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134(1), 436-446] with a correlation back end inspired by the short-time objective intelligibility measure [STOI; Taal, Hendriks, Heusdens, and Jensen (2011). IEEE Trans. Audio Speech Lang. PROCESS: 19(7), 2125-2136]. This "hybrid" model, named sEPSMcorr, is shown to account for the effects of stationary and fluctuating additive interferers as well as for the effects of non-linear distortions, such as spectral subtraction, phase jitter, and ideal time frequency segregation (ITFS). The model shows a broader predictive range than both the original mr-sEPSM (which fails in the phase-jitter and ITFS conditions) and STOI (which fails to predict the influence of fluctuating interferers), albeit with lower accuracy than the source models in some individual conditions. Similar to other models that employ a short-term correlation-based back end, including STOI, the proposed model fails to account for the effects of room reverberation on speech intelligibility. Overall, the model might be valuable for evaluating the effects of a large range of interferers and distortions on speech intelligibility, including consequences of hearing impairment and hearing-instrument signal processing.

Entities: Disease

Year: 2016 PMID： 27794330 DOI： 10.1121/1.4964505

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 1.840

Keyword Cloud
Cited

8 in total

1. A harmonic-cancellation-based model to predict speech intelligibility against a harmonic masker.

Authors: Luna Prud'homme; Mathieu Lavandier; Virginia Best
Journal: J Acoust Soc Am Date: 2020-11 Impact factor: 1.840

2. Understanding degraded speech leads to perceptual gating of a brainstem reflex in human listeners.

Authors: Heivet Hernández-Pérez; Jason Mikiel-Hunter; David McAlpine; Sumitrajit Dhar; Sriram Boothalingam; Jessica J M Monaghan; Catherine M McMahon
Journal: PLoS Biol Date: 2021-10-20 Impact factor: 8.029

3. Temporal fine structure influences voicing confusions for consonant identification in multi-talker babble.

Authors: Vibha Viswanathan; Barbara G Shinn-Cunningham; Michael G Heinz
Journal: J Acoust Soc Am Date: 2021-10 Impact factor: 2.482

4. Modulation masking and fine structure shape neural envelope coding to predict speech intelligibility across diverse listening conditions.

Authors: Vibha Viswanathan; Hari M Bharadwaj; Barbara G Shinn-Cunningham; Michael G Heinz
Journal: J Acoust Soc Am Date: 2021-09 Impact factor: 2.482

8. Transient Noise Reduction Using a Deep Recurrent Neural Network: Effects on Subjective Speech Intelligibility and Listening Comfort.

Authors: Mahmoud Keshavarzi; Tobias Reichenbach; Brian C J Moore
Journal: Trends Hear Date: 2021 Jan-Dec Impact factor: 3.293