Literature DB >> 21206363

Predicting the intelligibility of vocoded speech.

Fei Chen1, Philipos C Loizou.   

Abstract

OBJECTIVES: The purpose of this study is to evaluate the performance of a number of speech intelligibility indices in terms of predicting the intelligibility of vocoded speech.
DESIGN: Noise-corrupted sentences were vocoded in a total of 80 conditions, involving three different signal-to-noise ratio levels (-5, 0, and 5 dB) and two types of maskers (steady state noise and two-talker). Tone-vocoder simulations and combined electric-acoustic stimulation (EAS) simulations were used. The vocoded sentences were presented to normal-hearing listeners for identification, and the resulting intelligibility scores were used to assess the correlation of various speech intelligibility measures. These included measures designed to assess speech intelligibility, including the speech transmission index (STI) and articulation index based measures, as well as distortions in hearing aids (e.g., coherence-based measures). These measures employed primarily either the temporal-envelope or the spectral-envelope information in the prediction model. The underlying hypothesis in the present study is that measures that assess temporal-envelope distortions, such as those based on the STI, should correlate highly with the intelligibility of vocoded speech. This is based on the fact that vocoder simulations preserve primarily envelope information, similar to the processing implemented in current cochlear implant speech processors. Similarly, it is hypothesized that measures such as the coherence-based index that assess the distortions present in the spectral envelope could also be used to model the intelligibility of vocoded speech.
RESULTS: Of all the intelligibility measures considered, the coherence-based and the STI-based measures performed the best. High correlations (r = 0.9 to 0.96) were maintained with the coherence-based measures in all noisy conditions. The highest correlation obtained with the STI-based measure was 0.92, and that was obtained when high modulation rates (100 Hz) were used. The performance of these measures remained high in both steady-noise and fluctuating masker conditions. The correlations with conditions involving tone-vocoded speech were found to be a bit higher than the correlations with conditions involving EAS-vocoded speech.
CONCLUSIONS: The present study demonstrated that some of the speech intelligibility indices that have been found previously to correlate highly with wideband speech can also be used to predict the intelligibility of vocoded speech. Both the coherence-based and STI-based measures have been found to be good measures for modeling the intelligibility of vocoded speech. The highest correlation (r = 0.96) was obtained with a derived coherence measure that placed more emphasis on information contained in vowel/consonant spectral transitions and less emphasis on information contained in steady sonorant segments. High (100 Hz) modulation rates were found to be necessary in the implementation of the STI-based measures for better modeling of the intelligibility of vocoded speech. We believe that the difference in modulation rates needed for modeling the intelligibility of wideband versus vocoded speech can be attributed to the increased importance of higher modulation rates in situations where the amount of spectral information available to the listeners is limited (eight channels in our study). Unlike the traditional STI method that has been found to perform poorly in terms of predicting the intelligibility of processed speech wherein nonlinear operations are involved, the STI-based measure used in the present study has been found to perform quite well. In summary, the present study took the first step in modeling the intelligibility of vocoded speech. Access to such intelligibility measures is of high significance as they can be used to guide the development of new speech coding algorithms for cochlear implants.

Entities:  

Mesh:

Year:  2011        PMID: 21206363      PMCID: PMC3085937          DOI: 10.1097/AUD.0b013e3181ff3515

Source DB:  PubMed          Journal:  Ear Hear        ISSN: 0196-0202            Impact factor:   3.570


  34 in total

1.  Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants.

Authors:  L M Friesen; R V Shannon; D Baskent; X Wang
Journal:  J Acoust Soc Am       Date:  2001-08       Impact factor: 1.840

2.  Holes in hearing.

Authors:  Robert V Shannon; John J Galvin; Deniz Baskent
Journal:  J Assoc Res Otolaryngol       Date:  2002-06

3.  Effect of talker and speaking style on the speech transmission index.

Authors:  Sander J van Wijngaarden; Tammo Houtgast
Journal:  J Acoust Soc Am       Date:  2004-01       Impact factor: 1.840

Review 4.  Temporal information in speech: acoustic, auditory and linguistic aspects.

Authors:  S Rosen
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  1992-06-29       Impact factor: 6.237

5.  Derivation of auditory filter shapes from notched-noise data.

Authors:  B R Glasberg; B C Moore
Journal:  Hear Res       Date:  1990-08-01       Impact factor: 3.208

6.  On using coherence to measure distortion in hearing aids.

Authors:  J M Kates
Journal:  J Acoust Soc Am       Date:  1992-04       Impact factor: 1.840

7.  Speech recognition with primarily temporal cues.

Authors:  R V Shannon; F G Zeng; V Kamath; J Wygonski; M Ekelid
Journal:  Science       Date:  1995-10-13       Impact factor: 47.728

8.  Evaluation of a noise reduction method--comparison between observed scores and scores predicted from STI.

Authors:  C Ludvigsen; C Elberling; G Keidser
Journal:  Scand Audiol Suppl       Date:  1993

9.  Effect of temporal envelope smearing on speech reception.

Authors:  R Drullman; J M Festen; R Plomp
Journal:  J Acoust Soc Am       Date:  1994-02       Impact factor: 1.840

10.  A physical method for measuring speech-transmission quality.

Authors:  H J Steeneken; T Houtgast
Journal:  J Acoust Soc Am       Date:  1980-01       Impact factor: 1.840

View more
  12 in total

1.  Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise.

Authors:  Fei Chen; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2012-05       Impact factor: 1.840

2.  Objective speech intelligibility measurement for cochlear implant users in complex listening environments.

Authors:  João F Santos; Stefano Cosentino; Oldooz Hazrati; Philipos C Loizou; Tiago H Falk
Journal:  Speech Commun       Date:  2013-09-01       Impact factor: 2.017

3.  Predicting the intelligibility of vocoded and wideband Mandarin Chinese.

Authors:  Fei Chen; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2011-05       Impact factor: 1.840

4.  Predicting the speech reception threshold of cochlear implant listeners using an envelope-correlation based measure.

Authors:  Nima Yousefian; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2012-11       Impact factor: 1.840

5.  Speech perception in noise with a harmonic complex excited vocoder.

Authors:  Tyler H Churchill; Alan Kan; Matthew J Goupell; Antje Ihlefeld; Ruth Y Litovsky
Journal:  J Assoc Res Otolaryngol       Date:  2014-01-22

6.  Acoustic richness modulates the neural networks supporting intelligible speech processing.

Authors:  Yune-Sang Lee; Nam Eun Min; Arthur Wingfield; Murray Grossman; Jonathan E Peelle
Journal:  Hear Res       Date:  2015-12-23       Impact factor: 3.208

7.  Objective Quality and Intelligibility Prediction for Users of Assistive Listening Devices.

Authors:  Tiago H Falk; Vijay Parsa; João F Santos; Kathryn Arehart; Oldooz Hazrati; Rainer Huber; James M Kates; Susan Scollie
Journal:  IEEE Signal Process Mag       Date:  2015-03       Impact factor: 12.551

8.  Voice gender discrimination provides a measure of more than pitch-related perception in cochlear implant users.

Authors:  Tianhao Li; Qian-Jie Fu
Journal:  Int J Audiol       Date:  2011-06-23       Impact factor: 2.117

9.  Effect of Place-Based Versus Default Mapping Procedures on Masked Speech Recognition: Simulations of Cochlear Implant Alone and Electric-Acoustic Stimulation.

Authors:  Margaret T Dillon; Brendan P O'Connell; Michael W Canfarotta; Emily Buss; Joseph Hopfinger
Journal:  Am J Audiol       Date:  2022-04-08       Impact factor: 1.636

10.  Sparse Nonnegative Matrix Factorization Strategy for Cochlear Implants.

Authors:  Hongmei Hu; Mark E Lutman; Stephan D Ewert; Guoping Li; Stefan Bleeck
Journal:  Trends Hear       Date:  2015-12-30       Impact factor: 3.293

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.