Literature DB >> 30424638

An ideal quantized mask to increase intelligibility and quality of speech in noise.

Eric W Healy1, Jordan L Vasko1.   

Abstract

Time-frequency (T-F) masks represent powerful tools to increase the intelligibility of speech in background noise. Translational relevance is provided by their accurate estimation based only on the signal-plus-noise mixture, using deep learning or other machine-learning techniques. In the current study, a technique is designed to capture the benefits of existing techniques. In the ideal quantized mask (IQM), speech and noise are partitioned into T-F units, and each unit receives one of N attenuations according to its signal-to-noise ratio. It was found that as few as four to eight attenuation steps (IQM4, IQM8) improved intelligibility over the ideal binary mask (IBM, having two attenuation steps), and equaled the intelligibility resulting from the ideal ratio mask (IRM, having a theoretically infinite number of steps). Sound-quality ratings and rankings of noisy speech processed by the IQM4 and IQM8 were also superior to that processed by the IBM and equaled or exceeded that processed by the IRM. It is concluded that the intelligibility and sound-quality advantages of infinite attenuation resolution can be captured by an IQM having only a very small number of steps. Further, the classification-based nature of the IQM might provide algorithmic advantages over the regression-based IRM during machine estimation.

Mesh:

Year:  2018        PMID: 30424638      PMCID: PMC6136922          DOI: 10.1121/1.5053115

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  27 in total

1.  Speech recognition by normal-hearing and cochlear implant listeners as a function of intensity resolution.

Authors:  P C Loizou; M Dorman; O Poroy; T Spahr
Journal:  J Acoust Soc Am       Date:  2000-11       Impact factor: 1.840

2.  Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.

Authors:  Ning Li; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2008-03       Impact factor: 1.840

Review 3.  Time-frequency masking for speech separation and its potential for hearing aid design.

Authors: 
Journal:  Trends Amplif       Date:  2008-10-30

4.  An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.

Authors:  Eric W Healy; Masood Delfarah; Jordan L Vasko; Brittney L Carter; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2017-06       Impact factor: 1.840

5.  Role of mask pattern in intelligibility of ideal binary-masked noisy speech.

Authors:  Ulrik Kjems; Jesper B Boldt; Michael S Pedersen; Thomas Lunner; Deliang Wang
Journal:  J Acoust Soc Am       Date:  2009-09       Impact factor: 1.840

6.  Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.

Authors:  Jitong Chen; Yuxuan Wang; Sarah E Yoho; DeLiang Wang; Eric W Healy
Journal:  J Acoust Soc Am       Date:  2016-05       Impact factor: 1.840

7.  An algorithm to increase speech intelligibility for hearing-impaired listeners in novel segments of the same noise type.

Authors:  Eric W Healy; Sarah E Yoho; Jitong Chen; Yuxuan Wang; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2015-09       Impact factor: 1.840

8.  Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners.

Authors:  Jessica J M Monaghan; Tobias Goehring; Xin Yang; Federico Bolner; Shangqiguo Wang; Matthew C M Wright; Stefan Bleeck
Journal:  J Acoust Soc Am       Date:  2017-03       Impact factor: 1.840

9.  On Training Targets for Supervised Speech Separation.

Authors:  Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2014-12

10.  Speech intelligibility in background noise with ideal binary time-frequency masking.

Authors:  DeLiang Wang; Ulrik Kjems; Michael S Pedersen; Jesper B Boldt; Thomas Lunner
Journal:  J Acoust Soc Am       Date:  2009-04       Impact factor: 1.840

View more
  2 in total

1.  Determining the energetic and informational components of speech-on-speech masking in listeners with sensorineural hearing loss.

Authors:  Gerald Kidd; Christine R Mason; Virginia Best; Elin Roverud; Jayaganesh Swaminathan; Todd Jennings; Kameron Clayton; H Steven Colburn
Journal:  J Acoust Soc Am       Date:  2019-01       Impact factor: 1.840

2.  The importance of processing resolution in "ideal time-frequency segregation" of masked speech and the implications for predicting speech intelligibility.

Authors:  Christopher Conroy; Virginia Best; Todd R Jennings; Gerald Kidd
Journal:  J Acoust Soc Am       Date:  2020-03       Impact factor: 1.840

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.