Literature DB >> 34717521

Deep learning based speaker separation and dereverberation can generalize across different languages to improve intelligibility.

Eric W Healy1, Eric M Johnson1, Masood Delfarah2, Divya S Krishnagiri1, Victoria A Sevich1, Hassan Taherian2, DeLiang Wang2.   

Abstract

The practical efficacy of deep learning based speaker separation and/or dereverberation hinges on its ability to generalize to conditions not employed during neural network training. The current study was designed to assess the ability to generalize across extremely different training versus test environments. Training and testing were performed using different languages having no known common ancestry and correspondingly large linguistic differences-English for training and Mandarin for testing. Additional generalizations included untrained speech corpus/recording channel, target-to-interferer energy ratios, reverberation room impulse responses, and test talkers. A deep computational auditory scene analysis algorithm, employing complex time-frequency masking to estimate both magnitude and phase, was used to segregate two concurrent talkers and simultaneously remove large amounts of room reverberation to increase the intelligibility of a target talker. Significant intelligibility improvements were observed for the normal-hearing listeners in every condition. Benefit averaged 43.5% points across conditions and was comparable to that obtained when training and testing were performed both in English. Benefit is projected to be considerably larger for individuals with hearing impairment. It is concluded that a properly designed and trained deep speaker separation/dereverberation network can be capable of generalization across vastly different acoustic environments that include different languages.

Entities:  

Mesh:

Year:  2021        PMID: 34717521      PMCID: PMC8637753          DOI: 10.1121/10.0006565

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   2.482


  20 in total

1.  The role of contrasting temporal amplitude patterns in the perception of speech.

Authors:  Eric W Healy; Richard M Warren
Journal:  J Acoust Soc Am       Date:  2003-03       Impact factor: 1.840

2.  An algorithm to improve speech recognition in noise for hearing-impaired listeners.

Authors:  Eric W Healy; Sarah E Yoho; Yuxuan Wang; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2013-10       Impact factor: 1.840

3.  An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.

Authors:  Eric W Healy; Masood Delfarah; Jordan L Vasko; Brittney L Carter; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2017-06       Impact factor: 1.840

4.  Development and validation of the Mandarin speech perception test.

Authors:  Qian-Jie Fu; Meimei Zhu; Xiaosong Wang
Journal:  J Acoust Soc Am       Date:  2011-06       Impact factor: 1.840

5.  Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.

Authors:  Jitong Chen; Yuxuan Wang; Sarah E Yoho; DeLiang Wang; Eric W Healy
Journal:  J Acoust Soc Am       Date:  2016-05       Impact factor: 1.840

6.  A "rationalized" arcsine transform.

Authors:  G A Studebaker
Journal:  J Speech Hear Res       Date:  1985-09

7.  On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement.

Authors:  Ashutosh Pandey; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2020-08-14

8.  Complex Ratio Masking for Monaural Speech Separation.

Authors:  Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2015-12-23

9.  Using recurrent neural networks to improve the perception of speech in non-stationary noise by people with cochlear implants.

Authors:  Tobias Goehring; Mahmoud Keshavarzi; Robert P Carlyon; Brian C J Moore
Journal:  J Acoust Soc Am       Date:  2019-07       Impact factor: 1.840

10.  An effectively causal deep learning algorithm to increase intelligibility in untrained noises for hearing-impaired listeners.

Authors:  Eric W Healy; Ke Tan; Eric M Johnson; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2021-06       Impact factor: 2.482

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.