Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deep learning based speaker separation and dereverberation can generalize across different languages to improve intelligibility.

Literature DB >> 34717521

Deep learning based speaker separation and dereverberation can generalize across different languages to improve intelligibility.

Eric W Healy¹, Eric M Johnson¹, Masood Delfarah², Divya S Krishnagiri¹, Victoria A Sevich¹, Hassan Taherian², DeLiang Wang².

Abstract

The practical efficacy of deep learning based speaker separation and/or dereverberation hinges on its ability to generalize to conditions not employed during neural network training. The current study was designed to assess the ability to generalize across extremely different training versus test environments. Training and testing were performed using different languages having no known common ancestry and correspondingly large linguistic differences-English for training and Mandarin for testing. Additional generalizations included untrained speech corpus/recording channel, target-to-interferer energy ratios, reverberation room impulse responses, and test talkers. A deep computational auditory scene analysis algorithm, employing complex time-frequency masking to estimate both magnitude and phase, was used to segregate two concurrent talkers and simultaneously remove large amounts of room reverberation to increase the intelligibility of a target talker. Significant intelligibility improvements were observed for the normal-hearing listeners in every condition. Benefit averaged 43.5% points across conditions and was comparable to that obtained when training and testing were performed both in English. Benefit is projected to be considerably larger for individuals with hearing impairment. It is concluded that a properly designed and trained deep speaker separation/dereverberation network can be capable of generalization across vastly different acoustic environments that include different languages.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34717521 PMCID： PMC8637753 DOI： 10.1121/10.0006565

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 2.482

Keyword Cloud
References

20 in total

1. The role of contrasting temporal amplitude patterns in the perception of speech.

Authors: Eric W Healy; Richard M Warren
Journal: J Acoust Soc Am Date: 2003-03 Impact factor: 1.840

2. An algorithm to improve speech recognition in noise for hearing-impaired listeners.

Authors: Eric W Healy; Sarah E Yoho; Yuxuan Wang; DeLiang Wang
Journal: J Acoust Soc Am Date: 2013-10 Impact factor: 1.840

3. An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.

Authors: Eric W Healy; Masood Delfarah; Jordan L Vasko; Brittney L Carter; DeLiang Wang
Journal: J Acoust Soc Am Date: 2017-06 Impact factor: 1.840

4. Development and validation of the Mandarin speech perception test.

Authors: Qian-Jie Fu; Meimei Zhu; Xiaosong Wang
Journal: J Acoust Soc Am Date: 2011-06 Impact factor: 1.840

5. Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.

Authors: Jitong Chen; Yuxuan Wang; Sarah E Yoho; DeLiang Wang; Eric W Healy
Journal: J Acoust Soc Am Date: 2016-05 Impact factor: 1.840

6. A "rationalized" arcsine transform.

Authors: G A Studebaker
Journal: J Speech Hear Res Date: 1985-09

7. On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement.

Authors: Ashutosh Pandey; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2020-08-14

8. Complex Ratio Masking for Monaural Speech Separation.

Authors: Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2015-12-23

9. Using recurrent neural networks to improve the perception of speech in non-stationary noise by people with cochlear implants.

Authors: Tobias Goehring; Mahmoud Keshavarzi; Robert P Carlyon; Brian C J Moore
Journal: J Acoust Soc Am Date: 2019-07 Impact factor: 1.840

10. An effectively causal deep learning algorithm to increase intelligibility in untrained noises for hearing-impaired listeners.

Authors: Eric W Healy; Ke Tan; Eric M Johnson; DeLiang Wang
Journal: J Acoust Soc Am Date: 2021-06 Impact factor: 2.482