Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Long short-term memory for speaker generalization in supervised speech separation.

Literature DB >> 28679261

Long short-term memory for speaker generalization in supervised speech separation.

Abstract

Speech separation can be formulated as learning to estimate a time-frequency mask from acoustic features extracted from noisy speech. For supervised speech separation, generalization to unseen noises and unseen speakers is a critical issue. Although deep neural networks (DNNs) have been successful in noise-independent speech separation, DNNs are limited in modeling a large number of speakers. To improve speaker generalization, a separation model based on long short-term memory (LSTM) is proposed, which naturally accounts for temporal dynamics of speech. Systematic evaluation shows that the proposed model substantially outperforms a DNN-based model on unseen speakers and unseen noises in terms of objective speech intelligibility. Analyzing LSTM internal representations reveals that LSTM captures long-term speech contexts. It is also found that the LSTM model is more advantageous for low-latency speech separation and it, without future frames, performs better than the DNN model with future frames. The proposed model represents an effective approach for speaker- and noise-independent speech separation.

Mesh：

Year: 2017 PMID： 28679261 PMCID： PMC5482750 DOI： 10.1121/1.4986931

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 1.840

10 in total

1. Learning to forget: continual prediction with LSTM.

Authors: F A Gers; J Schmidhuber; F Cummins
Journal: Neural Comput Date: 2000-10 Impact factor: 2.026

2. Learning long-term dependencies with gradient descent is difficult.

Authors: Y Bengio; P Simard; P Frasconi
Journal: IEEE Trans Neural Netw Date: 1994

3. An algorithm to improve speech recognition in noise for hearing-impaired listeners.

Authors: Eric W Healy; Sarah E Yoho; Yuxuan Wang; DeLiang Wang
Journal: J Acoust Soc Am Date: 2013-10 Impact factor: 1.840

4. An algorithm that improves speech intelligibility in noise for normal-hearing listeners.

Authors: Gibak Kim; Yang Lu; Yi Hu; Philipos C Loizou
Journal: J Acoust Soc Am Date: 2009-09 Impact factor: 1.840

5. Noise Perturbation for Supervised Speech Separation.

Authors: Jitong Chen; Yuxuan Wang; DeLiang Wang
Journal: Speech Commun Date: 2016-04-01 Impact factor: 2.017

6. Long short-term memory.

Authors: S Hochreiter; J Schmidhuber
Journal: Neural Comput Date: 1997-11-15 Impact factor: 2.026

7. Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.

Authors: Jitong Chen; Yuxuan Wang; Sarah E Yoho; DeLiang Wang; Eric W Healy
Journal: J Acoust Soc Am Date: 2016-05 Impact factor: 1.840

8. Long short-term memory for speaker generalization in supervised speech separation.

Authors: Jitong Chen; DeLiang Wang
Journal: J Acoust Soc Am Date: 2017-06 Impact factor: 1.840

9. An algorithm to increase speech intelligibility for hearing-impaired listeners in novel segments of the same noise type.

Authors: Eric W Healy; Sarah E Yoho; Jitong Chen; Yuxuan Wang; DeLiang Wang
Journal: J Acoust Soc Am Date: 2015-09 Impact factor: 1.840

10. On Training Targets for Supervised Speech Separation.

Authors: Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2014-12

10 in total

17 in total

1. An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.

Authors: Eric W Healy; Masood Delfarah; Jordan L Vasko; Brittney L Carter; DeLiang Wang
Journal: J Acoust Soc Am Date: 2017-06 Impact factor: 1.840

2. A talker-independent deep learning algorithm to increase intelligibility for hearing-impaired listeners in reverberant competing talker conditions.

Authors: Eric W Healy; Eric M Johnson; Masood Delfarah; DeLiang Wang
Journal: J Acoust Soc Am Date: 2020-06 Impact factor: 1.840

3. A deep learning based segregation algorithm to increase speech intelligibility for hearing-impaired listeners in reverberant-noisy conditions.

Authors: Yan Zhao; DeLiang Wang; Eric M Johnson; Eric W Healy
Journal: J Acoust Soc Am Date: 2018-09 Impact factor: 1.840

10. Using recurrent neural networks to improve the perception of speech in non-stationary noise by people with cochlear implants.

Authors: Tobias Goehring; Mahmoud Keshavarzi; Robert P Carlyon; Brian C J Moore
Journal: J Acoust Soc Am Date: 2019-07 Impact factor: 1.840