Literature DB >> 33748321

Deep Learning for Talker-dependent Reverberant Speaker Separation: An Empirical Study.

Masood Delfarah1, DeLiang Wang1.   

Abstract

Speaker separation refers to the problem of separating speech signals from a mixture of simultaneous speakers. Previous studies are limited to addressing the speaker separation problem in anechoic conditions. This paper addresses the problem of talker-dependent speaker separation in reverberant conditions, which are characteristic of real-world environments. We employ recurrent neural networks with bidirectional long short-term memory (BLSTM) to separate and dereverberate the target speech signal. We propose two-stage networks to effectively deal with both speaker separation and speech dereverberation. In the two-stage model, the first stage separates and dereverberates two-talker mixtures and the second stage further enhances the separated target signal. We have extensively evaluated the two-stage architecture, and our empirical results demonstrate large improvements over unprocessed mixtures and clear performance gain over single-stage networks in a wide range of target-to-interferer ratios and reverberation times in simulated as well as recorded rooms. Moreover, we show that time-frequency masking yields better performance than spectral mapping for reverberant speaker separation.

Entities:  

Keywords:  Cochannel speech separation; deep neural networks; speech dereverberation; two-stage network

Year:  2019        PMID: 33748321      PMCID: PMC7970708          DOI: 10.1109/taslp.2019.2934319

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  12 in total

1.  Supervised Speech Separation Based on Deep Learning: An Overview.

Authors:  DeLiang Wang; Jitong Chen
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2018-05-30

2.  Reverberation challenges the temporal representation of the pitch of complex sounds.

Authors:  Mark Sayles; Ian M Winter
Journal:  Neuron       Date:  2008-06-12       Impact factor: 17.173

3.  An algorithm to improve speech recognition in noise for hearing-impaired listeners.

Authors:  Eric W Healy; Sarah E Yoho; Yuxuan Wang; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2013-10       Impact factor: 1.840

4.  An algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker.

Authors:  Eric W Healy; Masood Delfarah; Jordan L Vasko; Brittney L Carter; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2017-06       Impact factor: 1.840

5.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

6.  Two-stage Deep Learning for Noisy-reverberant Speech Enhancement.

Authors:  Yan Zhao; Zhong-Qiu Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2018-09-17

7.  Hearing loss, aging, and speech perception in reverberation and noise.

Authors:  K S Helfer; L A Wilber
Journal:  J Speech Hear Res       Date:  1990-03

8.  On Training Targets for Supervised Speech Separation.

Authors:  Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2014-12

9.  A Deep Ensemble Learning Method for Monaural Speech Separation.

Authors:  Xiao-Lei Zhang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2016-03-01

10.  Speech enhancement based on neural networks improves speech intelligibility in noise for cochlear implant users.

Authors:  Tobias Goehring; Federico Bolner; Jessica J M Monaghan; Bas van Dijk; Andrzej Zarowski; Stefan Bleeck
Journal:  Hear Res       Date:  2016-11-30       Impact factor: 3.208

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.