Literature DB >> 29057291

Deep Learning Based Binaural Speech Separation in Reverberant Environments.

Xueliang Zhang1, DeLiang Wang2.   

Abstract

Speech signal is usually degraded by room reverberation and additive noises in real environments. This paper focuses on separating target speech signal in reverberant conditions from binaural inputs. Binaural separation is formulated as a supervised learning problem, and we employ deep learning to map from both spatial and spectral features to a training target. With binaural inputs, we first apply a fixed beamformer and then extract several spectral features. A new spatial feature is proposed and extracted to complement the spectral features. The training target is the recently suggested ideal ratio mask. Systematic evaluations and comparisons show that the proposed system achieves very good separation performance and substantially outperforms related algorithms under challenging multi-source and reverberant environments.

Entities:  

Keywords:  Beamforming; Binaural speech separation; computational auditory scene analysis (CASA); deep neural network (DNN); room reverberation

Year:  2017        PMID: 29057291      PMCID: PMC5646682          DOI: 10.1109/TASLP.2017.2687104

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  7 in total

1.  Speech segregation based on sound localization.

Authors:  Nicoleta Roman; DeLiang Wang; Guy J Brown
Journal:  J Acoust Soc Am       Date:  2003-10       Impact factor: 1.840

2.  An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech.

Authors:  Cees H Taal; Richard C Hendriks; Richard Heusdens; Jesper Jensen
Journal:  J Acoust Soc Am       Date:  2011-11       Impact factor: 1.840

3.  Source localization in complex listening situations: selection of binaural cues based on interaural coherence.

Authors:  Christof Faller; Juha Merimaa
Journal:  J Acoust Soc Am       Date:  2004-11       Impact factor: 1.840

4.  An algorithm to improve speech recognition in noise for hearing-impaired listeners.

Authors:  Eric W Healy; Sarah E Yoho; Yuxuan Wang; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2013-10       Impact factor: 1.840

5.  An algorithm that improves speech intelligibility in noise for normal-hearing listeners.

Authors:  Gibak Kim; Yang Lu; Yi Hu; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2009-09       Impact factor: 1.840

6.  Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.

Authors:  Jitong Chen; Yuxuan Wang; Sarah E Yoho; DeLiang Wang; Eric W Healy
Journal:  J Acoust Soc Am       Date:  2016-05       Impact factor: 1.840

7.  On Training Targets for Supervised Speech Separation.

Authors:  Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2014-12
  7 in total
  3 in total

1.  Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.

Authors:  Zhong-Qiu Wang; Peidong Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2020-05-28

2.  Deep Learning Based Target Cancellation for Speech Dereverberation.

Authors:  Zhong-Qiu Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2020-02-28

3.  Deep Learning Based Real-time Speech Enhancement for Dual-microphone Mobile Phones.

Authors:  Ke Tan; Xueliang Zhang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2021-05-21
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.