Literature DB >> 34212067

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation.

Zhong-Qiu Wang1, Peidong Wang2, DeLiang Wang3.   

Abstract

We propose multi-microphone complex spectral mapping, a simple way of applying deep learning for time-varying non-linear beamforming, for speaker separation in reverberant conditions. We aim at both speaker separation and dereverberation. Our study first investigates offline utterance-wise speaker separation and then extends to block-online continuous speech separation (CSS). Assuming a fixed array geometry between training and testing, we train deep neural networks (DNN) to predict the real and imaginary (RI) components of target speech at a reference microphone from the RI components of multiple microphones. We then integrate multi-microphone complex spectral mapping with minimum variance distortionless response (MVDR) beamforming and post-filtering to further improve separation, and combine it with frame-level speaker counting for block-online CSS. Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry. State-of-the-art separation performance is obtained on the simulated two-talker SMS-WSJ corpus and the real-recorded LibriCSS dataset.

Entities:  

Keywords:  Complex spectral mapping; deep learning; microphone array processing; speaker separation

Year:  2021        PMID: 34212067      PMCID: PMC8240467          DOI: 10.1109/taslp.2021.3083405

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  6 in total

1.  Supervised Speech Separation Based on Deep Learning: An Overview.

Authors:  DeLiang Wang; Jitong Chen
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2018-05-30

2.  Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation.

Authors:  Yi Luo; Nima Mesgarani
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2019-05-06

3.  Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.

Authors:  Zhong-Qiu Wang; Peidong Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2020-05-28

4.  Deep Learning Based Target Cancellation for Speech Dereverberation.

Authors:  Zhong-Qiu Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2020-02-28

5.  Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation.

Authors:  Yuzhou Liu; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2019-09-12

6.  Complex Ratio Masking for Monaural Speech Separation.

Authors:  Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2015-12-23
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.