Literature DB >> 33748326

Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.

Zhong-Qiu Wang1, Peidong Wang1, DeLiang Wang2.   

Abstract

This study proposes a complex spectral mapping approach for single- and multi-channel speech enhancement, where deep neural networks (DNNs) are used to predict the real and imaginary (RI) components of the direct-path signal from noisy and reverberant ones. The proposed system contains two DNNs. The first one performs single-channel complex spectral mapping. The estimated complex spectra are used to compute a minimum variance distortion-less response (MVDR) beamformer. The RI components of beamforming results, which encode spatial information, are then combined with the RI components of the mixture to train the second DNN for multi-channel complex spectral mapping. With estimated complex spectra, we also propose a novel method of time-varying beamforming. State-of-the-art performance is obtained on the speech enhancement and recognition tasks of the CHiME-4 corpus. More specifically, our system obtains 6.82%, 3.19% and 2.00% word error rates (WER) respectively on the single-, two-, and six-microphone tasks of CHiME-4, significantly surpassing the current best results of 9.15%, 3.91% and 2.24% WER.

Entities:  

Keywords:  Complex spectral mapping; beamforming; deep learning; microphone array processing; phase estimation; speech enhancement

Year:  2020        PMID: 33748326      PMCID: PMC7971156          DOI: 10.1109/taslp.2020.2998279

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  3 in total

1.  Supervised Speech Separation Based on Deep Learning: An Overview.

Authors:  DeLiang Wang; Jitong Chen
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2018-05-30

2.  Complex Ratio Masking for Monaural Speech Separation.

Authors:  Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2015-12-23

3.  Deep Learning Based Binaural Speech Separation in Reverberant Environments.

Authors:  Xueliang Zhang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2017-03-24
  3 in total
  3 in total

1.  Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

Authors:  Heming Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2021-12-28

2.  Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation.

Authors:  Zhong-Qiu Wang; Peidong Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2021-05-26

3.  Two-Step Joint Optimization with Auxiliary Loss Function for Noise-Robust Speech Recognition.

Authors:  Geon Woo Lee; Hong Kook Kim
Journal:  Sensors (Basel)       Date:  2022-07-19       Impact factor: 3.847

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.