Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising.

Literature DB >> 30112422

Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising.

Abstract

In real-world situations, speech is masked by both background noise and reverberation, which negatively affect perceptual quality and intelligibility. In this paper, we address monaural speech separation in reverberant and noisy environments. We perform dereverberation and denoising using supervised learning with a deep neural network. Specifically, we enhance the magnitude and phase by performing separation with an estimate of the complex ideal ratio mask. We define the complex ideal ratio mask so that direct speech results after the mask is applied to reverberant and noisy speech. Our approach is evaluated using simulated and real room impulse responses, and with background noises. The proposed approach improves objective speech quality and intelligibility significantly. Evaluations and comparisons show that it outperforms related methods in many reverberant and noisy environments.

Entities: Chemical Disease Gene Species

Keywords: Complex ideal ratio mask; deep neural networks; dereverberation; speech quality; speech separation

Year: 2017 PMID： 30112422 PMCID： PMC6089240 DOI： 10.1109/TASLP.2017.2696307

Source DB: PubMed Journal: IEEE/ACM Trans Audio Speech Lang Process

9 in total

1. On the importance of early reflections for speech in rooms.

Authors: J S Bradley; H Sato; M Picard
Journal: J Acoust Soc Am Date: 2003-06 Impact factor: 1.840

2. Pitch-based monaural segregation of reverberant speech.

Authors: Nicoleta Roman; DeLiang Wang
Journal: J Acoust Soc Am Date: 2006-07 Impact factor: 1.840

3. Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions.

Authors: Jianfen Ma; Yi Hu; Philipos C Loizou
Journal: J Acoust Soc Am Date: 2009-05 Impact factor: 1.840

4. Speech intelligibility in reverberation with ideal binary masking: effects of early reflections and signal-to-noise ratio threshold.

Authors: Nicoleta Roman; John Woodruff
Journal: J Acoust Soc Am Date: 2013-03 Impact factor: 1.840

5. Perceptual linear predictive (PLP) analysis of speech.

Authors: H Hermansky
Journal: J Acoust Soc Am Date: 1990-04 Impact factor: 1.840

6. Monaural and binaural speech perception through hearing aids under noise and reverberation with normal and hearing-impaired listeners.

Authors: A K Nabelek; J M Pickett
Journal: J Speech Hear Res Date: 1974-12

7. Speech enhancement based on physiological and psychoacoustical models of modulation perception and binaural interaction.

Authors: B Kollmeier; R Koch
Journal: J Acoust Soc Am Date: 1994-03 Impact factor: 1.840

8. On Training Targets for Supervised Speech Separation.

Authors: Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2014-12

9. Complex Ratio Masking for Monaural Speech Separation.

Authors: Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2015-12-23

9 in total

2 in total

1. Deep Learning Based Target Cancellation for Speech Dereverberation.

Authors: Zhong-Qiu Wang; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2020-02-28

2. Speech Enhancement by Multiple Propagation through the Same Neural Network.

Authors: Tomasz Grzywalski; Szymon Drgas
Journal: Sensors (Basel) Date: 2022-03-22 Impact factor: 3.576

2 in total