Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

Literature DB >> 36161036

Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

Abstract

This paper proposes a neural cascade architecture to address the monaural speech enhancement problem. The cascade architecture is composed of three modules which optimize in turn enhanced speech with respect to the magnitude spectrogram, the time-domain signal and the complex spectrogram. Each module takes as input the noisy speech and the output obtained from the previous module, and generates a prediction of the respective target. Our model is trained in an end-to-end manner, using a triple-domain loss function that accounts for three domains of signal representation. Experimental results on the WSJ0 SI-84 corpus show that the proposed model outperforms other strong speech enhancement baselines in terms of objective speech quality and intelligibility.

Entities: Chemical

Keywords: Monaural speech enhancement; cascade architecture; complex domain; deep learning; time domain

Year: 2021 PMID： 36161036 PMCID： PMC9491518 DOI： 10.1109/taslp.2021.3138716

Source DB: PubMed Journal: IEEE/ACM Trans Audio Speech Lang Process

Keyword Cloud
References

10 in total

Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.

1. Supervised Speech Separation Based on Deep Learning: An Overview.

2. Two-stage Deep Learning for Noisy-reverberant Speech Enhancement.

3. A New Framework for CNN-Based Speech Enhancement in the Time Domain.

4. Long short-term memory for speaker generalization in supervised speech separation.

5. Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.

6. Learning Complex Spectral Mapping with Gated Convolutional Recurrent Networks for Monaural Speech Enhancement.

7. On Training Targets for Supervised Speech Separation.

8. Complex Ratio Masking for Monaural Speech Separation.

9. Dense CNN with Self-Attention for Time-Domain Speech Enhancement.

10. Towards Model Compression for Deep Learning Based Speech Enhancement.