Literature DB >> 33748325

Monaural Speech Dereverberation Using Temporal Convolutional Networks with Self Attention.

Yan Zhao1, DeLiang Wang2, Buye Xu3, Tao Zhang4.   

Abstract

In daily listening environments, human speech is often degraded by room reverberation, especially under highly reverberant conditions. Such degradation poses a challenge for many speech processing systems, where the performance becomes much worse than in anechoic environments. To combat the effect of reverberation, we propose a monaural (single-channel) speech dereverberation algorithm using temporal convolutional networks with self attention. Specifically, the proposed system includes a self-attention module to produce dynamic representations given input features, a temporal convolutional network to learn a nonlinear mapping from such representations to the magnitude spectrum of anechoic speech, and a one-dimensional (1-D) convolution module to smooth the enhanced magnitude among adjacent frames. Systematic evaluations demonstrate that the proposed algorithm improves objective metrics of speech quality in a wide range of reverberant conditions. In addition, it generalizes well to untrained reverberation times, room sizes, measured room impulse responses, real-world recorded noisy-reverberant speech, and different speakers.

Entities:  

Keywords:  Dereverberation; room impulse response; self attention; temporal convolutional networks

Year:  2020        PMID: 33748325      PMCID: PMC7971181          DOI: 10.1109/taslp.2020.2995273

Source DB:  PubMed          Journal:  IEEE/ACM Trans Audio Speech Lang Process


  4 in total

1.  Supervised Speech Separation Based on Deep Learning: An Overview.

Authors:  DeLiang Wang; Jitong Chen
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2018-05-30

2.  Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions.

Authors:  Jianfen Ma; Yi Hu; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2009-05       Impact factor: 1.840

3.  Hearing loss, aging, and speech perception in reverberation and noise.

Authors:  K S Helfer; L A Wilber
Journal:  J Speech Hear Res       Date:  1990-03

4.  Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation.

Authors:  Yi Luo; Nima Mesgarani
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2019-05-06
  4 in total
  3 in total

1.  A causal and talker-independent speaker separation/dereverberation deep learning algorithm: Cost associated with conversion to real-time capable operation.

Authors:  Eric W Healy; Hassan Taherian; Eric M Johnson; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2021-11       Impact factor: 1.840

2.  Self-attending RNN for Speech Enhancement to Improve Cross-corpus Generalization.

Authors:  Ashutosh Pandey; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2022-03-22

3.  Dense CNN with Self-Attention for Time-Domain Speech Enhancement.

Authors:  Ashutosh Pandey; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2021-03-08
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.