Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Supervised Speech Separation Based on Deep Learning: An Overview.

Literature DB >> 31223631

Supervised Speech Separation Based on Deep Learning: An Overview.

Abstract

Speech separation is the task of separating target speech from background interference. Traditionally, speech separation is studied as a signal processing problem. A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data. Over the past decade, many supervised separation algorithms have been put forward. In particular, the recent introduction of deep learning to supervised speech separation has dramatically accelerated progress and boosted separation performance. This paper provides a comprehensive overview of the research on deep learning based supervised speech separation in the last several years. We first introduce the background of speech separation and the formulation of supervised separation. Then, we discuss three main components of supervised separation: learning machines, training targets, and acoustic features. Much of the overview is on separation algorithms where we review monaural methods, including speech enhancement (speech-nonspeech separation), speaker separation (multitalker separation), and speech dereverberation, as well as multimicrophone techniques. The important issue of generalization, unique to supervised learning, is discussed. This overview provides a historical perspective on how advances are made. In addition, we discuss a number of conceptual issues, including what constitutes the target source.

Entities: Chemical Disease Gene Species

Keywords: Seech separation; array separation; beamforming; deep learning; deep neural networks; speaker separation; speech dereverberation; speech enhancement; supervised speech separation; time-frequency masking

Year: 2018 PMID： 31223631 PMCID： PMC6586438 DOI： 10.1109/TASLP.2018.2842159

Source DB: PubMed Journal: IEEE/ACM Trans Audio Speech Lang Process

Keyword Cloud
Cited

25 in total

Supervised Speech Separation Based on Deep Learning: An Overview.

1. A two-stage deep learning algorithm for talker-independent speaker separation in reverberant conditions.

2. Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation.

3. Deep Learning for Talker-dependent Reverberant Speaker Separation: An Empirical Study.

4. Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.

5. Monaural Speech Dereverberation Using Temporal Convolutional Networks with Self Attention.

6. Deep Learning Based Target Cancellation for Speech Dereverberation.

7. On Cross-Corpus Generalization of Deep Learning Based Speech Enhancement.

8. SSGD: SPARSITY-PROMOTING STOCHASTIC GRADIENT DESCENT ALGORITHM FOR UNBIASED DNN PRUNING.

9. Towards Model Compression for Deep Learning Based Speech Enhancement.

10. Deep Learning Based Real-time Speech Enhancement for Dual-microphone Mobile Phones.