Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Noise Perturbation for Supervised Speech Separation.

Literature DB >> 26900194

Noise Perturbation for Supervised Speech Separation.

Jitong Chen¹, Yuxuan Wang¹, DeLiang Wang².

Abstract

Speech separation can be treated as a mask estimation problem, where interference-dominant portions are masked in a time-frequency representation of noisy speech. In supervised speech separation, a classifier is typically trained on a mixture set of speech and noise. It is important to efficiently utilize limited training data to make the classifier generalize well. When target speech is severely interfered by a nonstationary noise, a classifier tends to mistake noise patterns for speech patterns. Expansion of a noise through proper perturbation during training helps to expose the classifier to a broader variety of noisy conditions, and hence may lead to better separation performance. This study examines three noise perturbations on supervised speech separation: noise rate, vocal tract length, and frequency perturbation at low signal-to-noise ratios (SNRs). The speech separation performance is evaluated in terms of classification accuracy, hit minus false-alarm rate and short-time objective intelligibility (STOI). The experimental results show that frequency perturbation is the best among the three perturbations in terms of speech separation. In particular, the results show that frequency perturbation is effective in reducing the error of misclassifying a noise pattern as a speech pattern.

Entities: Chemical Disease Species

Keywords: Speech separation; noise perturbation; supervised learning

Year: 2016 PMID： 26900194 PMCID： PMC4754974 DOI： 10.1016/j.specom.2015.12.006

Source DB: PubMed Journal: Speech Commun ISSN： 0167-6393 Impact factor: 2.017

10 in total

1. Multi-column deep neural network for traffic sign classification.

Authors: Dan Cireşan; Ueli Meier; Jonathan Masci; Jürgen Schmidhuber
Journal: Neural Netw Date: 2012-02-14

2. Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation.

Authors: Douglas S Brungart; Peter S Chang; Brian D Simpson; DeLiang Wang
Journal: J Acoust Soc Am Date: 2006-12 Impact factor: 1.840

3. Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.

Authors: Ning Li; Philipos C Loizou
Journal: J Acoust Soc Am Date: 2008-03 Impact factor: 1.840

4. An algorithm to improve speech recognition in noise for hearing-impaired listeners.

Authors: Eric W Healy; Sarah E Yoho; Yuxuan Wang; DeLiang Wang
Journal: J Acoust Soc Am Date: 2013-10 Impact factor: 1.840

5. An algorithm that improves speech intelligibility in noise for normal-hearing listeners.

Authors: Gibak Kim; Yang Lu; Yi Hu; Philipos C Loizou
Journal: J Acoust Soc Am Date: 2009-09 Impact factor: 1.840

6. Perceptual learning for speech in noise after application of binary time-frequency masks.

Authors: Mahnaz Ahmadi; Vauna L Gross; Donal G Sinex
Journal: J Acoust Soc Am Date: 2013-03 Impact factor: 1.840

7. Evaluation of the importance of time-frequency contributions to speech intelligibility in noise.

Authors: Chengzhu Yu; Kamil K Wójcicki; Philipos C Loizou; John H L Hansen; Michael T Johnson
Journal: J Acoust Soc Am Date: 2014-05 Impact factor: 1.840

8. On Training Targets for Supervised Speech Separation.

Authors: Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal: IEEE/ACM Trans Audio Speech Lang Process Date: 2014-12

9. Environment-specific noise suppression for improved speech intelligibility by cochlear implant users.

Authors: Yi Hu; Philipos C Loizou
Journal: J Acoust Soc Am Date: 2010-06 Impact factor: 1.840

10. Speech intelligibility in background noise with ideal binary time-frequency masking.

Authors: DeLiang Wang; Ulrik Kjems; Michael S Pedersen; Jesper B Boldt; Thomas Lunner
Journal: J Acoust Soc Am Date: 2009-04 Impact factor: 1.840