Literature DB >> 26428778

Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality.

Donald S Williamson1, Yuxuan Wang1, DeLiang Wang2.   

Abstract

As a means of speech separation, time-frequency masking applies a gain function to the time-frequency representation of noisy speech. On the other hand, nonnegative matrix factorization (NMF) addresses separation by linearly combining basis vectors from speech and noise models to approximate noisy speech. This paper presents an approach for improving the perceptual quality of speech separated from background noise at low signal-to-noise ratios. An ideal ratio mask is estimated, which separates speech from noise with reasonable sound quality. A deep neural network then approximates clean speech by estimating activation weights from the ratio-masked speech, where the weights linearly combine elements from a NMF speech model. Systematic comparisons using objective metrics, including the perceptual evaluation of speech quality, show that the proposed algorithm achieves higher speech quality than related masking and NMF methods. In addition, a listening test was performed and its results show that the output of the proposed algorithm is preferred over the comparison systems in terms of speech quality.

Mesh:

Year:  2015        PMID: 26428778      PMCID: PMC5392055          DOI: 10.1121/1.4928612

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  8 in total

1.  Learning the parts of objects by non-negative matrix factorization.

Authors:  D D Lee; H S Seung
Journal:  Nature       Date:  1999-10-21       Impact factor: 49.962

2.  Effects of noise and distortion on speech quality judgments in normal-hearing and hearing-impaired listeners.

Authors:  Kathryn H Arehart; James M Kates; Melinda C Anderson; Lewis O Harvey
Journal:  J Acoust Soc Am       Date:  2007-08       Impact factor: 1.840

3.  An algorithm to improve speech recognition in noise for hearing-impaired listeners.

Authors:  Eric W Healy; Sarah E Yoho; Yuxuan Wang; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2013-10       Impact factor: 1.840

4.  Nonnegative matrix factorization with the Itakura-Saito divergence: with application to music analysis.

Authors:  Cédric Févotte; Nancy Bertin; Jean-Louis Durrieu
Journal:  Neural Comput       Date:  2009-03       Impact factor: 2.026

5.  An algorithm that improves speech intelligibility in noise for normal-hearing listeners.

Authors:  Gibak Kim; Yang Lu; Yi Hu; Philipos C Loizou
Journal:  J Acoust Soc Am       Date:  2009-09       Impact factor: 1.840

6.  Ideal time-frequency masking algorithms lead to different speech intelligibility and quality in normal-hearing and cochlear implant listeners.

Authors:  Raphael Koning; Nilesh Madhu; Jan Wouters
Journal:  IEEE Trans Biomed Eng       Date:  2014-08-26       Impact factor: 4.538

7.  Reconstruction techniques for improving the perceptual quality of binary masked speech.

Authors:  Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2014-08       Impact factor: 1.840

8.  On Training Targets for Supervised Speech Separation.

Authors:  Yuxuan Wang; Arun Narayanan; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2014-12
  8 in total
  3 in total

1.  An ideal quantized mask to increase intelligibility and quality of speech in noise.

Authors:  Eric W Healy; Jordan L Vasko
Journal:  J Acoust Soc Am       Date:  2018-09       Impact factor: 1.840

2.  Impact of phase estimation on single-channel speech separation based on time-frequency masking.

Authors:  Florian Mayer; Donald S Williamson; Pejman Mowlaee; DeLiang Wang
Journal:  J Acoust Soc Am       Date:  2017-06       Impact factor: 1.840

3.  Complex Ratio Masking for Monaural Speech Separation.

Authors:  Donald S Williamson; Yuxuan Wang; DeLiang Wang
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2015-12-23
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.