Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Voice Feature Selection to Improve Performance of Machine Learning Models for Voice Production Inversion.

Literature DB >> 33849760

Voice Feature Selection to Improve Performance of Machine Learning Models for Voice Production Inversion.

Abstract

OBJECTIVE: Estimation of physiological control parameters of the vocal system from the produced voice outcome has important applications in clinical management of voice disorders . Previously we developed a simulation-based neural network for estimation of vocal fold geometry, mechanical properties, and subglottal pressure from voice outcome features that characterize the acoustics of the produced voice. The goals of this study are to (1) explore the possibility of improving the estimation accuracy of physiological control parameters by including voice outcome features characterizing vocal fold vibration; and (2) identify voice feature sets that optimize both estimation accuracy and robustness to measurement noise.
METHODS: Feedforward neural networks are trained to solve the inversion problem of estimating the physiological control parameters of a three-dimensional body-cover vocal fold model from different sets of voice outcome features that characterize the simulated voice acoustics, glottal flow, and vocal fold vibration. A sensitivity analysis is then performed to evaluate the contribution of individual voice features to the overall performance of the neural networks in estimating the physiologic control parameters. RESULTS AND
CONCLUSIONS: While including voice outcome features characterizing vocal fold vibration increases estimation accuracy, it also reduces the network's robustness to measurement noise, due to high sensitivity of network performance to voice outcome features measuring the absolute amplitudes of the glottal flow and area waveforms, which are also difficult to measure accurately in practical applications. By excluding such glottal flow-based features and replacing glottal area-based features by their normalized counterparts, we are able to significantly improve both estimation accuracy and robustness to noise. We further show that similar estimation accuracy and robustness can be achieved with an even smaller set of voice outcome features by excluding features of small sensitivity.

Entities: Chemical

Keywords: Voice inversion—Vocal fold geometry—Vocal fold stiffness—Machine learning

Year: 2021 PMID： 33849760 PMCID： PMC8502179 DOI： 10.1016/j.jvoice.2021.03.004

Source DB: PubMed Journal: J Voice ISSN： 0892-1997 Impact factor: 2.300

19 in total

1. Vibration parameter extraction from endoscopic image series of the vocal folds.

Authors: Michael Döllinger; Ulrich Hoppe; Frank Hettlich; Jörg Lohscheller; Stefan Schuberth; Ulrich Eysholdt
Journal: IEEE Trans Biomed Eng Date: 2002-08 Impact factor: 4.538

2. Extracting physiologically relevant parameters of vocal folds from high-speed video image series.

Authors: Chao Tao; Yu Zhang; Jack J Jiang
Journal: IEEE Trans Biomed Eng Date: 2007-05 Impact factor: 4.538

3. Cause-effect relationship between vocal fold physiology and voice production in a three-dimensional phonation model.

Authors: Zhaoyan Zhang
Journal: J Acoust Soc Am Date: 2016-04 Impact factor: 1.840

4. Experimental validation of a three-dimensional reduced-order continuum model of phonation.

Authors: Mehrdad H Farahani; Zhaoyan Zhang
Journal: J Acoust Soc Am Date: 2016-08 Impact factor: 1.840

5. Estimation of vocal fold physiology from voice acoustics using machine learning.

Authors: Zhaoyan Zhang
Journal: J Acoust Soc Am Date: 2020-03 Impact factor: 1.840

6. Physical parameter estimation from porcine ex vivo vocal fold dynamics in an inverse problem framework.

Authors: Pablo Gómez; Anne Schützenberger; Stefan Kniesburges; Christopher Bohr; Michael Döllinger
Journal: Biomech Model Mechanobiol Date: 2017-12-11

7. The effect of high-speed videoendoscopy configuration on reduced-order model parameter estimates by Bayesian inference.

Authors: Jonathan J Deng; Paul J Hadwin; Sean D Peterson
Journal: J Acoust Soc Am Date: 2019-08 Impact factor: 1.840

Voice Feature Selection to Improve Performance of Machine Learning Models for Voice Production Inversion.

1. Vibration parameter extraction from endoscopic image series of the vocal folds.

2. Extracting physiologically relevant parameters of vocal folds from high-speed video image series.

3. Cause-effect relationship between vocal fold physiology and voice production in a three-dimensional phonation model.

4. Experimental validation of a three-dimensional reduced-order continuum model of phonation.

5. Estimation of vocal fold physiology from voice acoustics using machine learning.

6. Physical parameter estimation from porcine ex vivo vocal fold dynamics in an inverse problem framework.

7. The effect of high-speed videoendoscopy configuration on reduced-order model parameter estimates by Bayesian inference.

8. Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.

9. Laryngeal Pressure Estimation With a Recurrent Neural Network.

10. Bayesian Inference of Vocal Fold Material Properties from Glottal Area Waveforms Using a 2D Finite Element Model.

1. Estimating subglottal pressure and vocal fold adduction from the produced voice in a single-subject study (L).