Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.

Literature DB >> 23326297

Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.

Vikramjit Mitra¹, Hosung Nam, Carol Y Espy-Wilson, Elliot Saltzman, Louis Goldstein.

Abstract

Many different studies have claimed that articulatory information can be used to improve the performance of automatic speech recognition systems. Unfortunately, such articulatory information is not readily available in typical speaker-listener situations. Consequently, such information has to be estimated from the acoustic signal in a process which is usually termed "speech-inversion." This study aims to propose and compare various machine learning strategies for speech inversion: Trajectory mixture density networks (TMDNs), feedforward artificial neural networks (FF-ANN), support vector regression (SVR), autoregressive artificial neural network (AR-ANN), and distal supervised learning (DSL). Further, using a database generated by the Haskins Laboratories speech production model, we test the claim that information regarding constrictions produced by the distinct organs of the vocal tract (vocal tract variables) is superior to flesh-point information (articulatory pellet trajectories) for the inversion process.

Entities: Chemical Disease Gene Species

Year: 2010 PMID： 23326297 PMCID： PMC3544523 DOI： 10.1109/JSTSP.2010.2076013

Source DB: PubMed Journal: IEEE J Sel Top Signal Process ISSN： 1932-4553 Impact factor: 6.856

17 in total

1. An overlapping-feature-based phonological model incorporating linguistic constraints: applications to speech recognition.

Authors: Jiping Sun; Li Deng
Journal: J Acoust Soc Am Date: 2002-02 Impact factor: 1.840

2. Toward a model for lexical access based on acoustic landmarks and distinctive features.

Authors: Kenneth N Stevens
Journal: J Acoust Soc Am Date: 2002-04 Impact factor: 1.840

3. Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data.

Authors: G Papcun; J Hochberg; T R Thomas; F Laroche; J Zacks; S Levy
Journal: J Acoust Soc Am Date: 1992-08 Impact factor: 1.840

Review 4. Articulatory phonology: an overview.

Authors: C P Browman; L Goldstein
Journal: Phonetica Date: 1992 Impact factor: 1.759

Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.

1. An overlapping-feature-based phonological model incorporating linguistic constraints: applications to speech recognition.

2. Toward a model for lexical access based on acoustic landmarks and distinctive features.

3. Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data.

Review 4. Articulatory phonology: an overview.

5. Accurate recovery of articulator positions from acoustics: new conclusions based on human data.

Review 6. Coordination and coarticulation in speech production.

7. Timing constraints and coarticulation: alveolo-palatals and sequences of alveolar + [j] in Catalan.

8. Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique.

9. Generating vocal tract shapes from formant frequencies.

10. Coarticulation in VCV utterances: spectrographic measurements.

1. A procedure for estimating gestural scores from speech acoustics.

2. Vocal generalization depends on gesture identity and sequence.

3. Statistical Methods for Estimation of Direct and Differential Kinematics of the Vocal Tract.

4. Voice Feature Selection to Improve Performance of Machine Learning Models for Voice Production Inversion.