Literature DB >> 1506525

Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data.

G Papcun1, J Hochberg, T R Thomas, F Laroche, J Zacks, S Levy.   

Abstract

This paper describes a method for inferring articulatory parameters from acoustics with a neural network trained on paired acoustic and articulatory data. An x-ray microbeam recorded the vertical movements of the lower lip, tongue tip, and tongue dorsum of three speakers saying the English stop consonants in repeated Ce syllables. A neural network was then trained to map from simultaneously recorded acoustic data to the articulatory data. To evaluate learning, acoustics from the training set were passed through the neural network. To evaluate generalization, acoustics from speakers or consonants excluded from the training set were passed through the network. The articulatory trajectories thus inferred were a good fit to the actual movements in both the learning and generalization conditions, as judged by root-mean-square error and correlation. Inferred trajectories were also matched to templates of lower lip, tongue tip, and tongue dorsum release gestures extracted from the original data. This technique correctly recognized from 94.4% to 98.9% of all gestures in the learning and cross-speaker generalization conditions, and 75% of gestures underlying consonants excluded from the training set. In addition, greater regularity was observed for movements of articulators that were critical in the formation of each consonant.

Mesh:

Year:  1992        PMID: 1506525     DOI: 10.1121/1.403994

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  5 in total

1.  A study of acoustic-to-articulatory inversion of speech by analysis-by-synthesis using chain matrices and the Maeda articulatory model.

Authors:  Sankaran Panchapagesan; Abeer Alwan
Journal:  J Acoust Soc Am       Date:  2011-04       Impact factor: 1.840

2.  Statistical Methods for Estimation of Direct and Differential Kinematics of the Vocal Tract.

Authors:  Adam Lammert; Louis Goldstein; Shrikanth Narayanan; Khalil Iskarous
Journal:  Speech Commun       Date:  2013-01       Impact factor: 2.017

3.  A computational theory for movement pattern recognition based on optimal movement pattern generation.

Authors:  Y Wada; Y Koike; E Vatikiotis-Bateson; M Kawato
Journal:  Biol Cybern       Date:  1995-06       Impact factor: 2.086

4.  Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.

Authors:  Vikramjit Mitra; Hosung Nam; Carol Y Espy-Wilson; Elliot Saltzman; Louis Goldstein
Journal:  IEEE J Sel Top Signal Process       Date:  2010-09-13       Impact factor: 6.856

5.  The use of phonetic motor invariants can improve automatic phoneme discrimination.

Authors:  Claudio Castellini; Leonardo Badino; Giorgio Metta; Giulio Sandini; Michele Tavella; Mirko Grimaldi; Luciano Fadiga
Journal:  PLoS One       Date:  2011-09-01       Impact factor: 3.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.