Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Speaker verification based on the fusion of speech acoustics and inverted articulatory signals.

Literature DB >> 28496292

Speaker verification based on the fusion of speech acoustics and inverted articulatory signals.

Ming Li^1,2,3, Jangwon Kim⁴, Adam Lammert⁴, Prasanta Kumar Ghosh⁵, Vikram Ramanarayanan⁴, Shrikanth Narayanan⁴.

Abstract

We propose a practical, feature-level and score-level fusion approach by combining acoustic and estimated articulatory information for both text independent and text dependent speaker verification. From a practical point of view, we study how to improve speaker verification performance by combining dynamic articulatory information with the conventional acoustic features. On text independent speaker verification, we find that concatenating articulatory features obtained from measured speech production data with conventional Mel-frequency cepstral coefficients (MFCCs) improves the performance dramatically. However, since directly measuring articulatory data is not feasible in many real world applications, we also experiment with estimated articulatory features obtained through acoustic-to-articulatory inversion. We explore both feature level and score level fusion methods and find that the overall system performance is significantly enhanced even with estimated articulatory features. Such a performance boost could be due to the inter-speaker variation information embedded in the estimated articulatory features. Since the dynamics of articulation contain important information, we included inverted articulatory trajectories in text dependent speaker verification. We demonstrate that the articulatory constraints introduced by inverted articulatory features help to reject wrong password trials and improve the performance after score level fusion. We evaluate the proposed methods on the X-ray Microbeam database and the RSR 2015 database, respectively, for the aforementioned two tasks. Experimental results show that we achieve more than 15% relative equal error rate reduction for both speaker verification tasks.

Entities: Chemical Disease Gene Species

Keywords: Acoustic-to-articulatory inversion; Articulatory features; Speech production; Text dependent speaker verification; Text independent speaker verification

Year: 2015 PMID： 28496292 PMCID： PMC5423730 DOI： 10.1016/j.csl.2015.05.003

Source DB: PubMed Journal: Comput Speech Lang ISSN： 0885-2308 Impact factor: 1.899

11 in total

1. Acoustics of children's speech: developmental changes of temporal and spectral parameters.

Authors: S Lee; A Potamianos; S Narayanan
Journal: J Acoust Soc Am Date: 1999-03 Impact factor: 1.840

2. Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion.

Authors: Prasanta Kumar Ghosh; Shrikanth Narayanan
Journal: J Acoust Soc Am Date: 2011-10 Impact factor: 1.840

3. A generalized smoothness criterion for acoustic-to-articulatory inversion.

Authors: Prasanta Kumar Ghosh; Shrikanth Narayanan
Journal: J Acoust Soc Am Date: 2010-10 Impact factor: 1.840

4. A fast learning algorithm for deep belief nets.

Authors: Geoffrey E Hinton; Simon Osindero; Yee-Whye Teh
Journal: Neural Comput Date: 2006-07 Impact factor: 2.026

5. On the relationship between palate shape and articulatory behavior.

Authors: Jana Brunner; Susanne Fuchs; Pascal Perrier
Journal: J Acoust Soc Am Date: 2009-06 Impact factor: 1.840

6. On smoothing articulatory trajectories obtained from Gaussian mixture model based acoustic-to-articulatory inversion.

Authors: Prasanta K Ghosh; Shrikanth S Narayanan
Journal: J Acoust Soc Am Date: 2013-08 Impact factor: 1.840

7. Articulatory and acoustic adaptation to palatal perturbation.

Authors: Mélanie Thibeault; Lucie Ménard; Shari R Baum; Gabrielle Richard; David H McFarland
Journal: J Acoust Soc Am Date: 2011-04 Impact factor: 1.840

8. Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC).

Authors: Shrikanth Narayanan; Asterios Toutios; Vikram Ramanarayanan; Adam Lammert; Jangwon Kim; Sungbok Lee; Krishna Nayak; Yoon-Chul Kim; Yinghua Zhu; Louis Goldstein; Dani Byrd; Erik Bresch; Prasanta Ghosh; Athanasios Katsamanis; Michael Proctor
Journal: J Acoust Soc Am Date: 2014-09 Impact factor: 1.840