Literature DB >> 28496292

Speaker verification based on the fusion of speech acoustics and inverted articulatory signals.

Ming Li1,2,3, Jangwon Kim4, Adam Lammert4, Prasanta Kumar Ghosh5, Vikram Ramanarayanan4, Shrikanth Narayanan4.   

Abstract

We propose a practical, feature-level and score-level fusion approach by combining acoustic and estimated articulatory information for both text independent and text dependent speaker verification. From a practical point of view, we study how to improve speaker verification performance by combining dynamic articulatory information with the conventional acoustic features. On text independent speaker verification, we find that concatenating articulatory features obtained from measured speech production data with conventional Mel-frequency cepstral coefficients (MFCCs) improves the performance dramatically. However, since directly measuring articulatory data is not feasible in many real world applications, we also experiment with estimated articulatory features obtained through acoustic-to-articulatory inversion. We explore both feature level and score level fusion methods and find that the overall system performance is significantly enhanced even with estimated articulatory features. Such a performance boost could be due to the inter-speaker variation information embedded in the estimated articulatory features. Since the dynamics of articulation contain important information, we included inverted articulatory trajectories in text dependent speaker verification. We demonstrate that the articulatory constraints introduced by inverted articulatory features help to reject wrong password trials and improve the performance after score level fusion. We evaluate the proposed methods on the X-ray Microbeam database and the RSR 2015 database, respectively, for the aforementioned two tasks. Experimental results show that we achieve more than 15% relative equal error rate reduction for both speaker verification tasks.

Entities:  

Keywords:  Acoustic-to-articulatory inversion; Articulatory features; Speech production; Text dependent speaker verification; Text independent speaker verification

Year:  2015        PMID: 28496292      PMCID: PMC5423730          DOI: 10.1016/j.csl.2015.05.003

Source DB:  PubMed          Journal:  Comput Speech Lang        ISSN: 0885-2308            Impact factor:   1.899


  11 in total

1.  Acoustics of children's speech: developmental changes of temporal and spectral parameters.

Authors:  S Lee; A Potamianos; S Narayanan
Journal:  J Acoust Soc Am       Date:  1999-03       Impact factor: 1.840

2.  Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion.

Authors:  Prasanta Kumar Ghosh; Shrikanth Narayanan
Journal:  J Acoust Soc Am       Date:  2011-10       Impact factor: 1.840

3.  A generalized smoothness criterion for acoustic-to-articulatory inversion.

Authors:  Prasanta Kumar Ghosh; Shrikanth Narayanan
Journal:  J Acoust Soc Am       Date:  2010-10       Impact factor: 1.840

4.  A fast learning algorithm for deep belief nets.

Authors:  Geoffrey E Hinton; Simon Osindero; Yee-Whye Teh
Journal:  Neural Comput       Date:  2006-07       Impact factor: 2.026

5.  On the relationship between palate shape and articulatory behavior.

Authors:  Jana Brunner; Susanne Fuchs; Pascal Perrier
Journal:  J Acoust Soc Am       Date:  2009-06       Impact factor: 1.840

6.  On smoothing articulatory trajectories obtained from Gaussian mixture model based acoustic-to-articulatory inversion.

Authors:  Prasanta K Ghosh; Shrikanth S Narayanan
Journal:  J Acoust Soc Am       Date:  2013-08       Impact factor: 1.840

7.  Articulatory and acoustic adaptation to palatal perturbation.

Authors:  Mélanie Thibeault; Lucie Ménard; Shari R Baum; Gabrielle Richard; David H McFarland
Journal:  J Acoust Soc Am       Date:  2011-04       Impact factor: 1.840

8.  Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC).

Authors:  Shrikanth Narayanan; Asterios Toutios; Vikram Ramanarayanan; Adam Lammert; Jangwon Kim; Sungbok Lee; Krishna Nayak; Yoon-Chul Kim; Yinghua Zhu; Louis Goldstein; Dani Byrd; Erik Bresch; Prasanta Ghosh; Athanasios Katsamanis; Michael Proctor
Journal:  J Acoust Soc Am       Date:  2014-09       Impact factor: 1.840

9.  Interspeaker variability in hard palate morphology and vowel production.

Authors:  Adam Lammert; Michael Proctor; Shrikanth Narayanan
Journal:  J Speech Lang Hear Res       Date:  2013-12       Impact factor: 2.297

10.  Morphological variation in the adult hard palate and posterior pharyngeal wall.

Authors:  Adam Lammert; Michael Proctor; Shrikanth Narayanan
Journal:  J Speech Lang Hear Res       Date:  2013-04       Impact factor: 2.297

View more
  1 in total

1.  Articulation constrained learning with application to speech emotion recognition.

Authors:  Mohit Shah; Ming Tu; Visar Berisha; Chaitali Chakrabarti; Andreas Spanias
Journal:  EURASIP J Audio Speech Music Process       Date:  2019-08-20
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.