Literature DB >> 17031716

Performance enhancement for audio-visual speaker identification using dynamic facial muscle model.

Vahid Asadpour1, Farzad Towhidkhah, Mohammad Mehdi Homayounpour.   

Abstract

Science of human identification using physiological characteristics or biometry has been of great concern in security systems. However, robust multimodal identification systems based on audio-visual information has not been thoroughly investigated yet. Therefore, the aim of this work to propose a model-based feature extraction method which employs physiological characteristics of facial muscles producing lip movements. This approach adopts the intrinsic properties of muscles such as viscosity, elasticity, and mass which are extracted from the dynamic lip model. These parameters are exclusively dependent on the neuro-muscular properties of speaker; consequently, imitation of valid speakers could be reduced to a large extent. These parameters are applied to a hidden Markov model (HMM) audio-visual identification system. In this work, a combination of audio and video features has been employed by adopting a multistream pseudo-synchronized HMM training method. Noise robust audio features such as Mel-frequency cepstral coefficients (MFCC), spectral subtraction (SS), and relative spectra perceptual linear prediction (J-RASTA-PLP) have been used to evaluate the performance of the multimodal system once efficient audio feature extraction methods have been utilized. The superior performance of the proposed system is demonstrated on a large multispeaker database of continuously spoken digits, along with a sentence that is phonetically rich. To evaluate the robustness of algorithms, some experiments were performed on genetically identical twins. Furthermore, changes in speaker voice were simulated with drug inhalation tests. In 3 dB signal to noise ratio (SNR), the dynamic muscle model improved the identification rate of the audio-visual system from 91 to 98%. Results on identical twins revealed that there was an apparent improvement on the performance for the dynamic muscle model-based system, in which the identification rate of the audio-visual system was enhanced from 87 to 96%.

Entities:  

Mesh:

Year:  2006        PMID: 17031716     DOI: 10.1007/s11517-006-0106-5

Source DB:  PubMed          Journal:  Med Biol Eng Comput        ISSN: 0140-0118            Impact factor:   2.602


  9 in total

1.  Recognizing moving faces: a psychological and neural synthesis.

Authors:  Alice J. O'Toole; Dana A. Roark; Hervé Abdi
Journal:  Trends Cogn Sci       Date:  2002-06-01       Impact factor: 20.229

2.  The use of facial motion and facial form during the processing of identity.

Authors:  Barbara Knappmeyer; Ian M Thornton; Heinrich H Bülthoff
Journal:  Vision Res       Date:  2003-08       Impact factor: 1.886

Review 3.  Lipreading and audio-visual speech perception.

Authors:  Q Summerfield
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  1992-01-29       Impact factor: 6.237

4.  Hearing lips and seeing voices.

Authors:  H McGurk; J MacDonald
Journal:  Nature       Date:  1976 Dec 23-30       Impact factor: 49.962

5.  Evaluating the articulation index for auditory-visual input.

Authors:  K W Grant; L D Braida
Journal:  J Acoust Soc Am       Date:  1991-06       Impact factor: 1.840

Review 6.  Human neural systems for face recognition and social communication.

Authors:  James V Haxby; Elizabeth A Hoffman; M Ida Gobbini
Journal:  Biol Psychiatry       Date:  2002-01-01       Impact factor: 13.382

7.  A model for learning human reaching movements.

Authors:  A Karniel; G F Inbar
Journal:  Biol Cybern       Date:  1997-09       Impact factor: 2.086

8.  Role of intrinsic muscle properties in producing smooth movements.

Authors:  A M Krylow; W Z Rymer
Journal:  IEEE Trans Biomed Eng       Date:  1997-02       Impact factor: 4.538

9.  Spirometric standards for healthy nonsmoking adults.

Authors:  J F Morris; A Koski; L C Johnson
Journal:  Am Rev Respir Dis       Date:  1971-01
  9 in total
  3 in total

1.  Voiceless Arabic vowels recognition using facial EMG.

Authors:  Luay Fraiwan; Khaldon Lweesy; Ayat Al-Nemrawi; Sondos Addabass; Rasha Saifan
Journal:  Med Biol Eng Comput       Date:  2011-03-16       Impact factor: 2.602

2.  Usefulness of biological fingerprint in magnetic resonance imaging for patient verification.

Authors:  Yasuyuki Ueda; Junji Morishita; Shohei Kudomi; Katsuhiko Ueda
Journal:  Med Biol Eng Comput       Date:  2015-09-04       Impact factor: 2.602

3.  Voiceless Bangla vowel recognition using sEMG signal.

Authors:  S S Mostafa; M A Awal; M Ahmad; M A Rashid
Journal:  Springerplus       Date:  2016-09-09
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.