Literature DB >> 28464689

Analysis of human scream and its impact on text-independent speaker verification.

John H L Hansen1, Mahesh Kumar Nandwana1, Navid Shokouhi1.   

Abstract

Scream is defined as sustained, high-energy vocalizations that lack phonological structure. Lack of phonological structure is how scream is identified from other forms of loud vocalization, such as "yell." This study investigates the acoustic aspects of screams and addresses those that are known to prevent standard speaker identification systems from recognizing the identity of screaming speakers. It is well established that speaker variability due to changes in vocal effort and Lombard effect contribute to degraded performance in automatic speech systems (i.e., speech recognition, speaker identification, diarization, etc.). However, previous research in the general area of speaker variability has concentrated on human speech production, whereas less is known about non-speech vocalizations. The UT-NonSpeech corpus is developed here to investigate speaker verification from scream samples. This study considers a detailed analysis in terms of fundamental frequency, spectral peak shift, frame energy distribution, and spectral tilt. It is shown that traditional speaker recognition based on the Gaussian mixture models-universal background model framework is unreliable when evaluated with screams.

Entities:  

Mesh:

Year:  2017        PMID: 28464689     DOI: 10.1121/1.4979337

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  4 in total

1.  Toward a Consensus Description of Vocal Effort, Vocal Load, Vocal Loading, and Vocal Fatigue.

Authors:  Eric J Hunter; Lady Catherine Cantor-Cutiva; Eva van Leer; Miriam van Mersbergen; Chaya Devie Nanjundeswaran; Pasquale Bottalico; Mary J Sandage; Susanna Whitling
Journal:  J Speech Lang Hear Res       Date:  2020-02-19       Impact factor: 2.297

2.  Analysis and Calibration of Lombard Effect and Whisper for Speaker Recognition.

Authors:  Finnian Kelly; John H L Hansen
Journal:  IEEE/ACM Trans Audio Speech Lang Process       Date:  2021-01-21

3.  A Moan of Pleasure Should Be Breathy: The Effect of Voice Quality on the Meaning of Human Nonverbal Vocalizations.

Authors:  Andrey Anikin
Journal:  Phonetica       Date:  2020-01-21       Impact factor: 1.759

4.  Chimpanzee vowel-like sounds and voice quality suggest formant space expansion through the hominoid lineage.

Authors:  Sven Grawunder; Natalie Uomini; Liran Samuni; Tatiana Bortolato; Cédric Girard-Buttoz; Roman M Wittig; Catherine Crockford
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2021-11-15       Impact factor: 6.237

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.