Literature DB >> 32873043

Speech variability: A cross-language study on acoustic variations of speaking versus untrained singing.

John H L Hansen1, Marigona Bokshi1, Soheil Khorram1.   

Abstract

Speech production variability introduces significant challenges for existing speech technologies such as speaker identification (SID), speaker diarization, speech recognition, and language identification (ID). There has been limited research analyzing changes in acoustic characteristics for speech produced by untrained singing versus speaking. To better understand changes in speech production of the untrained singing voice, this study presents the first cross-language comparison between normal speaking and untrained karaoke singing of the same text content. Previous studies comparing professional singing versus speaking have shown deviations in both prosodic and spectral features. Some investigations also considered assigning the intrinsic activity of the singing. Motivated by these studies, a series of experiments to investigate both prosodic and spectral variations of untrained karaoke singers for three languages, American English, Hindi, and Farsi, are considered. A comprehensive comparison on common prosodic features, including phoneme duration, mean fundamental frequency (F0), and formant center frequencies of vowels was performed. Collective changes in the corresponding overall acoustic spaces based on the Kullback-Leibler distance using Gaussian probability distribution models trained on spectral features were analyzed. Finally, these models were used in a Gausian mixture model with universal background model SID evaluation to quantify speaker changes between speaking and singing when the audio text content is the same. The experiments showed that many acoustic characteristics of untrained singing are considerably different from speaking when the text content is the same. It is suggested that these results would help advance automatic speech production normalization/compensation to improve performance of speech processing applications (e.g., speaker ID, speech recognition, and language ID).

Entities:  

Mesh:

Year:  2020        PMID: 32873043      PMCID: PMC7438159          DOI: 10.1121/10.0001526

Source DB:  PubMed          Journal:  J Acoust Soc Am        ISSN: 0001-4966            Impact factor:   1.840


  18 in total

1.  Formant frequencies in country singers' speech and singing.

Authors:  R E Stone; T F Cleveland; J Sundberg
Journal:  J Voice       Date:  1999-06       Impact factor: 2.009

2.  Perceptual and acoustic study of professionally trained versus untrained voices.

Authors:  W S Brown; H B Rothman; C M Sapienza
Journal:  J Voice       Date:  2000-09       Impact factor: 2.009

3.  Comparison of singer's formant, speaker's ring, and LTA spectrum among classical singers and untrained normal speakers.

Authors:  V M Oliveira Barrichelo; R J Heuer; C M Dean; R T Sataloff
Journal:  J Voice       Date:  2001-09       Impact factor: 2.009

4.  Acoustic hole filling for sparse enrollment data using a cohort universal corpus for speaker recognition.

Authors:  Jun-Won Suh; John H L Hansen
Journal:  J Acoust Soc Am       Date:  2012-02       Impact factor: 1.840

5.  The singing power ratio as an objective measure of singing voice quality in untrained talented and nontalented singers.

Authors:  Christopher Watts; Kathryn Barnes-Burroughs; Julie Estis; Debra Blanton
Journal:  J Voice       Date:  2006-03       Impact factor: 2.009

6.  The sound level of the singer's formant in professional singing.

Authors:  G Bloothooft; R Plomp
Journal:  J Acoust Soc Am       Date:  1986-06       Impact factor: 1.840

7.  Articulatory interpretation of the "singing formant".

Authors:  J Sundberg
Journal:  J Acoust Soc Am       Date:  1974-04       Impact factor: 1.840

8.  The acoustics of the singing voice.

Authors:  J Sundberg
Journal:  Sci Am       Date:  1977-03       Impact factor: 2.142

9.  Spectral analysis of sung vowels. I. Variation due to differences between vowels, singers, and modes of singing.

Authors:  G Bloothooft; R Plomp
Journal:  J Acoust Soc Am       Date:  1984-04       Impact factor: 1.840

10.  Perceptions of Voice Teachers Regarding Students' Vocal Behaviors During Singing and Speaking.

Authors:  Shellie A Beeman
Journal:  J Voice       Date:  2016-04-08       Impact factor: 2.009

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.