Johan Sundberg1, Gláucia Laís Salomão2, Klaus R Scherer3. 1. Department of Speech Music Hearing, School of Electrical Engineering and Computer Science, KTH, Stockholm, Sweden; University College of Music Education Stockholm, Stockholm, Sweden. Electronic address: jsu@kth.se. 2. Department of Linguistics, Stockholm University, Stockholm, Sweden. 3. Department of Psychology, University of Geneva, Geneva, Switzerland; Department of Psychology, University of Munich, Munich, Germany.
Abstract
BACKGROUND: Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics. METHOD: Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal. RESULTS: On the basis of component analysis, the emotions could be grouped into four "families", Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families. CONCLUSIONS: (i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.
BACKGROUND: Acoustic aspects of emotional expressivity in speech have been analyzed extensively during recent decades. Emotional coloring is an important if not the most important property of sung performance, and therefore strictly controlled. Hence, emotional expressivity in singing may promote a deeper insight into vocal signaling of emotions. Furthermore, physiological voice source parameters can be assumed to facilitate the understanding of acoustical characteristics. METHOD: Three highly experienced professional male singers sang scales on the vowel /ae/ or /a/ in 10 emotional colors (Neutral, Sadness, Tender, Calm, Joy, Contempt, Fear, Pride, Love, Arousal, and Anger). Sixteen voice experts classified the scales in a forced-choice listening test, and the result was compared with long-term-average spectrum (LTAS) parameters and with voice source parameters, derived from flow glottograms (FLOGG) that were obtained from inverse filtering the audio signal. RESULTS: On the basis of component analysis, the emotions could be grouped into four "families", Anger-Contempt, Joy-Love-Pride, Calm-Tender-Neutral and Sad-Fear. Recognition of the intended emotion families by listeners reached accuracy levels far beyond chance level. For the LTAS and FLOGG parameters, vocal loudness had a paramount influence on all. Also after partialing out this factor, some significant correlations were found between FLOGG and LTAS parameters. These parameters could be sorted into groups that were associated with the emotion families. CONCLUSIONS: (i) Both LTAS and FLOGG parameters varied significantly with the enactment intentions of the singers. (ii) Some aspects of the voice source are reflected in LTAS parameters. (iii) LTAS parameters affect listener judgment of the enacted emotions and the accuracy of the intended emotional coloring.