Literature DB >> 28713197

Detecting paralinguistic events in audio stream using context in features and probabilistic decisions.

Rahul Gupta1, Kartik Audhkhasi2, Sungbok Lee1, Shrikanth Narayanan1.   

Abstract

Non-verbal communication involves encoding, transmission and decoding of non-lexical cues and is realized using vocal (e.g. prosody) or visual (e.g. gaze, body language) channels during conversation. These cues perform the function of maintaining conversational flow, expressing emotions, and marking personality and interpersonal attitude. In particular, non-verbal cues in speech such as paralanguage and non-verbal vocal events (e.g. laughters, sighs, cries) are used to nuance meaning and convey emotions, mood and attitude. For instance, laughters are associated with affective expressions while fillers (e.g. um, ah, um) are used to hold floor during a conversation. In this paper we present an automatic non-verbal vocal events detection system focusing on the detect of laughter and fillers. We extend our system presented during Interspeech 2013 Social Signals Sub-challenge (that was the winning entry in the challenge) for frame-wise event detection and test several schemes for incorporating local context during detection. Specifically, we incorporate context at two separate levels in our system: (i) the raw frame-wise features and, (ii) the output decisions. Furthermore, our system processes the output probabilities based on a few heuristic rules in order to reduce erroneous frame-based predictions. Our overall system achieves an Area Under the Receiver Operating Characteristics curve of 95.3% for detecting laughters and 90.4% for fillers on the test set drawn from the data specifications of the Interspeech 2013 Social Signals Sub-challenge. We perform further analysis to understand the interrelation between the features and obtained results. Specifically, we conduct a feature sensitivity analysis and correlate it with each feature's stand alone performance. The observations suggest that the trained system is more sensitive to a feature carrying higher discriminability with implications towards a better system design.

Entities:  

Keywords:  Filler; Laughter; Paralinguistic event; Probability masking; Probability smoothing

Year:  2015        PMID: 28713197      PMCID: PMC5507373          DOI: 10.1016/j.csl.2015.08.003

Source DB:  PubMed          Journal:  Comput Speech Lang        ISSN: 0885-2308            Impact factor:   1.899


  17 in total

1.  The acoustic features of human laughter.

Authors:  J A Bachorowski; M J Smoski; M J Owren
Journal:  J Acoust Soc Am       Date:  2001-09       Impact factor: 1.840

2.  Take a deep breath: the relief effect of spontaneous and instructed sighs.

Authors:  Elke Vlemincx; Joachim Taelman; Ilse Van Diest; Omer Van den Bergh
Journal:  Physiol Behav       Date:  2010-04-24

3.  Acoustic analysis of the infant cry: classical and new methods.

Authors:  G Várallyay; Z Benyó; A Illényi; Z Farkas; L Kovács
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2004

4.  Automatic acoustic synthesis of human-like laughter.

Authors:  Shiva Sundaram; Shrikanth Narayanan
Journal:  J Acoust Soc Am       Date:  2007-01       Impact factor: 1.840

5.  Face recognition: a convolutional neural-network approach.

Authors:  S Lawrence; C L Giles; A C Tsoi; A D Back
Journal:  IEEE Trans Neural Netw       Date:  1997

Review 6.  The relationship between communication and marital satisfaction: a review.

Authors:  J P Boland; D R Follingstad
Journal:  J Sex Marital Ther       Date:  1987

7.  Personality and the structure of the nonverbal communication of emotion.

Authors:  M R Cunningham
Journal:  J Pers       Date:  1977-12

8.  The Autism Diagnostic Observation Schedule: revised algorithms for improved diagnostic validity.

Authors:  Katherine Gotham; Susan Risi; Andrew Pickles; Catherine Lord
Journal:  J Autism Dev Disord       Date:  2006-12-16

9.  Defining the social deficits of autism: the contribution of non-verbal communication measures.

Authors:  P Mundy; M Sigman; J Ungerer; T Sherman
Journal:  J Child Psychol Psychiatry       Date:  1986-09       Impact factor: 8.982

10.  Multi-level prediction of short-term outcome of depression: non-verbal interpersonal processes, cognitions and personality traits.

Authors:  E Geerts; N Bouhuys
Journal:  Psychiatry Res       Date:  1998-06-02       Impact factor: 3.222

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.