| Literature DB >> 28971424 |
Michalis Papakostas1, Giorgos Siantikos2, Theodoros Giannakopoulos2, Evaggelos Spyrou3, Dimitris Sgouropoulos2.
Abstract
Emotion recognition plays an important role in several applications, such as human computer interaction and understanding affective state of users in certain tasks, e.g., within a learning process, monitoring of elderly, interactive entertainment etc. It may be based upon several modalities, e.g., by analyzing facial expressions and/or speech, using electroencephalograms, electrocardiograms etc. In certain applications the only available modality is the user's (speaker's) voice. In this paper we aim to analyze speakers' emotions based solely on paralinguistic information, i.e., not depending on the linguistic aspect of speech. We compare two machine learning approaches, namely a Convolutional Neural Network and a Support Vector Machine. The former is trained using raw speech information, while the latter is trained on a set of extracted low-level features. Aiming to provide a multilingual approach, training and testing datasets contain speech from different languages.Entities:
Keywords: Convolutional neural networks; Emotion recognition; Speech information; Support vector machines; Transfer learning
Mesh:
Year: 2017 PMID: 28971424 DOI: 10.1007/978-3-319-57348-9_13
Source DB: PubMed Journal: Adv Exp Med Biol ISSN: 0065-2598 Impact factor: 2.622