| Literature DB >> 36091417 |
Abstract
In recent years, there are more and more intelligent machines in people's life, such as intelligent wristbands, sweeping robots, intelligent learning machines and so on, which can simply complete a single execution task. We want robots to be as emotional as humans. In this way, human-computer interaction can be more natural, smooth and intelligent. Therefore, emotion research has become a hot topic that researchers pay close attention to. In this paper, we propose a new dance emotion recognition based on global and local feature fusion method. If the single feature of audio is extracted, the global information of dance cannot be reflected. And the dimension of data features is very high. In this paper, an improved long and short-term memory (LSTM) method is used to extract global dance information. Linear prediction coefficient is used to extract local information. Considering the complementarity of different features, a global and local feature fusion method based on discriminant multi-canonical correlation analysis is proposed in this paper. Experimental results on public data sets show that the proposed method can effectively identify dance emotion compared with other state-of-the-art emotion recognition methods.Entities:
Keywords: LSTM; dance emotion recognition; feature fusion; linear prediction coefficient; robot
Year: 2022 PMID: 36091417 PMCID: PMC9449463 DOI: 10.3389/fnbot.2022.998568
Source DB: PubMed Journal: Front Neurorobot ISSN: 1662-5218 Impact factor: 3.493
Figure 1Flow chart of proposed dance emotion recognition.
Figure 2LPC coefficient extraction process.
Figure 3LPMFCC parameter feature extraction process.
Figure 4Waveform prediction model based on LSTM.
Figure 5The relation between recognition rate and dimension diagram.
Figure 6STMWLD dance video experiment result.
Figure 8LSTM dance video experiment result.
Figure 9LSTM + LPC dance video experiment result.
Expression recognition rate table of each feature extraction algorithm/%.
|
|
|
|
|
|---|---|---|---|
| STMWLD | 32.74 | 55.43 | 68.34 |
| LPC | 33.65 | 63.34 | 72.51 |
| LPC + STMWLDCNN | 34.56 | 62.51 | 73.34 |
| LSTM + LPC |
|
|
|
The bold values indicate the best values obtained by proposed method.
Comparison between different methods and feature fusion methods in this paper.
|
|
|
|
|---|---|---|
| HOG | 35.81 | 4.6 |
| CNN | 63.21 | 2.5 |
| Att-Net (Kwon, | 75.13 | 2.1 |
| CTNet (Lian et al., | 76.89 | 1.7 |
| CCML (Zehra et al., | 73.54 | 1.3 |
| Proposed |
|
|
The bold values indicate the best values obtained by proposed method.