| Literature DB >> 35733939 |
Kana Miyamoto1,2, Hiroki Tanaka1,2, Satoshi Nakamura1,2.
Abstract
Music is often used for emotion induction. ince the emotions felt when listening to it vary from person to person, customized music is required. Our previous work designed a music generation system that created personalized music based on participants' emotions predicted from EEG data. Although our system effectively induced emotions, unfortunately, it suffered from two problems. The first is that a long EEG recording is required to train emotion prediction models. In this paper, we trained models with a small amount of EEG data. We proposed emotion prediction with meta-learning and compared its performance with two other training methods. The second problem is that the generated music failed to consider the participants' emotions before they listened to music. We solved this challenge by constructing a system that adapted an iso principle that gradually changed the music from close to the participants' emotions to the target emotion. Our results showed that emotion prediction with meta-learning had the lowest RMSE among three methods (p < 0.016). Both a music generation system based on the iso principle and our conventional music generation system more effectively induced emotion than music generation that was not based on the emotions of the participants (p < 0.016).Entities:
Keywords: electroencephalogram (EEG); emotion induction; emotion prediction; iso principle; meta-learning; music generation
Year: 2022 PMID: 35733939 PMCID: PMC9207201 DOI: 10.3389/fdgth.2022.873822
Source DB: PubMed Journal: Front Digit Health ISSN: 2673-253X
Figure 1Model structure: Upper part is a CNN that predicts emotion from EEG data. Right part is a neural network that uses predicted emotions from CNN and music generator's inputs for emotion prediction.
Structures of CNN and neural network: Conv is convolutional layer, BN is batch normalization layer, FC is fully connected layer, and Drop is drop-out layer.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| CNN | Conv+BN+ReLU | 2×2 | 8 | 1 | – |
| Conv+BN+ReLU | 2×2 | 8 | 1 | – | |
| Conv+BN+ReLU | 2×2 | 8 | 1 | – | |
| FC | – | 2 | – | – | |
| Neural network | FC+ReLU+Drop | – | 8 | – | 0.2 |
| FC | – | 2 | – | – |
MAML for emotion prediction using EEG data.
| 1: Randomly initialize θ |
| 2: Sample training tasks |
| 3: |
| 4: |
| 5: Select data of 20 pieces of music |
| 6: Evaluate |
| 7: Update parameters: |
| 8: Select data of about 21 pieces of music |
| 9: |
| 10: Update |
| 11: |
Figure 2Three emotion prediction methods using a single target participant's small amount of EEG data.
Participants' mean and standard deviation of RMSEs of felt and predicted emotions using EEG data: Bold indicates RMSE of proposed method with a significant difference from baseline methods.
|
|
|
|
|
| ||||
|---|---|---|---|---|---|---|---|---|
| Val | Aro | Val | Aro | Val | Aro | Val | Aro | |
| A | 0.298 | 0.298 | 0.275 | 0.290 | 0.262 | 0.285 | 0.256 | 0.274 |
| (0.121) | (0.071) | (0.101) | (0.071) | (0.096) | (0.066) | (0.098) | (0.058) | |
| B | 0.347 | 0.328 | 0.325 | 0.323 | 0.318 | 0.320 | 0.312 | 0.308 |
| (0.122) | (0.082) | (0.103) | (0.080) | (0.099) | (0.077) | (0.101) | (0.067) | |
| C | 0.378 | 0.391 | 0.355 | 0.366 | 0.338 | 0.354 | 0.331 | 0.344 |
| (0.080) | (0.079) | (0.084) | (0.068) | (0.071) | (0.070) | (0.070) | (0.069) | |
Figure 3Box plots of 20 participants' RMSEs of felt and predicted emotions using EEG data.
Participants' mean and standard deviation of RMSEs of felt and predicted emotions using EEG data and music generator's inputs: Music gen. indicates emotion prediction using music generator's inputs.
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
| ||||||
|
|
|
|
|
|
|
|
|
|
|
| 0.202 | 0.204 | 0.194 |
|
|
|
|
| 0.251 | 0.258 |
| (0.088) | (0.044) | (0.093) | (0.043) | (0.078) | (0.041) | (0.075) | (0.039) | (0.084) | (0.086) |
Bold indicates RMSE with a significant difference from both predictions using model A and music generator's inputs.
Figure 4Box plots of 20 participants' RMSEs of felt and predicted emotions using EEG data and music generator's inputs: Music gen. indicates emotion prediction using music generator's inputs.
Figure 5Emotion induction system using meta-learning: Red text is newly implemented methods in this paper.
Figure 6Experimental protocol.
Update music generator's inputs.
| 1: Record 1 s EEG during the silent state |
| 2: Predict emotion before listening to music using EEG |
| 3: |
| 4: Set a music generator's inputs as a participant's emotion before listening to music |
| 5: |
| 6: Set a music generator's inputs as a target emotion |
| 7: |
| 8: |
| 9: Start generating music using the music generator's inputs |
| 10: Record a 1 s EEG |
| 11: Predict the current emotion using EEG |
| 12: |
| 13: Update the music generator's inputs using formulas (2) and (4) |
| 14: |
| 15: Update the music generator's inputs using formulas (5) and (6) |
| 16: |
| 17: Update the music generator's inputs using formulas (7) and (8) |
| 18: |
| 19: |
Figure 7Conceptual scheme of control of music generator using three methods: Target denotes target emotion for emotion induction. Initial denotes participant's emotion before listening to music.
RMSE of felt and predicted emotions before or after listening to music in current system: Bold indicates performance of CNN and neural network used by system to generate music.
|
|
| |||||
|---|---|---|---|---|---|---|
|
|
|
| ||||
|
|
|
|
|
|
| |
| 1 | 0.183 | 0.117 | 0.159 | 0.162 | 0.144 | 0.193 |
| 2 | 0.170 | 0.317 | 0.259 | 0.198 | 0.126 | 0.155 |
| 3 | 0.117 | 0.203 | 0.297 | 0.329 | 0.265 | 0.311 |
| 4 | 0.163 | 0.196 | 0.221 | 0.239 | 0.168 | 0.164 |
| 5 | 0.301 | 0.199 | 0.458 | 0.379 | 0.321 | 0.191 |
| 6 | 0.200 | 0.183 | 0.233 | 0.163 | 0.132 | 0.163 |
| 7 | 0.251 | 0.264 | 0.490 | 0.358 | 0.200 | 0.225 |
| 8 | 0.317 | 0.167 | 0.318 | 0.322 | 0.253 | 0.326 |
| 9 | 0.190 | 0.248 | 0.192 | 0.210 | 0.152 | 0.155 |
| 10 | 0.242 | 0.203 | 0.332 | 0.356 | 0.196 | 0.200 |
| Mean | 0.213 | 0.210 | 0.296 | 0.272 |
|
|
| SD | 0.063 | 0.055 | 0.109 | 0.086 | 0.065 | 0.062 |
Distance between target and induced emotions: Bold indicates distance with a significant difference from baseline method.
|
|
|
| |
|---|---|---|---|
| 1 | 0.483 | 0.496 | 0.450 |
| 2 | 0.399 | 0.371 | 0.401 |
| 3 | 0.387 | 0.445 | 0.348 |
| 4 | 0.318 | 0.345 | 0.418 |
| 5 | 0.385 | 0.353 | 0.378 |
| 6 | 0.299 | 0.296 | 0.393 |
| 7 | 0.234 | 0.190 | 0.362 |
| 8 | 0.267 | 0.284 | 0.318 |
| 9 | 0.309 | 0.287 | 0.340 |
| 10 | 0.224 | 0.303 | 0.338 |
| Mean |
|
| 0.375 |
| SD | 0.082 | 0.087 | 0.041 |
Figure 8Plots of inputs of music generator and emotions predicted from CNN and neural network in participant eight: Target emotion is {val,aro} = {0.875,0.125}.