| Literature DB >> 31888182 |
Yue Zhao1, Jiancheng Xu1.
Abstract
Human beings are particularly inclined to express real emotions through micro-expressions with subtle amplitude and short duration. Though people regularly recognize many distinct emotions, for the most part, research studies have been limited to six basic categories: happiness, surprise, sadness, anger, fear, and disgust. Like normal expressions (i.e., macro-expressions), most current research into micro-expression recognition focuses on these six basic emotions. This paper describes an important group of micro-expressions, which we call compound emotion categories. Compound micro-expressions are constructed by combining two basic micro-expressions but reflect more complex mental states and more abundant human facial emotions. In this study, we firstly synthesized a Compound Micro-expression Database (CMED) based on existing spontaneous micro-expression datasets. These subtle feature of micro-expression makes it difficult to observe its motion track and characteristics. Consequently, there are many challenges and limitations to synthetic compound micro-expression images. The proposed method firstly implemented Eulerian Video Magnification (EVM) method to enhance facial motion features of basic micro-expressions for generating compound images. The consistent and differential facial muscle articulations (typically referred to as action units) associated with each emotion category have been labeled to become the foundation of generating compound micro-expression. Secondly, we extracted the apex frames of CMED by 3D Fast Fourier Transform (3D-FFT). Moreover, the proposed method calculated the optical flow information between the onset frame and apex frame to produce an optical flow feature map. Finally, we designed a shallow network to extract high-level features of these optical flow maps. In this study, we synthesized four existing databases of spontaneous micro-expressions (CASME I, CASME II, CAS(ME)2, SAMM) to generate the CMED and test the validity of our network. Therefore, the deep network framework designed in this study can well recognize the emotional information of basic micro-expressions and compound micro-expressions.Entities:
Keywords: 3D-FFT; CNN; EVM; FACS; TV-L1 optical flow; compound micro-expressions
Year: 2019 PMID: 31888182 PMCID: PMC6960609 DOI: 10.3390/s19245553
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1This Compound Facial Expressions of Emotion (CFEE).
Figure 2The framework of the proposed method.
Figure 3Compound facial expressions in real environments (left: “disgustedly surprised”, right: “fearfully surprised”).
Prototypical AUs of 18 emotions (6 basic emotions and 12 compound emotions) described by Martinez [10].
| Emotion | Prototypical AUs |
|---|---|
| Happiness | 12, 25, 6 |
| Sadness | 4, 15 [1 (60%), 6 (50%), 11 (26%), 17 (67%)] |
| Fear | 1, 4, 20, 25 [2 (57%), 5 (63%), 26 (33%)] |
| Angry | 4, 7, 24 [10 (26%), 17 (52%), 23 (29%)] |
| Surprise | 1, 2, 25, 26 [5 (66%)] |
| Disgust | 9, 10, 17 [4 (31%), 24 (26%)] |
| Happily surprised | 1, 2, 12, 25 [5 (64%), 26 (67%)] |
| Happily disgusted | 10, 12, 25 [4 (32%), 6 (61%), 9 (59%)] |
| Sadly fearful | 1, 4, 20, 25 [2 (46%), 5 (24%), 6 (34%), 15 (30%)] |
| Sadly angry | 4, 15 [6 (26%), 7 (48%), 11 (20%), 17 (50%)] |
| Sadly surprised | 1, 4, 25, 26 [2 (27%), 6 (31%)] |
| Sadly disgusted | 4, 10, 25 [1 (49%), 6 (61%), 9 (20%), 11 (35%), 15 (54%), 17 (47%)] |
| Fearfully angry | 4, 20, 25 [5 (40%), 7 (39%), 10 (30%)] |
| Fearfully surprised | 1, 2, 5, 20, 25 [4 (47%), 26 (51%)] |
| Fearfully disgusted | 1, 4, 10, 20, 25 [2 (64%), 5 (50%), 9 (28%), 15 (33%)] |
| Angrily surprised | 4, 25, 26 [5 (35%), 7 (50%), 10 (34%)] |
| Angrily disgusted | 4, 10, 17 [7 (60%), 9 (57%), 24 (36%)] |
| Disgustedly surprised | 1, 2, 5, 10 [4 (45%), 9 (37%), 17 (66%), 24 (33%)] |
Prototypical AUs of 6 basic MEs.
| Emotion | Prototypical AUs |
|---|---|
| Happiness | 6, 12 |
| Sadness | 1, 4, 15 |
| Fear | 1, 4, 20 |
| Anger | 4, 7, 43 |
| Disgust | 4, 7, 9, 25, 26 |
| Surprise | 1, 2, 5 |
Figure 4The generation process of CMED: (a) Description of Positively Surprised; (b) Description of Positively Negative; (c) Description of Negatively Surprised; (d) Description of Negatively Negative.
Figure 5The compound micro-expression database.
Figure 6Comparison of ME sequences at different magnification factors.
Figure 7Optical flow maps of six MEs in CASME Ⅱ database.
Figure 8Overall framework of proposed network.
Basic information of databases used in experiment.
| CASME I | CASME II | CAS(ME)2 | SMIC-HS | SAMM | ||
|---|---|---|---|---|---|---|
|
| 2013 | 2014 | 2016 | 2013 | 2018 | |
| Participants | 19 | 24 | 22 | 16 | 28 | |
| Frame rate (fps) | 60 | 200 | 30 | 100 | 200 | |
| FACS coded | Yes | Yes | Yes | No | Yes | |
| Face resolution | 150 × 190 | 280 × 340 | 190 × 230 | 130 × 160 | 960 × 650 | |
| Emotion classes | 7 | 5 | 4 | 3 | 7 | |
| Expression | Negative | 52 | 88 | 28 | 70 | 91 |
| Positive | 9 | 32 | 16 | 51 | 26 | |
| Surprise | 20 | 25 | 10 | 43 | 15 | |
| Total | 81 | 145 | 54 | 164 | 132 | |
| Ground-truth (index) | Onset | Yes | Yes | Yes | Yes | Yes |
| Apex | Yes | Yes | Yes | No | Yes | |
| Offset | Yes | Yes | Yes | Yes | Yes | |
Compound ME database.
| CASME I | CASME II | CAS(ME)2 | SAMM | CMED | ||
|---|---|---|---|---|---|---|
| Pos | Happiness | 9 | 32 | 15 | 26 | 82 |
| Neg | Disgust | 44 | 64 | 16 | 9 | 233 |
| Fear | 2 | 2 | 4 | 8 | ||
| Anger | - | - | 7 | 57 | ||
| Sadness | 6 | 7 | 1 | 6 | ||
| Sur | Surprise | 20 | 25 | 10 | 15 | 70 |
| PS | Happily surprised | 16 | 18 | 20 | 20 | 74 |
| NS | Sadly surprised | 5 | 19 | - | 7 | 236 |
| Fearfully surprised | - | - | 8 | 8 | ||
| Angrily surprised | - | - | 12 | 26 | ||
| Disgustedly surprised | 62 | 73 | 16 | - | ||
| PN | Happily disgusted | 6 | 143 | 13 | 35 | 197 |
| NN | Sadly fearful | 2 | - | - | 7 | 158 |
| Sadly angry | - | - | - | 18 | ||
| Sadly disgusted | 28 | 52 | - | 1 | ||
| Fearfully angry | - | - | 2 | 2 | ||
| Fearfully disgusted | - | - | 5 | - | ||
| Angrily disgusted | - | - | 10 | 31 |
Figure 9Recognition performance using different magnification factor.
Figure 10Comparison of different magnification method.
Figure 11Optical flow feature maps with different λ and .
Network structure.
| Layer | Filter Size | Stride | Output Size | Dropout |
|---|---|---|---|---|
| Input |
| |||
| Conv-1 |
| 1 |
| |
| Conv-2 |
| 1 |
| |
| Pool-1 |
| 2 |
| |
| Conv-3 |
| 1 |
| |
| Pool-2 |
| 2 |
| |
| Conv-3 |
| 1 |
| |
| Pool-3 |
| 2 |
| |
| FC-1 |
| 70% | ||
| FC-2 |
| 70% | ||
| Output |
| |||
|
| ||||
Recognition accuracy and F1-measure evaluated on basic/compound ME databases.
| Epoch | Basic ME | CMED | ||
|---|---|---|---|---|
| Accuracy (%) | F1-Measure | Accuracy (%) | F1-Measure | |
| 100 | 76.19 | 0.7304 | 66.06 | 0.6353 |
| 300 | 78.52 | 0.7518 | 66.93 | 0.6384 |
| 500 | 80.64 | 0.7724 | 67.15 | 0.6418 |
| 1000 | 77.03 | 0.7401 | 65.14 | 0.6267 |
Figure 12Recognition performance using different input graph on CMED. Magnified not magnificated.
Recognition performance: Accuracy (%) on the CASME II, SMIC, and SAMM databases for state-of-the-art methods and the proposed method.
| Method | CASME II (5 Classes) | SMIC (3 Classes) | SAMM (7 Classes) |
|---|---|---|---|
| LBP-TOP [ | 39.68 | 43.73 | 35.56 |
| LBP-SIP [ | 43.32 | 54.88 | - |
| STLBP-IP [ | 59.51 | 57.93 | - |
| STCLQP [ | 58.39 | 64.02 | - |
| OSF [ | - | 31.98 | - |
| OSW [ | 41.7 | 53.05 | - |
| MDMO [ | 44.25 | - | - |
| Bi-WOOF [ | 57.89 | 61.59 | 51.39 |
| AlexNet [ | 83.12 | 63.73 | 66.42 |
| GoogLeNet [ | 64.14 | 55.11 | 59.92 |
| VGG 16 [ | 82.02 | 59.64 | 47.93 |
| OFF-ApexNet [ | 86.81 | 66.95 | 53.92 |
| STSTNet [ | 86.86 | 70.13 | 68.1 |
| Proposed method | 87.01 | 69.79 | 70.18 |
Properties of the neural networks.
| Network | Depth | Image Input Size | Execution Time (s) |
|---|---|---|---|
| AlexNet [ | 8 |
| 12.9007 |
| GooLeNet [ | 22 |
| 29.3002 |
| VGG 16 [ | 16 |
| 95.4436 |
| OFF-ApexNet [ | 5 |
| 5.5632 |
| STSTNet [ | 2 |
| 5.7366 |
| The Proposed Method | 11 |
| 10.9402 |
Figure 13The measurement of confusion matrix: (a) the basic ME database; (b) the CMED.