Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Learning Facial Action Units with Spatiotemporal Cues and Multi-label Sampling.

Literature DB >> 30524157

Learning Facial Action Units with Spatiotemporal Cues and Multi-label Sampling.

Wen-Sheng Chu¹, Fernando De la Torre¹, Jeffrey F Cohn².

Abstract

Facial action units (AUs) may be represented spatially, temporally, and in terms of their correlation. Previous research focuses on one or another of these aspects or addresses them disjointly. We propose a hybrid network architecture that jointly models spatial and temporal representations and their correlation. In particular, we use a Convolutional Neural Network (CNN) to learn spatial representations, and a Long Short-Term Memory (LSTM) to model temporal dependencies among them. The outputs of CNNs and LSTMs are aggregated into a fusion network to produce per-frame prediction of multiple AUs. The hybrid network was compared to previous state-of-the-art approaches in two large FACS-coded video databases, GFT and BP4D, with over 400,000 AU-coded frames of spontaneous facial behavior in varied social contexts. Relative to standard multi-label CNN and feature-based state-of-the-art approaches, the hybrid system reduced person-specific biases and obtained increased accuracy for AU detection. To address class imbalance within and between batches during training the network, we introduce multi-labeling sampling strategies that further increase accuracy when AUs are relatively sparse. Finally, we provide visualization of the learned AU models, which, to the best of our best knowledge, reveal for the first time how machines see AUs.

Entities: Chemical Disease Gene Species

Keywords: 00-01; 99-00; Multi-label learning; deep learning; facial action unit detection; multi-label sampling; spatio-temporal learning; video analysis

Year: 2018 PMID： 30524157 PMCID： PMC6277040 DOI： 10.1016/j.imavis.2018.10.002

Source DB: PubMed Journal: Image Vis Comput ISSN： 0262-8856 Impact factor: 2.818

14 in total

Learning Facial Action Units with Spatiotemporal Cues and Multi-label Sampling.

1. Meta-Analysis of the First Facial Expression Recognition Challenge.

2. Automatic Analysis of Facial Affect: A Survey of Registration, Representation, and Recognition.

3. Entropy and distance of random graphs with application to structural pattern recognition.

4. Facial Action Unit Event Detection by Cascade of Tasks.

5. A Model of the Perception of Facial Expressions of Emotion by Humans: Research Overview and Perspectives.

6. Context-Sensitive Dynamic Ordinal Regression for Intensity Estimation of Facial Action Units.

7. Dynamic Cascades with Bidirectional Bootstrapping for Action Unit Detection in Spontaneous Facial Behavior.

8. Facial action unit recognition by exploiting their dynamic and semantic relationships.

9. Joint Patch and Multi-label Learning for Facial Action Unit Detection.

Review 10. Compound facial expressions of emotion: from basic research to clinical applications.

1. Unmasking the Devil in the Details: What Works for Deep Facial Action Coding?