| Literature DB >> 32240406 |
Issam Dagher1, Elio Dahdah2, Morshed Al Shakik2.
Abstract
Herein, a three-stage support vector machine (SVM) for facial expression recognition is proposed. The first stage comprises 21 SVMs, which are all the binary combinations of seven expressions. If one expression is dominant, then the first stage will suffice; if two are dominant, then the second stage is used; and, if three are dominant, the third stage is used. These multilevel stages help reduce the possibility of experiencing an error as much as possible. Different image preprocessing stages are used to ensure that the features attained from the face detected have a meaningful and proper contribution to the classification stage. Facial expressions are created as a result of muscle movements on the face. These subtle movements are detected by the histogram-oriented gradient feature, because it is sensitive to the shapes of objects. The features attained are then used to train the three-stage SVM. Two different validation methods were used: the leave-one-out and K-fold tests. Experimental results on three databases (Japanese Female Facial Expression, Extended Cohn-Kanade Dataset, and Radboud Faces Database) show that the proposed system is competitive and has better performance compared with other works.Entities:
Keywords: Facial expression recognition; Histogram of oriented gradients; Support vector machine; Validation; Viola–Jones
Year: 2019 PMID: 32240406 PMCID: PMC7099535 DOI: 10.1186/s42492-019-0034-5
Source DB: PubMed Journal: Vis Comput Ind Biomed Art ISSN: 2524-4442
Fig. 1General skeletal structure: dividing the dataset into training and testing sets
Fig. 2Two-stage model: first stage (21 SVM) and Second and third stages combining the results of Stage 1
Fig. 3Sample of seven expressions of each dataset starting from JAFFE (top), CK+ (middle), and RaFD (bottom)
Fig. 4Image preprocessing: (1) Gray scaling and resizing, (2) Viola–Jones, (3) border adjustment, (4) cropping, and (5) additional resizing• Viola-Jones
Fig. 5Grayscale and resizing to 256 × 256
Fig. 6Viola–Jones face detection with mouth missing (left): Border 3 must cover the whole mouth only, whereas Borders 1 and 2 must cover the face region (right)
Increase of proposed accuracy method using image preprocessing techniques
| Datasets | ||
|---|---|---|
| Accuracy | JAFFE | CK+ |
| Before | 94.84% | 88.66% |
| After | 96.71% | 93.29% |
Proposed accuracy method using leave-one-out on the three datasets
| Datasets | Leave-one-out |
|---|---|
| JAFFE | 96.71% |
| CK+ | 93.29% |
| RaFD | 99.72% |
Proposed accuracy method using K-fold method
| Folds | |||
|---|---|---|---|
| Datasets | 10 | 5 | 2 |
| JAFFE | 98.10% | 97.62% | 90.10% |
| CK+ | 94.42% | 93.49% | 90.00% |
| RaFD | 95.14% | 95.10% | 94.88% |
Proposed accuracy methods using different folds
| Datasets | Methods | |||
|---|---|---|---|---|
| Leave-one-out | 10 folds | 5 folds | 2 folds | |
| JAFFE | 96.71% | 98.10% | 97.62% | 90.10% |
| CK+ | 93.29% | 94.42% | 93.49% | 90.00% |
| RaFD | 99.72% | 95.14% | 95.10% | 94.88% |
Proposed accuracy method compared with other techniques
| Datasets | Method | Classification rate | Proposed method | Validation test |
|---|---|---|---|---|
| JAFFE | Patch-based Gabor [ | 92.30% | 96.71% | Leave-one-out |
| HOG+SVM [ | 94.30% | |||
| LDA + SVM [ | 95.71% | |||
| Boosted LBP + SVM [ | 79.80% | 98.10% | 10-folds | |
| LBP pyramid + SVM [ | 91.36% | |||
| LBP + Gabor filter [ | 92.38% | |||
| LBP + HOG + PCA + SVM [ | 87.60% | 97.62% | 5-folds | |
| CK+ | HOG+SVM [ | 88.70% | 93.29% | Leave-one-out |
| DNN [ | 93.20% | 93.49% | 5-fold | |
| RaFD | HOG+SVM [ | 98.50% | 99.72% | 10-fold |