| Literature DB >> 36188698 |
Dimin Zhu1, Yuxi Fu2, Xinjie Zhao3, Xin Wang4, Hanxi Yi5.
Abstract
The exploration of facial emotion recognition aims to analyze psychological characteristics of juveniles involved in crimes and promote the application of deep learning to psychological feature extraction. First, the relationship between facial emotion recognition and psychological characteristics is discussed. On this basis, a facial emotion recognition model is constructed by increasing the layers of the convolutional neural network (CNN) and integrating CNN with several neural networks such as VGGNet, AlexNet, and LeNet-5. Second, based on the feature fusion, an optimized Central Local Binary Pattern (CLBP) algorithm is introduced into the CNN to construct a CNN-CLBP algorithm for facial emotion recognition. Finally, the validity analysis is conducted on the algorithm after the preprocessing of face images and the optimization of relevant parameters. Compared with other methods, the CNN-CLBP algorithm has higher accuracy in facial expression recognition, with an average recognition rate of 88.16%. Besides, the recognition accuracy of this algorithm is improved by image preprocessing and parameter optimization, and there is no poor-fitting. Moreover, the CNN-CLBP algorithm can recognize 97% of the happy expressions and surprised expressions, but the misidentification rate of sad expressions is 22.54%. The research result provides data reference and direction for analyzing psychological characteristics of juveniles involved in crimes.Entities:
Mesh:
Year: 2022 PMID: 36188698 PMCID: PMC9522492 DOI: 10.1155/2022/2249417
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Structure of CNN.
Parameter settings of the facial emotion recognition neural network based on psychological feature analysis.
| Network structure | Convolution layer | Pooling layer | Fully connected layer | Output layer |
|---|---|---|---|---|
| Output size | 1 : 128∗128∗1 | 1 : 28∗28∗6 | 1 : 1∗1∗1 | 1∗1∗6 |
| 2 : 64∗64∗1 | 2 : 10∗10∗16 | 2 : 1∗1∗6 | ||
| 3 : 32∗32∗1 | ||||
| 4 : 14∗14∗6 |
Figure 2Structure of the hybrid CNN-CLBP model.
Image composition of Fer2013 and CK + data sets.
| Datasets | JAFFE | CK+ |
|---|---|---|
| Image composition | Number of images: 35886 | Number of images: 593 |
| Size: 48∗48 pixel | Size: 640∗480 pixel | |
| Participants: 10 | Participants: 123 | |
| Tags: happy, fear, sad, surprised, angry, disgusted, neutral | Tags: happy, fear, sad, surprised, contempt, anger, disgust, neutral |
Figure 3Human face images of the JAFFE data set (the data source: https://blog.csdn.net/akadiao/article/details/79956952).
Figure 4Human face images of the CK + data set (the data source: https://blog.csdn.net/yinghua2016/article/details/77323537).
Parameter settings of LeNet-5 and VGGNet models.
| Network layer | Input size | Convolution kernel size | Output size | |
|---|---|---|---|---|
| Lenet-5 | Convolution layer 1 | 32∗32∗1 | 5∗5∗1 | 28∗28∗4 |
| Lower sampling layer 1 | 28∗28∗4 | 2∗2 | 14∗14∗4 | |
| Convolution layer 2 | 14∗14∗4 | 5∗5∗6 | 10∗10∗14 | |
| Lower sampling layer 2 | 10∗10∗14 | 2∗2 | 5∗5∗14 | |
| Convolution layer 3 | 5∗5∗14 | 5∗5∗14 | 1∗1∗120 | |
| Full connection layer | 1∗1∗120 | 120∗82 | 1∗1∗82 | |
| Output layer | 1∗1∗82 | 82∗10 | 1∗1∗10 | |
| VGGNet | Convolution layer 1 | 224∗224∗3 | 11∗11∗3 | 55∗55∗46 |
| Lower sampling layer 1 | 55∗55∗46 | 3∗3 | 27∗27∗46 | |
| Convolution layer 2 | 27∗27∗46 | 5∗5∗46 | 27∗27∗128 | |
| Lower sampling layer 2 | 27∗27∗126 | 3∗3 | 13∗13∗128 | |
| Convolution layer 3 | 13∗13∗126 | 3∗3∗256 | 13∗13∗192 | |
| Convolution layer 4 | 13∗13∗192 | 3∗3∗192 | 13∗13∗192 | |
| Convolution layer 5 | 13∗13∗192 | 3∗3∗192 | 13∗13∗128 | |
Figure 5Comparison of recognition results of several facial emotion recognition methods: (a) recognition rate; (b) recognition results and time consumption.
Figure 6Quantitative analysis results of the CNN-CLBP model based on feature fusion.
Figure 7Facial emotion recognition results of the hybrid CNN-CLBP model.
Figure 8Facial emotion recognition effects of the hybrid CNN-CLBP model (the picture material comes from the public face recognition data set on the web page).
Comparison of research based on algorithm recognition.
| Author | Primary research contents |
|---|---|
| Jain et al. [ | They constructed a facial expression recognition system based on a single deep CNN, including a convolution layer and a deep residual layer. Through the training of face image labels and the training of CK + dataset and JAFFE dataset, they found that the model with deep convolution layer had better recognition effect and accuracy than the traditional emotion recognition methods. |
| Ma and Celik [ | They proposed a densely connected CNN structure applicable to facial expression recognition, and through this structure, the output and input of adjacent convolution layers were connected. They finally verified the effectiveness of the structure in facial expression recognition. |
|
| |
|
| |
| Liu et al. [ | They applied the fused CNN and CLBP to intelligent mining. By considering the unique visual features, they found that, in the case of applying the fused model, the accuracy of image recognition could be improved by 2% to 3% compared with the traditional methods. |
| Shao and Qian [ | They proposed a two-branch CNN model. By extracting traditional LBP features and deep learning features, they found that the fusion of LBP features and CNN showed excellent performance and applicability in facial expression recognition. |
| Takalkar et al. [ | They combined LBP with CNN for microexpression recognition. Through the evaluation of seven widely used microexpression databases, they found that the recognition accuracy of the proposed method had been significantly improved, and the relevant training and testing could be realized through a small number of data sets. |
|
| |
|
| |
| Liu and Zhang [ | They found that the recognition accuracy of CNN model could reach 78.9% after analyzing the accuracy of deep neural network in image recognition on CK + data set. |
| Shahid et al. [ | They proposed a multiclass SVM and topic-related k-fold crossover method for facial expression recognition. They found that the recognition rate of this method for CK + data set could reach more than 90%, and the accuracy and calculation time were improved. |
| Liao et al. [ | They introduced the conditional random forest structure to build a deep multi-instance learning model. They found that, in the field of automatic facial expression recognition, the recognition rate of the model on CK + public data set could reach more than 86%. |
| Miyoshi et al. [ | They proposed an enhanced convolutional long short-term memory algorithm for automatic facial expression recognition. The test results showed that the recognition accuracy of this algorithm on CK + data set was more than 85%. |
| Hybrid CNN-CLBP algorithm reported here | In conclusion, although the recognition rate of the hybrid CNN-CLBP algorithm is lower than that proposed in [ |