| Literature DB >> 36241674 |
Prabal Datta Barua1,2, Nursena Baygin3, Sengul Dogan4, Mehmet Baygin5, N Arunkumar6, Hamido Fujita7,8,9, Turker Tuncer4, Ru-San Tan10,11, Elizabeth Palmer12,13, Muhammad Mokhzaini Bin Azizan14, Nahrizul Adib Kadri15, U Rajendra Acharya16,17,18.
Abstract
Pain intensity classification using facial images is a challenging problem in computer vision research. This work proposed a patch and transfer learning-based model to classify various pain intensities using facial images. The input facial images were segmented into dynamic-sized horizontal patches or "shutter blinds". A lightweight deep network DarkNet19 pre-trained on ImageNet1K was used to generate deep features from the shutter blinds and the undivided resized segmented input facial image. The most discriminative features were selected from these deep features using iterative neighborhood component analysis, which were then fed to a standard shallow fine k-nearest neighbor classifier for classification using tenfold cross-validation. The proposed shutter blinds-based model was trained and tested on datasets derived from two public databases-University of Northern British Columbia-McMaster Shoulder Pain Expression Archive Database and Denver Intensity of Spontaneous Facial Action Database-which both comprised four pain intensity classes that had been labeled by human experts using validated facial action coding system methodology. Our shutter blinds-based classification model attained more than 95% overall accuracy rates on both datasets. The excellent performance suggests that the automated pain intensity classification model can be deployed to assist doctors in the non-verbal detection of pain using facial images in various situations (e.g., non-communicative patients or during surgery). This system can facilitate timely detection and management of pain.Entities:
Mesh:
Year: 2022 PMID: 36241674 PMCID: PMC9568538 DOI: 10.1038/s41598-022-21380-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Summary of the state-of-the-art for automated pain level detection using facial images.
| Study | Method | Classifier | Number of subjects | Number of facial images/frames | Accuracy (%) | Limitations |
|---|---|---|---|---|---|---|
| Brahnam et al.[ | Principal component analysis, linear discriminant analysis | Support vector machine | 26 (13 male, 13 female) | 204 | 88.00 | Single and small dataset, low accuracy |
| Brahnam et al.[ | Principal component analysis, linear discriminant analysis, frequency domain methods | Neural network simultaneous optimization algorithm | 26 (13 male, 13 female) | 204 | 100.0 | Single and small dataset |
| Brahnam et al.[ | Principal component analysis, linear discriminant analysis | Neural network simultaneous optimization algorithm | 26 (13 male, 13 female) | 204 | 90.20 | Single and small dataset |
| Kristian et al.[ | Active shape model, local binary pattern | Support vector machine | 23 | 132 | 88.70 | Single and small dataset |
| Othman et al.[ | MobileNetV2 | Softmax | 1. 87 | 1. 3480 | 1. 6550 | Low accuracy |
| 2. 134 | 2. 7763 | 2. 7140 | ||||
| Weitz et al.[ | Convolutional neural network | Softmax | 324 | 14,322 | 67.00 | Single dataset, low accuracy |
| Yang et al.[ | Local binary pattern, local phase quantization, statistical features | Support vector machine | 1. 129 | 1. 48,398 | 1. 8342 | Low accuracy |
| 2. 90 | 2. 8700 | 2. 71.00 | ||||
| Kharghanian et al.[ | Convolutional deep belief network model | Support vector machine | 25 | 48,398 | 87.20 | Single dataset, low accuracy |
| Zafar and Khan[ | Geometric features | k-nearest neighbor | Unspecified | 21,500 | 84.02 | Single dataset, low accuracy |
Confusion matrix obtained for proposed shutter blinds model using UNBC-McMaster Shoulder Pain Expression Archive Database with tenfold cross-validation.
| Real outputs | Predicted outputs | |||
|---|---|---|---|---|
| PSPI = 0 | PSPI = 1 | 2 ≤ PSPI ≤ 3 | PSPI > 3 | |
| PSPI = 0 | 2342 | 67 | 59 | 15 |
| PSPI = 1 | 11 | 2798 | 99 | 1 |
| 2 ≤ PSPI ≤ 3 | 9 | 94 | 3598 | 62 |
| PSPI > 3 | 8 | 4 | 52 | 1633 |
| Recall (%) | 94.32 | 96.18 | 95.62 | 96.23 |
| Precision (%) | 98.82 | 94.43 | 94.49 | 95.44 |
| F1-score (%) | 96.52 | 95.30 | 95.05 | 95.83 |
Confusion matrix obtained for proposed shutter blinds model using DISFA Database with tenfold cross-validation.
| Real outputs | Predicted outputs | |||
|---|---|---|---|---|
| PSPI = 0 | PSPI = 1 | 2 ≤ PSPI ≤ 3 | PSPI > 3 | |
| PSPI = 0 | 8528 | 405 | 80 | 12 |
| PSPI = 1 | 132 | 9606 | 227 | 8 |
| 2 ≤ PSPI ≤ 3 | 19 | 338 | 9765 | 187 |
| PSPI > 3 | 4 | 11 | 121 | 9739 |
| Recall (%) | 94.49 | 96.32 | 94.72 | 98.62 |
| Precision (%) | 98.21 | 92.72 | 95.80 | 97.92 |
| F1-score (%) | 96.32 | 94.49 | 95.26 | 98.27 |
Figure 1Overall performance comparisons. Acc: accuracy, UAR: unweighted average recall, UAP: unweighted average precision, F1: F1-score, MCC: Matthew’s correlation coefficient, CK: Cohen’s kappa, GM: geometric mean.
Figure 2Generated feature vector samples for the different classes of the UNBC-McMaster dataset: (a) PSPI = 0, (b) PSPI = 1, (c) 2 ≤ PSPI ≤ 3, (d) PSPI > 3.
Figure 3The classification accuracy of seven standard shallow classifiers was attained using tenfold cross-validation on the UNBC-McMaster dataset. The FT, fine tree; KNB, kernel naïve Bayes; LD, linear discriminant; CSVM, cubic support vector machine; FkNN, fine k-nearest neighbor; BT, bagged tree; WNN, wide neural network.
Comparison of our work with state-of-the-art methods developed for automated pain intensity classification using facial images.
| Study | Method | Classifier | Dataset | Results |
|---|---|---|---|---|
| Bargshady[ | Temporal convolutional network, LSTM, principal component analysis | Temporal convolutional network | UNBC-McMaster (10,783 frames) | MSE: 1.186 |
| MAE: 0.234 | ||||
| Acc: 94.14% | ||||
| AUC: 91.30% | ||||
| Bargshady[ | Ensemble neural network | Ensemble CNN-recurrent neural network | UNBC-McMaster (10,783 frames) | AUC: 90.50% |
| Acc: 86.00% | ||||
| MSE: 0.081 | ||||
| Semwal[ | Ensemble of compact CNN | Ensemble | UNBC-McMaster (16,000 frames) | Pre: 91.97% |
| Rec: 91.01$ | ||||
| F1: 91.42% | ||||
| Acc: 93.87 | ||||
| Rudovic[ | CNN | Softmax | UNBC-McMaster (48,106 frames) | Acc: 76.00% |
| PR-AUC: 59.00 | ||||
| F1: 47.00 | ||||
| Karamitsos[ | CNN | Softmax | UNBC-McMaster (48,398 frames) | Acc: 92.50% |
| Semwal[ | CNN | Softmax | UNBC-McMaster (16,000 frames) | Acc: 92.00% |
| MAE: 0.20 | ||||
| MSE: 0.17 | ||||
| Bargshady[ | CNN, bidirectional LSTM | Enhanced joint hybrid-CNN-bidirectional LSTM | UNBC-McMaster (10,783 frames) | Acc: 91.20% |
| AUC: 98.40% | ||||
| El Morabit and Rivenq[ | Vision Transformer, Feed Forward Network | Softmax | UNBC-McMaster (48,398 frames) | Acc: 84.15 |
| Our model | Transfer learning, novel shutter blinds-based deep feature extraction | kNN | UNBC-McMaster (10,852 frames) | Acc: 95.57% |
| UAR: 95.59% | ||||
| UAP: 95.79% | ||||
| Average F1: 95.67% | ||||
| MCC: 94.14% | ||||
| CK: 93.93% | ||||
| GM: 95.58% | ||||
| DISFA (39,182 frames) | Acc: 96.06% | |||
| UAR: 96.04% | ||||
| UAP: 96.16% | ||||
| Average F1: 96.08% | ||||
| MCC: 94.78% | ||||
| CK: 94.74% | ||||
| GM: 96.03% |
Acc, Accuracy; AUC, area under curve; CK, Cohen’s kappa; CNN, convolutional neural network; F1, F1-ScoreGM, geometric mean; LSTM, long short-term memory; MAE, mean absolute error; MCC, Matthew’s correlation coefficient; MSE, mean squared error; PR-AUC, precision-recall area under the curve; Pre, precision; Rec, recall; UAP, unweighted average precision; UAR, unweighted average recall.
Frequency distribution of video frames in the UNBC-McMaster and DISFA databases by PSPI scores and PSPI groups.
| PSPI score | UNBC-McMaster database | DISFA database | ||
|---|---|---|---|---|
| Frequency by PSPI score | Frequency by PSPI group | Frequency by PSPI score | Frequency by PSPI group | |
| 0 | 40,029 | 2483* | 90,309 | 9025* |
| 1 | 2909 | 2909 | 9973 | 9973 |
| 2 | 2351 | 3763 | 12,194 | 10,309* |
| 3 | 1412 | 8447 | ||
| 4 | 802 | 1697 | 4127 | 9875 |
| 5 | 242 | 1661 | ||
| 6 | 270 | 1526 | ||
| 7 | 53 | 996 | ||
| 8 | 79 | 435 | ||
| 9 | 32 | 503 | ||
| 10 | 67 | 317 | ||
| 11 | 76 | 169 | ||
| 12 | 48 | 108 | ||
| 13 | 22 | 33 | ||
| 14 | 1 | 0 | ||
| 15 | 5 | 0 | ||
| Total | 48,398 | 10,852 | 130,798 | 39,182 |
*Random under-sampling of over-represented PSPI groups was performed to create more balanced study datasets with smaller total numbers of video frames for training and testing the pain intensity classification model.
Figure 4Schema of the proposed model based on shutter blinds-based deep feature extraction.