| Literature DB >> 31936089 |
Yassin Kortli1,2, Maher Jridi1, Ayman Al Falou1, Mohamed Atri3.
Abstract
Over the past few decades, interest in theories and algorithms for face recognition has been growing rapidly. Video surveillance, criminal identification, building access control, and unmanned and autonomous vehicles are just a few examples of concrete applications that are gaining attraction among industries. Various techniques are being developed including local, holistic, and hybrid approaches, which provide a face image description using only a few face image features or the whole facial features. The main contribution of this survey is to review some well-known techniques for each approach and to give the taxonomy of their categories. In the paper, a detailed comparison between these techniques is exposed by listing the advantages and the disadvantages of their schemes in terms of robustness, accuracy, complexity, and discrimination. One interesting feature mentioned in the paper is about the database used for face recognition. An overview of the most commonly used databases, including those of supervised and unsupervised learning, is given. Numerical results of the most interesting techniques are given along with the context of experiments and challenges handled by these techniques. Finally, a solid discussion is given in the paper about future directions in terms of techniques to be used for face recognition.Entities:
Keywords: biometric systems; face recognition systems; person identification; survey
Year: 2020 PMID: 31936089 PMCID: PMC7013584 DOI: 10.3390/s20020342
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Face recognition structure [3,23].
Figure 2Face recognition methods. SIFT, scale-invariant feature transform; SURF, scale-invariant feature transform; BRIEF, binary robust independent elementary features; LBP, local binary pattern; HOG, histogram of oriented gradients; LPQ, local phase quantization; PCA, principal component analysis; LDA, linear discriminant analysis; KPCA, kernel PCA; CNN, convolutional neural network; SVM, support vector machine.
Figure 3The local binary pattern (LBP) descriptor [19].
Figure 4All “” optical configuration [37].
Figure 5Flowchart of the VanderLugt correlator (VLC) technique [4]. FFT, fast Fourier transform; POF, phase-only filter.
Figure 6(a) Creation of the 3D face of a person, (b) results of the detection of 29 landmarks of a face using the active shape model, (c) results of the detection of 26 landmarks of a face [3].
Figure 7Face recognition based on the speeded-up robust features (SURF) descriptor [58]: recognition using fast library for approximate nearest neighbors (FLANN) distance.
Figure 8Fast retina keypoint (FREAK) descriptor used 43 sampling patterns [19].
Summary of local approaches. SIFT, scale-invariant feature transform; SURF, scale-invariant feature transform; BRIEF, binary robust independent elementary features; LBP, local binary pattern; HOG, histogram of oriented gradients; LPQ, local phase quantization; PCA, principal component analysis; LDA, linear discriminant analysis; KPCA, kernel PCA; CNN, convolutional neural network; SVM, support vector machine; PLBP, pyramid of LBP; KNN, k-nearest neighbor; MLBP, multiscale LBP; LTP, local ternary pattern.; PHOG, pyramid HOG; VLC, VanderLugt correlator; LFW, Labeled Faces in the Wild; FERET, Face Recognition Technology; PHPID, Pointing Head Pose Image Database; PCE, peak to correlation energy; POF, phase-only filter; PSR, peak-to-sidelobe ratio.
| Author/Technique Used | Database | Matching | Limitation | Advantage | Result | |
|---|---|---|---|---|---|---|
| Local Appearance-Based Techniques | ||||||
| Khoi et al. [ | LBP | TDF | MAP | Skewness in face image | Robust feature in fontal face | 5% |
| CF1999 | 13.03% | |||||
| LFW | 90.95% | |||||
| Xi et al. [ | LBPNet | FERET | Cosine similarity | Complexities of CNN | High recognition accuracy | 97.80% |
| LFW | 94.04% | |||||
| Khoi et al. [ | PLBP | TDF | MAP | Skewness in face image | Robust feature in fontal face | 5.50% |
| CF | 9.70% | |||||
| LFW | 91.97% | |||||
| Laure et al. [ | LBP and KNN | LFW | KNN | Illumination conditions | Robust | 85.71% |
| CMU-PIE | 99.26% | |||||
| Bonnen et al. [ | MRF and MLBP | AR (Scream) | Cosine similarity | Landmark extraction fails or is not ideal | Robust to changes in facial expression | 86.10% |
| FERET (Wearing sunglasses) | 95% | |||||
| Ren et al. [ | Relaxed LTP | CMU-PIE | Chisquare distance | Noise level | Superior performance compared with LBP, LTP | 95.75% |
| Yale B | 98.71% | |||||
| Hussain et al. [ | LPQ | FERET/ | Cosine similarity | Lot of discriminative information | Robust to illumination variations | 99.20% |
| LFW | 75.30% | |||||
| Karaaba et al. [ | HOG and MMD | FERET | MMD/MLPD | Low recognition accuracy | Aligning difficulties | 68.59% |
| LFW | 23.49% | |||||
| Arigbabu et al. [ | PHOG and SVM | LFW | SVM | Complexity and time of computation | Head pose variation | 88.50% |
| Leonard et al. [ | VLC correlator | PHPID | ASPOF | The low number of the reference image used | Robustness to noise | 92% |
| Napoléon et al. [ | LBP and VLC | YaleB | POF | Illumination | Rotation + Translation | 98.40% |
| YaleB Extended | 95.80% | |||||
| Heflin et al. [ | correlation filter | LFW/PHPID | PSR | Some pre-processing steps | More effort on the eye localization stage | 39.48% |
| Zhu et al. [ | PCA–FCF | CMU-PIE | Correlation filter | Use only linear method | Occlusion-insensitive | 96.60% |
| FRGC2.0 | 91.92% | |||||
| Seo et al. [ | LARK + PCA | LFW | Cosine similarity | Face detection | Reducing computational complexity | 78.90% |
| Ghorbel et al. [ | VLC + DoG | FERET | PCE | Low recognition rate | Robustness | 81.51% |
| Ghorbel et al. [ | uLBP + DoG | FERET | chi-square distance | Robustness | Processing time | 93.39% |
| Ouerhani et al. [ | VLC | PHPID | PCE | Power | Processing time | 77% |
|
| ||||||
| Lenc et al. [ | SIFT | FERET | a posterior probability | Still far to be perfect | Sufficiently robust on lower quality real data | 97.30% |
| AR | 95.80% | |||||
| LFW | 98.04% | |||||
| Du et al. [ | SURF | LFW | FLANN distance | Processing time | Robustness and distinctiveness | 95.60% |
| Vinay et al. [ | SURF + SIFT | LFW | FLANN | Processing time | Robust in unconstrained scenarios | 78.86% |
| Face94 | distance | 96.67% | ||||
| Calonder et al. [ | BRIEF | _ | KNN | Low recognition rate | Low processing time | 48% |
Figure 9Example of dimensional reduction when applying principal component analysis (PCA) [62].
Figure 10The first five Eigenfaces built from the ORL database [63].
Figure 11The first five Fisherfaces obtained from the ORL database [63].
Subspace approaches. ICA, independent component analysis; DWT, discrete wavelet transform; FFT, fast Fourier transform; DCT, discrete cosine transform.
| Author/Techniques Used | Databases | Matching | Limitation | Advantage | Result | |
|---|---|---|---|---|---|---|
| Linear Techniques | ||||||
| Seo et al. [ | LARK and PCA | LFW | L2 distance | Detection accuracy | Reducing computational complexity | 85.10% |
| Annalakshmi et al. [ | ICA and LDA | LFW | Bayesian Classifier | Sensitivity | Good accuracy | 88% |
| Annalakshmi et al. [ | PCA and LDA | LFW | Bayesian Classifier | Sensitivity | Specificity | 59% |
| Hussain et al. [ | LQP and Gabor | FERET | Cosine similarity | Lot of discriminative information | Robust to illumination variations | 99.2% |
| LFW | ||||||
| Gowda et al. [ | LPQ and LDA | MEPCO | SVM | Computation time | Good accuracy | 99.13% |
| Z. Cui et al. [ | BoW | AR | ASM | Occlusions | Robust | 99.43% |
| ORL | 99.50% | |||||
| FERET | 82.30% | |||||
| Khan et al. [ | PSO and DWT | CK | Euclidienne distance | Noise | Robust to illumination | 98.60% |
| MMI | 95.50% | |||||
| JAFFE | 98.80% | |||||
| Huang et al. [ | 2D-DWT | FERET | KNN | Pose | Frontal or near-frontal facial images | 90.63% |
| LFW | ||||||
| Perlibakas and Vytautas [ | PCA and Gabor filter | FERET | Cosine metric | Precision | Pose | 87.77% |
| Hafez et al. [ | Gabor filter and LDA | ORL | 2DNCC | Pose | Good recognition performance | 98.33% |
| C. YaleB | 99.33% | |||||
| Sufyanu et al. [ | DCT | ORL | NCC | High memory | Controlled and uncontrolled databases | 93.40% |
| Yale | ||||||
| Shanbhag et al. [ | DWT and BPSO | _ _ | _ _ | Rotation | Significant reduction in the number of features | 88.44% |
| Ghorbel et al. [ | Eigenfaces and DoG filter | FERET | Chi-square distance | Processing time | Reduce the representation | 84.26% |
| Zhang et al. [ | PCA and FFT | YALE | SVM | Complexity | Discrimination | 93.42% |
| Zhang et al. [ | PCA | YALE | SVM | Recognition rate | Reduce the dimensionality | 84.21% |
|
| ||||||
| Fan et al. [ | RKPCA | MNIST ORL | RBF kernel | Complexity | Robust to sparse noises | _ |
| Vinay et al. [ | ORB and KPCA | ORL | FLANN Matching | Processing time | Robust | 87.30% |
| Vinay et al. [ | SURF and KPCA | ORL | FLANN Matching | Processing time | Reduce the dimensionality | 80.34% |
| Vinay et al. [ | SIFT and KPCA | ORL | FLANN Matching | Low recognition rate | Complexity | 69.20% |
| Lu et al. [ | KPCA and GDA | UMIST face | SVM | High error rate | Excellent performance | 48% |
| Yang et al. [ | PCA and MSR | HELEN face | ESR | Complexity | Utilizes color, gradient, and regional information | 98.00% |
| Yang et al. [ | LDA and MSR | FRGC | ESR | Low performances | Utilizes color, gradient, and regional information | 90.75% |
| Ouanan et al. [ | FDDL | AR | CNN | Occlusion | Orientations, expressions | 98.00% |
| Vankayalapati and Kyamakya [ | CNN | ORL | _ _ | Poses | High recognition rate | 95% |
| Devi et al. [ | 2FNN | ORL | _ _ | Complexity | Low error rate | 98.5 |
Figure 12Flowchart of the proposed multimodal deep face representation (MM-DFR) technique [95]. CNN, convolutional neural network.
Figure 13The proposed CNN–LSTM–ELM [103].
Hybrid approaches. GW, Gabor wavelet; OCLBP, over-complete LBP; WCCN, within class covariance normalization; WLBP, Walsh LPB; ICP, iterative closest point; LGBPHS, local Gabor binary pattern histogram sequence; FLD, Fisher linear discriminant; SAE, stacked auto-encoder.
| Author/Technique Used | Database | Matching | Limitation | Advantage | Result | |
|---|---|---|---|---|---|---|
| Fathima et al. [ | GW-LDA | AT&T | k-NN | High processing time | Illumination invariant and reduce the dimensionality | 88% |
| FACES94 | 94.02% | |||||
| MITINDIA | 88.12% | |||||
| Barkan et al., [ | OCLBP, LDA, and WCCN | LFW | WCCN | _ | Reduce the dimensionality | 87.85% |
| Juefei et al. [ | ACF and WLBP | LFW | Complexity | Pose conditions | 89.69% | |
| Simonyan et al. [ | Fisher + SIFT | LFW | Mahalanobis matrix | Single feature type | Robust | 87.47% |
| Sharma et al. [ | PCA–ANFIS | ORL | ANFIS | Sensitivity-specificity | 96.66% | |
| ICA–ANFIS | ANFIS | Pose conditions | 71.30% | |||
| LDA–ANFIS | ANFIS | 68% | ||||
| Ojala et al. [ | DCT–PCA | ORL | Euclidian distance | Complexity | Reduce the dimensionality | 92.62% |
| UMIST | 99.40% | |||||
| YALE | 95.50% | |||||
| Mian et al. [ | Hotelling transform, SIFT, and ICP | FRGC | ICP | Processing time | Facial expressions | 99.74% |
| Cho et al. [ | PCA–LGBPHS | Extended Yale Face | Bhattacharyya distance | Illumination condition | Complexity | 95% |
| PCA–GABOR Wavelets | ||||||
| Sing et al. [ | PCA–FLD | CMU | SVM | Robustness | Pose, illumination, and expression | 71.98% |
| FERET | 94.73% | |||||
| AR | 68.65% | |||||
| Kamencay et al. [ | SPCA-KNN | ESSEX | KNN | Processing time | Expression variation | 96.80% |
| Sun et al. [ | CNN–LSTM–ELM | OPPORTUNITY | LSTM/ELM | High processing time | Automatically learn feature representations | 90.60% |
| Ding et al. [ | CNNs and SAE | LFW | _ _ | Complexity | High recognition rate | 99% |
Figure 14Optimal hyperplane, support vectors, and maximum margin.
Figure 15Artificial neural network.
General performance of face recognition approaches.
| Approaches | Databases Used | Advantages | Disadvantages | Performances | Challenges Handled | |
|---|---|---|---|---|---|---|
|
|
| TDF, CF1999, |
Easy to implement, allowing an analysis of images in a difficult environment in real-time [ Invariant to size, orientation, and lighting [ |
Lack discrimination ability. It is difficult to automatic detect feature in this approach. |
High performance in terms of processing time and recognition rate [ |
Pose variations [ |
|
|
Does not require prior knowledge of the images [ Different illumination conditions, scaling, aging effects, facial expressions, face occlusions, and noisy images [ |
More affected by orientation changes or the expression of the face [ |
High processing time [ Low recognition rate [ |
Different illumination conditions, facial expressions, aging effects, scaling, face occlusions and noisy images [ | ||
|
|
| LFW, FERET, MEPCO, AR, ORL, CK, MMI, JAFFE, |
When frontal views of faces are used, these techniques provide good performance [ Recognition is effective and simple. Dimensionality reduction, represent global information [ |
Sensitive to the rotation and the translation of the face images. Can only classify a face that is “known” to the database. Low speed in the face recognition caused by a long feature vector [ |
Processed with larger size features. High processing time [ High performance in terms of recognition rate [ |
Different illumination conditions [ |
|
|
Dimensionality reduction [ They are because of supervised classification problems. Automatically detect feature in this approach (CNN and RNN) [ |
The recognition performance depends on the chosen kernel [ More difficult to implement than the local technique. Recognition rate unsatisfying [ |
Complexity [ Computationally expensive and require a high degree of correlation between the test and training images (SVM, CNN) [ |
Different illumination [ | ||
|
| AT&T, FACES94, |
Provides faster systems and efficient recognition [ |
More difficult to implement. Complex and computational cost [ |
High recognition rate [ High computational complexity [ |
Pose, illumination conditions, and facial expressions [ | |