| Literature DB >> 25474200 |
Daksha Yadav1, Richa Singh2, Mayank Vatsa2, Afzel Noore1.
Abstract
Humans utilize facial appearance, gender, expression, aging pattern, and other ancillary information to recognize individuals. It is interesting to observe how humans perceive facial age. Analyzing these properties can help in understanding the phenomenon of facial aging and incorporating the findings can help in designing effective algorithms. Such a study has two components--facial age estimation and age-separated face recognition. Age estimation involves predicting the age of an individual given his/her facial image. On the other hand, age-separated face recognition consists of recognizing an individual given his/her age-separated images. In this research, we investigate which facial cues are utilized by humans for estimating the age of people belonging to various age groups along with analyzing the effect of one's gender, age, and ethnicity on age estimation skills. We also analyze how various facial regions such as binocular and mouth regions influence age estimation and recognition capabilities. Finally, we propose an age-invariant face recognition algorithm that incorporates the knowledge learned from these observations. Key observations of our research are: (1) the age group of newborns and toddlers is easiest to estimate, (2) gender and ethnicity do not affect the judgment of age group estimation, (3) face as a global feature, is essential to achieve good performance in age-separated face recognition, and (4) the proposed algorithm yields improved recognition performance compared to existing algorithms and also outperforms a commercial system in the young image as probe scenario.Entities:
Mesh:
Year: 2014 PMID: 25474200 PMCID: PMC4256302 DOI: 10.1371/journal.pone.0112234
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Face images of an individual illustrating variations due to aging across different years.
Figure 2Sample facial regions presented to participants for age group estimation.
Figure 3Sample images presented to the participants for recognizing age-separated images of individuals.
Confusion matrix showing the actual and predicted age groups in the task of age estimation by human participants.
| Stimuli Age Group | Predicted Age Group | |||||||||
|
|
|
|
|
|
|
|
|
| > | |
|
|
| 32 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 |
|
| 8 |
| 43 | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
|
| 0 | 1 |
| 104 | 17 | 2 | 1 | 0 | 1 | 0 |
|
| 0 | 2 | 77 |
| 48 | 5 | 0 | 0 | 0 | 0 |
|
| 0 | 1 | 0 | 8 |
| 59 | 12 | 1 | 0 | 0 |
|
| 0 | 0 | 0 | 0 | 17 |
| 51 | 35 | 8 | 0 |
|
| 0 | 0 | 0 | 1 | 6 | 47 |
| 71 | 7 | 0 |
|
| 0 | 0 | 0 | 0 | 0 | 4 | 31 |
| 5 | 0 |
|
| 0 | 0 | 1 | 2 | 3 | 15 | 56 | 80 |
| 15 |
| > | 0 | 0 | 0 | 0 | 0 | 6 | 12 | 12 | 3 |
|
Multiple quantitative measures for analyzing the performance of human participants in estimating age group of visual face stimuli.
| Analysis of Perceptual of Humans in Discrimination of Age Groups by Humans | Automatic Age Group Estimation | ||||||
| Age Group of Face Stimuli | Sensitivity or Accuracy (in %) | Specificity (in %) |
| Stimulus Entropy (H(S)) (in bits) | Noise (H(S | Information Entropy (I(S | Face++ Accuracy (in %) |
|
| 86.12 | 99.52 | 2.7500 | 0.3782 | 0.0612 | 0.3171 | 100 |
|
| 78.46 | 97.86 | 2.3971 | 0.3790 | 0.1072 | 0.2718 | 100 |
|
| 48.99 | 92.80 | 1.4453 | 0.3798 | 0.1735 | 0.2063 | 60 |
|
| 45.46 | 93.00 | 1.3422 | 0.3758 | 0.1978 | 0.1780 | 20 |
|
| 56.92 | 94.77 | 1.6776 | 0.3275 | 0.1595 | 0.1680 | 80 |
|
| 36.21 | 92.13 | 1.0658 | 0.3132 | 0.2127 | 0.1005 | 20 |
|
| 44.30 | 90.36 | 1.3085 | 0.3717 | 0.2208 | 0.1509 | 20 |
|
| 47.37 | 89.20 | 1.3981 | 0.1839 | 0.1226 | 0.0613 | 20 |
|
| 23.21 | 98.59 | 0.6292 | 0.3608 | 0.2030 | 0.1578 | 0 |
| > | 32.65 | 99.20 | 0.9543 | 0.1347 | 0.0856 | 0.0491 | 0 |
The results show that the accuracy is high when the stimuli faces belong to age groups and .
Effect of participant ethnicity and ethnicity of the presented face stimuli on age estimation accuracy.
| Participant Ethnicity (# of participants) | Face Stimuli Ethnicity (Accuracy in %) | |
| Indian | Caucasian | |
| Indian (366) | 30.88 |
|
| Caucasian (81) | 24.69 |
|
The results show that irrespective of the participant's ethnicity, the age estimation accuracy is higher for Caucasian face stimuli.
Age group estimation accuracy based on participant's gender and face stimuli's gender.
| Participant Gender (# of participants) | Face Stimuli Gender (Accuracy in %) | |
| Male | Female | |
| Male (201) | 45.00 |
|
| Female (281) | 47.06 |
|
The results show that irrespective of the gender of the participants, the accuracy for female stimuli's are better.
Age estimation accuracy based on participant's age and age of the face stimuli across various age groups.
| Participant Age Group (# of participants) | Face Stimuli Age (Accuracy in %) | |||
|
|
|
| > | |
| 0–20 (14) |
| 33.33 | 50.00 | 34.78 |
| 21–40 (385) |
| 52.85 | 38.48 | 36.00 |
| 41–60 (66) |
| 46.48 | 44.44 | 36.95 |
| >60 (17) |
| 41.17 | 26.67 | 30.43 |
The results show that participants from all age groups provide the best results on stimuli faces belonging to age group of 0–20.
Analyzing the effect of facial regions in predicting the age group of the visual stimuli.
| Stimuli Age Groups | Facial Region (Accuracy in %) | ||||
| T Region | T-Region Masked | Binocular | Eyes Masked | Chin | |
|
| 50.00 | 50.00 | 95.92 | 76.00 |
|
|
| 76.00 | 85.42 | 26.00 | 77.08 | 68.75 |
|
| 56.09 | 32.00 | 59.09 | 63.26 | 23.68 |
|
| 51.02 | 22.22 | 54.00 | 35.42 | 42.00 |
|
| 32.00 | 40.81 | 45.83 | 50.00 | 46.81 |
|
| 39.13 | 55.55 | 22.22 | 39.13 | 39.02 |
|
| 48.84 | 24.49 | 24.49 | 33.33 | 41.67 |
|
| 46.81 | 56.01 | 36.36 | 57.14 | 54.00 |
|
| 12.50 | 53.48 | 38.09 | 26.83 | 14.58 |
| > | 11.90 | 19.04 | 20.83 | 20.93 | 20.93 |
The results show that chin region of children belonging to age group are most discriminating for age prediction.
Face recognition accuracy achieved with respect to stimuli age group and type of facial region shown.
| Stimuli Age Groups | Facial Region (Accuracy ± Standard Deviation in %) | ||||
| Full Face | Binocular | T Region | T-Region Masked | Chin | |
|
| 60.41±1.13 |
| 59.37±2.10 | 33.33±0.14 | 50.55±0.13 |
|
|
| 69.47±2.11 | 76.59±0.91 | 69.38±0.04 | 66.67±2.61 |
|
|
| 68.89±3.00 | 67.34±1.07 | 65.21±0.02 | 43.75±0.3 |
|
|
| 54.08±0.45 | 63.33±1.71 | 57.14±1.01 | 59.13±1.26 |
|
| 70.83±0.98 | 55.10±0.42 | 72.00±0.36 | 66.30±1.22 |
|
The values in bold show which region is the most discriminating for recognizing the stimuli belonging to a given age group. It can be observed that in general, the whole face yields the highest accuracy whereas for children and elderly people, binocular and chin regions provide the most discriminating features respectively.
Figure 4Proposed human perception based fusion scheme (HPFS) for age prediction and face recognition.
Comparing the face recognition accuracy of the proposed (HPFS) algorithm and existing algorithms using IIIT-Delhi, FG-Net Aging, and MORPH databases.
| Algorithm | Facial Region(s) | Oldest Image as Probe | Youngest Image as Probe | ||||
| IIIT-Delhi | FG-Net | MORPH | IIIT-Delhi | FG-Net | MORPH | ||
| (1) Unimodal | Face | 26.1% | 10.0% | 4.2% | 25.0% | 3.4% | 14.2% |
| (2) Sum Rule | Face, Mouth | 21.1% | 10.3% | 15.2% | 13.9% | 3.4% | 12.3% |
| (3) Sum Rule | Face, Binocular | 25.7% | 17.6% | 11.3% | 19.6% | 6.5% | 12.0% |
| (4) Sum Rule | Periocular | 26.1% | 19.8% | 9.3% | 19.6% | 4.3% | 7.6% |
| (5) Sum Rule | Binocular, Periocular | 26.4% | 20.2% | 9.3% | 20.0% | 5.6% | 7.2% |
| (6) Sum Rule | Face, Periocular | 30.4% | 19.8% | 11% | 21.1% | 7.7% | 12.3% |
| (7) Sum Rule | Face, Binocular, Mouth | 26.8% | 14.6% | 13.7% | 18.9% | 5.6% | 15.6% |
| (8) Sum Rule | Mouth, Periocular | 29.0% | 19.8% | 13.5% | 21.8% | 6.5% | 12.8% |
| (9) Sum Rule | Face, Binocular, Periocular | 29.6% | 20.2% | 14% | 22.5% | 8.6% | 12.8% |
| (10) Sum Rule | Face, Mouth, Binocular, Periocular | 31.8% | 18.5% | 16.3% | 21.8% | 7.3% | 15.8% |
| (11) Weighted Sum | Face, Mouth, Binocular, Periocular | 32.8% | 20.7% | 22.3% | 22.1% | 9.5% | 17.4% |
| (12) SVM Fusion | Face, Mouth, Binocular, Periocular | 8.6% | 4.2% | 3.6% | 10.0% | 2.3% | 6.5% |
| (13) DM | Overlapping Patches | 52.8% | 62.7% | 21.3% | 30.4% | 28.0% | 14.9% |
| (14) Face++ | Face | 48.2% | 55.7% | 17.2% | 29.4% | 21.6% | 18.5% |
| (15) VeriLook (COTS) | Face | 52.7% | 51.6% | 15.2% | 27.8% | 8.9% | 12.1% |
| (16) Proposed Human Perception based Fusion Scheme (HPFS) with | Face, Binocular, T-Region, Not T-Region, Chin | 45.3% | 51.0% | 26.2% | 39.6% | 31.4% | 29.9% |
| (17) |
|
|
|
|
|
|
|
The results of the proposed human perception based estimation and recognition algorithm is shown in bold.