| Literature DB >> 35684792 |
Beichen Zhang1, Yue Bao1.
Abstract
Age estimation from human faces is an important yet challenging task in computer vision because of the large differences between physical age and apparent age. Due to the differences including races, genders, and other factors, the performance of a learning method for this task strongly depends on the training data. Although many inspiring works have focused on the age estimation of a single human face through deep learning, the existing methods still have lower performance when dealing with faces in videos because of the differences in head pose between frames, which can lead to greatly different results. In this paper, a combined system of age estimation and head pose estimation is proposed to improve the performance of age estimation from faces in videos. We use deep regression forests (DRFs) to estimate the age of facial images, while a multiloss convolutional neural network is also utilized to estimate the head pose. Accordingly, we estimate the age of faces only for head poses within a set degree threshold to enable value refinement. First, we divided the images in the Cross-Age Celebrity Dataset (CACD) and the Asian Face Age Dataset (AFAD) according to the estimated head pose degrees and generated separate age estimates for images with different poses. The experimental results showed that the accuracy of age estimation from frontal facial images was better than that for faces at different angles, thus demonstrating the effect of head pose on age estimation. Further experiments were conducted on several videos to estimate the age of the same person with his or her face at different angles, and the results show that our proposed combined system can provide more precise and reliable age estimates than a system without head pose estimation.Entities:
Keywords: CNN; age estimation; deep learning; head pose estimation
Mesh:
Year: 2022 PMID: 35684792 PMCID: PMC9185429 DOI: 10.3390/s22114171
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Differences between different people of the same age.
Figure 2Changes in facial appearance from childhood to adulthood.
Figure 3Examples from Cross-Age Celebrity Dataset [15] and Asian Face Age Dataset [12]. The number below each image is the ground truth age of the subject.
Figure 4Illustration of deep regression forests.
Figure 5CNN with combined mean squared error and cross-entropy losses.
Mean average error of Euler angles across different methods on the AFLW2000 dataset.
| Methods | Yaw | Pitch | Roll | Average |
|---|---|---|---|---|
| Dlib [ | 23.153 | 13.633 | 10.545 | 15.777 |
| Fan [ | 6.358 | 12.277 | 8.714 | 9.116 |
| CPAM [ | 1.479 | 1.804 | 1.869 | 1.697 |
| |
|
|
|
|
| Ground truth landmarks | 5.924 | 11.756 | 8.271 | 8.651 |
Figure 6Examples of nonfrontal facial images.
Performance (MAE) comparison on the frontal and nonfrontal subsets of AFAD [12] and CACD.
| Subset | AFAD | CACD |
|---|---|---|
| Frontal | 3.73 | 4.59 |
| Nonfrontal | 4.97 | 5.65 |
Performance (MAE) and image numbers comparison on the AFAD [12] and CACD with different threshold.
|
Threshold | AFAD | CACD | ||
|---|---|---|---|---|
| MAE | Number | MAE | Number | |
| 50 | 3.97 | 59,173 | 4.87 | 18,023 |
| 40 | 3.84 | 57,232 | 4.70 | 16,842 |
| 30 | 3.73 | 53,983 | 4.59 | 15,145 |
| 20 | 3.72 | 36,748 | 4.58 | 10,398 |
| 10 | 3.71 | 18,753 | 4.58 | 7569 |
Figure 7Examples from the facial video datasets with age and head pose estimates. The numbers represent the predicted age. Green and red colors indicate that the sum of the head pose rotational angles is less than and greater than 30 degrees, respectively.
Accuracy (MAE) and variance results for comparison with state-of-the-art methods on the Asian and European facial video datasets.
| Method | Asian | European | ||
|---|---|---|---|---|
| MAE | Variance | MAE | Variance | |
| AlexNet [ | 6.19 | 6.92 | 6.93 | 7.15 |
| DEX [ | 6.72 | 8.65 | 7.17 | 8.22 |
| DRF [ | 5.96 | 4.12 | 6.39 | 5.84 |
| |
|
|
|
|