| Literature DB >> 34230817 |
Shuang Li1,2,3, Xiaoli Dong1,2,3, Yuan Shi2,4, Baoli Lu1,3, Linjun Sun1,2,3, Wenfa Li5.
Abstract
Head pose classification is an important part of the preprocessing process of face recognition, which can independently solve application problems related to multi-angle. But, due to the impact of the COVID-19 coronavirus pandemic, more and more people wear masks to protect themselves, which covering most areas of the face. This greatly affects the performance of head pose classification. Therefore, this article proposes a method to classify the head pose with wearing a mask. This method focuses on the information that is helpful for head pose classification. First, the H-channel image of the HSV color space is extracted through the conversion of the color space. Then use the line portrait to extract the contour lines of the face, and train the convolutional neural networks to extract features in combination with the grayscale image. Finally, stacked generalization technology is used to fuse the output of the three classifiers to obtain the final classification result. The results on the MAFA dataset show that compared with the current advanced algorithm, the accuracy of our method is 94.14% on the front, 86.58% on the more side, and 90.93% on the side, which has better performance.Entities:
Keywords: color space conversion; head pose classification; line portrait; stacked generalization
Year: 2021 PMID: 34230817 PMCID: PMC8250277 DOI: 10.1002/cpe.6331
Source DB: PubMed Journal: Concurr Comput ISSN: 1532-0626 Impact factor: 1.831
FIGURE 1Head pose estimation can be regarded a rigid transformation. The rotation of the head can be represented by three Euler angles, (A) yaw; (B) pitch; (C) roll
FIGURE 2RGB three‐dimensional space model
FIGURE 3HSV inverted cone model
FIGURE 4(A) RGB face images in different poses; (B) grayscale images; (C) B channel of RGB color space images; (D) H channel of HSV color space images
FIGURE 5(A) RGB face image; (B) H channel image; (C) processed image
FIGURE 6RGB image and line portrait
FIGURE 7(A) RGB face image; (B) H channel image and line portrait pixel fusion; (C) input image
FIGURE 8Stack generalization flowchart
FIGURE 9Algorithm flowchart
Comparison of different algorithms on the MAFA data set
| Method | Front accuracy | More side accuracy | Side accuracy |
|---|---|---|---|
| Line | 92.67% | 83.40% | 88.53% |
| FSA‐Net | 74.97% | 76.64% | 71.20% |
| EPnP‐LAB | 90.04% | 68.92% | 50.13% |
| Hope‐Net | 72.16% | 76.53% | 73.87% |
| Ours | 94.14% | 86.58% | 90.93% |
Note: Through the accuracy of the three classification to show the performance of each algorithm.
Ablation study for different aggregation methods and the results are the MAFA of the front, more side, and side accuracy
| Testing set | Front accuracy | More side accuracy | Side accuracy |
|---|---|---|---|
| Optimized AlexNet | 93.39% | 87.00% | 90.40% |
| Not use H‐channel | 91.04% | 84.36% | 88.53% |
| Not use line portraits | 90.87% | 86.79% | 89.33% |
| Line | 92.67% | 83.40% | 88.53% |
| Ours | 94.14% | 86.58% | 90.93% |