| Literature DB >> 34824599 |
Qingqing Xu1, Zhiyu Zhu1, Huilin Ge1, Zheqing Zhang1, Xu Zang1.
Abstract
The application of face detection and recognition technology in security monitoring systems has made a huge contribution to public security. Face detection is an essential first step in many face analysis systems. In complex scenes, the accuracy of face detection would be limited because of the missing and false detection of small faces, due to image quality, face scale, light, and other factors. In this paper, a two-level face detection model called SR-YOLOv5 is proposed to address some problems of dense small faces in actual scenarios. The research first optimized the backbone and loss function of YOLOv5, which is aimed at achieving better performance in terms of mean average precision (mAP) and speed. Then, to improve face detection in blurred scenes or low-resolution situations, we integrated image superresolution technology on the detection head. In addition, some representative deep-learning algorithm based on face detection is discussed by grouping them into a few major categories, and the popular face detection benchmarks are enumerated in detail. Finally, the wider face dataset is used to train and test the SR-YOLOv5 model. Compared with multitask convolutional neural network (MTCNN), Contextual Multi-Scale Region-based CNN (CMS-RCNN), Finding Tiny Faces (HR), Single Shot Scale-invariant Face Detector (S3FD), and TinaFace algorithms, it is verified that the proposed model has higher detection precision, which is 0.7%, 0.6%, and 2.9% higher than the top one. SR-YOLOv5 can effectively use face information to accurately detect hard-to-detect face targets in complex scenes.Entities:
Mesh:
Year: 2021 PMID: 34824599 PMCID: PMC8610656 DOI: 10.1155/2021/7748350
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1SRGAN network model.
Figure 2Structures of YOLOv5s.
Results of anchor boxes of the training set.
| Feature map | Size | Anchor |
|---|---|---|
| Predict one | 13 × 13 | (43, 59) (76, 89) (178, 234) |
| Predict two | 26 × 26 | (30, 36) (22, 26) (15, 18) |
| Predict three | 52 × 52 | (10, 12) (7, 9) (5, 6) |
Figure 3The architecture of improved SR-YOLOv5.
Available datasets.
| Datasets | Pictures | Faces |
|---|---|---|
| Wider face | 32203 | 393703 |
| AFW | 205 | 473 |
| FDDB | 2845 | 5171 |
| Pascal face | 851 | 1341 |
| IJB-A | 24327 | 49759 |
| MALF | 5250 | 11931 |
Experimental environment configuration.
| Experimental environment | Configuration |
|---|---|
| Operating system | Linux 64 |
| GPU | TITAN Xp |
| CPU | Intel(R)Core i7-3770CPU@ |
| Deep learning framework | PyTorch |
Figure 4Precision-recall (PR) curves of our SR-YOLOv5 detector.
Figure 5Part of the test results.
Performance comparison using different models.
| Model | Backbone | AP50 | Time/ms |
|---|---|---|---|
| HR | Resnet101 | 57.5% | 198 |
| YOLOv3 | Darknet53 | 57.9% | 51 |
| Ours | YOLOv5s-SRGAN | 59.8% | 75 |
Comparison of mAP using different face detection algorithms.
| Face detection algorithms | Easy | Medium | Hard |
|---|---|---|---|
| MTCNN | 85.1% | 82.0% | 62.9% |
| CMS-RCNN | 90.2% | 87.4% | 64.3% |
| HR | 92.5% | 91.0% | 81.9% |
| S3FD | 93.7% | 92.5% | 85.9% |
| TinaFace | 95.6% | 94.3% | 85.3% |
| Ours | 96.3% | 94.9% | 88.2% |