| Literature DB >> 35968407 |
Saurav Kumar1, Drishti Yadav2, Himanshu Gupta3, Mohit Kumar4, Om Prakash Verma3.
Abstract
The eruption of COVID-19 pandemic has led to the blossoming usage of face masks among individuals in the communal settings. To prevent the transmission of the virus, a mandatory mask-wearing rule in public areas has been enforced. Owing to the use of face masks in communities at different workplaces, an effective surveillance seems essential because several security analyses indicate that face masks may be used as a tool to hide the identity. Therefore, this work proposes a framework for the development of a smart surveillance system as an aftereffect of COVID-19 for recognition of individuals behind the face mask. For this purpose, transfer learning approach has been employed to train the custom dataset by YOLOv3 algorithm in the Darknet neural network framework. Moreover, to demonstrate the competence of YOLOv3 algorithm, a comparative analysis with YOLOv3-tiny has been presented. The simulated results verify the robustness of YOLOv3 algorithm in the recognition of individuals behind the face mask. Also, YOLOv3 algorithm achieves a mAP of 98.73% on custom dataset, outperforming YOLOv3-tiny by approximately 62%. Moreover, YOLOv3 algorithm provides adequate speed and accuracy on small faces.Entities:
Keywords: COVID-19; Convolutional neural network; Deep neural networks; Facemask detection; Object detection; Surveillance system; YOLOv3 algorithm
Year: 2022 PMID: 35968407 PMCID: PMC9362536 DOI: 10.1007/s11042-021-11560-1
Source DB: PubMed Journal: Multimed Tools Appl ISSN: 1380-7501 Impact factor: 2.577
Novel approaches for face and face mask detection
| Methods | Reported work | Performance | Highlights |
|---|---|---|---|
| CNN based | Ge et al | Average precision: 74.6% | Face occlusion detection techniques using LLE-CNN |
| Bu et al | Accuracy: 86.6%, Recall: 87.8% | Cascade framework of CNN for classifying masked face | |
| Qin et al | Accuracy: 98.70% | Combination of Image super-resolution network and classification network to identify facemask wearing conditions | |
| Inamdar et al | Accuracy: 98.60% | CNN (with 8 layers) is used to detect faces, extract ROI and detect facemask | |
| Chavda et al | Precision: 98.28%, Recall: 100%, F1-score: 99.13% | Two stage detection algorithm: First face detection by RetinaFace and then facemask detection by NASNetMobile | |
| Yadav [ | Precision: 91.7% | CNNs, FPN, and a context attention module for facemask detection | |
| Militante et al | Accuracy: 96% | VGG16 is used to detect facemask | |
| Khandelwal et al | Accuracy: 97.6% | Detect social distancing and mask using MobileNetV2 | |
| Loey et al | Average precision: 81% | Facemask classification algorithm using YOLOv2 and ResNet-50 | |
| Hybrid | Nieto-Rodr’iguez et al | Recall: > 95%, False positive rate: < 5% | It uses colour filter for classifying face and facemask using skin texture in HSV colour space |
| Vinitha et al | Not specified | Real-time facemask detection using MobileNetV2 and PyTorch |
Fig. 1Speed and accuracy tradeoff on the mAP at 0.5 IoU metric [47]
Fig. 2Samples from the considered dataset
Fig. 3Some sample images showing augmentation techniques
Fig. 4Schematic of YOLOv3 algorithm
Fig. 5YOLOv3 architecture
Fig. 6Bounding box prediction
Fig. 7Bounding box for location prediction
Fig. 8Illustration of Anchor boxes
Fig. 9Non-max suppression for filtering multiple detections
Fig. 10The architecture of YOLOv3-tiny
Fig. 11The Training and Detection phases
Comparative study of training results of YOLOv3 and YOLOv3-tiny algorithms
| Iterations | YOLO version | mAP (%) | Average IoU (%) | Precision | Recall | F1 score |
|---|---|---|---|---|---|---|
| 1000 | v3 | 75.04 | 55.10 | 0.76 | 0.29 | 0.42 |
| v3-tiny | 12.67 | 17.27 | 0.27 | 0.19 | 0.22 | |
| 2000 | v3 | 97.03 | 74.70 | 0.95 | 0.80 | 0.87 |
| v3-tiny | 31.54 | 27.00 | 0.40 | 0.38 | 0.39 | |
| 3000 | v3 | 88.58 | 61.07 | 0.85 | 0.90 | 0.87 |
| v3-tiny | 34.35 | 27.40 | 0.40 | 0.50 | 0.45 | |
| 4000 | v3 | 96.92 | 78.07 | 0.97 | 0.88 | 0.92 |
| v3-tiny | 53.02 | 55.05 | 0.75 | 0.36 | 0.49 | |
| 5000 | v3 | 94.35 | 75.80 | 0.96 | 0.75 | 0.85 |
| v3-tiny | 60.37 | 57.27 | 0.77 | 0.45 | 0.57 | |
| 6000 | v3 | 97.74 | 71.53 | 0.94 | 0.97 | 0.95 |
| v3-tiny | 60.12 | 58.81 | 0.78 | 0.43 | 0.56 | |
| 7000 | v3 | 97.77 | 78.84 | 0.97 | 0.94 | 0.95 |
| v3-tiny | 58.64 | 50.54 | 0.69 | 0.53 | 0.60 | |
| 8000 | v3 | 98.73 | 72.84 | 0.95 | 0.97 | 0.96 |
| v3-tiny | 61.04 | 54.95 | 0.74 | 0.51 | 0.61 |
Fig. 12Trends of loss function and mAP in the training for (a) YOLOv3 (b) YOLOv3-tiny
Fig. 13mAP and IoU trends over the training phase for YOLOv3 algorithm
Fig. 14Snapshots taken during testing phase
Fig. 15Experimental results on some sample images a Original image b Detection results by YOLOv3 c Detection results by YOLOv3-tiny
Quantitative comparison of YOLOv3 and YOLOv3-tiny algorithms
| Test image | Name | Detection accuracy (in %) | ||
|---|---|---|---|---|
| YOLOv3 | YOLOv3-tiny | |||
| 1 | Without mask | Saurav(M.Tech) | 100 | 66 |
| With mask | Saurav(M.Tech) | 99 | 83 | |
| 2 | Without mask | Drishti(M.Tech) | 100 | 98 |
| With mask | Drishti(M.Tech) | 99 | No detection | |
| 3 | Without mask | Dr. O.P. Verma(Faculty) | 100 | 96 |
| With mask | Dr. O.P. Verma(Faculty) | 92 | 53 | |
| 4 | Without mask | Himanshu(PhD) | 99 | Wrong detection |
| With mask | Himanshu(PhD) | 98 | Wrong detection | |
| 5 | Without mask | Anshika(M.Tech) | 97 | 89 |
| Saurav(M.Tech) | 98 | 85 | ||
| With mask | Anshika(M.Tech) | 99 | Wrong detection | |
| 6 | Without mask | Prerna(M.Tech) | 94 | 85 |
| With mask | Prerna(M.Tech) | 88 | 32 | |
Quantitative comparison of detection time
| Test Image | Image size (Pixels) | Prediction time (in milliseconds) | ||||
|---|---|---|---|---|---|---|
| Platform 1a | Platform 2b | |||||
| YOLOv3 | YOLOv3-tiny | YOLOv3 | YOLOv3-tiny | |||
| 1 | Without mask | 744 | 224.14 | 51.51 | 4381.05 | 366.90 |
| With mask | 2316 | 261.36 | 51.84 | 3890.63 | 323.59 | |
| 2 | Without mask | 3017 | 232.29 | 51.86 | 3988.60 | 311.72 |
| With mask | 960 | 234.75 | 51.85 | 4732.87 | 312.70 | |
| 3 | Without mask | 4623 | 261.25 | 52.31 | 4554.85 | 373.05 |
| With mask | 960 | 246.07 | 51.79 | 4055.01 | 368.66 | |
| 4 | Without mask | 4032 | 240.91 | 51.89 | 4082.82 | 389.19 |
| With mask | 960 | 236..36 | 51.80 | 4435.33 | 307.05 | |
| 5 | Without mask | 3017 | 257.73 | 51.89 | 4359.05 | 323.80 |
| With mask | 960 | 229.57 | 51.75 | 4407.87 | 368.54 | |
| 6 | Without mask | 3017 | 228.22 | 51.90 | 3903.71 | 320.87 |
| With mask | 967 | 228.88 | 51.41 | 4170.70 | 317.90 | |
aPlatform 1: the experimental platform specified in Sect. 4
bPlatform 2: Intel(R) Core (TM) i5-4200U CPU @ 1.60 GHz 2.30 GHz