| Literature DB >> 35411120 |
Akhil Kumar1, Arvind Kalia1, Aayushi Kalia2.
Abstract
During the last two years, several deep learning-based methods for face mask detection have been proposed by researchers. However, most of the proposed methods struggle with the detection of face masks that are too small an object to detect and further achieve low detection accuracy. Considering the issues of the existing methods, in this work, we have proposed ETL-YOLO v4 with a modified and improved feature extraction and prediction network for tiny YOLO v4 which surpasses all its predecessors and other related work in the literature. To develop ETL-YOLO v4, we have improved the backbone architecture of tiny YOLO v4 by adding a modified-dense SPP network, two additional detection layers with modified and optimized CNN layers that aid in accurate prediction, used Mish as the activation function, and utilized modified anchor boxes. Furthermore, to obtain detection results in images of varied viewpoints, we have added Mosaic and CutMix data augmentation at training time. The proposed ETL-YOLO v4 achieved 9.93% higher mAP, 5.75% higher average precision (AP) for faces with masks, and 16.6% higher average precision (AP) for the face mask region as compared to its original base-line variant.Entities:
Keywords: COVID-19; Face mask detection; Mish activation; SPP network; Tiny YOLO v4
Year: 2022 PMID: 35411120 PMCID: PMC8986544 DOI: 10.1016/j.ijleo.2022.169051
Source DB: PubMed Journal: Optik (Stuttg) ISSN: 0030-4026 Impact factor: 2.840
Network architecture of tiny YOLO v4 algorithm.
| Type | Filters | Size/Stride | Output |
|---|---|---|---|
| Convolutional | 32 | 3 × 3/2 | 208 × 208 × 32 |
| Convolutional | 64 | 3 × 3/2 | 104 × 104 × 64 |
| Convolutional | 64 | 3 × 3 | 104 × 104 × 64 |
| Route | 104 × 104 × 32 | ||
| Convolutional | 32 | 3 × 3 | 104 × 104 × 32 |
| Convolutional | 32 | 3 × 3 | 104 × 104 × 32 |
| Route | 104 × 104 × 64 | ||
| Convolutional | 64 | 1 × 1 | 104 × 104 × 64 |
| Route | 104 × 104 × 128 | ||
| Maxpool | 2 × 2/2 | 52 × 52 ×128 | |
| Convolutional | 128 | 3 × 3 | 52 × 52 ×128 |
| Route | 52 × 52 × 64 | ||
| Convolutional | 64 | 3 × 3 | 52 × 52 × 64 |
| Convolutional | 64 | 3 × 3 | 52 × 52 × 64 |
| Route | 52 × 52 × 128 | ||
| Convolutional | 128 | 1 × 1 | 52 × 52 × 128 |
| Route | 52 × 52 × 256 | ||
| Maxpool | 2 × 2/2 | 26 × 26 × 256 | |
| Convolutional | 256 | 3 × 3 | 26 × 26 × 256 |
| Route | 26 × 26 × 128 | ||
| Convolutional | 128 | 3 × 3 | 26 × 26 × 128 |
| Convolutional | 128 | 3 × 3 | 26 × 26 × 128 |
| Route | 26 × 26 × 256 | ||
| Convolutional | 256 | 1 × 1 | 26 × 26 × 256 |
| Route | 26 × 26 × 512 | ||
| Maxpool | 2 × 2/2 | 13 × 13 × 512 | |
| Convolutional | 512 | 3 × 3 | 13 × 13 × 512 |
| Convolutional | 256 | 1 × 1 | 13 × 13 × 256 |
| Convolutional | 512 | 3 × 3 | 13 × 13 × 512 |
| Convolutional | 27 | 1 × 1 | 13 × 13 × 27 |
| YOLO | |||
| Route | 13 × 13 × 256 | ||
| Convolutional | 128 | 1 × 1 | 13 × 13 × 128 |
| Upsample | 2 | 26 × 26 × 128 | |
| Route | 26 × 26 × 384 | ||
| Convolutional | 256 | 3 × 3 | 26 × 26 × 256 |
| Convolutional | 27 | 1 × 1 | 26 × 26 × 27 |
| YOLO | |||
Fig. 1Working module of proposed ETL-YOLO v4.
Network architecture of proposed ETL-YOLO v4.
| Type | Filters | Size/stride | Output | |
|---|---|---|---|---|
| Convolutional | 32 | 3 × 3/2 | 208 × 208 × 32 | |
| Convolutional | 64 | 3 × 3/2 | 104 × 104 × 64 | |
| Convolutional | 64 | 3 × 3 | 104 × 104 × 64 | |
| Route | 104 × 104 × 32 | |||
| Convolutional | 32 | 3 × 3 | 104 × 104 × 32 | |
| Convolutional | 32 | 3 × 3 | 104 × 104 × 32 | |
| Route | 104 × 104 × 64 | |||
| Convolutional | 64 | 1 × 1 | 104 × 104 × 64 | |
| Route | 104 × 104 × 128 | |||
| Maxpool | 2 × 2/2 | 52 × 52 × 128 | ||
| Convolutional | 128 | 3 × 3 | 52 × 52 × 128 | |
| Route | 52 × 52 × 64 | |||
| Convolutional | 64 | 3 × 3 | 52 × 52 × 64 | |
| Convolutional | 64 | 3 × 3 | 52 × 52 × 64 | |
| Route | 52 × 52 × 128 | |||
| Convolutional | 128 | 1 × 1 | 52 × 52 × 128 | |
| Route | 52 × 52 × 256 | |||
| Maxpool | 2 × 2/2 | 26 × 26 × 256 | ||
| Convolutional | 256 | 3 × 3 | 26 × 26 × 256 | |
| Route | 26 × 26 × 128 | |||
| Convolutional | 128 | 3 × 3 | 26 × 26 × 128 | |
| Convolutional | 128 | 3 × 3 | 26 × 26 × 128 | |
| Route | 26 × 26 × 256 | |||
| Convolutional | 256 | 1 × 1 | 26×26×256 | |
| Route | 26 × 26 × 512 | |||
| Maxpool | 2 × 2/2 | 13 × 13 × 512 | ||
| Convolutional | 512 | 3 × 3 | 13 × 13 × 512 | |
| Convolutional | 256 | 1 × 1 | 13 × 13 × 256 | |
| Convolutional | 512 | 3 × 3 | 13 × 13 × 512 | |
| Convolutional | 27 | 1 × 1 | 13 × 13 × 27 | |
| YOLO | ||||
Fig. 2Dense SPP network.
Fig. 3Mosaic and CutMix augmented images utilized at training time (a)Sample images for Mosaic augmentation (b)Sample images for CutMix augmentation.
Fig. 4Dataset images with persons wearing and not wearing face masks.
Performance comparison of tiny YOLO v4 and ETL-YOLO v4.
| Algorithm | Class | AP | Precision | Recall | F-1 Score | mAP |
|---|---|---|---|---|---|---|
| tiny YOLO v4 | with mask | 83.94% | 79% | 65% | 72% | 57.71% |
| without mask | 53.16% | |||||
| mask incorrectly | 23.99% | |||||
| mask area | 70.37% | |||||
| with mask | 89.69% | 85% | 72% | 78% | 67.64% | |
| without mask | 64.66% | |||||
| mask incorrectly | 29.25% | |||||
| mask area | 86.97% |
Fig. 5Detection results with proposed ETL-YOLO v4. (a) Face with mask and mask area. (b)Face without mask. (c)Face mask detection results for a public area. (d)Face mask detection results in COVID-19 pandemic. (e)Face mask detection results for Mosaic augmented images. (f)Face mask detection results for CutMix augmented images.
Fig. 6ETL-YOLO v4 comparison with other algorithms based on average precision.
Performance comparison of ETL-YOLO v4 with other algorithms based on mAP.
| Algorithm | mAP |
|---|---|
| YOLO v1 | 52.40% |
| YOLO v2 | 55.34% |
| YOLO v3 | 65.84% |
| EfficientNet-YOLO v4 | 40.04% |
| tiny YOLO v4 | 57.71% |
| Proposed ETL-YOLO v4 | 67.64% |
Performance comparison of ETL-YOLO v4 on MOXA dataset.
| Work | Algorithm | Dataset | mAP |
|---|---|---|---|
| Roy et al. | tiny YOLO v3 | MOXA | 56.27% |
| SSD 300 MobileNet v2 | 46.52% | ||
| F-RCNN 300 Inception v2 | 60.50% | ||
| YOLO v3 | 63.99% | ||
| Proposed ETL-YOLO v4 | 65.14% |