| Literature DB >> 32230961 |
Lu Wang1,2, Liangbin Xie1, Peiyu Yang1, Qingxu Deng1, Shuo Du3, Lisheng Xu2,3.
Abstract
Construction sites are dangerous due to the complex interaction of workers with equipment, building materials, vehicles, etc. As a kind of protective gear, hardhats are crucial for the safety of people on construction sites. Therefore, it is necessary for administrators to identify the people that do not wear hardhats and send out alarms to them. As manual inspection is labor-intensive and expensive, it is ideal to handle this issue by a real-time automatic detector. As such, in this paper, we present an end-to-end convolutional neural network to solve the problem of detecting if workers are wearing hardhats. The proposed method focuses on localizing a person's head and deciding whether they are wearing a hardhat. The MobileNet model is employed as the backbone network, which allows the detector to run in real time. A top-down module is leveraged to enhance the feature-extraction process. Finally, heads with and without hardhats are detected on multi-scale features using a residual-block-based prediction module. Experimental results on a dataset that we have established show that the proposed method could produce an average precision of 87.4%/89.4% at 62 frames per second for detecting people without/with a hardhat worn on the head.Entities:
Keywords: convolutional neural network; hardhat-wearing detection; real-time detection
Mesh:
Year: 2020 PMID: 32230961 PMCID: PMC7180748 DOI: 10.3390/s20071868
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Architecture of the proposed network for detecting and classifying no_hardhat and hardhat instances.
Performance comparison of different network architectures (bold indicates either best performance or minimum cost).
| Architecture | Model Size (M) | Run Time (ms) | ||
|---|---|---|---|---|
| Baseline |
|
| 82.8 | 86.0 |
| Baseline+top-down | 14.7 | 14 | 85.1 | 88.3 |
| Proposed | 18.9 | 16 |
|
|
Figure 2Three variants of the prediction module: (a) A 3 × 3 convolutional block; (b) A residual block with two 3 × 3 convolutional blocks; (c) A residual block with the bottleneck structure.
Performance comparison for different prediction modules (bold indicates either best performance or minimum cost).
| Prediction Module | Model Size (M) | Run Time (ms) | ||
|---|---|---|---|---|
|
| 22.7 |
| 87.0 | 88.8 |
|
| 29.7 | 17 | 86.7 | 89.1 |
|
|
| 16 | 87.2 | 89.2 |
| Proposed | 18.9 | 16 |
|
|
Results comparison with other object detection methods (bold indicates either best performance or minimum cost).
| Detector | Input Size (pixels) | Model Size (M) | Run Time (ms) | ||
|---|---|---|---|---|---|
| Faster R-CNN |
| 607.2 | 98 | 85.7 | 88.6 |
| SSD |
| 105.1 | 21 | 86.3 | 88.7 |
| Proposed |
|
|
|
|
|
Figure 3Examples of the detection results using different network architectures. Left: baseline; Middle: baseline + top-down; Right: proposed. Persons indicated by red arrows correspond to incorrect detections.
Figure 4Examples of satisfactory detection results.
Figure 5Examples of failure cases.
Results comparison of different network architectures under the occlusion condition (bold indicates either best performance or minimum cost).
| Detector |
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| (%) | (%) | (%) | (%) | (%) | (%) | |
| Baseline | 82.8 | 65.1 | 57.3 | 79.8 | 58.4 | 50.9 |
| Baseline+top-down |
| 64.1 | 59.4 |
| 66.4 | 62.1 |
| Proposed | 89.0 |
|
| 86.1 |
|
|
Results comparison of different network architectures under the low-contrast condition (bold indicates either best performance or minimum cost).
| Detector |
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| (%) | (%) | (%) | (%) | (%) | (%) | |
| Baseline | 86.9 | 57.0 | 52.5 |
| 45.6 | 42.8 |
| Baseline+top-down | 92.4 | 57.0 | 54.5 | 85.1 | 58.5 | 53.3 |
| Proposed |
|
|
| 86.4 |
|
|
People-counting results of different network architectures on the test set. ER: error rate; MSE: mean square error (bold indicates either best performance or minimum cost).
| Architecture | #no_hardhat | #hardhat | #total |
| |
|---|---|---|---|---|---|
| Baseline | 1698 | 4344 | 6042 | 9.36 | 1.94 |
| Baseline+top-down | 1605 | 4528 | 6133 | 8.00 | 1.40 |
| Proposed | 1752 | 4766 | 6518 |
|
|
| Ground truth | 1803 | 4863 | 6666 | – | – |