| Literature DB >> 32326573 |
Mingjie Liu1, Xianhao Wang1, Anjian Zhou2, Xiuyuan Fu3, Yiwei Ma1, Changhao Piao1.
Abstract
Object detection, as a fundamental task in computer vision, has been developed enormously, but is still challenging work, especially for Unmanned Aerial Vehicle (UAV) perspective due to small scale of the target. In this study, the authors develop a special detection method for small objects in UAV perspective. Based on YOLOv3, the Resblock in darknet is first optimized by concatenating two ResNet units that have the same width and height. Then, the entire darknet structure is improved by increasing convolution operation at an early layer to enrich spatial information. Both these two optimizations can enlarge the receptive filed. Furthermore, UAV-viewed dataset is collected to UAV perspective or small object detection. An optimized training method is also proposed based on collected UAV-viewed dataset. The experimental results on public dataset and our collected UAV-viewed dataset show distinct performance improvement on small object detection with keeping the same level performance on normal dataset, which means our proposed method adapts to different kinds of conditions.Entities:
Keywords: convolutional neural network; object detection; unmanned aerial vehicle
Year: 2020 PMID: 32326573 PMCID: PMC7218847 DOI: 10.3390/s20082238
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The structure of our proposed UAV-YOLO.
Figure 2The backbone structure of YOLOv3 and UAV-YOLO. (a) Original YOLOv3 structure. (b) The proposed UAV-YOLO structure.
Figure 3The proposed training method and steps to enhance YOLOv3 performance on UAV-viewed human detection.
Mean average precision (mAP) and intersection over union (IOU) performance on UAV-YOLO using different optimized training methods.
| UAV-Viewed | Normal | Games | Far | |||||
|---|---|---|---|---|---|---|---|---|
| Optimized Method | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% |
| Original data | 51.41 | 66.17 | 90.81 | 85.75 | 94.32 | 91.44 | 15.58 | 29.12 |
| Classified data | 90.88 | 78.20 | 90.83 | 80.40 | 90.76 | 74.11 | 56.44 | 55.02 |
| Anchor3 | 90.84 | 79.07 | 90.86 | 80.20 | 90.91 | 72.62 | 59.60 | 55.83 |
| Anchor6 | 90.88 | 78.77 | 90.85 | 82.13 | 87.35 | 71.05 | 57.15 | 56.43 |
| Anchor9 | 90.91 | 80.59 | 90.91 | 84.16 | 90.91 | 74.40 | 60.67 | 57.43 |
| Mining | 90.89 | 80.29 | 90.90 | 83.85 | 90.48 | 74.11 | 61.72 | 61.84 |
Figure 4Clustering results on different number of anchor boxes.
mAP and IOU performance comparison on different detectors using our collected UAV-viewed dataset.
| UAV-Viewed | Normal | Games | Far | |||||
|---|---|---|---|---|---|---|---|---|
| Optimized Method | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% | mAP/% | IOU/% |
| UAV-YOLO | 90.86 | 80.42 | 90.90 | 84.11 | 90.62 | 76.54 | 64.42 | 68.02 |
| YOLOv3 | 90.89 | 80.29 | 90.90 | 83.85 | 90.48 | 74.11 | 61.72 | 61.84 |
| SSD300 | 89.87 | 72.34 | 90.68 | 76.45 | 89.19 | 68.21 | 56.01 | 52.98 |
| SSD512 | 90.92 | 74.23 | 90.89 | 78.86 | 90.71 | 70.08 | 61.09 | 56.84 |
Comparison results with state-of-the-art one-stage detection method on selected human samples from VOC/COCO.
| Methods | mAP/% | IOU/% | Time/fps |
|---|---|---|---|
| UAV-YOLO | 72.54 | 70.05 | 20 |
| YOLOv3 | 72.21 | 68.43 | 20 |
| SSD300 | 62.94 | 60.72 | 23 |
| SSD512 | 72.54 | 60.72 | 23 |
Figure 5Detected results of different methods on UAV-viewed dataset. (a) Detected results of original YOLOv3; (b) detected results of YOLOv3 using optimized training method; (c) detected results of UAV-YOLO.