| Literature DB >> 35875760 |
Abstract
Compared with the traditional object detection algorithm, the object detection algorithm based on deep learning has stronger robustness to complex scenarios, which is the hot direction of current research. According to the process characteristics of the object detection algorithm based on deep learning, it is divided into two-stage object detection algorithm and single-stage object detection algorithm, focusing on the problems solved by some classical algorithms and their advantages and disadvantages. In view of the problem of object detection, especially small object detection, the commonly used data sets and performance evaluation indicators are summarized; the characteristics, advantages, and detection difficulties of various common data sets are compared; the challenges faced by commonly used object detection methods and small object detection are systematically summarized; the latest work of small object detection methods based on deep learning is sorted out; and the small object detection methods based on multiscale and small object detection methods based on super-resolution are introduced. At the same time, the lightweight strategy for target detection methods and the performance of some lightweight models are introduced; the characteristics, advantages, and limitations of various methods are summarized; and the future development direction of small object detection methods based on deep learning is prospected.Entities:
Mesh:
Year: 2022 PMID: 35875760 PMCID: PMC9303118 DOI: 10.1155/2022/3843155
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Algorithm process framework.
Brief introduction and comparison of lightweight networks.
| Model name | Reference volume/106 | Basic introduction | Classification effect on ImageNet/% |
|---|---|---|---|
| MobileNet v1 | 4.20 | Lightweight network model that can be used for mobile, using deep separable convolution instead of normal convolution; the number of parameters and the amount of computation are greatly reduced, but the straight-cylinder structure is not sufficient for feature learning | 70.6 |
| MobileNet v2 | 3.40 | The hourglass residual structure is introduced to enhance the gradient propagation and reduce the computation, and the ReLU function is removed from the last layer to preserve the feature diversity | 72.0 |
| MobileNet v3 | 5.40 | The model structure is improved by using the network structure search algorithm, while the SE module is introduced to strengthen the network learning ability by combining with the channel attention mechanism, and the h-swish activation function is proposed to improve the accuracy | 75.2 |
| ShuffleNet v1 | 1.90 | The point-by-point group convolution is used to reduce the computational complexity of 1 × 1 convolution, and the channel transformation method is proposed to induce the flow of information in the same channel of different features, but the difference between the number of input and output channels is still too large to affect the efficiency | 67.8 |
| ShuffleNet v2 | 2.30 | Abandon group convolution and introduce channel splitting operation to reduce the number of network branches to obtain faster detection speed with certain accuracy | 69.4 |
| ShuffleNet | 1.25 | Replace the 3 × 3 convolution with 1 × 1 convolution, reduce the number of convolution channels, and postpone the sampling operation to significantly reduce the number of parameters and the amount of computation so as to maximize the speed increase in exchange for the reduction in accuracy | 57.5 |
Target detection methods combined with lightweighting strategies.
| Method name | Year | Basic introduction | Performance |
|---|---|---|---|
| CSPNet | 2019 | Starting from the perspective of network architecture, we use cross-stage feature fusion to optimize the repetitive gradient information in the network to achieve lightweighting | In the same environment, the computation is reduced by nearly 30% and the accuracy is improved by 2% compared to yolov3 |
| YOLO Nano | 2019 | Design PEP macro-architecture by human-machine collaborative design strategy, combined with fully connected attention module for embedded environment to significantly reduce the amount of computation, but only in embedded environment | Achieve an average accuracy of 69.1% on the VOC2007 data set and a model size of only 4 MB |
| ThunderNet | 2019 | Based on ShuffieNet v2, compressing RPN module, proposing context enhancement module, using 1 × 1 convolutional compression channel to achieve feature fusion effect while reducing computational cost, and introducing spatial attention mechanism to optimize feature distribution to reduce computational effort | Obtain 19.1% AP on the COCO data set, which is similar to the accuracy of SSD using MobileNet, but nearly 5 times faster and significantly cheaper to compute |
| RefineDeLite | 2020 | A new backbone network Res2NetLite is proposed for the detection task, ensuring the same number of input and output channels, focusing on optimizing the loss function and training strategy | The AP value of 26.8% is achieved on the COCO data set, and it can reach 29.6% with its proposed training strategy, which is the best lightweight network at present |
Other deep learning-based methods for small target detection.
| Order number | Title | Main content | Year |
|---|---|---|---|
| 1 | Small object detection with random decision forests | A small target detection algorithm based on antimachine straight forest is proposed for small-shaped UAV target detection task, achieving 98.8% and 98.7% on single-day and multiday subsets of UIUC data sales | 2017 |
| 2 | Small object detection using context and attention | The multiscale features are concatenated and the additional features of different spreading are used as contexts, and a target detection algorithm combining the note checking force mechanism is also proposed, and the detection effect of both methods on small targets is higher than SSD | 2019 |
| 3 | Bond curated: Online semantic similarity small object detection in crowded scenes | Introducing a pairwise constraint referring to semantic eclipse and using the contextual information of candidate objects to improve the detection performance of tiny objects | 2019 |
| 4 | Lightweight small target detection algorithm based on improved SSD | Adding transposed volumetric station structure to SSD algorithm to improve small target detection | 2018 |
| 5 | An Atrous filter design to enhance the detection capability of small targets in SSD | Based on the SSD algorithm, the special evidence fusion of three or four convolutional layers is performed and undergoes a strong splitting of the empty volume to improve the accuracy and robustness of detection | 2019 |
| 6 | Real-time small target detection method based on improved PVANet | To improve the candidate pivot selection method to better locate small targets based on PVANet for real-time small target detection problem | 2020 |
| 7 | Multiscale Faster RCNN detection algorithm for small targets | Improve the structure of Faster RCNN network, while using the high- and low-level features of the network and use the crawler crawl data to increase or decrease the training data set | 2019 |
| 8 | A small object detection solution by using super-resolution recovery | Segmentation of original aerial images and semi-super resolution for enhancement using GAN | 2019 |
| 9 | TBC-Net: a real-time detector for infrared small target detection using semantic constraint | Proposing a lightweight slow network TBC-Ne for out-of-river small target detection, and adding high-level semantic constraint information of images in the training process to solve the problem of uneven streets of small target samples | 2019 |
| 10 | Multiscale convolutional feature fusion for SSD target detection algorithm | Regional amplification extraction of model low-level features and fusion with high-level features to improve small target detection | 2019 |
| 11 | Improvement of YOLO3 in aerial target detection | Reduce some of the convolution operations and introduce the jump connection on the basis of the original model to ensure the real-time performance | 2020 |
| 12 | Deep learning image target detection combined with note checking force mechanism | Introducing an attention mechanism module in the two-order strand detection model associated with subregion features and competitive height ratio properties | 2019 |
| 13 | Visual small object detection in SSD based on feature union | Fusing feature information from deep and found layers and adjusting the prior to small day marker size for better small target detection | 2020 |
Classification table of forecasts.
| Real situation/predicted results | Positive sample | Antisample |
|---|---|---|
| Positive sample | TP (real example) | FN (false counterexample) |
| Antisample | FP (false-positive example) | TN (true counterexample) |
Figure 2PR curve operation diagram.
Figure 3ROC curve operation diagram.
Figure 4Learning rate-target detection accuracy.
The accuracy of target detection corresponding to different batches.
| Bulk | 30 | 25 | 20 | Bulk | 30 | 25 | 20 |
|---|---|---|---|---|---|---|---|
| Aircraft | 83.6 | 82.1 | 80.1 | Dogs | 87 | 82.7 | 83.1 |
| Bicycles | 85.1 | 82.7 | 83.1 | Ma | 83.1 | 84.1 | 84.9 |
| Bird | 73.5 | 72.5 | 75.9 | Motorcycle | 81.3 | 79 | 82.6 |
| Boat | 60.6 | 66.5 | 64.3 | People | 80.5 | 76 | 78.5 |
| Bottle | 46.2 | 47.2 | 51.3 | Potted plants | 45.9 | 45.8 | 48.2 |
| Buses | 82.4 | 81.8 | 81.2 | Sheep | 75.2 | 74.7 | 73.4 |
| Small cars | 81.5 | 83.8 | 81.9 | Sofa | 78.5 | 75.9 | 75.3 |
| Cat | 89.2 | 84.3 | 86.4 | Trains | 83.9 | 85.2 | 82.5 |
| Chairs | 55 | 58.4 | 54.4 | Monitors | 72.5 | 73.7 | 74.8 |
| Dairy cattle | 79 | 79.4 | 80.3 | mAP | 74.89 | 74.6 | 74.8 |
| Dining table | 73.8 | 76.1 | 73.8 | — | — | — | — |
Figure 5The convergence result of the iteration.