| Literature DB >> 35360334 |
Lele Wang1,2, Yingjie Zhao1,2, Shengbo Liu1,2, Yuanhong Li1,2, Shengde Chen1,3,2, Yubin Lan1,3,2,4.
Abstract
The precision detection of dense small targets in orchards is critical for the visual perception of agricultural picking robots. At present, the visual detection algorithms for plums still have a poor recognition effect due to the characteristics of small plum shapes and dense growth. Thus, this paper proposed a lightweight model based on the improved You Only Look Once version 4 (YOLOv4) to detect dense plums in orchards. First, we employed a data augmentation method based on category balance to alleviate the imbalance in the number of plums of different maturity levels and insufficient data quantity. Second, we abandoned Center and Scale Prediction Darknet53 (CSPDarknet53) and chose a lighter MobilenetV3 on selecting backbone feature extraction networks. In the feature fusion stage, we used depthwise separable convolution (DSC) instead of standard convolution to achieve the purpose of reducing model parameters. To solve the insufficient feature extraction problem of dense targets, this model achieved fine-grained detection by introducing a 152 × 152 feature layer. The Focal loss and complete intersection over union (CIOU) loss were joined to balance the contribution of hard-to-classify and easy-to-classify samples to the total loss. Then, the improved model was trained through transfer learning at different stages. Finally, several groups of detection experiments were designed to evaluate the performance of the improved model. The results showed that the improved YOLOv4 model had the best mean average precision (mAP) performance than YOLOv4, YOLOv4-tiny, and MobileNet-Single Shot Multibox Detector (MobileNet-SSD). Compared with some results from the YOLOv4 model, the model size of the improved model is compressed by 77.85%, the parameters are only 17.92% of the original model parameters, and the detection speed is accelerated by 112%. In addition, the influence of the automatic data balance algorithm on the accuracy of the model and the detection effect of the improved model under different illumination angles, different intensity levels, and different types of occlusions were discussed in this paper. It is indicated that the improved detection model has strong robustness and high accuracy under the real natural environment, which can provide data reference for the subsequent orchard yield estimation and engineering applications of robot picking work.Entities:
Keywords: MobileNetV3; YOLOv4; data balance; object detection; plum
Year: 2022 PMID: 35360334 PMCID: PMC8963500 DOI: 10.3389/fpls.2022.839269
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
FIGURE 1Location of images acquisition site.
FIGURE 2Data annotation example: the blue box represents mature plums, and the purple box represents immature plums.
The number of datasets before and after augmentation.
| Collection date | Dataset | Processing method | Number of pictures | Mature labels | Immature labels |
| April 24, 2021 | Sub-dataset 1 | Before augmentation | 368 | 1,353 | 3,287 |
| After augmentation | 4,416 | 16,236 | 39,444 | ||
| May 2, 2021 | Sub-dataset 2 | Before augmentation | 400 | 2,347 | 258 |
| After augmentation | 2,400 | 9,388 | 1,548 | ||
| May 3, 2021 | Sub-dataset 3 | Before augmentation | 744 | 4,634 | 317 |
| After augmentation | 4,464 | 27,804 | 1,902 | ||
| Total | Before augmentation | 1,512 | 8,334 | 3,862 | |
| After augmentation | 11,280 | 53,428 | 42,894 |
FIGURE 3Structure diagram of Bneck.
FIGURE 4The structure diagram of improved YOLOv4.
FIGURE 5Loss curve during training process.
Comparison of recognition effect of the improved model before and after data balance.
| Dataset types | Types Name | Plum AP | Raw_plum AP | mAP |
| Unbalanced data | A dataset | 91.77% | 80.23% | 86.00% |
| Balanced data | B dataset | 91.10% | 86.34% | 88.72% |
FIGURE 6The comparison of detection effect of plum images before and after improved data balance.
Comparison of detection results of different architectures.
| Architecture | Plum AP | Raw_plum AP | mAP | Model size | Parameters | FPS |
| YOLOv4 | 88.99% | 83.95% | 86.47% | 244 MB | 61.38 M | 20.03 |
| YOLOv4-tiny | 87.51% | 81.71% | 84.61% | 22.4 MB | 5.77 M | 112 |
| MobileNet-SSD | 87.12% | 79.23% | 83.18% | 24.7 MB | 5.98 M | 82.84 |
| Improved YOLOv4 | 90.58% | 86.54% | 88.56% | 54.05 MB | 11.00 M | 42.55 |
Evaluation results of plum test set under different light conditions.
| Light conditions | Classes | P | R | F1 | mAP |
| Natural light | plum | 90.32% | 88.19% | 0.89 | 94.53% |
| raw_plum | 89.41% | 91.69% | 0.91 | ||
| mean value | 89.87% | 89.94% | 0.9 | ||
| Side light | plum | 88.29% | 89.09% | 0.89 | 94.86% |
| raw_plum | 93.07% | 92.61% | 0.93 | ||
| mean value | 90.68% | 90.85% | 0.91 | ||
| Back light | plum | 90.14% | 80.33% | 0.85 | 86.75% |
| raw_plum | 92.36% | 81.46% | 0.87 | ||
| mean value | 91.25% | 80.90% | 0.86 |
FIGURE 7Plum detection effect pictures under different light conditions.
The detection results of different density in four architectures.
| Evaluation indicator | YOLOv4 | YOLOv4-tiny | MobileNet-SSD | Improved YOLOv4 |
| Moderately dense mAP value | 89.19% | 87.12% | 87.28% | 89.30% |
| Highly dense mAP value | 83.01% | 80.03% | 77.16% | 84.75% |
FIGURE 8Plum detection effect pictures under different dense conditions.
FIGURE 9The detection effect of unmanned aerial vehicle (UAV) images.
FIGURE 10The comparison of plum detection effect under different occlusions.