| Literature DB >> 36061777 |
Ming Sun1,2, Yanan Li1,2, Yang Qi1,2, Huabing Zhou1,2, LongXing Tian1.
Abstract
Cotton is an important source of fiber. The precise and intelligent management of cotton fields is the top priority of cotton production. Many intelligent management methods of cotton fields are inseparable from cotton boll localization, such as automated cotton picking, sustainable boll pest control, boll maturity analysis, and yield estimation. At present, object detection methods are widely used for crop localization. However, object detection methods require relatively expensive bounding box annotations for supervised learning, and some non-object regions are inevitably included in the annotated bounding boxes. The features of these non-object regions may cause misjudgment by the network model. Unlike bounding box annotations, point annotations are less expensive to label and the annotated points are only likely to belong to the object. Considering these advantages of point annotation, a point annotation-based multi-scale cotton boll localization method is proposed, called MCBLNet. It is mainly composed of scene encoding for feature extraction, location decoding for localization prediction and localization map fusion for multi-scale information association. To evaluate the robustness and accuracy of MCBLNet, we conduct experiments on our constructed cotton boll localization (CBL) dataset (300 in-field cotton boll images). Experimental results demonstrate that MCBLNet method improves by 49.4% average precision on CBL dataset compared with typically point-based localization state-of-the-arts. Additionally, MCBLNet method outperforms or at least comparable with common object detection methods.Entities:
Keywords: cotton boll; deep learning; localization; multi-scale; point annotation
Year: 2022 PMID: 36061777 PMCID: PMC9433923 DOI: 10.3389/fpls.2022.960592
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 6.627
Figure 1Location of field boll image acquisition sites.
Figure 2Example of CBL dataset. (A) Is the original image, (B) is the corresponding ground-truth.
Figure 3The pipeline of MCBLNet network architecture.
Figure 4Detailed structure of scene encoder and location decoder.
Figure 5Down module structures. (A) Is Down of UNet, (B) is Down of MCBLNet-lite, (C) is Down of MCBLNet. “+” indicating matrix addition.
Figure 6Up module structures. (A) Is Up of UNet, (B) Is Up of MCBLNet. “C” representing concatenate.
Table of experimental results for each method on the CBL dataset.
|
|
|
|
|
|
|---|---|---|---|---|
| SSD | Box | 8.23 | 13.18 | 95 |
| Faster RCNN | Box | 38.5 | 9.67 | 165.7 |
| YOLOv3-tiny | Box | 51.1 |
| 17.4 |
| YOLOv3-spp | Box | 64.4 | 28.21 | 125.6 |
| YOLOv5m | Box | 60.8 | 28.92 | 42.2 |
| YOLOv5s | Box | 57.2 | 31.42 |
|
| P2PNet | Point | 8.3 | 23.38 | 86.4 |
| MSPSNet | Point | 34.5 | 7.16 | 263.3 |
| MCBLNet-lite | Point | 78.3 | 22.5 | 37.8 |
| MCBLNet | Point |
| 20.86 | 50.3 |
The optimal values in each column are bold-faced.
Localization results of three methods under different density distributions.
|
|
|
| ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| MSPSNet | 74.7 | 18.2 | 14.3 | 78 | 16.9 | 34.3 |
| YOLOv3-spp | 74.5 | 68.4 | 67.6 | 75.7 | 59.3 | 63.7 |
| MCBLNet | 69 | 56.8 | 61.7 | 82.9 | 58.6 | 83.9 |
Figure 7Localization effect of three methods in different density images. (A) Is original image and (B) is ground-truth. (C–E) Are the localization effects of MSPSNet, YOLOv3-spp, and MCBLNet, respectively.
Ablation experiments on the CBL dataset.
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
|
| ||||
|
|
| ||||
| MCBLNet-lite_base | 71.8 | 22.5 | 37.8 | ||
| MCBLNet-lite | ✓ | 78.3 | 22.5 | 37.8 | |
| MCBLNet_base | ✓ | 82.4 | 20.86 | 50.3 | |
| MCBLNet | ✓ | ✓ | 83.9 | 20.86 | 50.3 |
“✓” means joining the corresponding module.