| Literature DB >> 36247573 |
Hecang Zang1,2, Yanjing Wang3, Linyuan Ru4, Meng Zhou1,2, Dandan Chen1,2, Qing Zhao1,2, Jie Zhang1,2, Guoqiang Li1,2, Guoqing Zheng1,2.
Abstract
In wheat breeding, spike number is a key indicator for evaluating wheat yield, and the timely and accurate acquisition of wheat spike number is of great practical significance for yield prediction. In actual production; the method of using an artificial field survey to count wheat spikes is time-consuming and labor-intensive. Therefore, this paper proposes a method based on YOLOv5s with an improved attention mechanism, which can accurately detect the number of small-scale wheat spikes and better solve the problems of occlusion and cross-overlapping of the wheat spikes. This method introduces an efficient channel attention module (ECA) in the C3 module of the backbone structure of the YOLOv5s network model; at the same time, the global attention mechanism module (GAM) is inserted between the neck structure and the head structure; the attention mechanism can be more Effectively extract feature information and suppress useless information. The result shows that the accuracy of the improved YOLOv5s model reached 71.61% in the task of wheat spike number, which was 4.95% higher than that of the standard YOLOv5s model and had higher counting accuracy. The improved YOLOv5s and YOLOv5m have similar parameters, while RMSE and MEA are reduced by 7.62 and 6.47, respectively, and the performance is better than YOLOv5l. Therefore, the improved YOLOv5s method improves its applicability in complex field environments and provides a technical reference for the automatic identification of wheat spike numbers and yield estimation. Labeled images, source code, and trained models are available at: https://github.com/228384274/improved-yolov5.Entities:
Keywords: YOLOv5s; attention mechanism; deep learning; spike number detection; wheat
Year: 2022 PMID: 36247573 PMCID: PMC9554473 DOI: 10.3389/fpls.2022.993244
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 6.627
FIGURE 1Geographical location of the study area.
FIGURE 2Data enhancement.
FIGURE 3Network structure of YOLOv5s algorithm.
FIGURE 4Structure of efficient channel attention (ECA) module.
FIGURE 5Structure of improved C3 module.
FIGURE 6Structure of global attention mechanism (GAM) module.
FIGURE 7Network structure of improved YOLOv5s algorithm.
Algorithm structure of improved YOLOv5s.
| Number of layers | From | Parameter quantity | Module name |
| 0 | −1 | 3520 | Focus |
| 1 | −1 | 18560 | Conv |
| 2 | −1 | 18819 | ECA-C3 |
| 3 | −1 | 73984 | Conv |
| 4 | −1 | 115715 | ECA-C3 |
| 5 | −1 | 295424 | Conv |
| 6 | −1 | 625155 | ECA-C3 |
| 7 | −1 | 1180672 | Conv |
| 8 | −1 | 656896 | SPP |
| 9 | −1 | 1182723 | ECA-C3 |
| 10 | −1 | 131584 | Conv |
| 11 | −1 | 0 | Upsample |
| 12 | [−1,6] | 0 | Concat |
| 13 | −1 | 361984 | C3 |
| 14 | −1 | 33024 | Conv |
| 15 | −1 | 0 | Upsample |
| 16 | [−1,4] | 0 | Concat |
| 17 | −1 | 90880 | C3 |
| 18 | −1 | 147712 | Conv |
| 19 | [−1,14] | 0 | Concat |
| 20 | −1 | 296448 | C3 |
| 21 | −1 | 590336 | Conv |
| 22 | [−1,10] | 0 | Concat |
| 23 | −1 | 1182720 | C3 |
| 24 | [17,20,23] | 8622262 | Detect |
Test performance comparison of different models.
| Methods | RMSE | MAE | Recall | mAP@.0.5 | Map@.0.5:0.95 |
| YOLOv5s | 53.23 | 41.24 | 0.887 | 0.949 | 0.526 |
| YOLOv5m | 51.56 | 40.83 | 0.894 | 0.949 | 0.522 |
| YOLOv5l | 49.71 | 38.87 | 0.888 | 0.947 | 0.525 |
| YOLOv5x | 44.51 | 33.62 | 0.913 | 0.950 | 0.541 |
| Improved YOLOv5s | 43.94 | 34.36 | 0.911 | 0.951 | 0.545 |
| Faster R-CNN | 94.57 | 87.10 | 0.819 | 0.862 | 0.355 |
Statistical average error and average accuracy.
| Methods | Mean error (%) | Mean accuracy (%) |
| YOLOv5s | 33.34% | 66.66% |
| YOLOv5m | 33.29% | 67.29% |
| YOLOv5l | 30.89% | 69.11% |
| YOLOv5x | 27.52% | 72.48% |
| Improved YOLOv5s | 28.39% | 71.61% |
| Faster R-CNN | 54.07% | 45.93% |
Comparison of parameter quantity, GFLOPs, inference, inference speed, and GPU resource occupancy of different models.
| Methods | Parameter quantity (M) | GFLOPs | Inference (Min) | Inference speed (ms) | GPU resource occupancy (G) |
| YOLOv5s | 13.38 | 15.8 | 370.5 | 7.5 | 1.70 |
| YOLOv5m | 39.77 | 47.9 | 396.2 | 11.6 | 1.80 |
| YOLOv5l | 87.90 | 107.6 | 415.6 | 17.3 | 2.10 |
| YOLOv5x | 164.36 | 204.0 | 479.9 | 29.0 | 2.40 |
| Improved YOLOv5s | 28.81 | 31.6 | 372.5 | 14.7 | 2.42 |
| Faster R-CNN | 41.30 | 278.2 | 755.3 | 227.7 | 7.87 |
Comparison of average accuracy and training time between CloU and EloU of YOLOv5 models.
| Methods | Mean accuracy (%) | Inference (Min) |
| Improved YOLOv5s with CIoU | 71.61% | 372.5 |
| Improved YOLOv5s with EIoU | 72.82% | 405.6 |
FIGURE 8Qualitative analysis of experimental results of YOLOv5 algorithm. (A–E) Represent the number of images.
FIGURE 9Experimental effects of improved YOLOv5s under different densities and backgrounds. (A–F) Represent six different images randomly selected from the global wheat challenge 2021 International Conference on computer vision 2021 dataset.