| Literature DB >> 35735838 |
Yue Teng1,2, Rujing Wang1, Jianming Du1, Ziliang Huang1,2, Qiong Zhou1,2,3, Lin Jiao1,4.
Abstract
It is well recognized that aphid infestation severely reduces crop yield and further leads to significant economic loss. Therefore, accurately and efficiently detecting aphids is of vital importance in pest management. However, most existing detection methods suffer from unsatisfactory performance without fully considering the aphid characteristics, including tiny size, dense distribution, and multi-viewpoint data quality. In addition, existing clustered tiny-sized pest detection methods improve performance at the cost of time and do not meet the real-time requirements. To address the aforementioned issues, we propose a robust aphid detection method with two customized core designs: a Transformer feature pyramid network (T-FPN) and a multi-resolution training method (MTM). To be specific, the T-FPN is employed to improve the feature extraction capability by a feature-wise Transformer module (FTM) and a channel-wise feature recalibration module (CFRM), while the MTM aims at purifying the performance and lifting the efficiency simultaneously with a coarse-to-fine training pattern. To fully demonstrate the validity of our methods, abundant experiments are conducted on a densely clustered tiny pest dataset. Our method can achieve an average recall of 46.1% and an average precision of 74.2%, which outperforms other state-of-the-art methods, including ATSS, Cascade R-CNN, FCOS, FoveaBox, and CRA-Net. The efficiency comparison shows that our method can achieve the fastest training speed and obtain 0.045 s per image testing time, meeting the real-time detection. In general, our TD-Det can accurately and efficiently detect in-field aphids and lays a solid foundation for automated aphid detection and ranking.Entities:
Keywords: aphid detection; convolution neural network; dense distribution; multi-resolution training; multi-viewpoint detection; tiny size; transformer
Year: 2022 PMID: 35735838 PMCID: PMC9224525 DOI: 10.3390/insects13060501
Source DB: PubMed Journal: Insects ISSN: 2075-4450 Impact factor: 3.139
Figure 1The comparison of the APHID-4K and other pest datasets.
The constitution of the APHID-4K dataset.
| Training Images | Test Images | Training Aphids | Test Aphids | ||
|---|---|---|---|---|---|
|
| Macrosiphum avenae | 2125 | 546 | 20,043 | 5203 |
|
| Rhopalosiphum padi | 2093 | 507 | 23,074 | 5525 |
Figure 2The network architecture of TD-Det with T-FPN, where LN is layer normalization, MLP is multi-layer perceptron, FC is fully connected, ReLu is rectified linear activation function, and C0–C4 are feature maps.
Figure 3The architecture of the multi-resolution training method (MTM), where R is the resolution of images, Lr is the learning rate, and the red bounding boxes are detected aphids.
Overall performance comparison.
| Method |
|
|
|
|
|
|
|---|---|---|---|---|---|---|
|
| ||||||
| Faster R-CNN w/FPN [ | 26.1 | 68.0 | 13.1 | 21.9 | 30.1 | 36.7 |
| Libra Faster R-CNN [ | 25.5 | 64.9 | 13.2 | 21.1 | 29.9 | 30.8 |
| ATSS [ | 26.9 | 69.8 | 13.4 | 22.4 | 31.4 | 33.3 |
| Cascade R-CNN [ | 27.3 | 69.3 | 14.1 | 23.4 | 31.0 | 38.3 |
| FCOS [ | 24.9 | 66.2 | 11.3 | 19.9 | 29.3 | 32.3 |
| RetinaNet [ | 21.7 | 60.0 | 9.4 | 15.4 | 26.7 | 37.1 |
| FoveaBox [ | 23.1 | 63.4 | 10.1 | 18.2 | 27.7 | 36.2 |
| CRA-Net [ | 26.1 | 68.1 | 13.0 | 21.8 | 30.1 | 31.5 |
| DCTDet W/CCG [ | 27.1 | 68.5 | 13.7 | 22.0 | 30.4 | 32.8 |
|
| ||||||
| TD-Det(RV) | 27.2 | 71.6 | 13.4 | 22.8 | 31.4 | 34.6 |
| TD-Det(PV) | 29.2 | 74.2 | 15.4 | 25.7 | 32.7 | 46.1 |
The efficiency comparison.
| Method | Training Time ( | Testing Time ( | Parameters | ||||
|---|---|---|---|---|---|---|---|
|
| |||||||
| FPN Faster R-CNN [ | 0.111 | 0.048 | 68.0 | 36.7 | 6.13 | 14.17 | 41,353,306 |
| Libra R-CNN [ | 0.118 | 0.050 | 64.9 | 37.4 | 5.50 | 12.98 | 41,616,474 |
| ATSS [ | 0.106 | 0.048 | 69.8 | 40.3 | 6.59 | 14.54 | 32,115,532 |
| Cascade R-CNN [ | 0.133 | 0.058 | 69.3 | 38.3 | 5.21 | 11.95 | 69,154,916 |
| FCOS [ | 0.093 | 0.041 | 66.2 | 37.6 | 7.12 | 16.15 | 32,113,484 |
| RetinaNet [ | 0.102 | 0.048 | 60.0 | 37.1 | 5.88 | 12.5 | 36,350,582 |
| FoveaBox [ | 0.103 | 0.042 | 63.4 | 36.2 | 6.16 | 15.10 | 36,239,942 |
| CRA-Net [ | 0.114 | 0.050 | 68.1 | 31.5 | 5.97 | 13.62 | 41,361,498 |
| DCTDet [ | 0.280 | 0.213 | 68.5 | 32.8 | 2.45 | 3.22 | 84,706,732 |
|
| |||||||
| TD-Det(RV) | 0.075 | 0.045 | 71.6 | 41.9 | 9.55 | 15.91 | 33,032,012 |
| TD-Det(PV) | 0.116 | 0.100 | 74.2 | 46.1 | 6.40 | 7.42 | 33,097,804 |
The performance of various detection methods with or without T-FPN.
| Method | T-FPN |
|
|
|
|---|---|---|---|---|
| Faster R-CNN [ | 26.1 | 68.0 | 36.7 | |
| √ | 26.6 | 68.4 | 37.2 | |
| Libra R-CNN [ | 25.5 | 64.9 | 37.4 | |
| √ | 25.9 | 65.4 | 37.7 | |
| Cascade R-CNN [ | 27.3 | 69.3 | 38.3 | |
| √ | 27.4 | 69.7 | 38.2 | |
| FCOS [ | 24.9 | 66.2 | 37.6 | |
| √ | 25.0 | 67.1 | 37.4 | |
| RetinaNet [ | 21.7 | 60.0 | 37.1 | |
| √ | 22.0 | 60.9 | 37.0 | |
| FoveaBox [ | 23.1 | 63.4 | 36.2 | |
| √ | 23.5 | 64.5 | 36.3 |
The performance of various detection methods with or without MTM.
| Method | MTM |
|
| Training Time (s/iter) | Test Time (s/img) |
|---|---|---|---|---|---|
| Faster R-CNN [ | 68.0 | 36.7 | 0.111 | 0.048 | |
|
| 68.5 | 37.2 | 0.079 | 0.047 | |
| Libra R-CNN [ | 64.9 | 37.4 | 0.118 | 0.050 | |
|
| 66.2 | 38.3 | 0.084 | 0.050 | |
| Cascade R-CNN [ | 69.3 | 38.3 | 0.133 | 0.058 | |
|
| 69.4 | 38.5 | 0.102 | 0.058 | |
| FCOS [ | 66.2 | 37.6 | 0.093 | 0.041 | |
|
| 69.3 | 38.9 | 0.062 | 0.040 | |
| RetinaNet [ | 60.0 | 37.1 | 0.102 | 0.048 | |
|
| 62.8 | 38.2 | 0.072 | 0.048 | |
| FoveaBox [ | 63.4 | 36.2 | 0.103 | 0.042 | |
|
| 66.5 | 37.5 | 0.071 | 0.042 |
The performance comparison of TD-Det with various backbones.
| Resnet50 | Resnet101 | Resnext50 | Resnext101 | |
|---|---|---|---|---|
|
| 74.2 | 74.0 | 73.2 | 72.9 |
|
| 15.4 | 14.5 | 14.4 | 13.9 |
|
| 29.2 | 29.1 | 28.6 | 28.3 |
Figure 4PR curve with IoU = 0.5. (a) PR curve of Macrosiphum avenae. (b) PR curve of Rhopalosiphum padi.
Figure 5Performance comparison of .
Figure 6Comparison and visualization of detection results with other methods.