| Literature DB >> 31874611 |
Qingxin Xiao1, Weilu Li1, Yuanzhong Kai2, Peng Chen3,4, Jun Zhang5, Bing Wang6.
Abstract
BACKGROUND: The occurrence of cotton pests and diseases has always been an important factor affecting the total cotton production. Cotton has a great dependence on environmental factors during its growth, especially climate change. In recent years, machine learning and especially deep learning methods have been widely used in many fields and have achieved good results.Entities:
Keywords: Association rules analysis; Long short term memory; Recurrent neural network; The occurrence of pests and diseases; Weather factors
Mesh:
Year: 2019 PMID: 31874611 PMCID: PMC6929544 DOI: 10.1186/s12859-019-3262-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Overview of analysis and prediction of cotton pests and diseases
Fig. 2Classification and statistics of cotton pests and diseases in India. a Cotton pests and diseases in different regions of India. b The occurrence of different types of cotton diseases and insect pests in India
Fig. 3Numbers of association rules directly related to the occurrence of pests and diseases under different K values
Coefficient range for different weather factors
| Weather features | Coefficient range of each features | ||
|---|---|---|---|
| A- | A1 (0, 27.79] | A3 (35.63, 46.60] | |
| B- | B1 (0, 13.68] | B2 (13.68, 21.05] | |
| C- | C1 (0, 57.50] | C2 (57.50, 78.58] | |
| D- | D1 (0, 36.37] | D2 (36.37, 57.89] | |
| E- | E2 (29.12,103.52] | E3 (103.52, 602.00] | |
| F- | F2 (5.73, 25.52] | F3 (25.52, 71.40] | |
| G- | G1 (0, 4.97] | G3 (35.63, 12.70] | |
| H- | H2 (10.16, 23.09] | H3 (23.09, 72.00] | |
Bold means that the weather condition is more likely to cause cotton pests and diseases to occur according to the mined association rules
Fig. 4Structure of LSTM cells
Partial association rules between pest occurrence and weather factors in five different regions (25 rules)
| Locations | Numbers | Association rules of pests occur and weather factors |
|---|---|---|
| Akola | 241 | |
| Support= 0.07643, confidence=0.790816 | ||
| Lam | 94 | |
| Nagpur | 80 | |
| Pharbhani | 44 | |
| Sirsa | 121 | |
Partial association rules between pest occurrence and weather factors in five different types of cotton pests and diseases (15 rules)
| Pests and Diseases | Numbers | Association rules of pests occur and weather factors |
|---|---|---|
| Ahpid | 153 | |
| Jassid | 199 | |
| Leaf Diseases | 142 | |
| Thrios | 109 | |
| Whitrfly | 52 | |
Fig. 5The probability of occurrence of each item in the association rules. a five different regions; b five different types of cotton pests and diseases.
The size of five groups of cotton bollworm records
| P1 | P2 | P3 | P4 | P5 | |
|---|---|---|---|---|---|
| Pests and diseases | 335 | 316 | 167 | 197 | 70 |
| No pest and disease | 861 | 724 | 457 | 286 | 190 |
| Total | 1196 | 1040 | 624 | 483 | 260 |
Predictions on five datasets in terms of units_r
| Units_r | Metrics | P1 | P2 | P3 | P4 | P5 |
|---|---|---|---|---|---|---|
| 4 | ACC | 0.9241 | 0.8973 | 0.9111 | 0.9017 | 0.8742 |
| AUC | 0.9712 | 0.9532 | 0.9687 | 0.9578 | 0.9465 | |
| F1-score | 0.8857 | 0.8258 | 0.8316 | 0.8737 | 0.7804 | |
| 5 | ACC | 0.9176 | 0.8903 | |||
| AUC | 0.9663 | |||||
| F1-score | 0.8580 | 0.7903 | ||||
| 6 | ACC | 0.9281 | 0.9063 | 0.9098 | 0.8949 | 0.8968 |
| AUC | 0.9737 | 0.9643 | 0.9529 | 0.9628 | 0.9649 | |
| F1-score | 0.8896 | 0.8450 | 0.8420 | 0.8680 | ||
| 7 | ACC | 0.9276 | 0.9013 | 0.9000 | ||
| AUC | 0.9710 | 0.9557 | 0.9551 | 0.9636 | ||
| F1-score | 0.8870 | 0.8205 | 0.8763 | 0.8104 |
The entry in boldface represents the best performance on one dataset with respect of Units_r
Prediction results on five datasets in terms of l
| Metrics | P1 | P2 | P3 | P4 | P5 | |
|---|---|---|---|---|---|---|
| 1 | ACC | |||||
| AUC | 0.9604 | |||||
| F1-score | 0.8918 | 0.8290 | ||||
| 2 | ACC | 0.9331 | 0.8831 | 0.9190 | 0.8949 | 0.8936 |
| AUC | 0.9727 | 0.9402 | 0.9656 | 0.9567 | 0.9657 | |
| F1-score | 0.7844 | 0.8374 | 0.8710 | 0.7972 | ||
| 3 | ACC | 0.9245 | 0.8784 | 0.8847 | 0.9129 | |
| AUC | 0.9732 | 0.9377 | 0.9598 | 0.9421 | ||
| F1-score | 0.8769 | 0.7806 | 0.8559 | 0.8564 |
The entry in boldface represents the best performance on one dataset with respect of lr
Prediction results on five datasets in terms of l
| Metrics | P1 | P2 | P3 | P4 | P5 | |
|---|---|---|---|---|---|---|
| 1[2] | ACC | 0.9300 | 0.8858 | 0.9189 | 0.8780 | 0.8936 |
| AUC | 0.9515 | 0.9668 | 0.9565 | 0.9510 | ||
| F1-score | 0.8819 | 0.8080 | 0.8545 | 0.8378 | 0.8256 | |
| 2[5,1] | ACC | 0.9206 | ||||
| AUC | 0.9694 | |||||
| F1-score | 0.8662 | |||||
| 3[5,5,1] | ACC | 0.9292 | 0.8959 | 0.9124 | 0.8729 | 0.8903 |
| AUC | 0.9512 | 0.9506 | 0.9381 | 0.9555 | ||
| F1-score | 0.8859 | 0.8264 | 0.8466 | 0.8550 | 0.8237 | |
| 3[10,5,1] | ACC | 0.9284 | 0.8946 | 0.9020 | 0.8763 | 0.9129 |
| AUC | 0.9529 | 0.9443 | 0.9526 | 0.9625 | ||
| F1-score | 0.8288 | 0.8327 | 0.8480 | 0.8363 |
The entry in boldface represents the best performance on one dataset with respect of lfc
The list of parameters for LSTM network and other compared methods
| Methods | Parameters |
|---|---|
| LSTM | |
| SVM | |
| KNN | |
| Random Forest |
The sizes of datasets for the four kinds of pests and diseases
| Bollworm | Whitefly | Jassid | Leaf Blight | |
|---|---|---|---|---|
| Pests and diseases | 1776 | 450 | 730 | 523 |
| No pests and diseases | 5307 | 1059 | 1244 | 1401 |
| Total | 7083 | 1509 | 1974 | 1924 |
Predictions on different kinds of pests and diseases with LSTM network
| Metrics | Bollworm | Whitefly | Jassid | Leaf Blight |
|---|---|---|---|---|
| ACC | 0.9207 | 0.9244 | 0.9354 | 0.9557 |
| AUC | 0.9659 | 0.9687 | 0.9776 | 0.9868 |
| F1-score | 0.8749 | 0.9243 | 0.9161 | 0.9204 |
Fig. 6Confusion matrix on the four kinds of dataset with LSTM network. Subfigures A, B, C and D show the confusion matrix of cotton pests and diseases occurrence of bollworm, whitefly, jassid and leaf blight, respectively. Here, the green bar representative model predicts the correct number of samples. a Bollworm b Whitefly c Jassid d Leaf blight
Fig. 7ROC curves on the four kinds of dataset with LSTM network. Subfigures 7A, 7B, 7C and 7D show the ROC curves of cotton pests and diseases occurrence of bollworm, whitefly, jassid and leaf blight, respectively. Here "area" means the area under each ROC curve. a Bollworm b Whitefly c Jassid d Leaf blight
Fig. 8Performance comparison on dataset "p1" with different methods
Performance of different models without adding historical pest values on dataset p1
| Metrics | LSTM | KNN | Random Forest | SVM |
|---|---|---|---|---|
| ACC | 0.8393 | 0.8135 | 0.8423 | 0.7485 |
| AUC | 0.8994 | 0.7515 | 0.7845 | 0.5453 |
| F1-score | 0.6920 | 0.6338 | 0.6861 | 0.2009 |
Fig. 9The AUC scores of each model without adding historical data on dataset "p1"