| Literature DB >> 35039560 |
F Shahoveisi1, M Riahi Manesh2, L E Del Río Mendoza3.
Abstract
Diseases caused by the fungus Sclerotinia sclerotiorum are managed mainly through fungicide applications in canola and dry bean. Accurate estimation of the risk of disease development on these crops could help farmers make spraying decisions. Five machine learning (ML) models were evaluated in classification and regression modes for predicting disease establishment under different air temperatures and leaf wetness duration conditions. Model algorithms were trained and tested using 20-fold cross validation. Correspondence between predicted and observed values were measured using Cohen's Kappa (classification) and Lin's concordance coefficients (regression). The artificial neural network (ANN) algorithms had average accuracies ≥ 89% (classification) and R2 ≥ 88% (regression) on canola and dry bean and their correspondence agreements were ≥ 0.83, which is considered substantial to almost perfect. In contrast, logistic regression algorithms had accuracies of 88% for dry bean and 78% for canola; other models were similarly inconsistent. Implementation of ANN models in disease warning systems could help farmers with spraying decisions. At the same time, these models provide insights on temperature and leaf wetness requirements for development of S. sclerotiorum diseases in these crops. Results of this study show the potential of ML models as tools for epidemiological studies on other pathosystems.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35039560 PMCID: PMC8764076 DOI: 10.1038/s41598-021-04743-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Analysis of variance of the effect of incubation temperature and interrupted leaf wetness period and of interrupted leaf wetness on diseases incidence caused by Sclerotinia sclerotiorum ascosporic infections on canola and dry bean, respectively.
| Crop | Sources of variation | Degrees of freedom | F-value | ||
|---|---|---|---|---|---|
| Numerator | Denominator | ||||
| Canola | Temperature | 4 | 95 | 38.16 | < 0.0001 |
| Leaf wetness | 3 | 95 | 11.01 | < 0.0001 | |
| Temperature × leaf wetness | 12 | 95 | 1.41 | 0.1768 | |
| Dry bean | Leaf wetness | 2 | 128 | 13.64 | < 0.0001 |
| Dry period | 2 | 128 | 13.07 | < 0.0001 | |
| Leaf wetness × dry period | 4 | 128 | 1.30 | 0.2738 | |
Analysis was conducted using the GLIMMIX procedure of SAS (version 9.4). The studies were conducted for 10 and 8 days, respectively.
Main effects of discontinuous leaf wetness duration and incubation temperatures on incidence (%) of foliar lesions caused by Sclerotinia sclerotiorum ascosporic infection on canola and dry bean plants.
| Canola | Dry bean | |||
|---|---|---|---|---|
| Factors | Levels | Incidence (%) | Levels | Incidence (%) |
| Incubation temperature (°C) | 10 | 28 d | – | – |
| 15 | 73 b | – | – | |
| 20 | 66 b | – | – | |
| 25 | 88 a | – | – | |
| 30 | 50 c | – | – | |
| Leaf wetness (hours/cycle) | 6 | 51 c | 8 | 56 b |
| 10 | 53 bc | 12 | 54 b | |
| 14 | 65 ab | 16 | 78 a | |
| 18 | 75 a | |||
| Leaf dryness (hours/cycle) | – | – | 12 | 74 a |
| – | – | 18 | 67 a | |
| – | – | 24 | 46 b | |
On canola, a successive wet and dry period adds to a cycle of 24 h; in dry bean, the cycle does not necessarily add to 24 h. Incidence values are least square means that represent 24 and 34 observations on canola and dry bean plants, respectively. Incidence was measured after 8 and 10 days of incubation of canola and dry bean plants, respectively.
Incidence means followed by same letters in a factor are not statistically different (α = 0.05) from each other according to the Tukey–Kramer test.
A “–” indicates levels of the factor were not tested.
Evaluation of fitness of artificial neural networks (ANN), support-vector machine (SVM), random forest (RF), decision trees (DT), and logistic regression (LGR) machine-learning models used in classification analyses of canola and dry bean data sets that associated incubation temperature and duration of leaf wetness conditions with incidence of Sclerotinia stem rot disease.
| Study | Models | Model fitness metrics | ||||
|---|---|---|---|---|---|---|
| Accuracy (%) | Precision (%) | Recall (%) | F-score (%) | AUC (%) | ||
| Canola | ANN | 89 | 91 | 92 | 91 | 93 |
| SVM | 88 | 90 | 92 | 91 | 91 | |
| RF | 86 | 88 | 91 | 89 | 89 | |
| DT | 78 | 83 | 84 | 83 | 72 | |
| LGR | 78 | 79 | 91 | 85 | 86 | |
| Dry bean | ANN | 92 | 90 | 93 | 91 | 95 |
| SVM | 90 | 87 | 93 | 90 | 96 | |
| RF | 85 | 85 | 82 | 83 | 94 | |
| DT | 83 | 82 | 82 | 82 | 82 | |
| LGR | 88 | 86 | 89 | 87 | 95 | |
AUC represents the area under the receiver operating characteristic curve.
Figure 1Prediction probabilities of Sclerotinia stem rot development on canola using classification artificial neural network (ANN). Temperature, leaf wetness duration, and total time from the inoculation were used as predictors of the model. Figure shows the probabilities estimated nine days after inoculation.
Concordance coefficients for classification (Kappa) and regression (Lin’s ccc) models for correspondence between observed and predicted outcomes of artificial neural networks (ANN), support-vector machine (SVM), random forest (RF), decision trees (DT), logistic regression (LGR), and linear regression (LNR) machine-learning models used to characterize the effect of leaf wetness and incubation temperature on incidence of Sclerotinia stem rot of canola and dry bean.
| Study | Models | Kappa | |
|---|---|---|---|
| Canola | ANN | 0.75 | 0.94 |
| RF | 0.68 | 0.87 | |
| DT | 0.51 | 0.86 | |
| LGR | 0.50 | – | |
| LNR | – | 0.53 | |
| SVM | 0.73 | 0.49 | |
| Dry bean | ANN | 0.83 | 0.98 |
| RF | 0.70 | 0.95 | |
| DT | 0.67 | 0.94 | |
| LGR | 0.83 | – | |
| LNR | – | 0.86 | |
| SVM | 0.80 | 0.80 |
Statistical fitness metrics of artificial neural networks (ANN), support-vector machine (SVM), random forest (RF), decision trees (DT), and linear regression (LNR) machine-learning models used in regression analyses of canola and dry bean data sets that associated incubation temperatures and duration of leaf wetness conditions to incidence of Sclerotinia stem rot disease.
| Study | Models | R2 (%) | Root mean square error | Mean absolute error |
|---|---|---|---|---|
| Canola | ANN | 88 | 7.84 | 6.09 |
| RF | 77 | 10.91 | 8.20 | |
| DT | 73 | 11.91 | 8.19 | |
| LNR | 35 | 18.43 | 14.52 | |
| SVM | 31 | 18.97 | 13.91 | |
| Dry bean | ANN | 95 | 5.82 | 4.36 |
| RF | 90 | 8.46 | 5.52 | |
| DT | 88 | 9.54 | 6.80 | |
| LNR | 74 | 13.70 | 11.48 | |
| SVM | 70 | 14.90 | 12.34 |
Parameter estimates of artificial neural networks (ANN), decision trees (DT), random forest (RF), support-vector machine (SVM), logistic regression (LGR), and linear regression (LNR) machine-learning models used in classification and regression analyses.
| Study/analyses | Models | ||||
|---|---|---|---|---|---|
| ANN | DT | RF | SVM | LGR/LNR | |
| Canola/classification | Hidden layers = 1 | Pruning = none | Number of trees = 5 | Loss function = 110.0, ε = 1.0 | Regularization = ridge (L2) |
| Neurons = 10 | Node splitting = 95% | Replicable training = yes | Kernel = RBF, exp(− auto|x–y|2) | Cost strength = 5 | |
| Activation function = tanh | Tree depth = unlimited | Tree depth = unlimited | Numerical tolerance = 0.0001 | ||
| α (learning rate) = 0.5 | Max number of considered features = unlimited | Iteration = unlimited | |||
| Max iteration = 100 | |||||
| Dry bean/classification | Hidden layers = 1 | Pruning = none | Number of trees = 16 | Loss function = 0.8, ε = 0.9 | Regularization = ridge (L2) |
| Neurons = 5 | Node splitting = 95% | Replicable training = yes | Kernel = RBF, exp(− auto|x–y|2) | Cost strength = 50 | |
| Activation function = tanh | Tree depth = unlimited | Tree depth = unlimited | Numerical tolerance = 0.0001 | ||
| α (learning rate) = 0.7 | Max number of considered features = unlimited | Iteration = unlimited | |||
| Max iteration = 100 | |||||
| Canola/regression | Hidden layers = 1 | Pruning = none | Number of trees = 10 | Loss function = 1.0, ε = 0.8 | α (regularization parameter) = 1 |
| Neurons = 200 | Node splitting = 95% | Replicable training = yes | Kernel = Linear | ||
| Activation function = tanh | Tree depth = unlimited | Tree depth = unlimited | Numerical tolerance = 0.0001 | ||
| α (learning rate) = 0.7 | Max number of considered features = unlimited | Iteration = unlimited | |||
| Max iteration = 2000 | |||||
| Dry bean/regression | Hidden layers = 2 | Pruning = none | Number of trees = 10 | Loss function = 1.0, ε = 0.8 | α (regularization parameter) = 1 |
| Neurons = 20 | Node splitting = 95% | Replicable training = yes | Kernel = linear | ||
| Activation function = logistic | Tree depth = unlimited | Tree depth = unlimited | Numerical tolerance = 0.0001 | ||
| α (learning rate) = 1 | Max number of considered features = unlimited | Iteration = unlimited | |||
| Max iteration = 2000 | |||||
LGR was used in classification and LNR in regression analyses.