| Literature DB >> 33313566 |
Anna O Conrad1, Wei Li1, Da-Young Lee1, Guo-Liang Wang1, Luis Rodriguez-Saona2, Pierluigi Bonello1.
Abstract
Early detection of plant diseases, prior to symptom development, can allow for targeted and more proactive disease management. The objective of this study was to evaluate the use of near-infrared (NIR) spectroscopy combined with machine learning for early detection of rice sheath blight (ShB), caused by the fungus Rhizoctonia solani. We collected NIR spectra from leaves of ShB-susceptible rice (Oryza sativa L.) cultivar, Lemont, growing in a growth chamber one day following inoculation with R. solani, and prior to the development of any disease symptoms. Support vector machine (SVM) and random forest, two machine learning algorithms, were used to build and evaluate the accuracy of supervised classification-based disease predictive models. Sparse partial least squares discriminant analysis was used to confirm the results. The most accurate model comparing mock-inoculated and inoculated plants was SVM-based and had an overall testing accuracy of 86.1% (N = 72), while when control, mock-inoculated, and inoculated plants were compared the most accurate SVM model had an overall testing accuracy of 73.3% (N = 105). These results suggest that machine learning models could be developed into tools to diagnose infected but asymptomatic plants based on spectral profiles at the early stages of disease development. While testing and validation in field trials are still needed, this technique holds promise for application in the field for disease diagnosis and management.Entities:
Year: 2020 PMID: 33313566 PMCID: PMC7706329 DOI: 10.34133/2020/8954085
Source DB: PubMed Journal: Plant Phenomics ISSN: 2643-6515
Sample sizes. Data were randomly split into training (70% of data) and testing (30% of data) sets for model development and validation for the experiment containing control, mock-inoculated, and inoculated seedlings.
| Comparison | Data set | Total | ||
|---|---|---|---|---|
| Control | Mock-inoculated | Inoculated | ||
| All groups | Training | 80 | 84 | 89 |
| Testing | 33 | 35 | 37 | |
| Mock vs. inoculated | Training | — | 84 | 89 |
| Testing | — | 35 | 37 | |
| Control vs. inoculated | Training | 80 | — | 89 |
| Testing | 33 | — | 37 | |
SVM model parameters. Support vector machine (SVM) optimal parameters for the experiment containing three treatments (control, mock-inoculated, and inoculated seedlings).
| Comparison | Model | Kernel | Cost | Gamma |
|---|---|---|---|---|
| All groups | Second derivative | Linear | 100 | — |
| VSURF | Radial | 100 | 0.05 | |
| Resampled | Radial | 100 | 0.05 | |
| Mock vs. inoculated | Second derivative | Radial | 100 | 0.05 |
| VSURF | Radial | 100 | 0.05 | |
| Resampled | Radial | 100 | 0.05 | |
| Control vs. inoculated | Second derivative | Linear | 100 | — |
| VSURF | Linear | 0.1 | — | |
| Resampled | Linear | 0.1 | — |
Figure 1ShB symptomatic rice plants. Representative examples of control, mock-inoculated, and inoculated Lemont rice seedlings at seven days post-inoculation.
Figure 2Rice NIR spectra. Average (a) raw and (b) second derivative transformed near-infrared spectra from 2551–1348 nm for control (grey), mock-inoculated (green), and inoculated (blue) Lemont rice seedlings at one-day postinoculation.
VSURF-selected bands. Variable selection using random forests- (VSURF-) selected bands at prediction and interpretation steps for the experiment containing three treatments (control, mock-inoculated, and inoculated seedlings). Prediction step variables used for support vector machine (SVM) classification models.
| Comparison | Selected bands (nm) | |
|---|---|---|
| Prediction step | Interpretation step | |
| All groups | 2442, 2054, 2427, 2153, 2356, 2165, 2225, 1982 | 2442, 2054, 2427, 2153, 2043, 2356, 2165, 2370, 2328, 2033, 2384, 2141, 2176, 2342, 2315, 2119, 2212, 2412, 2275, 2130, 2188, 2225, 2200, 2398, 2302, 2086, 2064, 2075, 2288, 2108, 1992, 2022, 1982 |
| Mock vs. inoculated | 2130, 2442, 2275, 2141, 2153, 2328, 2097, 2427, 2200 | 2130, 2442, 2275, 2141, 2153, 2176, 2328, 2165, 2108, 2288, 2119, 2097, 2427, 2200 |
| Control vs. inoculated | 2119, 2370, 2033, 2442 | 2119, 2370, 2033, 2442 |
SVM classification performance. Support vector machine (SVM) classification performance for the experiment containing three treatments (control, mock-inoculated, and inoculated seedlings).
| Comparison | Model | Data set | Accuracy | 10-fold CV accuracy | Proportion correctly classified | ||
|---|---|---|---|---|---|---|---|
| Control | Mock-inoculated | Inoculated | |||||
| All groups | Second derivative | Training | 0.822 | 0.708 | 0.825 | 0.786 | 0.854 |
| Testing | 0.733 | — | 0.788 | 0.600 | 0.811 | ||
| VSURF | Training | 0.830 | 0.664 | 0.775 | 0.810 | 0.899 | |
| Testing | 0.714 | — | 0.576 | 0.600 | 0.946 | ||
| Resampled | Training | 0.866 | 0.644 | 0.813 | 0.905 | 0.876 | |
| Testing | 0.657 | — | 0.606 | 0.600 | 0.757 | ||
| Mock vs. inoculated | Second derivative | Training | 1.000 | 0.757 | — | 1.000 | 1.000 |
| Testing | 0.806 | — | — | 0.829 | 0.784 | ||
| VSURF | Training | 0.890 | 0.832 | — | 0.810 | 0.966 | |
| Testing | 0.861 | — | — | 0.829 | 0.892 | ||
| Resampled | Training | 0.936 | 0.809 | — | 0.881 | 0.989 | |
| Testing | 0.847 | — | — | 0.800 | 0.892 | ||
| Control vs. inoculated | Second derivative | Training | 0.911 | 0.811 | 0.850 | — | 0.966 |
| Testing | 0.886 | — | 0.848 | — | 0.919 | ||
| VSURF | Training | 0.763 | 0.746 | 0.688 | — | 0.831 | |
| Testing | 0.643 | — | 0.485 | — | 0.784 | ||
| Resampled | Training | 0.710 | 0.704 | 0.650 | — | 0.764 | |
| Testing | 0.643 | — | 0.545 | — | 0.730 | ||
VSURF classification performance. Variable selection using random forests (VSURF) classification performance based on bands selected at prediction and interpretation steps (Table 3) for the experiment containing three treatments (control, mock-inoculated, and inoculated seedlings).
| Comparison | Model | Data set | Accuracy | Proportion correctly classified | ||
|---|---|---|---|---|---|---|
| Control | Mock-inoculated | Inoculated | ||||
| All groups | Prediction | Training | 1.000 | 1.000 | 1.000 | 1.000 |
| Testing | 0.562 | 0.515 | 0.457 | 0.703 | ||
| Interpretation | Training | 1.000 | 1.000 | 1.000 | 1.000 | |
| Testing | 0.600 | 0.485 | 0.457 | 0.838 | ||
| Mock vs. inoculated | Prediction | Training | 1.000 | — | 1.000 | 1.000 |
| Testing | 0.792 | — | 0.743 | 0.838 | ||
| Interpretation | Training | 1.000 | — | 1.000 | 1.000 | |
| Testing | 0.806 | — | 0.743 | 0.865 | ||
| Control vs. inoculated | Prediction | Training | 1.000 | 1.000 | — | 1.000 |
| Testing | 0.657 | 0.485 | — | 0.811 | ||
| Interpretation | Training | 1.000 | 1.000 | — | 1.000 | |
| Testing | 0.657 | 0.485 | — | 0.811 | ||
Figure 3ROC curves. Receiver operating characteristic (ROC) curves for training (left) and testing (right) sets for the SVM classification model for mock-inoculated and inoculated seedlings based on (a) second derivative transformed spectra, (b) spectral bands selected by VSURF, and (c) resampled spectra for the experiment containing control, mock-inoculated, and inoculated seedlings.
Figure 4ROC curves. Receiver operating characteristic (ROC) curves for training (left) and testing (right) sets for the SVM classification model for control and inoculated seedlings based on (a) second derivative transformed spectra, (b) spectral bands selected by VSURF, and (c) resampled spectra for the experiment containing control, mock-inoculated, and inoculated seedlings.
Prediction performance of sPLS-DA for control vs. inoculated samples. Sparse partial least squares discriminating analysis (sPLS-DA) prediction performance of the testing set. Bands (Table S6) selected during model calibration using the training set for experiment one (Exp. 1; experiment with two treatments) and experiment two (Exp. 2; experiment with three treatments).
| Experiment | No. components | BER | Proportion correctly classified | ||
|---|---|---|---|---|---|
| Control | Inoculated | Total | |||
| Exp. 1 (2 treatments) | 3 | 0.359 | 0.643 | 0.640 | 0.642 |
| Exp. 2 (3 treatments) | 4 | 0.362 | 0.545 | 0.730 | 0.643 |