| Literature DB >> 35591275 |
Bar Cohen1,2, Yael Edan2, Asher Levi1, Victor Alchanatis1.
Abstract
Agricultural industry is facing a serious threat from plant diseases that cause production and economic losses. Early information on disease development can improve disease control using suitable management strategies. This study sought to detect downy mildew (Peronospora) on grapevine (Vitis vinifera) leaves at early stages of development using thermal imaging technology and to determine the best time during the day for image acquisition. In controlled experiments, 1587 thermal images of grapevines grown in a greenhouse were acquired around midday, before inoculation, 1, 2, 4, 5, 6, and 7 days after an inoculation. In addition, images of healthy and infected leaves were acquired at seven different times during the day between 7:00 a.m. and 4:30 p.m. Leaves were segmented using the active contour algorithm. Twelve features were derived from the leaf mask and from meteorological measurements. Stepwise logistic regression revealed five significant features used in five classification models. Performance was evaluated using K-folds cross-validation. The support vector machine model produced the best classification accuracy of 81.6%, F1 score of 77.5% and area under the curve (AUC) of 0.874. Acquiring images in the morning between 10:40 a.m. and 11:30 a.m. resulted in 80.7% accuracy, 80.5% F1 score, and 0.895 AUC.Entities:
Keywords: biotic stress; classification; disease detection; fungal infection; pre-symptomatic diagnosis; precision agriculture; viticulture
Mesh:
Year: 2022 PMID: 35591275 PMCID: PMC9104212 DOI: 10.3390/s22093585
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
The experimental schedule.
| Campaign | Date | Days after | Number of Infected Leaf Samples | Number of Healthy Leaf Samples |
|---|---|---|---|---|
| 1 | 30 December 2019 | 1 | 71 | 17 |
| 1 | 31 December 2019 | 2 | 74 | - |
| 2 | 16 January 2020 | 4 | 60 | - |
| 3 | 26 January 2020 | 7 | 52 | - |
| 4 | 3 March 2020 | 2 | 94 | 85 |
| 4 | 5 March 2020 | 4 | 86 | - |
| 4 | 8 March 2020 | 7 | 86 | - |
| 5 | 26 March 2020 | 0 | - | 323 |
| 5 | 2 April 2020 | 4 | 101 | - |
| 6 | 25 October 2020 | 4 | 45 | 45 |
| 6 | 26 October 2020 | 5 | 45 | 45 |
| 6 | 27 October 2020 | 6 | 45 | 43 |
| 6 | 28 October 2020 | 7 | 45 | 41 |
Diurnal acquisition: Time of image acquisition and number of samples.
| Round Number | Acquisition Time | Number of Samples |
|---|---|---|
| 1 | 7:15–8:25 | 88 |
| 2 | 9:00–9:45 | 88 |
| 3 | 10:40–11:30 | 88 |
| 4 | 12:25–13:15 | 87 |
| 5 | 14:15–15:05 | 86 |
| 6 | 15:20–16:00 | 86 |
| 7 | 16:00–16:30 | 52 |
Features description.
| Variable Name | Description | Range | Symbol | Calculation |
|---|---|---|---|---|
| Minimum temperature | The minimum temperature in the leaf, minus the air temperature measured at the same time | (−6.3)–10.7 | Tmin | Tmin-Tair |
| Maximum temperature | The maximum temperature in the leaf, minus the air temperature measured at the same time | (−4.2)–14.6 | Tmax | Tmax-Tair |
| Average temperature | The average of the leaf temperatures values, minus the air temperature measured at the same time | (−5.11)–12.98 | Tavg | Tavg-Tair |
| Median temperature | The median of the leaf temperatures values, minus the air temperature measured at the same time | (−5.06)–13.11 | median | median-Tair |
| Maximum temperature difference | The difference between the maximum and minimum temperature in the leaf | 0.5–7.1 | MTD | Tmax-Tmin |
| Standard deviation | The standard deviation value of the leaf temperature values | 0.1–1.73 | STD | std |
| Interquartile range | A measure of statistical dispersion and equal to the difference between 75th and 25th percentiles | 0.17–3.28 | IQR | T0.75-T0.25 |
| Mean absolute deviation | A robust measure of the variability. Defined as the mean of the absolute deviations from the mean of the data | 0.1–1.53 | MAD |
|
| Coefficient of variation | Or relative standard deviation, a standardized measure of the dispersion of a probability distribution or frequency distribution. | 0.004–0.061 | CV |
|
| Percentile 10 | The percentile is a score at or below which a given percentage fall, minus the air temperature measured | (−5.9)–11.9 | perc10 | T0.1-Tair |
| Percentile 90 | (−4.8)–13.9 | perc90 | T0.9-Tair | |
| Crop water stress index | A means of irrigation scheduling and crop water stress quantification based on leaf temperature measurements and prevailing meteorological conditions [ | 0.37–1.53 | CWSI |
|
Figure 1The output of the active contour algorithm. (A) includes image, initial, and final mask; (B) image with final mask.
Figure 2Examples of RGB images (top row) and thermal images (bottom row) of different leaves from different infected days.
Stepwise regression-estimated coefficients, standard errors, and p-value.
| Variable | Estimated Coefficients | Standard Errors | |
|---|---|---|---|
| MTD | 0.6543 | 0.1782 | 0.00024 |
| STD | −2.4373 | 1.1323 | 0.03136 |
| CV | 79.2226 | 24.7641 | 0.00138 |
| percentile 90 | 0.2709 | 0.0388 | 3.06 × 10−12 |
| CWSI | 1.6405 | 0.4575 | 0.00034 |
Hyperparameters, search range, and selected optimal value.
| Model | Hyperparameter | Range | Optimal |
|---|---|---|---|
| Decision Tree | Maximum number of splits | [1, 1011] | 17 |
| Split criterion | Gini’s diversity index, Twoing rule, and Maximum deviance reduction | Maximum deviance reduction | |
| Naive Bayes | Distribution names | Gaussian and Kernel | Kernel |
| Kernel type | Gaussian, Box, Epanechnikov, and Triangle | Box | |
| SVM | Kernel function | Gaussian, Linear, Quadratic, and Cubic | Cubic |
| Box constraint level | [0.001, 1000] | 1 | |
| Ensemble | Ensemble method | AdaBoost, RUSBoost, LogitBoost, GentleBoost, and Bag | GentleBoost |
| Maximum number of splits | [1, 1011] | 960 | |
| Number of learners | [10, 500] | 498 | |
| Learning rate | [0.001, 1] | 0.057385 |
All results from all models based on all data.
| Model Measure | Decision Tree | Logistic Regression | NB | SVM | Ensemble |
|---|---|---|---|---|---|
| F1 score | 60.5% | 64.9% | 66.9% | 77.5% | 66.7% |
| Precision | 70.5% | 70.8% | 70.4% | 83.1% | 69.3% |
| Recall | 53.1% | 59.9% | 64.2% | 71.6% | 64.4% |
| AUC | 0.728 | 0.762 | 0.782 | 0.874 | 0.782 |
| Accuracy | 69.9% | 71.7% | 72.6% | 81.6% | 72% |
Figure 3The ROC curve of each model.
Results by day after infection.
| Days after Inoculation | Number of Samples | Number of Misses | Accuracy |
|---|---|---|---|
| 0 | 571 | 65 | 88.6% |
| 1 | 19 | 2 | 89.5% |
| 2 | 61 | 3 | 95.1% |
| 4 | 180 | 55 | 69.4% |
| 5 | 39 | 4 | 89.7% |
| 6 | 44 | 17 | 61.4% |
| 7 | 98 | 40 | 59.2% |
| Model | - | - | 81.6% |
Figure 4Diagram representing the logical flow of the different analyses.
Results after balancing healthy and infected samples (Exp1a).
| Days after Inoculation | Number of Samples | Number of Misses | Accuracy |
|---|---|---|---|
| 0 | 441 | 68 | 84.6% |
| 1 | 19 | 1 | 94.7% |
| 2 | 61 | 5 | 91.8% |
| 4 | 180 | 55 | 69.4% |
| 5 | 39 | 5 | 87.2% |
| 6 | 44 | 14 | 68.2% |
| 7 | 98 | 36 | 63.3% |
| Model | - | - | 79.1% |
Results of a dataset with days 0, 1, 2 after inoculation (Exp3a).
| Days after Inoculation | Number of Samples | Number of Misses | Accuracy |
|---|---|---|---|
| 0 | 99 | 10 | 89.9% |
| 1 | 21 | 1 | 95.2% |
| 2 | 78 | 5 | 93.6% |
| Model | - | - | 91.9% |
Results of a dataset with days 0, 4, 5, 6, 7 after inoculation (Exp3b).
| Days after Inoculation | Number of Samples | Number of Misses | Accuracy |
|---|---|---|---|
| 0 | 399 | 72 | 81.9% |
| 4 | 197 | 54 | 72.6% |
| 5 | 45 | 5 | 88.9% |
| 6 | 45 | 9 | 80% |
| 7 | 112 | 27 | 75.9% |
| Model | - | - | 79.1% |
Figure 5Confusion matrix of ordinal regression.
A summary of all approaches and their results.
| Approach | F1 Score | AUC | Accuracy |
|---|---|---|---|
| SVM—all data | 77.5% | 0.874 | 81.6% |
| Balance between healthy and infected (Exp1a) | 77.9% | 0.86 | 79.1% |
| Balance between the infected days (Exp1b) | 71% | 0.756 | 73.8% |
| Each imaging day’s data—80% training set and 20% test set (Exp2a) | 70.8% | 0.827 | 76.5% |
| Experiment 10446 as test (Exp2b) | 48.9% | 0.593 | 57.8% |
| Days 0,1,2 (Exp3a) | 92.1% | 0.961 | 91.9% |
| Days 0,4,5,6,7 (Exp3b) | 78.4% | 0.856 | 79.1% |
| As ordinal instead of binary (Exp4) | - | - | 74.9% |
Figure 6Thermal images from infected and healthy leaves acquired at different times of the day.
Performance for each round of the diurnal measurements.
| Measure | |||
|---|---|---|---|
| Round No./Time | Accuracy | F1 Score | AUC |
| (1) 7:15–8:25 | 75% | 75.6% | 0.774 |
| (2) 9:00–9:45 | 72.7% | 72.7% | 0.794 |
| (3) 10:40–11:30 | 80.7% | 80.5% | 0.895 |
| (4) 12:25–13:15 | 59.8% | 61.5% | 0.676 |
| (5) 14:15–15:05 | 65.1% | 67.4% | 0.691 |
| (6) 15:20–16:00 | 58.1% | 61.7% | 0.644 |
| (7) 16:00–16:30 | 57.7% | 59.3% | 0.557 |
Results of a dataset from 10:40 a.m. to 11:30 a.m. and after a new feature selection.
| Model | Number of Samples | F1 score | AUC | Accuracy |
|---|---|---|---|---|
| Hours 10:40–11:30 | 239 | 72.7% | 0.764 | 67.4% |
| New features | 239 | 80.8% | 0.826 | 76.6% |