| Literature DB >> 34266447 |
Kenichi Tatsumi1, Noa Igarashi2, Xiao Mengxue3.
Abstract
BACKGROUND: The objective of this study is twofold. First, ascertain the important variables that predict tomato yields from plant height (PH) and vegetation index (VI) maps. The maps were derived from images taken by unmanned aerial vehicles (UAVs). Second, examine the accuracy of predictions of tomato fresh shoot masses (SM), fruit weights (FW), and the number of fruits (FN) from multiple machine learning algorithms using selected variable sets. To realize our objective, ultra-high-resolution RGB and multispectral images were collected by a UAV on ten days in 2020's tomato growing season. From these images, 756 total variables, including first- (e.g., average, standard deviation, skewness, range, and maximum) and second-order (e.g., gray-level co-occurrence matrix features and growth rates of PH and VIs) statistics for each plant, were extracted. Several selection algorithms (i.e., Boruta, DALEX, genetic algorithm, least absolute shrinkage and selection operator, and recursive feature elimination) were used to select the variable sets useful for predicting SM, FW, and FN. Random forests, ridge regressions, and support vector machines were used to predict the yield using the top five selected variable sets.Entities:
Keywords: Gray-level co-occurrence matrix; Machine learning; Plant-level; Tomato yield prediction; Unmanned aerial vehicle
Year: 2021 PMID: 34266447 PMCID: PMC8281694 DOI: 10.1186/s13007-021-00761-2
Source DB: PubMed Journal: Plant Methods ISSN: 1746-4811 Impact factor: 4.993
Fig. 1Experimental field used in this study. Field is located in Tokyo, Japan. This orthomosaic image was created using unmanned aerial vehicle images taken on June 18
Fig. 2The UAV and sensor system utilized
Selected gray-level co-occurrence matrix (GLCM) texture measures and their abbreviations and equations
| GLCM feature | Abbreviation | Formula |
|---|---|---|
| Sum average | SA | |
| Entropy | Ent | |
| Difference entropy | DE | |
| Sum entropy | SE | |
| Variance | Var | |
| Difference variance | DV | |
| Sum variance | SV | |
| Angular second moment (uniformity) | ASM | |
| Inverse difference moment | IDM | |
| Contrast | Con | |
| Correlation | Cor | |
| Information measure of correlation-1 | MOC-1 | |
| Information measure of correlation-2 | MOC-2 |
N is the number of gray levels, is the normalized symmetric GLCM dimension, is GLCM value on element (i, j). Other variables were calculated as shown in Additional file 1
Fig. 3Spatial multitemporal plant height (m). a May 24, b May 30, c June 5, d June 11, e June 18, f June 26, g July 2, h July 12, i July 16, j July 24, 2020
Fig. 4Spatial multitemporal green normalized difference vegetation index (GNDVI) (−). a May 24, b May 30, c June 6, d June 11, e June 18, f June 26, g July 2, h July 12, i July 16, j July 24 on 2020
Fig. 5Temporal change of (a) plant height and (b) three vegetation indices of tomato plants during the growing period. The unmanned aerial vehicle images collected on May 14 and May 18 are not used in the analysis
Top five variables selected from first-order statistics and all variables by Boruta, DALEX, genetic algorithm (GA), least absolute shrinkage and selection operator (LASSO), and recursive feature elimination (RFE) for shoot mass (SM)
| Rank | From first-order statistics | From first- and second-order statistics | ||||
|---|---|---|---|---|---|---|
| Map | Variable | Date | Map | Variable | Date | |
| Boruta | ||||||
| 1 | Plant height | AVE | 0626 | Plant height | MOC-1 | 0712 |
| 2 | GNDVI | RANGE | 0716 | NDVI | SV | 0712 |
| 3 | Plant height | MAX | 0720 | NDVI | DV | 0712 |
| 4 | GNDVI | AVE | 0712 | Plant height | AVE | 0702 |
| 5 | NDVI | SD | 0712 | GNDVI | DV | 0724 |
| DALEX | ||||||
| 1 | Plant height | AVE | 0626 | NDVI | SV | 0712 |
| 2 | Plant height | MAX | 0702 | Plant height | SE | 0712 |
| 3 | Plant height | AVE | 0702 | NDVI | SE | 0712 |
| 4 | GNDVI | MAX | 0530 | NDVI | DV | 0712 |
| 5 | GNDVI | RANGE | 0530 | Plant height | Ent | 0626 |
| GA | ||||||
| 1 | Plant height | RANGE | 0618 | Plant height | RANGE | 0712 |
| 2 | Plant height | AVE | 0626 | NDVI | SV | 0712 |
| 3 | GNDVI | RANGE | 0716 | NDVI | IDM | 0702 |
| 4 | WDVI | MAX | 0724 | NDVI | DV | 0712 |
| 5 | NDVI | SD | 0712 | GNDVI | MOC-2 | 0724 |
| LASSO | ||||||
| 1 | Plant height | AVE | 0626 | Plant height | AVE | 0626 |
| 2 | GNDVI | RANGE | 0716 | GNDVI | DV | 0724 |
| 3 | NDVI | MAX | 0716 | NDVI | SV | 0716 |
| 4 | Plant height | MAX | 0702 | GNDVI | Con | 0618 |
| 5 | Plant height | SKEW | 0605 | Plant height | MAX | 0702 |
| RFE | ||||||
| 1 | Plant height | AVE | 0626 | NDVI | SV | 0712 |
| 2 | NDVI | MAX | 0716 | Plant height | MOC-1 | 0712 |
| 3 | Plant height | MAX | 0702 | NDVI | DV | 0712 |
| 4 | Plant height | AVE | 0702 | NDVI | MAX | 0716 |
| 5 | NDVI | SD | 0712 | GNDVI | DV | 0724 |
Top five variables selected from first-order statistics and all variables by Boruta, DALEX, genetic algorithm (GA), least absolute shrinkage and selection operator (LASSO), and recursive feature elimination (RFE) for fruit weight (FW)
| Rank | From first-order statistics | From first- and second-order statistics | ||||
|---|---|---|---|---|---|---|
| Map | Variable | Date | Map | Variable | Date | |
| Boruta | ||||||
| 1 | WDVI | RANGE | 0618 | WDVI | RANGE | 0618 |
| 2 | NDVI | AVE | 0618 | NDVI | AVE | 0618 |
| 3 | WDVI | AVE | 0618 | WDVI | AVE | 0618 |
| 4 | NDVI | AVE | 0626 | WDVI | SA | 0618 |
| 5 | GNDVI | AVE | 0626 | NDVI | AVE | 0626 |
| DALEX | ||||||
| 1 | WDVI | AVE | 0618 | WDVI | SA | 0618 |
| 2 | NDVI | AVE | 0724 | NDVI | AVE | 0626 |
| 3 | WDVI | RANGE | 0618 | NDVI | AVE | 0618 |
| 4 | Plant height | RANGE | 0618 | WDVI | RANGE | 0618 |
| 5 | NDVI | AVE | 0618 | GNDVI | IDM | 0712 |
| GA | ||||||
| 1 | WDVI | RANGE | 0618 | NDVI | IDM | 0716 |
| 2 | NDVI | MAX | 0606 | WDVI | RANGE | 0618 |
| 3 | NDVI | AVE | 0618 | GNDVI | SE | 0724 |
| 4 | NDVI | SD | 0716 | Plant height | Growth Rate | 0530–0605 |
| 5 | NDVI | SD | 0524 | WDVI | MAX | 0606 |
| LASSO | ||||||
| 1 | NDVI | AVE | 0618 | GNDVI | Con | 0618 |
| 2 | Plant height | MAX | 0724 | Plant height | MAX | 0724 |
| 3 | NDVI | RANGE | 0724 | WDVI | SA | 0626 |
| 4 | NDVI | RANGE | 0524 | NDVI | AVE | 0626 |
| 5 | Plant height | SKEW | 0712 | NDVI | Cor | 0712 |
| RFE | RFE | |||||
| 1 | NDVI | AVE | 0618 | NDVI | AVE | 0618 |
| 2 | WDVI | RANGE | 0618 | WDVI | RANGE | 0618 |
| 3 | WDVI | AVE | 0618 | WDVI | AVE | 0618 |
| 4 | – | – | – | NDVI | AVE | 0626 |
| 5 | – | – | – | WDVI | SA | 0618 |
Top five variables selected from first-order statistics and all variables by Boruta, DALEX, genetic algorithm (GA), least absolute shrinkage and selection operator (LASSO), and recursive feature elimination (RFE) for number of fruit (FN)
| Rank | From first-order statistics | From first- and second-order statistics | ||||
|---|---|---|---|---|---|---|
| Map | Variable | Date | Map | Variable | Date | |
| Boruta | ||||||
| 1 | NDVI | AVE | 0626 | WDVI | RANGE | 0618 |
| 2 | GNDVI | AVE | 0626 | NDVI | AVE | 0618 |
| 3 | NDVI | MAX | 0618 | WDVI | AVE | 0618 |
| 4 | GNDVI | MAX | 0611 | WDVI | SA | 0618 |
| 5 | GNDVI | SD | 0626 | NDVI | AVE | 0626 |
| DALEX | ||||||
| 1 | NDVI | RANGE | 0606 | WDVI | IDM | 0618 |
| 2 | NDVI | AVE | 0626 | NDVI | RANGE | 0626 |
| 3 | NDVI | AVE | 0524 | NDVI | AVE | 0618 |
| 4 | NDVI | MAX | 0618 | WDVI | AVE | 0618 |
| 5 | NDVI | MAX | 0618 | GNDVI | SA | 0712 |
| GA | ||||||
| 1 | WDVI | SD | 0712 | NDVI | IDM | 0716 |
| 2 | NDVI | SD | 0524 | WDVI | RANGE | 0618 |
| 3 | WDVI | MAX | 0606 | GNDVI | AVE | 0724 |
| 4 | GNDVI | AVE | 0626 | Plant height | Growth rate | 0530–0605 |
| 5 | GNDVI | RANGE | 0524 | WDVI | MAX | 0606 |
| LASSO | ||||||
| 1 | GNDVI | MAX | 0611 | GNDVI | Con | 0618 |
| 2 | GNDVI | AVE | 0626 | Plant height | MAX | 0724 |
| 3 | WDVI | SD | 0606 | WDVI | SA | 0626 |
| 4 | GNDVI | AVE | 0712 | NDVI | AVE | 0626 |
| 5 | NDVI | MAX | 0606 | NDVI | Cor | 0712 |
| RFE | ||||||
| 1 | GNDVI | AVE | 0626 | NDVI | AVE | 0618 |
| 2 | NDVI | AVE | 0626 | WDVI | RANGE | 0618 |
| 3 | NDVI | MAX | 0618 | WDVI | AVE | 0618 |
| 4 | NDVI | MAX | 0606 | NDVI | AVE | 0626 |
| 5 | NDVI | SD | 0626 | WDVI | SA | 0618 |
Fig. 6Correlations between observed and simulated plant weight: a random forest (RF) with selected variables from first-order statistics. b RF with selected variables from first- and second-order statistics. c ridge regression (RI) with selected variables from first-order statistics. d RI with selected variables from first- and second-order statistics. e support vector machine (SVM) with selected variables from first-order statistics. f SVM with selected variables from first- and second-order statistics
Fig. 7Correlations between observed and simulated fruit weight. a Random forest (RF) with selected variables from first-order statistics. b RF with selected variables from first- and second-order statistics. c Ridge regression (RI) with selected variables from first-order statistics. d RI with selected variables from first- and second-order statistics. e Support vector machine (SVM) with selected variables from first-order statistics. f SVM with selected variables from first- and second-order statistics
Fig. 8Correlations between observed and simulated number of fruits. a Random forest (RF) with selected variables from first-order statistics. b RF with selected variables from first- and second-order statistics. c Ridge regression (RI) with selected variables from first-order statistics. d RI with selected variables from first- and second-order statistics. e support vector machine (SVM) with selected variables from first-order statistics. f SVM with selected variables from first- and second-order statistics
relative Root mean square error (rRMSE) value of tomato shoot mass (SM), fruit weight (FW), and number of fruits (FN) using random forest (RF), ridge regression (RI), and support vector machine (SVM) models with selected variables set from first-order statistics and all variables
| Model | From first-order statistics | ||||
|---|---|---|---|---|---|
| Boruta | DALEX | GA | LASSO | RFE | |
| SM [kg plant−1] | |||||
| RF | 17.8 | 22.2 | 16.9 | 22.5 | 22.9 |
| RI | 26.4 | 26.7 | 24.9 | 30.6 | 26.7 |
| SVM | 17.6 | 18.9 | 16.7 | 21.4 | 21.8 |
| FW [kg plant−1] | |||||
| RF | 14.0 | 13.9 | 13.2 | 15.7 | 24.1 |
| RI | 49.6 | 48.5 | 48.5 | 50.1 | 48.0 |
| SVM | 14.3 | 14.5 | 14.6 | 18.7 | 15.9 |
| FN [piece plant−1] | |||||
| RF | 12.6 | 14.2 | 10.0 | 12.4 | 14.2 |
| RI | 30.4 | 25.5 | 30.3 | 18.1 | 21.2 |
| SVM | 13.1 | 14.2 | 13.5 | 13.0 | 13.6 |