| Literature DB >> 35967193 |
Shuaipeng Fei1, Muhammad Adeel Hassan2,3, Yonggui Xiao2, Xin Su4, Zhen Chen1, Qian Cheng1, Fuyi Duan1, Riqiang Chen5, Yuntao Ma6.
Abstract
Early prediction of grain yield helps scientists to make better breeding decisions for wheat. Use of machine learning (ML) methods for fusion of unmanned aerial vehicle (UAV)-based multi-sensor data can improve the prediction accuracy of crop yield. For this, five ML algorithms including Cubist, support vector machine (SVM), deep neural network (DNN), ridge regression (RR) and random forest (RF) were used for multi-sensor data fusion and ensemble learning for grain yield prediction in wheat. A set of thirty wheat cultivars and breeding lines were grown under three irrigation treatments i.e., light, moderate and high irrigation treatments to evaluate the yield prediction capabilities of a low-cost multi-sensor (RGB, multi-spectral and thermal infrared) UAV platform. Multi-sensor data fusion-based yield prediction showed higher accuracy compared to individual-sensor data in each ML model. The coefficient of determination (R 2) values for Cubist, SVM, DNN and RR models regarding grain yield prediction were observed from 0.527 to 0.670. Moreover, the results of ensemble learning through integrating the above models illustrated further increase in accuracy. The predictions of ensemble learning showed high R 2 values up to 0.692, which was higher as compared to individual ML models across the multi-sensor data. Root mean square error (RMSE), residual prediction deviation (RPD) and ratio of prediction performance to inter-quartile range (RPIQ) were calculated to be 0.916 t ha-1, 1.771 and 2.602, respectively. The results proved that low altitude UAV-based multi-sensor data can be used for early grain yield prediction using data fusion and an ensemble learning framework with high accuracy. This high-throughput phenotyping approach is valuable for improving the efficiency of selection in large breeding activities. Supplementary Information: The online version contains supplementary material available at 10.1007/s11119-022-09938-8.Entities:
Keywords: Data fusion; Machine learning; Phenotyping; Unmanned aerial vehicle; Wheat
Year: 2022 PMID: 35967193 PMCID: PMC9362526 DOI: 10.1007/s11119-022-09938-8
Source DB: PubMed Journal: Precis Agric ISSN: 1385-2256 Impact factor: 5.767
Fig. 1Experimental design
Fig. 2The profile of meteorological variables during the wheat growing season in 2019–2020 and the volume of sprinkler irrigation. a Sunshine duration, b max temperature, c precipitation, and d irrigation volume and time point for the three irrigation treatments. Meteorological data was gathered from local weather stations
Fig. 3Violin diagram and t-test of measured grain yield under three irrigation treatments. ***Indicates significant at the 0.001 level
Fig. 4UAV systems and integrated sensors. a DJI M210 platform, b cameras integrated in the UAV platform, c and d are the wheat growth status during multi-sensor data acquisition
Detailed parameters for the sensors installed on the UAV
| Camera name | Sensor type | Band | Wavelength | Image resolution |
|---|---|---|---|---|
| Red-Edge MX | Multi-spectral | Blue | 475 nm | 1280 × 960 |
| Green | 560 nm | 1280 × 960 | ||
| Red | 668 nm | 1280 × 960 | ||
| Red-edge | 717 nm | 1280 × 960 | ||
| Near infrared | 842 nm | 1280 × 960 | ||
| Zenmuse XT2 | Thermal | Thermal infrared | 7.5–13.5 μm | 640 × 512 |
| Zenmuse XT2 | RGB | R G B | – | 4000 × 3000 |
Fig. 5A workflow diagram of data acquisition, data processing, and feature extraction. MS multi-spectral images, TIR thermal infrared images, DSM digital surface model
Fig. 6a Scatter plot of estimated and measured temperature of calibration board, b violin diagram and t-test of estimated canopy temperature, c scatter plot of estimated and measured crop height, and d violin diagram and t-test of estimated crop height. ***Indicates significant at the 0.001 level. LI light irrigation, MI moderate irrigation, HI high irrigation
Definitions of the features derived from various sensors
| Sensor | Feature | Formulation | References |
|---|---|---|---|
| RGB | Color intensity | INT = | Ahmad and Reid ( |
| Kawashima index | IKAW = | Kawashima and Nakatani ( | |
| Principal component analysis index | IPCA = | Saberioon et al. ( | |
| Excess red index | ExR = | Meyer and Neto ( | |
| Excess green index | ExG = | Woebbecke et al., ( | |
| Excess green minus excess red index | ExGR = | Meyer and Neto ( | |
| Modified Green Red Vegetation Index | MGRVI = | Bendig et al. ( | |
| Red Green Blue Vegetation Index | RGBVI = | Bendig et al. ( | |
| Crop height | DSM-DEM | / | |
| Gray-level co-occurrence matrix | ME, VA, HO, CO, DI, EN, SE, COR | Haralick and Shanmugam ( | |
| MS | Normalized Difference Vegetation Index | NDVI = | Rouse et al ( |
| Green-NDVI | GNDVI = | Gitelson et al. ( | |
| Ratio Vegetation Index | RVI = | Tucker ( | |
| Normalized difference red-edge index | NDREI = | Barnes et al. ( | |
| Enhanced Vegetation Index | EVI = | Huete et al. ( | |
| Optimized Soil-Adjusted Vegetation Index | OSAVI = | Rondeaux et al. ( | |
| Modified chlorophyll absorption in reflectance index | MCARI = | Daughtry et al. ( | |
| Transformed chlorophyll absorption in reflectance index | TCARI = | Haboudane et al. ( | |
| Nitrogen Reflectance Index | NRI = | Schleicher et al. ( | |
| Transformational Vegetation Index | TVI = | Broge and Leblanc ( | |
| Modified Simple Ratio Index | MSR = | Chen ( | |
| Structure Insensitive Pigment Index | SIPI = | Penuelas et al. ( | |
| Plant Senescence Reflectance Index | PSRI = | Merzlyak et al. ( | |
| Chlorophyll Index Red-Edge | CIRE = | Gitelson et al. ( | |
| MCARI/OSAVI | MCARI/OSAVI | Daughtry et al. ( | |
| TCARI/OSAVI | TCARI/OSAVI | Haboudane et al. ( | |
| Gray-level co-occurrence matrix | ME, VA, HO, CO, DI, EN, SE, COR | Haralick and Shanmugam ( | |
| TIR | Canopy temperature | / | |
| Gray-level co-occurrence matrix | ME, VA, HO, CO, DI, EN, SE, COR | Haralick and Shanmugam ( |
Multi-spectral vegetation indices were calculated from the reflectance of each band, and the RGB vegetation indices were calculated from the DN value of each band
MS multi-spectral, TIR thermal infrared, DSM digital surface model, DEM digital elevation model, ME mean, VA variance, HO homogeneity, CO contrast, DI dissimilarity, EN entropy, SE second moment, CO correlation
Hyperparameters of machine learning methods
| Machine learning method | Hyperparameters |
|---|---|
| Cubist | Committees: 1 to 25; neighbors: 1 to 9 |
| SVM | Kernel function |
| DNN | Units: from 10 to 100 with increments of 10; epochs: from 2 to 120 with increments of 2; hidden layers: 1, 2, 3 and 4; regularization method: dropout; activation function: rectified linear activation unit function |
| RR | |
| RF |
SVM support vector machine, DNN deep neural network, RR ridge regression, RF random forest
Fig. 7A workflow of a multi-sensor data fusion and ensemble learning, and b outer and inner cross-validation. MS multi-spectral features, TIR thermal infrared features, SVM support vector machine, DNN deep neural network, RR ridge regression, RF random forest, CV cross-validation, P and p model predictions at different modeling stages
Test accuracy statistics of different models for grain yield prediction (the accuracy parameters in this table are the average of 400 test results)
| Sensor | Metric | Base learner | Secondary learner | ||||
|---|---|---|---|---|---|---|---|
| Cubist | SVM | DNN | RR | RF | StRR | ||
| RGB | 0.514 | 0.597 | 0.606 | 0.556 | 0.605 | 0.624 | |
| RMSE (t ha−1) | 1.149 | 1.046 | 1.045 | 1.103 | 1.034 | 1.016 | |
| RPD | 1.427 | 1.559 | 1.562 | 1.481 | 1.580 | 1.606 | |
| RPIQ | 2.081 | 2.275 | 2.279 | 2.160 | 2.306 | 2.345 | |
| MS | 0.498 | 0.502 | 0.489 | 0.477 | 0.509 | 0.532 | |
| RMSE (t ha−1) | 1.164 | 1.149 | 1.165 | 1.188 | 1.140 | 1.120 | |
| RPD | 1.400 | 1.413 | 1.393 | 1.368 | 1.424 | 1.449 | |
| RPIQ | 2.043 | 2.061 | 2.033 | 1.998 | 2.078 | 2.117 | |
| TIR | 0.529 | 0.553 | 0.578 | 0.563 | 0.599 | 0.617 | |
| RMSE (t ha−1) | 1.133 | 1.102 | 1.079 | 1.146 | 1.038 | 1.026 | |
| RPD | 1.449 | 1.486 | 1.517 | 1.440 | 1.579 | 1.594 | |
| RPIQ | 2.112 | 2.168 | 2.213 | 2.100 | 2.303 | 2.325 | |
| RGB + MS | 0.541 | 0.638 | 0.648 | 0.605 | 0.622 | 0.662 | |
| RMSE (t ha−1) | 1.114 | 0.982 | 0.990 | 1.043 | 1.009 | 0.960 | |
| RPD | 1.464 | 1.652 | 1.637 | 1.556 | 1.608 | 1.690 | |
| RPIQ | 2.145 | 2.422 | 2.401 | 2.284 | 2.360 | 2.479 | |
| RGB + TIR | 0.527 | 0.615 | 0.631 | 0.603 | 0.670 | 0.671 | |
| RMSE (t ha−1) | 1.143 | 1.028 | 1.015 | 1.045 | 0.955 | 0.951 | |
| RPD | 1.435 | 1.587 | 1.608 | 1.563 | 1.711 | 1.718 | |
| RPIQ | 2.093 | 2.315 | 2.346 | 2.280 | 2.498 | 2.508 | |
| MS + TIR | 0.543 | 0.591 | 0.596 | 0.592 | 0.629 | 0.640 | |
| RMSE (t ha−1) | 1.116 | 1.048 | 1.052 | 1.067 | 0.998 | 0.991 | |
| RPD | 1.463 | 1.553 | 1.546 | 1.524 | 1.633 | 1.643 | |
| RPIQ | 2.137 | 2.269 | 2.259 | 2.228 | 2.388 | 2.402 | |
| RGB + MS + TIR | 0.563 | 0.666 | 0.670 | 0.630 | 0.665 | 0.692 | |
| RMSE (t ha−1) | 1.092 | 0.949 | 0.964 | 1.006 | 0.956 | 0.916 | |
| RPD | 1.494 | 1.709 | 1.681 | 1.612 | 1.698 | 1.771 | |
| RPIQ | 2.193 | 2.511 | 2.470 | 2.369 | 2.496 | 2.602 | |
MS multi-spectral features, TIR thermal infrared features, SVM support vector machine, DNN deep neural network, RR ridge regression, RF random forest, StRR stacking regression using ridge regression as a secondary learner
Fig. 8The statistical distribution of the prediction accuracy of individual machine learning and ensemble learning for grain yield prediction using individual sensor data in the modeling test phase. MS multi-spectral features, TIR thermal infrared features, SVM support vector machine, DNN deep neural network, RR ridge regression, RF random forest, StRR stacking regression using ridge regression as a secondary learner
Fig. 9The statistical distributions of the prediction accuracy of individual machine learning and ensemble learning for grain yield prediction using multi-sensor data fusion in the modeling test phase. MS multi-spectral features, TIR thermal infrared features, SVM support vector machine, DNN deep neural network, RR ridge regression, RF random forest, StRR stacking regression using ridge regression as a secondary learner
Fig. 10The distribution of coefficients within the level-2 models (ridge regression). MS multi-spectral features, TIR thermal infrared features, SVM support vector machine, DNN deep neural network, RR ridge regression, RF random forest, StRR stacking regression using ridge regression as a secondary learner
Fig. 11Comparison of accuracy improvement (R2) of ensemble learning and data fusion. MS multi-spectral features, TIR thermal infrared features, DNN deep neural network, RF random forest, StRR stacking regression using ridge regression as a secondary learner
Fig. 12Variance inflation factor (VIF) for the output predictions of a each individual sensor b each machine learning model. MS multi-spectral features, TIR thermal infrared features, SVM support vector machine, DNN deep neural network, RR ridge regression, RF random forest, StRR stacking regression using ridge regression as a secondary learner
Fig. 13Spatial distribution of predicted grain yield (t ha−1) at the plot scale using multi-sensor fusion and ensemble learning. ***Indicates significant at the 0.001 level. MS multi-spectral features, TIR thermal infrared features