| Literature DB >> 35807648 |
Nuzhat Khan1, Mohamad Anuar Kamaruddin1, Usman Ullah Sheikh2, Mohd Hafiz Zawawi3, Yusri Yusup1, Muhammed Paend Bakht2,4, Norazian Mohamed Noor5.
Abstract
Current development in precision agriculture has underscored the role of machine learning in crop yield prediction. Machine learning algorithms are capable of learning linear and nonlinear patterns in complex agro-meteorological data. However, the application of machine learning methods for predictive analysis is lacking in the oil palm industry. This work evaluated a supervised machine learning approach to develop an explainable and reusable oil palm yield prediction workflow. The input data included 12 weather and three soil moisture parameters along with 420 months of actual yield records of the study site. Multisource data and conventional machine learning techniques were coupled with an automated model selection process. The performance of two top regression models, namely Extra Tree and AdaBoost was evaluated using six statistical evaluation metrics. The prediction was followed by data preprocessing and feature selection. Selected regression models were compared with Random Forest, Gradient Boosting, Decision Tree, and other non-tree algorithms to prove the R2 driven performance superiority of tree-based ensemble models. In addition, the learning process of the models was examined using model-based feature importance, learning curve, validation curve, residual analysis, and prediction error. Results indicated that rainfall frequency, root-zone soil moisture, and temperature could make a significant impact on oil palm yield. Most influential features that contributed to the prediction process are rainfall, cloud amount, number of rain days, wind speed, and root zone soil wetness. It is concluded that the means of machine learning have great potential for the application to predict oil palm yield using weather and soil moisture data.Entities:
Keywords: crop yield; machine learning; oil palm; precision agriculture; prediction; sustainability
Year: 2022 PMID: 35807648 PMCID: PMC9268852 DOI: 10.3390/plants11131697
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
Figure 1Study area.
A detailed summary of input data for yield modeling.
| Category | Variable | Spatial Resolution | Temporal Resolution | Time Coverage | Source |
|---|---|---|---|---|---|
| Crop data | Yield (t/h) | NA | 1 Month | 1986–2020 | MPOB |
| Soil moisture data | Surface soil wetness | 10 m | 1 Month | 1986–2020 | NASA |
| Soil moisture data | Profile soil wetness | 10 m | 1 Month | 1986–2020 | NASA |
| Soil moisture data | Root zone soil wetness (%) | 10 m | 1 Month | 1986–2020 | NASA |
| Meteorological data | Cloud amount | NA | 1 Month | 1986–2020 | NASA |
| Meteorological data | Rain days/month | NA | 1 Month | 1986–2020 | MET |
| Meteorological data | Wind speed (m/s) | 10 m | 1 Month | 1986–2020 | NASA |
| Meteorological data | Rainfall (mm) | 10 m | 1 Month | 1986–2020 | MET |
| Meteorological data | Radiative flux (kW/h) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Min temp (°C) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Max temp (°C) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Earth skin temp (°C) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Temperature range (°C) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Surface pressure (kpa) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Relative humidity (%) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Specific humidity (%) | 2 m | 1 Month | 1986–2020 | NASA/MET |
| Meteorological data | Precipitation (mm) | 2 m | 1 Month | 1986–2020 | NASA/MET |
Figure 2Proposed workflow.
Figure 3Schematic diagram of the Extra Tree Regressor.
Figure 4Schematic diagram of the AdaBoost Regressor.
Figure 5Feature importance plot of Extra Tree.
Figure 6Feature importance plot of AdaBoost.
Figure 7The residual plot of Extra Tree.
Figure 8The residual plot of AdaBoost.
Figure 9Prediction error of (a) Extra Tree; (b) AdaBoost.
Figure 10Learning curve of of (a) Extra Tree; (b) AdaBoost.
Figure 11Cross validation of (a) Extra Tree; (b) AdaBoost.
Figure 12Prediction of oil palm yield by Extra Tree.
Figure 13Prediction of oil palm yield by AdaBoost.
Performance comparison of tree-based models.
| Model | MAE | MSE | RMSE | R2 | RMSLE | MAPE |
|---|---|---|---|---|---|---|
| Extra Tree | 0.1562 | 0.0405 | 0.2013 | 0.6057 | 0.0788 | 0.106 |
| AdaBoost | 0.1602 | 0.038 | 0.1951 | 0.63 | 0.0779 | 0.1073 |
| Random Forest | 0.1815 | 0.0534 | 0.2279 | 0.3894 | 0.0922 | 0.1289 |
| Decision Tree | 0.2505 | 0.1018 | 0.3161 | −0.2015 | 0.1273 | 0.1750 |
| Gradient Boosting | 0.1836 | 0.0545 | 0.2309 | 0.3748 | 0.0931 | 0.1301 |
Figure 14KPI based performance comparison of different models.