| Literature DB >> 35310645 |
Ali Mokhtar1,2,3, Wessam El-Ssawy1,4, Hongming He2,3, Nadhir Al-Anasari5, Saad Sh Sammen6, Yeboah Gyasi-Agyei7, Mohamed Abuarab1.
Abstract
Prediction of crop yield is an essential task for maximizing the global food supply, particularly in developing countries. This study investigated lettuce yield (fresh weight) prediction using four machine learning (ML) models, namely, support vector regressor (SVR), extreme gradient boosting (XGB), random forest (RF), and deep neural network (DNN). It was cultivated in three hydroponics systems (i.e., suspended nutrient film technique system, pyramidal aeroponic system, and tower aeroponic system), which interacted with three different magnetic unit strengths under a controlled greenhouse environment during the growing season in 2018 and 2019. Three scenarios consisting of the combinations of input variables (i.e., leaf number, water consumption, dry weight, stem length, and stem diameter) were assessed. The XGB model with scenario 3 (all input variables) yielded the lowest root mean square error (RMSE) of 8.88 g followed by SVR with the same scenario that achieved 9.55 g, and the highest result was by RF with scenario 1 (i.e., leaf number and water consumption) that achieved 12.89 g. All model scenarios having Scatter Index (SI) (i.e., RMSE divided by the average values of the observed yield) values less than 0.1 were classified as excellent in predicting fresh lettuce yield. Based on all of the performance statistics, the two best models were SVR with scenario 3 and DNN with scenario 2 (i.e., leaf number, water consumption, and dry weight). However, DNN with scenario 2 requiring less input variables is preferred. The potential of the DNN model to predict fresh lettuce yield is promising, and it can be applied on a large scale as a rapid tool for decision-makers to manage crop yield.Entities:
Keywords: DNN; deep learning; food safety 2; machine learning; yield prediction
Year: 2022 PMID: 35310645 PMCID: PMC8928436 DOI: 10.3389/fpls.2022.706042
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
FIGURE 1Components of the experimental setup. (A) Photograph. (B) Computer graphics.
FIGURE 2Flowchart of the treatments implemented and models applied.
Descriptive statistical analysis of the collected data.
| Mean | Max | Min | SD | Q1 | Q3 | |
| Stem diameter | 22.05 | 28.20 | 17.00 | 2.84 | 19.98 | 23.98 |
| Leaf number | 26.88 | 37.00 | 21.00 | 3.51 | 24.00 | 29.00 |
| Stem length | 41.15 | 52.00 | 32.00 | 4.28 | 38.00 | 43.00 |
| Dry weight | 18.20 | 27.90 | 13.10 | 3.17 | 16.25 | 19.05 |
| water/area | 0.32 | 0.42 | 0.25 | 0.05 | 0.26 | 0.34 |
| Fresh head weight | 329.81 | 416.20 | 275.20 | 36.48 | 301.73 | 346.10 |
Summary of the combination of the input variables for the applied models.
| Scenario | Model | Input variables combination | |||
| 1 | SVR1 | XGB1 | RF1 | DNN1 | Leaf number, water consumption |
| 2 | SVR2 | XGB2 | RF2 | DNN2 | Leaf number, water consumption, dry weight |
| 3 | SVR3 | XGB3 | RF3 | DNN3 | Leaf number, water consumption, dry weight, stem length, stem diameter |
SVR1, XGB1, RF1, and DNN1 for the first scenario, 2 is the second scenario, and 3 is the third scenario.
FIGURE 3The performance statistics values for different model scenarios.
FIGURE 4Taylor diagram displaying a statistical comparison of the applied models used for predicting fresh head weight (yield).
FIGURE 5Boxplots showing the distribution of the estimation errors in the test stage for support vector regressor (SVR), extreme gradient boosting (XGB), deep neural network (DNN), and random forest (RF) models. Q25, lower quartile of errors; Q75, upper quartile of errors; IQR, interquartile range for each model.
The performance statistics of support vector regressor (SVR), extreme gradient boosting (XGB), deep neural network (DNN), and random forest (RF) models for lettuce.
| Model | Scenario | SI | T | U95 | MBE |
| SVR | 1 | 0.035 | 0.647 | 31.90 | 1.59 |
| 2 | 0.032 | 0.015 | 29.35 | 0.034 | |
| 3 | 0.029 | 1.600 | 26.10 | 3.10 | |
| XGB | 1 | 0.051 | 0.780 | 46.80 | 2.84 |
| 2 | 0.031 | 0.110 | 28.70 | –0.25 | |
| 3 | 0.027 | 0.540 | 24.80 | 1.04 | |
| DNN | 1 | 0.037 | 0.630 | 34.50 | 1.70 |
| 2 | 0.033 | 1.650 | 30.30 | 3.80 | |
| 3 | 0.035 | 1.630 | 31.90 | 3.95 | |
| RF | 1 | 0.039 | 0.160 | 36.2 | –0.45 |
| 2 | 0.035 | 0.135 | 32.1 | –0.34 | |
| 3 | 0.033 | 0.087 | 30.3 | –0.21 |
SI, Scatter Index; Tstat, T-statistic test; U95, Uncertainty with a 95% confidence level; MBE, mean bias error.