| Literature DB >> 35454682 |
Lei-Ming Yuan1, Lifan You1, Xiaofeng Yang1, Xiaojing Chen1, Guangzao Huang1, Xi Chen1, Wen Shi1, Yiye Sun1.
Abstract
In order to reduce the uncertainty of the genetic algorithm (GA) in optimizing the near-infrared spectral calibration model and avoid the loss of spectral information of the unselected variables, a strategy of fusing consensus models is proposed to measure the soluble solids content (SSC) in peaches. A total of 266 peach samples were collected at four arrivals, and their interactance spectra were scanned by an integrated analyzer prototype, and then an internal index of SSC was destructively measured by the standard refractometry method. The near-infrared spectra were pre-processed with mean centering and were selected successively with a genetic algorithm (GA) to construct the consensus model, which was integrated with two member models with optimized weightings. One was the conventional partial least square (PLS) optimized with GA selected variables (PLSGA), and the other one was the derived PLS developed with residual variables after GA selections (PLSRV). The performance of PLSRV models showed some useful spectral information related to peaches' SSC and someone performed close to the full-spectral-based PLS model. Among these 10 runs, consensus models obtained a lower root mean squared errors of prediction (RMSEP), with an average of 1.106% and standard deviation (SD) of 0.0068, and performed better than that of the optimized PLSGA models, which achieved a RMSEP of average 1.116% with SD of 0.0097. It can be concluded that the application of fusion strategy can reduce the fluctuation uncertainty of a model optimized by genetic algorithm, fulfill the utilization of the spectral information amount, and realize the rapid detection of the internal quality of the peach.Entities:
Keywords: consensus fusion; genetic algorithm; near-infrared spectroscopy; partial least squares; peach
Year: 2022 PMID: 35454682 PMCID: PMC9030883 DOI: 10.3390/foods11081095
Source DB: PubMed Journal: Foods ISSN: 2304-8158
Figure 1Schematic diagram of peach spectrum acquisition.
Figure 2Spectrum of soluble matter content in peach samples.
Distribution of soluble solids content in peach samples.
| Number | Max | Min | Mean | Standard Deviation | |
|---|---|---|---|---|---|
| Calibration set | 177 | 15.0 | 6.4 | 10.61 | 1.63 |
| Prediction set | 89 | 15.5 | 6.4 | 10.86 | 1.83 |
Figure 3Original near infrared reflectance spectra of peaches. One curve stands for one sample. Spectral curves with different colors are convenient for readers to view the trend of spectra.
Comparison of pretreatments on the predictive performance of the developed PLS model.
| Pretreatments | LVs | Calibration Set | Prediction Set | ||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| S-G D1st | 9 | 1.066 | 0.755 | −0.030 | 1.297 | 0.733 | −0.015 |
| MSC | 9 | 1.029 | 0.779 | −0.005 | 1.139 | 0.765 | −0.006 |
| SNV | 9 | 1.029 | 0.778 | −0.008 | 1.141 | 0.764 | −0.010 |
| MC | 11 | 1.017 | 0.792 | −0.002 | 1.129 | 0.771 | −0.008 |
| None | 10 | 1.030 | 0.773 | −0.003 | 1.142 | 0.762 | −0.007 |
Note: MSC: Multiplicative Scatter Correction; SNV: Standard Normal Variate; MC: mean centering; S-G D1st: First deviation with S-G smoothing; PLS: partial least squares;.
Predictive performance of PLS models by genetic algorithm method.
| Member Model | Selected/Residual Variables | Calibration Subset | Prediction Subset | ||
|---|---|---|---|---|---|
|
|
|
|
| ||
|
| 62 a (6) b | 0.942 | 0.817 | 1.124 | 0.786 |
|
| 165 (8) | 1.079 | 0.735 | 1.154 | 0.723 |
|
| 55 (5) | 0.926 | 0.823 | 1.117 | 0.787 |
|
| 172 (6) | 1.0931 | 0.715 | 1.157 | 0.725 |
|
| 35 (5) | 0.928 | 0.821 | 1.131 | 0.780 |
|
| 192 (7) | 1.082 | 0.735 | 1.177 | 0.721 |
|
| 33 (4) | 0.936 | 0.819 | 1.113 | 0.784 |
|
| 194 (8) | 1.068 | 0.736 | 1.171 | 0.725 |
|
| 52 (5) | 0.954 | 0.811 | 1.122 | 0.783 |
|
| 175 (7) | 1.055 | 0.738 | 1.157 | 0.716 |
|
| 43 (4) | 0.924 | 0.823 | 1.104 | 0.791 |
|
| 184 (7) | 1.075 | 0.735 | 1.177 | 0.723 |
|
| 21 (4) | 0.908 | 0.829 | 1.103 | 0.771 |
|
| 206 (8) | 1.083 | 0.741 | 1.178 | 0.729 |
|
| 34 (4) | 0.936 | 0.818 | 1.127 | 0.778 |
|
| 193 (8) | 1.063 | 0.738 | 1.168 | 0.725 |
|
| 64 (6) | 0.917 | 0.825 | 1.114 | 0.788 |
|
| 163 (8) | 1.078 | 0.733 | 1.179 | 0.715 |
|
| 37 (4) | 0.900 | 0.832 | 1.107 | 0.792 |
|
| 190 (8) | 1.096 | 0.732 | 1.175 | 0.713 |
Note: : PLSGA model developed with the selected variables by GA method; : PLSRV model developed with the residual variables; : the i-th running the GA method. Letter superscript a is the number of spectral variables used for modeling, and b is the latent variables in PLS model.
Figure 4Root mean square error of models with the selected variables by GA, residual variables, and consensus fusion (a) Calibration set (b) Prediction set.