| Literature DB >> 36234855 |
Hongzhe Jiang1,2, Yilei Hu2, Xuesong Jiang1,2, Hongping Zhou1,2.
Abstract
The maturity of Camellia oleifera fruit is one of the most important indicators to optimize the harvest day, which, in turn, results in a high yield and good quality of the produced Camellia oil. A hyperspectral imaging (HSI) system in the range of visible and near-infrared (400-1000 nm) was employed to assess the maturity stages of Camellia oleifera fruit. Hyperspectral images of 1000 samples, which were collected at five different maturity stages, were acquired. The spectrum of each sample was extracted from the identified region of interest (ROI) in each hyperspectral image. Spectral principal component analysis (PCA) revealed that the first three PCs showed potential for discriminating samples at different maturity stages. Two classification models, including partial least-squares discriminant analysis (PLS-DA) and principal component analysis discriminant analysis (PCA-DA), based on the raw or pre-processed full spectra, were developed, and performances were compared. Using a PLS-DA model, based on second-order (2nd) derivative pre-processed spectra, achieved the highest results of correct classification rates (CCRs) of 99.2%, 98.4%, and 97.6% in the calibration, cross-validation, and prediction sets, respectively. Key wavelengths selected by PC loadings, two-dimensional correlation spectroscopy (2D-COS), and the uninformative variable elimination and successive projections algorithm (UVE+SPA) were applied as inputs of the PLS-DA model, while UVE-SPA-PLS-DA built the optimal model with the highest CCR of 81.2% in terms of the prediction set. In a confusion matrix of the optimal simplified model, satisfactory sensitivity, specificity, and precision were acquired. Misclassification was likely to occur between samples at maturity stages two, three, and four. Overall, an HSI with effective selected variables, coupled with PLS-DA, could provide an accurate method and a reference simple system by which to rapidly discriminate the maturity stages of Camellia oleifera fruit samples.Entities:
Keywords: Camellia oleifera fruit; chemometrics; hyperspectral imaging; maturity; near-infrared
Mesh:
Year: 2022 PMID: 36234855 PMCID: PMC9572681 DOI: 10.3390/molecules27196318
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.927
Figure 1The internal structure of Camellia oleifera fruit.
Figure 2Pseudo-color images showing the five maturity stages of typical Camellia oleifera fruit samples.
Figure 3The main steps involved in the segmentation of hyperspectral images. (a) Selecting 416 nm and 862 nm images; (b) band math; (c) binarization; (d) applying the mask; (e) extracting the spectra of a Camellia oleifera fruit sample.
Statistical results of the mean values and standard deviation (SD) for the physicochemical properties (n = 10).
| Maturity Stages | Height (mm) | Diameter (mm) | Fruit Mass (g) | Seeds Mass (g) | Seeds Yield (%) | Oil Content (%) | Pericarp Moisture (%) |
|---|---|---|---|---|---|---|---|
| S1 | 40.32 ± 0.35 | 40.08 ± 0.15 | 28.40 ± 1.32 | 10.27 ± 0.43 | 36.16 ± 3.13 | 22.31 ± 0.93 | 70.19 ± 4.56 |
| S2 | 40.45 ± 0.32 | 40.21 ± 0.14 | 27.32 ± 1.56 | 10.36 ± 0.36 | 37.92 ± 4.77 | 24.03 ± 0.73 | 70.23 ± 5.69 |
| S3 | 40.53 ± 0.33 | 40.39 ± 0.15 | 29.60 ± 2.03 | 11.30 ± 0.35 | 38.18 ± 4.31 | 27.46 ± 1.02 | 68.97 ± 5.35 |
| S4 | 41.12 ± 0.36 | 40.96 ± 0.18 | 30.21 ± 2.13 | 12.98 ± 0.42 | 42.97 ± 5.10 | 32.27 ± 1.12 | 69.12 ± 6.21 |
| S5 | 41.24 ± 0.35 | 41.03 ± 0.25 | 30.64 ± 1.89 | 11.85 ± 0.41 | 38.67 ± 3.95 | 35.54 ± 1.13 | 68.65 ± 5.36 |
| Control | 41.16 ± 0.42 | 41.65 ± 0.21 | 30.55 ± 2.77 | 12.64 ± 0.48 | 41.37 ± 4.32 | 35.06 ± 0.84 | 66.39 ± 3.13 |
Figure 4Average spectra of Camellia oleifera fruits at each maturity stage.
Figure 5PCA results. (a) Score plot; (b) PC loading lines. PCA, principal component analysis; PC, principal component.
Figure 6The first three PC-score images of the samples at five maturity stages. PC, principal component.
Results of PLS-DA and PCA-DA models based on the full spectra using various pre-processing techniques.
| Modeling Methods | Pre-Processings | Correction Classification Rate | Parameters | ||
|---|---|---|---|---|---|
| Calibration Set | Cross-Validation Set | Prediction Set | |||
| PLS-DA | None | 93.9% | 92.9% | 82.8% | LV = 18 |
| SNV | 97.9% | 96.5% | 95.6% | LV = 19 | |
| Normalization | 98.7% | 96.7% | 95.6% | LV = 19 | |
| 1st derivative | 95.2% | 93.6% | 88.0% | LV = 19 | |
| 2nd derivative | 99.2% | 98.4% | 97.6% | LV = 16 | |
| PCA-DA | None | 90.3% | 88.5% | 80.8% | PC = 20 |
| SNV | 89.1% | 87.1% | 83.2% | PC = 20 | |
| Normalization | 95.7% | 94.7% | 91.2% | PC = 20 | |
| 1st derivative | 86.4% | 84.0% | 79.6% | PC = 20 | |
| 2nd derivative | 94.9% | 93.9% | 91.6% | PC = 18 | |
Figure 7Wavelengths selection steps. (a) The synchronous 2D correlation spectra; (b) spectrum in a diagonal line in synchronous 2D correlation spectra; (c) the stability of wavelengths and random variables in UVE; (d) the changing trend of RMSE with the increase in selected wavelengths by SPA. 2D, two-dimensional; RMSE, root mean squares error; SPA, successive projections algorithm.
Wavelength selection by different methods.
| Methods | Numbers | Selected Wavelengths (nm) |
|---|---|---|
| PC loadings | 8 | 552, 572, 652, 682, 687, 718, 753, 926 |
| 2DCOS | 10 | 417, 494, 528, 557, 632, 672, 692, 728, 931, 958 |
| UVE+SPA | 10 | 572, 622, 652, 753, 774, 821, 862, 873, 894, 963 |
Performance of multi-spectral PLSR models using selected wavelengths.
| Model. | LVs | Correction Classification Rate (%) | ||
|---|---|---|---|---|
| Calibration Set | Cross-Validation Set | Prediction Set | ||
| PC-PLS-DA | 7 | 57.9 | 56.1 | 55.6 |
| 2DCOS-PLS-DA | 9 | 68.8 | 66.9 | 54.0 |
| UVE-SPA-PLS-DA | 9 | 83.6 | 82.1 | 81.2 |
Confusion matrix of prediction set for UVE-SPA-PLS-DA model.
| Actual Stages | Predicted Stages | CCR | Sensitivity | Specificity | Precision | ||||
|---|---|---|---|---|---|---|---|---|---|
| S1 | S2 | S3 | S4 | S5 | |||||
| S1 | 49 | 1 | 0 | 0 | 0 | 98.0% | 0.98 | 0.95 | 0.83 |
| S2 | 8 | 37 | 1 | 1 | 3 | 74.0% | 0.74 | 0.95 | 0.80 |
| S3 | 2 | 2 | 40 | 4 | 2 | 80.0% | 0.80 | 0.94 | 0.78 |
| S4 | 0 | 4 | 7 | 33 | 6 | 66.0% | 0.66 | 0.97 | 0.85 |
| S5 | 0 | 2 | 3 | 1 | 44 | 88.0% | 0.88 | 0.94 | 0.80 |