| Literature DB >> 36130074 |
María Isabel Sánchez-Rodríguez1, Elena Sánchez-López2, Alberto Marinas2, José María Caridad1, Francisco José Urbano2.
Abstract
The high price of marketing of extra virgin olive oil (EVOO) requires the introduction of cost-effective and sustainable procedures that facilitate its authentication, avoiding fraud in the sector. Contrary to classical techniques (such as chromatography), near-infrared (NIR) spectroscopy does not need derivatization of the sample with proper integration of separated peaks and is more reliable, rapid, and cost-effective. In this work, principal component analysis (PCA) and then redundancy analysis (RDA)─which can be seen as a constrained version of PCA─are used to summarize the high-dimensional NIR spectral information. Then PCA and RDA factors are contemplated as explanatory variables in models to authenticate oils from qualitative or quantitative analysis, in particular, in the prediction of the percentage of EVOO in blended oils or in the classification of EVOO or other vegetable oils (sunflower, hazelnut, corn, or linseed oil) by the use of some machine learning algorithms. As a conclusion, the results highlight the potential of RDA factors in prediction and classification because they appreciably improve the results obtained from PCA factors in calibration and validation.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36130074 PMCID: PMC9554901 DOI: 10.1021/acs.jcim.2c00964
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 6.162
Figure 1NIR spectra of EVOO and other vegetable oils.
Figure 2RDA representation from PCA factors.
Figure 3Cross-validated R2 and DRMSEP for EVOO percentage prediction from PCA and RDA factors.
Figure 4Accuracy and Kappa for calibration and validation in classification of pure oils from PCA and RDA factors and different algorithms.
Accuracy for Calibration in Classification of Blended Oils from PCA and RDA Factors and Different Algorithmsa
| correct
classification (%) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sunflower | hazelnut | corn | linseed | |||||||||
| PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | |
| LDA | 81.601 | 96.258 | ↑ | 76.823 | 80.788 | ↑ | 90.050 | 85.828 | ↓ | 74.151 | 95.364 | ↑ |
| CART | 59.222 | 75.359 | ↑ | 56.040 | 58.318 | ↑ | 47.681 | 76.192 | ↑ | 53.500 | 84.080 | ↑ |
| KNN | 82.677 | 94.591 | ↑ | 85.848 | 85.767 | ↓ | 91.348 | 86.828 | ↓ | 77.060 | 86.419 | ↑ |
| SVM | 69.505 | 95.348 | ↑ | 75.752 | 89.510 | ↑ | 90.070 | 94.313 | ↑ | 63.515 | 96.182 | ↑ |
| RF | 78.859 | 98.258 | ↑ | 85.843 | 92.328 | ↑ | 86.864 | 97.182 | ↑ | 70.333 | 98.182 | ↑ |
Dif indicates ↑ or ↓ if the sign of RDA – PCA is positive or negative, respectively.
Kappa for Validation in Classification of Blended Oils from PCA and RDA Factors and Different Algorithmsa
| correct
classification (%) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sunflower | hazelnut | corn | linseed | |||||||||
| PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | |
| LDA | 79.904 | 96.651 | ↑ | 76.560 | 70.000 | ↓ | 86.577 | 86.667 | ↑ | 76.644 | 100.00 | ↑ |
| CART | 34.272 | 44.651 | ↑ | 44.392 | 44.651 | ↑ | 28.638 | 44.651 | ↑ | 28.704 | 60.747 | ↑ |
| KNN | 76.510 | 86.551 | ↑ | 79.923 | 83.333 | ↑ | 83.205 | 83.253 | ↑ | 70.000 | 86.564 | ↑ |
| SVM | 76.510 | 93.288 | ↑ | 59.801 | 86.641 | ↑ | 76.532 | 89.942 | ↑ | 69.914 | 96.647 | ↑ |
| RF | 69.856 | 96.650 | ↑ | 80.000 | 86.641 | ↑ | 86.577 | 93.320 | ↑ | 66.571 | 100.00 | ↑ |
Dif indicates ↑ or ↓ if the sign of RDA – PCA is positive or negative, respectively.
Kappa for Calibration in Classification of Blended Oils from PCA and RDA Factors and Different Algorithmsa
| correct
classification (%) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sunflower | hazelnut | corn | linseed | |||||||||
| PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | |
| LDA | 78.248 | 95.532 | ↑ | 72.465 | 77.122 | ↑ | 88.214 | 83.241 | ↓ | 69.426 | 94.496 | ↑ |
| CART | 52.380 | 71.674 | ↑ | 49.132 | 51.321 | ↑ | 38.322 | 72.515 | ↑ | 46.350 | 81.602 | ↑ |
| KNN | 79.626 | 93.579 | ↑ | 83.259 | 83.055 | ↓ | 89.778 | 84.452 | ↓ | 72.912 | 83.759 | ↑ |
| SVM | 64.244 | 94.479 | ↑ | 71.317 | 87.625 | ↑ | 88.250 | 93.280 | ↑ | 56.965 | 95.458 | ↑ |
| RF | 74.972 | 97.927 | ↑ | 83.285 | 90.890 | ↑ | 84.426 | 96.645 | ↑ | 65.174 | 97.843 | ↑ |
Dif indicates ↑ or ↓ if the sign of RDA – PCA is positive or negative, respectively.
Accuracy for Validation in Classification of Blended Oils from PCA and RDA Factors and Different Algorithmsa
| correct
classification (%) | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sunflower | hazelnut | corn | linseed | |||||||||
| PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | PCA | RDA | Dif | |
| LDA | 82.857 | 97.143 | ↑ | 80.000 | 74.286 | ↓ | 88.571 | 88.571 | ↑ | 80.000 | 100.00 | ↑ |
| CART | 42.857 | 51.429 | ↑ | 51.429 | 51.429 | ↑ | 37.143 | 51.429 | ↑ | 37.143 | 65.714 | ↑ |
| KNN | 80.000 | 88.571 | ↑ | 82.857 | 85.714 | ↑ | 85.714 | 85.714 | ↔ | 74.285 | 88.571 | ↑ |
| SVM | 80.000 | 94.286 | ↑ | 65.714 | 88.571 | ↑ | 80.000 | 91.429 | ↑ | 74.285 | 97.142 | ↑ |
| RF | 74.286 | 97.143 | ↑ | 82.857 | 88.571 | ↑ | 88.571 | 94.285 | ↑ | 71.428 | 100.00 | ↑ |
Dif indicates ↑ or ↓ if the sign of RDA – PCA is positive or negative, respectively, and ↔ if RDA – PCA is zero.