| Literature DB >> 32429442 |
Alessandra Biancolillo1, Martina Foschi1, Angelo Antonio D'Archivio1.
Abstract
One-hundred and fourteen samples of saffron harvested in four different Italian areas (three in Central Italy and one in the South) were investigated by IR and UV-Vis spectroscopies. Two different multi-block strategies, Sequential and Orthogonalized Partial Least Squares Linear Discriminant Analysis (SO-PLS-LDA) and Sequential and Orthogonalized Covariance Selection Linear Discriminant Analysis (SO-CovSel-LDA), were used to simultaneously handle the two data blocks and classify samples according to their geographical origin. Both multi-block approaches provided very satisfying results. Each model was investigated in order to understand which spectral variables contribute the most to the discrimination of samples, i.e., to the characterization of saffron harvested in the four different areas. The most accurate solution was provided by SO-PLS-LDA, which only misclassified three test samples over 31 (in external validation).Entities:
Keywords: SO-CovSel; SO-PLS; classification; data fusion; infrared; multi-block; saffron; ultraviolet
Mesh:
Substances:
Year: 2020 PMID: 32429442 PMCID: PMC7287695 DOI: 10.3390/molecules25102332
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Mean raw spectra per class. a) IR signals; and, b) UV-Vis signals.
Organization of samples into training and test sets.
| Training (N. Samples) | Test (N. Samples) | Total (N. Samples) | |
|---|---|---|---|
| Class Spoleto (SP) | 26 | 8 | 34 |
| Class Aquila (AQ) | 11 | 5 | 16 |
| Class Sicily (SIC) | 11 | 8 | 19 |
| Class Città della Pieve (CP) | 35 | 10 | 45 |
| Total | 83 | 31 | 114 |
Sequential and Orthogonalized Partial Least Squares Linear Discriminant Analysis (SO-PLS-LDA) analysis: External Validation. Correct classification rates (%) and number of misclassified test samples.
| Predictions (on the Test Set) | |||||||
|---|---|---|---|---|---|---|---|
| Class SP | Class AQ | Class SIC | Class CP | ||||
| Class. Rate (%) | Misclass. samples | Class. Rate (%) | Misclass. samples | Class. Rate (%) | Misclass. samples | Class. Rate (%) | Misclass. samples |
| 100.0 | 0 | 80.0 | 1 | 75.0 | 2 | 100.0 | 0 |
Figure 2SO-PLS-LDA analysis. Samples projected onto the three canonical variate scores (CV).
Figure 3Variable Importance in Projection (VIP) analysis. a) on IR spectra; and, b) on UV spectra. Selected variables (VIP > 1) are colored.
SO-CovSel-LDA analysis: External Validation. Correct classification rates (%) and number of misclassified test samples.
| Predictions (on the Test Set) | |||||||
|---|---|---|---|---|---|---|---|
| Class SP | Class AQ | Class SIC | Class CP | ||||
| Class. Rate (%) | Misclass. samples | Class. Rate (%) | Misclass. samples | Class. Rate (%) | Misclass. samples | Class. Rate (%) | Misclass. samples |
| 100.0 | 0 | 80.0 | 1 | 62.5 | 3 | 100.0 | 0 |
Figure 4SO-CovSel-LDA Analysis. Graphical representation of variables selected on a) IR block; and, b) UV block. Legend: Black line: Mean Spectrum. Red Circles: Selected variables.
Figure 5Details about the origin and the numerosity of the analyzed samples.
PLS-DA analysis on IR data: Correct classification rates (%) for calibration (CV) and validation (test set).
| IR | ||||
|---|---|---|---|---|
| Class SP | Class AQ | Class SIC | Class CP | |
| Calibration (CV) | Class. Rate (%) | Class. Rate (%) | Class. Rate (%) | Class. Rate (%) |
| 57.7 | 72.7 | 54.5 | 40.0 | |
| Prediction (on the test set) | 75.0 | 80.0 | 25.0 | 60.0 |
PLS-DA analysis: Correct classification rates (%) for calibration (CV) and validation (test set).
| UV-Vis | ||||
|---|---|---|---|---|
| Class SP | Class AQ | Class SIC | Class CP | |
| Calibration | Class. Rate (%) | Class. Rate (%) | Class. Rate (%) | Class. Rate (%) |
| 62.5 | 40.0 | 75.0 | 90.0 | |
| Prediction | 91.3 | 84.6 | 87.5 | 72.9 |