| Literature DB >> 35954110 |
Yan Hu1, Youli Wu1, Jie Sun1, Jinping Geng1, Rongsheng Fan1, Zhiliang Kang1.
Abstract
Oolong tea is a semi-fermented tea that is popular among people. This study aims to establish a classification method for oolong tea based on fluorescence hyperspectral technology(FHSI) combined with chemometrics. First, the spectral data of Tieguanyin, Benshan, Maoxie and Huangjingui were obtained. Then, standard normal variation (SNV) and multiple scatter correction (MSC) were used for preprocessing. Principal component analysis (PCA) was used for data visualization, and with tolerance ellipses that were drawn according to Hotelling, outliers in the spectra were removed. Variable importance for the projection (VIP) > 1 in partial least squares discriminant analysis (PLS-DA) was used for feature selection. Finally, the processed spectral data was entered into the support vector machine (SVM) and PLS-DA. MSC_VIP_PLS-DA was the best model for the classification of oolong tea. The results showed that the use of FHSI could accurately distinguish these four types of oolong tea and was able to identify the key wavelengths affecting the tea classification, which were 650.11, 660.29, 665.39, 675.6, 701.17, 706.31, 742.34 and 747.5 nm. In these wavelengths, different kinds of tea have significant differences (p < 0.05). This study could provide a non-destructive and rapid method for future tea identification.Entities:
Keywords: chemometrics; classification; oolong tea; spectroscopy
Year: 2022 PMID: 35954110 PMCID: PMC9368096 DOI: 10.3390/foods11152344
Source DB: PubMed Journal: Foods ISSN: 2304-8158
Figure 1(a) Average spectra of four oolong teas; (b) three-dimensional plot of the spectral curves; (c) spectra after MSC; (d) spectra after SNV.
Figure 2PCA score plots of four oolong teas. (a) The PCA of the raw spectra; (b) the PCA of the spectra after MSC; and (c) the PCA of the spectra after SNV. (0 represents Tieguanyin, 1 represents Maoxie, 2 represents Huangjingui, 3 represents Benshan).
Figure 3The distributions of all variables in each wavelength after the selection of VIP.
Classification results for oolong tea (Tie represents Tieguanyin, Mao represents Maoxie, Huang represents Huangjingui, Ben represents Benshan. Total represents the average accuracy, precision and recall rate of each kind of tea).
| Model | Preprocessing | Variables | Class | Calibration Set | Prediction Set | ||||
|---|---|---|---|---|---|---|---|---|---|
| Accuracy | Precision | Recall | Accuracy | Precision | Recall | ||||
| SVM | RAW | 104 | Tie | 95.83% | 100.00% | 96.00% | 100.00% | 88.00% | 100.00% |
| Mao | 100.00% | 100.00% | 100.00% | 88.89% | 100.00% | 89.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 92.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Total | 98.96% | 98.00% | 99.00% | 97.22% | 97.00% | 97.25% | |||
| 43 (VIP > 1) | Tie | 100.00% | 87.00% | 100.00% | 100.00% | 82.00% | 100.00% | ||
| Mao | 100.00% | 100.00% | 100.00% | 88.89% | 89.00% | 89.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 71.43% | 100.00% | 71.00% | 80.00% | 100.00% | 80.00% | |||
| Total | 92.86% | 96.75% | 92.75% | 92.22% | 92.75% | 92.25% | |||
| SNV | 104 (none selection) | Tie | 92.86% | 100.00% | 93.00% | 100.00% | 100.00% | 100.00% | |
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 80.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Total | 98.21% | 95.00% | 98.25% | 100.00% | 100.00% | 100.00% | |||
| 33 (VIP > 1) | Tie | 100.00% | 100.00% | 100.00% | 100.00% | 88.00% | 100.00% | ||
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 100.00% | 100.00% | 86.67% | 100.00% | 87.00% | |||
| Total | 100.00% | 100.00% | 100.00% | 96.67% | 97.00% | 96.75% | |||
| MSC | 104 (none selection) | Tie | 91.67% | 100.00% | 92.00% | 100.00% | 88.00% | 100.00% | |
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 83.00% | 91.00% | 86.67% | 100.00% | 87.00% | |||
| Total | 97.92% | 95.75% | 95.75% | 96.67% | 97.00% | 96.75% | |||
| 35 (VIP > 1) | Tie | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | ||
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Total | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| PLS-DA | RAW | 104 | Tie | 100.00% | 92.00% | 100.00% | 95.65% | 100.00% | 96.00% |
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 84.62% | 100.00% | 85.00% | 100.00% | 91.00% | 100.00% | |||
| Total | 96.15% | 98.00% | 96.25% | 98.91% | 97.75% | 99.00% | |||
| 43 (VIP > 1) | Tie | 96.67% | 97.00% | 97.00% | 96.67% | 97.00% | 97.00% | ||
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 90.00% | 90.00% | 90.00% | 90.00% | 90.00% | 100.00% | |||
| Total | 96.67% | 96.75% | 96.75% | 96.67% | 96.75% | 99.25% | |||
| SNV | 104 | Tie | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Total | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| 33 (VIP > 1) | Tie | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | ||
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Total | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| MSC | 104 | Tie | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Total | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| 35 (VIP > 1) | Tie | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | ||
| Mao | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Huang | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Ben | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
| Total | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | 100.00% | |||
The wavelengths selected for VIP > 1 in PLS–DA.
| Preprocessing Methods | No. | Selected Wavelength |
|---|---|---|
| RAW | 43 | 479.65; 484.59; 489.54; 494.49; 634.89; 639.95; 645.03; 650.11; 655.2; 660.29; 665.39; 670.49; 675.6; 696.06; 701.17; 706.31; 711.44; 716.58; 726.86; 732.03; 737.17; 742.34; 747.5; 752.65; 872.69; 877.95; 883.22; 888.51; 893.79; 914.95; 930.88; 936.2; 941.5; 946.84; 952.16; 957.5; 962.84; 968.16; 978.86; 984.23; 989.57; 994.94; 1011.05 |
| MSC | 35 | 489.54; 604.51; 609.56; 614.61; 619.69; 624.75; 629.81; 634.89; 639.95; 645.03; 650.11; 655.2; 660.29; 665.39; 670.49; 675.6; 680.7; 685.83; 690.94; 696.06; 701.17; 706.31; 711.44; 716.58; 737.17; 742.34; 747.5; 757.85; 763; 768.2; 773.39; 778.55; 783.75; 788.95; 794.15 |
| SNV | 33 | 489.54; 609.56; 614.61; 619.69; 624.75; 629.81; 634.89; 639.95; 645.03; 650.11; 655.2; 660.29; 665.39; 670.49; 675.6; 680.7; 690.94; 696.06; 701.17; 706.31; 711.44; 716.58; 742.34; 747.5; 752.65; 757.85; 763; 768.2; 773.39; 778.55; 783.75; 788.95; 794.15; |
ANOVA results for each wavelength.
| Wavelength/nm | 489.54 | 634.89 | 639.95 | 645.03 | 650.11 | 655.2 | 660.29 | 665.39 |
|---|---|---|---|---|---|---|---|---|
| Tieguanyin | 231.56 ± 24.01 a | 231.77 ± 16.31 a | 231.85 ± 16.18 a | 254.21 ± 16.59 a | 317.14 ± 17.33 a | 484.37 ± 17.93 a | 897.98 ± 20.66 a | 1772.93 ± 25.47 a |
| Maoxie | 240.07 ± 14.20 a | 249.03 ± 10.93 a | 253.06 ± 11.19 b | 276.64 ± 11.26 b | 339.49 ± 10.54 b | 507.87 ± 12.00 a | 923.06 ± 15.42 b | 1793.24 ± 26.15 b |
| Huangjingui | 234.42 ± 13.73 a | 233.23 ± 10.53 b | 240.62 ± 10.61 c | 266.39 ± 11.15 c | 330.84 ± 11.34 c | 484.58 ± 12.33 b | 845.42 ± 15.34 c | 1584.60 ± 22.11 c |
| Benshan | 253.42 ± 29.82 b | 246.92 ± 21.95 b | 250.95 ± 21.97 c | 278.50 ± 21.93 c | 350.06 ± 23.24 d | 528.76 ± 24.87 c | 954.03 ± 29.64 d | 1845.04 ± 51.84 d |
|
|
|
|
|
|
|
|
|
|
| Tieguanyin | 3058.43 ± 35.10 a | 4015.49 ± 37.25 a | 1421.99 ± 30.72 a | 1389.86 ± 31.68a | 1441.51 ± 34.26 a | 1541.62 ± 36.67 a | 1348.80 ± 44.45 a | 1150.73 ± 41.73 a |
| Maoxie | 3074.16 ± 40.44 b | 4062.58 ± 44.93 b | 1496.39 ± 33.07 b | 1471.07 ± 33.73 b | 1534.77 ± 34.12 a | 1650.03 ± 34.87 a | 1443.74 ± 33.87 b | 1247.50 ± 30.70 b |
| Huangjingui | 2708.93 ± 35.92 b | 3672.91 ± 43.78 c | 1715.80 ± 29.71 c | 1674.04 ± 29.10 c | 1734.17 ± 29.06 b | 1859.20 ± 30.27 b | 1691.45 ± 45.21 c | 1453.68 ± 40.66 c |
| Benshan | 3142.46 ± 87.62 c | 4103.93 ± 99.52 d | 1446.08 ± 57.81 d | 1410.60 ± 59.42 d | 1455.16 ± 64.79 c | 1546.21 ± 71.97 c | 1282.70 ± 66.33 d | 1086.56 ± 57.35 d |
Data represent the mean ± standard deviation. Statistical analysis was carried out by analysis of variance and post-Duncan test, and different lowercase letters (a–d) were used to indicate the importance of statistical signals (p < 0.05). The same letter means there is no significant difference between the teas, and different letters mean there is a significant difference.
Figure 4Distribution of key wavelengths in the average spectra.