| Literature DB >> 31390746 |
Zeling Chen1, Ting Wu2, Cheng Xiang1, Xiaoyan Xu3, Xingguo Tian4,5.
Abstract
This study intends to evaluate the utilization potential of the combined Raman spectroscopy and machine learning approach to quickly identify the rainbow trout adulteration in Atlantic salmon. The adulterated samples contained various concentrations (0-100% w/w at 10% intervals) of rainbow trout mixed into Atlantic salmon. Spectral preprocessing methods, such as first derivative, second derivative, multiple scattering correction (MSC), and standard normal variate, were employed. Unsupervised algorithms, such as recursive feature elimination, genetic algorithm (GA), and simulated annealing, and supervised K-means clustering (KM) algorithm were used for selecting important spectral bands to reduce the spectral complexity and improve the model stability. Finally, the performances of various machine learning models, including linear regression, nonlinear regression, regression tree, and rule-based models, were verified and compared. The results denoted that the developed GA-KM-Cubist machine learning model achieved satisfactory results based on MSC preprocessing. The determination coefficient (R2) and root mean square error of prediction sets (RMSEP) in the test sets were 0.87 and 10.93, respectively. These results indicate that Raman spectroscopy can be used as an effective Atlantic salmon adulteration identification method; further, the developed model can be used for quantitatively analyzing the rainbow trout adulteration in Atlantic salmon.Entities:
Keywords: Atlantic salmon; Raman spectroscopy; adulteration; machine learning
Mesh:
Year: 2019 PMID: 31390746 PMCID: PMC6696069 DOI: 10.3390/molecules24152851
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1The Raman spectra of fat in Atlantic salmon and rainbow trout. (a) The five spectra of salmon and rainbow trout after baseline correction and multiple scattering correction (MSC). (b) The mean and standard deviation spectra of salmon and rainbow trout.
Raman spectral distribution of the Atlantic salmon fat.
| Band/cm−1 | Vibration Mode | Functional Groups | Intensity |
|---|---|---|---|
| 1748 | ν(C=O) | Ester (RC=OOR) | Weak |
| 1659 | ν(C=C) | Unsaturated band (cis RHC=CHR) | Strong |
| 1441 | δγ(C–H) | Methylene (CH2) | Strong |
| 1303 | δτ(C–H) | Methylene (CH2) | Medium |
| 1268 | δIP(=C–H) | Non-conjugated cis (RHC=CHR) | Medium |
| 1079 | ν(C–C) | –(CH2)n– | Medium |
| 974 | δ(=C–H) | Trans RHC=CHR | Medium |
| 872 | ν(C–C) | –(CH2)n– | Medium |
Figure 2The Raman spectra observed in case of different proportions of rainbow trout adulteration in Atlantic salmon.
Figure 3The Raman spectra of samples obtained using different pretreatments: (a) the original spectrum; (b) the spectrum after baseline fitting; (c) the spectrum after applying the first derivative; (d) the spectrum after applying the second derivative; (e) the spectrum after applying standard normal variate (SNV); (f) the spectrum after applying MSC.
The partial least squares regression (PLSR) modeling results for four preprocessing methods.
| Pretreatment Methods | Ncomp | Calibration Sets | Test Sets | |||
|---|---|---|---|---|---|---|
| RMSE (%) | R2 | RMSEP (%) | R2P | MAE | ||
| NONE | 10 | 14.79 | 0.79 | 17.27 | 0.70 | 13.19 |
| FD | 10 | 21.38 | 0.58 | 23.10 | 0.48 | 18.18 |
| SD | 10 | 29.26 | 0.19 | 30.15 | 0.12 | 25.11 |
| SNV | 10 | 13.66 | 0.82 | 13.28 | 0.81 | 10.49 |
| MSC | 9 | 13.68 | 0.82 | 13.32 | 0.81 | 10.57 |
MAE (Mean Square Error) denotes the average absolute error.
Three feature wavelength selection methods based on the PLSR modeling results.
| Dimension Reduction Methods | Number of Wavelengths | Calibration Sets | Test Sets | |||
|---|---|---|---|---|---|---|
| RMSE (%) | R2 | RMSEP (%) | R2P | MAE | ||
| NONE | 882 | 13.68 | 0.82 | 13.32 | 0.81 | 10.57 |
| RFE–KM | 75 | 14.47 | 0.79 | 14.93 | 0.77 | 12.24 |
| GA–KM | 431 | 14.36 | 0.80 | 13.34 | 0.81 | 10.69 |
| SA–KM | 322 | 14.55 | 0.79 | 13.84 | 0.80 | 11.11 |
MAE denotes the average absolute error.
Figure 4The cross-validated RMSE (Root Mean Square Error) curve for different commit sizes and instance numbers.
Figure 5The predicted and true values of Atlantic salmon meat adulteration ratios based on the Cubist model in test sets.
Performance comparison of different machine learning models.
| Models | RMSE (%) | R2 | MAE | |||
|---|---|---|---|---|---|---|
| Calibration Sets | Test Sets | Calibration Sets | Test Sets | Calibration Sets | Test Sets | |
| PLS | 14.36 | 13.34 | 0.80 | 0.81 | 11.18 | 10.69 |
| Ridge | 17.09 | 14.84 | 0.74 | 0.78 | 13.39 | 11.81 |
| Enet | 15.23 | 14.38 | 0.77 | 0.78 | 12.11 | 11.60 |
| Rqlasso | 15.72 | 14.92 | 0.76 | 0.77 | 12.46 | 11.94 |
| Earth | 16.30 | 16.84 | 0.74 | 0.71 | 12.93 | 13.14 |
| Kknn | 16.44 | 16.02 | 0.75 | 0.74 | 12.79 | 12.38 |
| ParRF | 15.91 | 14.87 | 0.77 | 0.79 | 12.92 | 11.91 |
| Qrf | 15.66 | 14.81 | 0.76 | 0.77 | 11.99 | 10.99 |
| Rf | 15.92 | 14.99 | 0.77 | 0.78 | 12.95 | 11.98 |
| Ctree | 21.74 | 22.71 | 0.55 | 0.48 | 17.09 | 16.95 |
| Cubist | 12.67 | 10.93 | 0.84 | 0.87 | 9.78 | 8.37 |
| Glmboost | 15.20 | 14.38 | 0.77 | 0.78 | 12.17 | 11.57 |
| XgbTree | 29.67 | 29.22 | 0.33 | 0.30 | 22.70 | 22.86 |
| Msaene | 15.33 | 14.39 | 0.77 | 0.78 | 12.37 | 11.69 |
MAE was the average absolute error.