| Literature DB >> 31783592 |
Pan Gao1,2, Wei Xu3,4, Tianying Yan1,2, Chu Zhang5,6, Xin Lv2,3, Yong He5,6.
Abstract
Narrow-leaved oleaster (Elaeagnus angustifolia) fruit is a kind of natural product used as food and traditional medicine. Narrow-leaved oleaster fruits from different geographical origins vary in chemical and physical properties and differ in their nutritional and commercial values. In this study, near-infrared hyperspectral imaging covering the spectral range of 874-1734 nm was used to identify the geographical origins of dry narrow-leaved oleaster fruits with machine learning methods. Average spectra of each single narrow-leaved oleaster fruit were extracted. Second derivative spectra were used to identify effective wavelengths. Partial least squares discriminant analysis (PLS-DA) and support vector machine (SVM) were used to build discriminant models for geographical origin identification using full spectra and effective wavelengths. In addition, deep convolutional neural network (CNN) models were built using full spectra and effective wavelengths. Good classification performances were obtained by these three models using full spectra and effective wavelengths, with classification accuracy of the calibration, validation, and prediction set all over 90%. Models using effective wavelengths obtained close results to models using full spectra. The performances of the PLS-DA, SVM, and CNN models were close. The overall results illustrated that near-infrared hyperspectral imaging coupled with machine learning could be used to trace geographical origins of dry narrow-leaved oleaster fruits.Entities:
Keywords: convolutional neural network; effective wavelengths; geographical origin; narrow-leaved oleaster fruits; near-infrared hyperspectral imaging
Year: 2019 PMID: 31783592 PMCID: PMC6963922 DOI: 10.3390/foods8120620
Source DB: PubMed Journal: Foods ISSN: 2304-8158
Figure 1Samples of each geographical origin for hyperspectral imaging acquisition.
Figure 2The proposed convolutional neural network (CNN) architecture for narrow-leaved oleaster fruit identification. Conv1D denotes 1-dimension convolution layer, ReLU (Rectified Linear Unit) is the activation function, MaxPool1D denotes 1-dimension max pooling layer, Dense denotes densely-connected neural network layer. The parameter of Conv1D which is defined as ‘Channels’ is the number of the kernels or filters. The parameter of Dense which is defined as ‘units’ is the number of the neurons.
Figure 3Average spectra with standard deviation of each wavelength of narrow-leaved oleaster fruits from Gansu, Ningxia, and Xinjiang.
Figure 4Effective wavelength selection using the second derivative spectra of average spectra of the samples from Gansu, Ningxia, and Xinjiang.
Figure 5Principal component analysis (PCA) score scatter plots of (a) PC1 versus PC2; (b) PC1 versus PC3; and (c) PC2 versus PC3. The ellipse is the confidence ellipse (confidence level at 0.95).
Confusion matrix of the partial least squares discriminant analysis (PLS-DA), support vector machine (SVM) and convolutional neural network (CNN) models using full spectra.
| Model | Category Values | Calibration | Validation | Prediction | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2 | Total (%) | 0 | 1 | 2 | Total (%) | 0 | 1 | 2 | Total (%) | ||
|
|
| 539 | 0 | 0 | 291 | 0 | 0 | 268 | 0 | 7 | |||
|
| 0 | 601 | 1 | 0 | 303 | 0 | 0 | 299 | 1 | ||||
|
| 0 | 0 | 481 | 0 | 0 | 241 | 0 | 0 | 240 | ||||
|
| 99.94 | 100 | 99.02 | ||||||||||
|
|
| 539 | 0 | 0 | 289 | 0 | 2 | 224 | 0 | 51 | |||
|
| 0 | 602 | 0 | 0 | 303 | 0 | 0 | 300 | 0 | ||||
|
| 0 | 0 | 481 | 0 | 0 | 241 | 0 | 0 | 240 | ||||
|
| 100 | 99.76 | 93.74 | ||||||||||
|
|
| 539 | 0 | 0 | 289 | 0 | 2 | 253 | 0 | 22 | |||
|
| 1 | 601 | 0 | 0 | 303 | 0 | 0 | 300 | 0 | ||||
|
| 6 | 0 | 475 | 4 | 0 | 237 | 0 | 0 | 240 | ||||
|
| 99.57 | 99.28 | 97.30 | ||||||||||
* 0, 1, and 2 are the assigned category values of the samples from Gansu, Ningxia, and Xinjiang, respectively.
Confusion matrices of the PLS-DA, SVM, and CNN models using effective wavelengths.
| Model | Category Values | Calibration | Validation | Prediction | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 2 | Total (%) | 0 | 1 | 2 | Total (%) | 0 | 1 | 2 | Total (%) | ||
|
|
| 538 | 0 | 1 | 291 | 0 | 0 | 272 | 0 | 3 | |||
|
| 1 | 601 | 0 | 0 | 303 | 0 | 0 | 300 | 0 | ||||
|
| 1 | 0 | 480 | 0 | 0 | 241 | 0 | 0 | 240 | ||||
|
| 99.92 | 100 | 99.63 | ||||||||||
|
|
| 539 | 0 | 0 | 271 | 0 | 20 | 238 | 0 | 37 | |||
|
| 0 | 602 | 0 | 0 | 303 | 0 | 0 | 300 | 0 | ||||
|
| 2 | 0 | 479 | 1 | 0 | 240 | 1 | 0 | 239 | ||||
|
| 99.88 | 97.49 | 95.34 | ||||||||||
|
|
| 539 | 0 | 0 | 287 | 0 | 4 | 263 | 0 | 12 | |||
|
| 0 | 602 | 0 | 0 | 303 | 0 | 0 | 299 | 1 | ||||
|
| 4 | 0 | 477 | 8 | 0 | 233 | 5 | 0 | 235 | ||||
|
| 99.75 | 98.56 | 97.79 | ||||||||||
* 0, 1, and 2 are the assigned category values of the samples from Gansu, Ningxia, and Xinjiang, respectively.