| Literature DB >> 27446631 |
Haitao Chang1, Lianqing Zhu2, Xiaoping Lou2, Xiaochen Meng2, Yangkuan Guo2, Zhongyu Wang1.
Abstract
Over the last decade, near-infrared spectroscopy, together with the use of chemometrics models, has been widely employed as an analytical tool in several industries. However, most chemical processes or analytes are multivariate and nonlinear in nature. To solve this problem, local errors regression method is presented in order to build an accurate calibration model in this paper, where a calibration subset is selected by a new similarity criterion which takes the full information of spectra, chemical property, and predicted errors. After the selection of calibration subset, the partial least squares regression is applied to build calibration model. The performance of the proposed method is demonstrated through a near-infrared spectroscopy dataset of pharmaceutical tablets. Compared with other local strategies with different similarity criterions, it has been shown that the proposed local errors regression can result in a significant improvement in terms of both prediction ability and calculation speed.Entities:
Year: 2016 PMID: 27446631 PMCID: PMC4944088 DOI: 10.1155/2016/5416506
Source DB: PubMed Journal: J Anal Methods Chem ISSN: 2090-8873 Impact factor: 2.193
Descriptive statistics for the calibration set, validation set, and prediction set.
| Sample sets | Number | Range | Mean | Standard deviations |
|---|---|---|---|---|
| Calibration | 460 | 154.3~237.7 | 188.4 | 15.8 |
| Validation | 40 | 168.2~219.5 | 194.8 | 12.4 |
| Prediction | 155 | 151.6~239.1 | 192.9 | 22.0 |
Figure 1NIR original spectra for pharmaceutical tablets.
Figure 2Variation of RMSEP with the PLS factors for validation set. RMSEP = root mean squared error of prediction. PLS = partial least squares.
Figure 3The probability of insufficient selected samples with the PLS factors and error ranges for the validation set. × = insufficient selected samples for building calibration model. PLS = partial least squares.
Size of calibration subset for each query sample with local errors strategy in prediction set.
| Number of samples in calibration subset | Number of query samples | Percentage |
|---|---|---|
| [13,50] | 122 | 78.7% |
| [51,100] | 7 | 4.5% |
| [101,150] | 9 | 5.8% |
| [151,200] | 11 | 7.1% |
| [201,210] | 6 | 3.9% |
Figure 4Predicted versus reference values of prediction set for the local errors regression and global method.
Performance comparisons among local errors regression, global method, other local methods with different similarity criterions.
| Method | Similarity criterion | RMSEP |
| RPD | Size of subset | Parameters | Time consumption (s) |
|---|---|---|---|---|---|---|---|
| Global | — | 4.30 | 0.96 | 5.11 | 460 | — | 4.4 |
|
| |||||||
| Other local methods | ED | 4.18 | 0.96 | 5.26 | 150 | — | 175.4 |
| Cosine | 4.25 | 0.96 | 5.17 | 150 | — | 179.1 | |
| PC-M | 4.21 | 0.96 | 5.22 | 50 | PC factors = 10 | 53.5 | |
|
| 4.24 | 0.96 | 5.18 | 100 |
| 123.1 | |
|
| 4.27 | 0.96 | 5.15 | 200 |
| 233.9 | |
|
| |||||||
| Local errors regression | Errors + ED | 3.21 | 0.98 | 6.85 | 13~205 |
| 9.5 |
ED: Euclidean distance; PC-M: Principal components-Mahalanobis distance; X + Y + ED: Euclidean distance considering both spectra X and property Y; X + Y + SLPP: Euclidean distance in the low-dimensional space obtained with supervised locality preserving projection method; errors + ED: Euclidean distance between predicted errors; RMSEP: root mean squared error of prediction; R 2: correlation coefficient in prediction set; RPD: residual prediction deviation; PC factors: Principal component factors; symbol γ: a trade-off parameter to balance the importance of spectra X and property Y; d: dimension of transformation matrix; and s: second.