| Literature DB >> 26286630 |
Ashwin Kumar Myakalwar1, Nicolas Spegazzini2, Chi Zhang3, Siva Kumar Anubham1, Ramachandra R Dasari2, Ishan Barman3, Manoj Kumar Gundawar1.
Abstract
Despite its intrinsic advantages, translation of laser induced breakdown spectroscopy for material identification has been often impeded by the lack of robustness of developed classification models, often due to the presence of spurious correlations. While a number of classifiers exhibiting high discriminatory power have been reported, efforts in establishing the subset of relevant spectral features that enable a fundamental interpretation of the segmentation capability and avoid the 'curse of dimensionality' have been lacking. Using LIBS data acquired from a set of secondary explosives, we investigate judicious feature selection approaches and architect two different chemometrics classifiers -based on feature selection through prerequisite knowledge of the sample composition and genetic algorithm, respectively. While the full spectral input results in classification rate of ca.92%, selection of only carbon to hydrogen spectral window results in near identical performance. Importantly, the genetic algorithm-derived classifier shows a statistically significant improvement to ca. 94% accuracy for prospective classification, even though the number of features used is an order of magnitude smaller. Our findings demonstrate the impact of rigorous feature selection in LIBS and also hint at the feasibility of using a discrete filter based detector thereby enabling a cheaper and compact system more amenable to field operations.Entities:
Year: 2015 PMID: 26286630 PMCID: PMC4541340 DOI: 10.1038/srep13169
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Representative LIBS spectra of high-energy material (HEM) samples: HMX, NTO, PETN, RDX and TNT.
The spectra are normalized and offset for visualization purposes.
Details of selected spectral data regions based on the sample compositional information.
| Sl. No. | Region Name | Description | Wavelengths included (nm) | % of full spectrum |
|---|---|---|---|---|
| 1 | R1 | C and H peaks | 246.98–248.05; 640–670 | 3.16 |
| 2 | R2 | C to CN region | 246.98–424.04 | 34.05 |
| 3 | R3 | C to H region | 246.98–670.04 | 62.85 |
| 4 | R4 | Post-H region | 670–875.99 | 17.06 |
| 5 | Full | Entire spectrum | 199–981 | 100 |
Classification performance of PLS-DA models featuring selected spectral regions.
| Region Selected | Correct Classification (%) | Misclassification (%) | Unclassification (%) |
|---|---|---|---|
| 69.65 | 28.32 | 2.02 | |
| 87.48 | 10.24 | 2.27 | |
| 90.42 | 6.78 | 2.79 | |
| 85.42 | 12.34 | 2.23 | |
| 92.61 | 4.65 | 2.73 |
Figure 2Bar plots with classification error rates showing comparative performance of wavelength-selected PLS-DA classification models.
The wavelength selection in this case is performed using genetic algorithm. The length of the bars is proportional to the average root mean square error (RMSE) and the associated error bars represent the standard deviation over a hundred iterations.
Results of sensitivity analysis for the GA/PLS-DA models for HEM classification.
| 90% of sampled points | Correct Classification (%) | Misclassification (%) | Unclassification (%) |
|---|---|---|---|
| HMX | 95.78 | 1.93 | 2.30 |
| NTO | 95.87 | 3.47 | 0.67 |
| PETN | 93.50 | 0.00 | 6.50 |
| RDX | 81.83 | 17.00 | 1.17 |
| TNT | 97.66 | 0.00 | 2.34 |
| HMX | 96.54 | 0.80 | 2.65 |
| NTO | 97.67 | 1.44 | 0.89 |
| PETN | 95.42 | 1.11 | 2.92 |
| RDX | 83.19 | 13.75 | 3.33 |
| TNT | 98.16 | 0.00 | 1.84 |
| HMX | 93.56 | 2.67 | 3.78 |
| NTO | 93.07 | 4.80 | 2.13 |
| PETN | 88.67 | 5.17 | 6.17 |
| RDX | 79.33 | 17.83 | 2.83 |
| TNT | 98.62 | 0.07 | 1.31 |
Three separate cases are tabulated corresponding to 90%, 10% and 1% of the full spectrum being sampled.
Figure 3Plot of frequency of wavelength selection for the 1% selected spectral subset over 10 independent iterations with a representative spectrum of HMX on top.
1% of the full spectrum corresponds to 259 spectral points.