| Literature DB >> 36161799 |
Edward Duckworth1, Arti Hole2, Atul Deshmukh3, Pankaj Chaturvedi4, Murali Krishna Chilakapati2,5,4, Benjamin Mora1, Debdulal Roy1.
Abstract
We report a novel method with higher than 90% accuracy in diagnosing buccal mucosa cancer. We use Fourier transform infrared spectroscopic analysis of human serum by suppressing confounding high molecular weight signals, thus relatively enhancing the biomarkers' signals. A narrower range molecular weight window of the serum was also investigated that yielded even higher accuracy on diagnosis. The most accurate results were produced in the serum's 10-30 kDa molecular weight region to distinguish between the two hardest to discern classes, i.e., premalignant and cancer patients. This work promises an avenue for earlier diagnosis with high accuracy as well as greater insight into the molecular origins of these signals by identifying a key molecular weight region to focus on.Entities:
Mesh:
Year: 2022 PMID: 36161799 PMCID: PMC9558084 DOI: 10.1021/acs.analchem.2c02496
Source DB: PubMed Journal: Anal Chem ISSN: 0003-2700 Impact factor: 8.008
FTIR Cross-Validation Sensitivity (Sen), Specificity (Spc), and Principal Components (PCs) Results for Classifying between Buccal Mucosa Cancer Samples from Healthy and Premalignant Using PCA-SVMa
| Classification
of cancer and healthy | Classification
of cancer and Premalignant | Classification
of cancer and all other | Average
cross-validation accuracies (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Fraction | Sen (%) | Spc (%) | PCs | Sen (%) | Spc (%) | PCs | Sen (%) | Spc (%) | PCs | PCA-SVM | LDA | SVM |
| LMW | 88 | 88 | 29 | 83 | 84 | 46 | 65 | 81 | 30 | 82.3 | 76.5 | 77.2 |
| HMW | 94 | 82 | 10 | 83 | 83 | 15 | 81 | 89 | 24 | 86.1 | 83.9 | 83.1 |
| Whole | 89 | 86 | 29 | 90 | 84 | 27 | 84 | 90 | 43 | 87 | 82.7 | 79.7 |
Post-cross-validation results using LDA or SVM alone are also included for comparison, demonstrating a similar accuracy trend but with lower accuracies overall.
Figure 1Difference in the average spectra of cancer and premalignant patient serum from healthy for whole serum. Error in faded color around each line shows level of distinction for each spectrum.
Figure 2(A) Explained variance graph depending on the number of principle components (PCs) used. (B) Cross-validated sensitivity and specificity values dependent on the number of PCs used in the model. Example graph to demonstrate how the accuracy plateaus after a sufficient number of principle components. The cross-validated accuracy does not decrease after a point as the SVM algorithm ignores the unnecessary components and minimal or no overfitting occurs. The two graphs mimic one another; the plateau in panel B starts at 25 principle components whereas in panel A there is 99.84% variance explained. This example is from the classification of the whole cancer vs premalignant subset.
Figure 3Classification accuracies between FTIR spectra of premalignant and cancer patients for different serum molecular weight subsets (molecular windows). The 95% confidence interval is shown by the gray lines over the bars. The 10–30 kDa window performed significantly better than the whole serum.