| Literature DB >> 32733086 |
Daniel L D Freitas1, Ingrid M Câmara1, Priscila P Silva1, Nathália R S Wanderley2, Maria B C Alves3,4, Camilo L M Morais5,6, Francis L Martin5,6, Tirzah B P Lajus2,3,4, Kassio M G Lima7.
Abstract
Mortality due to breast cancer could be reduced via screening programs where preliminary clinical tests employed in an asymptomatic well-population with the objective of identifying cancer biomarkers could allow earlier referral of women with altered results for deeper clinical analysis and treatment. The introduction of well-population screening using new and less-invasive technologies as a strategy for earlier detection of breast cancer is thus highly desirable. Herein, spectrochemical analyses harnessed to multivariate classification techniques are used as a bio-analytical tool for a Breast Cancer Screening Program using liquid biopsy in the form of blood plasma samples collected from 476 patients recruited over a 2-year period. This methodology is based on acquiring and analysing the spectrochemical fingerprint of plasma samples by attenuated total reflection Fourier-transform infrared spectroscopy; derived spectra reflect intrinsic biochemical composition, generating information on nucleic acids, carbohydrates, lipids and proteins. Excellent results in terms of sensitivity (94%) and specificity (91%) were obtained using this method in comparison with traditional mammography (88-93% and 85-94%, respectively). Additional advantages such as better disease prognosis thus allowing a more effective treatment, lower associated morbidity, fewer false-positive and false-negative results, lower-cost, and higher analytical frequency make this method attractive for translation to the clinical setting.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32733086 PMCID: PMC7393361 DOI: 10.1038/s41598-020-69800-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1ATR-FTIR spectra of plasma samples in the bio-fingerprint region (1,800–900 cm−1). (a) Raw spectral data for breast cancer (BC) and healthy controls (HC) samples; (b) pre-processed spectral data (Savitzky–Golay smoothing [window of 7 points, 2nd order polynomial fitting] followed by AWLS baseline correction and normalization to the Amide I peak) for breast cancer (BC) and healthy controls (HC) samples.
Statistical results in % for the test set using the PCA-LDA/QDA/SVM, SPA-LDA/QDA/SVM and GA-LDA/QDA/SVM to discriminate healthy controls and breast cancer samples.
| Model | AC | SENS | SPEC | YOU | PPV | NPV | F-score | G-score |
|---|---|---|---|---|---|---|---|---|
| PCA-LDA | 65.7 | 82.9 | 486 | 31.4 | 61.7 | 73.9 | 61.2 | 63.4 |
| PCA-QDA | 65.7 | 82.9 | 48.6 | 31.4 | 61.7 | 73.9 | 61.2 | 63.4 |
| PCA-SVM | 88.6 | 91.4 | 85.7 | 77.1 | 86.5 | 90.9 | 88.5 | 88.5 |
| SPA-LDA | 68.6 | 80.0 | 57.1 | 37.1 | 65.1 | 74.1 | 66.7 | 67.6 |
| SPA-QDA | 74.3 | 85.7 | 62.9 | 48.6 | 69.8 | 81.5 | 72.5 | 73.4 |
| GA-LDA | 75.7 | 74.3 | 77.1 | 51.4 | 76.5 | 75.0 | 75.7 | 75.7 |
| GA-QDA | 72.9 | 71.4 | 74.3 | 45.7 | 73.5 | 72.2 | 72.8 | 72.8 |
| GA-SVM | 87.1 | 88.6 | 85.7 | 74.3 | 86.1 | 88.2 | 87.1 | 87.1 |
AC, Accuracy; SENS, Sensitivity; SPEC, Specificity; YOU, Youden’s Index; PPV, Positive predictive value; NPV, Negative predictive value. The best model (SPA-SVM) is in bold.
Figure 2Receiver operating characteristic (ROC) curve. Where, PCA-LDA: principal component analysis linear discriminant analysis; PCA-QDA: principal component analysis quadratic discriminant analysis; PCA-SVM: principal component analysis support vector machines; SPA-LDA: successive projections algorithm linear discriminant analysis; SPA-QDA: successive projections algorithm quadratic discriminant analysis; SPA-SVM: successive projections algorithm support vector machines; GA-LDA: genetic algorithm linear discriminant analysis; GA-QDA: genetic algorithm quadratic discriminant analysis; GA-SVM: genetic algorithm support vector machines. AUC: area under the curve.
Figure 3Selected wavenumbers by the successive projections algorithm support vector machines (SPA-SVM) model.
Selected wavenumbers by the SPA-SVM to distinguish healthy controls and breast cancer samples.
| Selected wavenumber (cm−1) | Tentative assignment |
|---|---|
| 901 | Phosphodiester (absorbances due to collagen and glycogen) |
| 959 | Symmetric stretching vibration of n1PO4 |
| 980 | OCH3 (polysaccharides) |
| 999 | Ring stretching vibrations mixed strongly with CH in plane bending |
| 1,018 | n(CO), n(CC), d(OCH), ring (polysaccharides, pectin) |
| 1,277 | Vibrational modes of collagen |
| 1,311 | Amide III band components of proteins |
| 1,364 | Stretching C–O, deformation C–H, deformation N–H |
| 1,402 | Symmetric CH3 bending modes of the methyl groups of proteins |
| 1,464 | CH2 scissoring mode of the acyl chain of lipid |
| 1,489 | In-plane CH bending vibration |
| 1582 | Ring C–C stretch of phenyl |
| 1626 | Peak of nucleic acids due to the base carbonyl stretching and ring breathing mode |
| 1643 | Amide I band (arises from C=O stretching vibrations) |
| 1661 | n(C=C) cis in lipids and fatty acids |
| 1742 | C=O stretching mode of lipids |
Equations to calculate the figures of merit for model evaluation.
| Parameter (%) | Equation |
|---|---|
| Accuracy (AC) | |
| Sensitivity (SENS) | |
| Specificity (SPEC) | |
| Youden’s index (YOU) | |
| Positive predictive value (PPV) | |
| Negative predictive value (NPV) | |
| F-score | |
| G-score |
FN stands for false negative, FP for false positive, TP for true positive, and TN for true negative.