| Literature DB >> 35741300 |
Nathan Blake1, Riana Gaifulina1, Lewis D Griffin2, Ian M Bell3, Geraint M H Thomas1.
Abstract
Raman Spectroscopy has long been anticipated to augment clinical decision making, such as classifying oncological samples. Unfortunately, the complexity of Raman data has thus far inhibited their routine use in clinical settings. Traditional machine learning models have been used to help exploit this information, but recent advances in deep learning have the potential to improve the field. However, there are a number of potential pitfalls with both traditional and deep learning models. We conduct a literature review to ascertain the recent machine learning methods used to classify cancers using Raman spectral data. We find that while deep learning models are popular, and ostensibly outperform traditional learning models, there are many methodological considerations which may be leading to an over-estimation of performance; primarily, small sample sizes which compound sub-optimal choices regarding sampling and validation strategies. Amongst several recommendations is a call to collate large benchmark Raman datasets, similar to those that have helped transform digital pathology, which researchers can use to develop and refine deep learning models.Entities:
Keywords: Raman Spectroscopy; cross-validation; deep learning; disease screening and diagnosis; machine learning; medical application
Year: 2022 PMID: 35741300 PMCID: PMC9222091 DOI: 10.3390/diagnostics12061491
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Literature search strategy: PRISMA flowchart of the literature selection process. n = number of studies.
Literature review results: PCA—Principle Component Analysis, LDA—Linear Discriminant Analysis, QDA—Quadratic Discriminant Analysis, PLS—Partial Least Squares, SVM—Support Vector Machine, ANN—Artificial Neural Network, CNN—Convolutional Neural Network, RF—Random Forest, GB—Gradient Boost, CV—Cross Validation, LOOCV—Leave One Out Cross Validation, GA—Genetic Algorithm, NPC—Nasopharyngeal Carcinoma.
| Authors/Year | Pathology Sample Type | Model | Validation Strategy | Number of Subjects/ Samples | Number of Spectra | Level of Split | Number of Classes | Accuracy (Sensitivity/ Specificity) |
|---|---|---|---|---|---|---|---|---|
| Aubertin et al., 2018 [ | Prostate Cancer (tissue) | ANN | LOOCV | 32 subjects/ samples | 928 | Not Stated | 2 | 86% (87%/86%) |
| Baria et al., 2020 [ | Skin Cancer (cell lines) | PCA-ANN | 5-fold CV | Not Stated | 150 | Not Stated | 3 | 96.7% |
| Bury et al., 2019 [ | Brain Metastases (tissue) | PCA-LDA | Not Stated | 21 subjects | 525 | Not Stated | 2 | 80.2% |
| Chen et al., 2022 [ | Ovarian Cancer (plasma) | ANN ensemble | Outer fold–single 66/33 Inner fold–5-fold CV | 174 subjects | 870 | Spectra | 2 | 94.8% (95%/95%) |
| Chen et al., 2021 [ | Lung cancer & glioma (tissue) | CNN | 5-fold CV | 104 subjects/ samples | 520 (2700 post augmentation) | Subject | 2 | 99% (all pairwise comparisons > 95% |
| Daniel et al., 2019 [ | Cervical Cancer (tissue) | PCA-ANN | Single 70/30 | 245 samples | Not Stated | Not Stated | 3 | 99.0% (87%/86%) |
| He et al., 2021 [ | Renal Cancer | SVM | LOOCV | 77 subjects/ samples | 4860 | Subject | 3 | 92.89% |
| Ito et al., 2020 [ | Colon Cancer (serum) | Boosted Tree | Not Stated | 184 subjects/ samples | 3 spectra per subject. Average used | N/A | 2 | 100% |
| Jeng et al., 2019 [ | Oral Cancer (tissue) | PCA-QDA | 80 subjects/ samples | 400 | Sample | 2 | 82% (84%/75%) | |
| Koya et al., 2020 [ | Breast Cancer (tissue) | CNN | Single split 60/20/20 | 88 subjects/ samples | 34,505 | Spectra | 2 | 90% (89%—precision, 89%—recall) |
| Lee et al., 2020 [ | Prostate Cancer (cell lines) | CNN | Single split 70/15/15 | 1 sample per class, 4 classes | 300 (1200 post augmentation) | Spectra | 4 | 97% |
| Ma et al., 2021 [ | Breast Cancer (tissue) | CNN | 10-fold CV | 20 subjects, 40 samples | 600 (5000 post augmentation) | Not Stated | 2 | 92% (98%/86%) |
| Mehta et al., 2018 [ | Brain Meningioma (serum) | PCA-LDA | LOOCV + independent test set | 20 subjects, 70 samples | ~8 spectra per subject. Average used | N/A | 2 | 86% |
| Qi et al., 2022 [ | Lung Cancer (tissue) | CNN | 10-fold CV | 77 subjects/ samples | 15 spectra per sample | Spectra | 2 | 98% (97%/99%) |
| Riva et al., 2021 [ | Glioma (tissue) | GB | LOOCV | 63 subjects/ samples | 3450 | Subject | 2 | 83% (82%—precision, 82%—recall) |
| Santos et al., 2018 [ | Skin (tissue) | PCA-LDA | Single split 60/40 | 128 samples | 9–19 spectra per sample | Sample | 2 | 62.5% |
| Sciortino et al., 2021 [ | Glioma (tissue) | SVM | LOOCV | 38 subjects/ samples | 2073 | Subject | 2 | 87% |
| Serzhantov et al., 2020 [ | Skin (tissue) | Gradient with soft voting | Single split 50/50, 1000 repeats | 139 subjects | 556 | Not Stated | 2 | 91% (93%/88%) |
| Shu et al., 2021 [ | Nasopharyngeal Cancer (in vivo tissue) | CNN | 10-fold Venetian Blind CV | 418 subjects, 888 samples | 15,354 (Augmented, quantity not specified) | Sample | 2 | 84% (99%/66%) |
| Wu et al., 2021 [ | Colon Cancer (tissue) | CNN | LOOCV | 45 subjects/ samples | 233 (2420 post augmentation) | Spectra AND Subject | 3 | 94%—by spectra, 81%—by subject |
| Xia et al., 2021 [ | Tongue Cancer (tissue) | CNN-SVM | 5-fold CV | 12 subjects, 24 samples | At least 216 | Not Stated | 2 | 99.5% (100%/100%) |
| Yan et al., 2021 [ | Tongue Cancer (tissue) | CNN ensemble | 5-fold CV | 22 subjects, 44 samples | 2004 | Not Stated | 2 | 99% (99%/98%) |
| Yu et al., 2021 [ | Tongue Cancer (tissue) | CNN | 5-fold CV | 12 subjects, 24 samples | 1440 | Not Stated | 2 | 97% (99%/94%) |
| Zhang et al., 2021 [ | Breast Cancer | PCA-SVM | Single split | 6 cell line | 4500 | Not Stated | 2 | 99.0% (100%/96%) |
| Zuvela et al., 2019 [ | Nasopharyngeal Cancer (in vivo tissue) | GA- PLS-LDA | LOOCV | 62 subjects, 113 samples | 2126 | Sample | 2 | 98% (93%/100%) |
Figure 2Validation strategy used in the reviewed literature. Some studies used more than one strategy.
Figure 3Spectra versus patient data splitting: note how the test set when split by spectra includes some spectra from all the patients contained in the train set.
Deep versus traditional learning models. If the models were tested against various sub-sets of the data, this is given in the data subset column. Boldface text indicates the best-performing model.
| Study | Deep Model | Traditional Model | Data Subsets |
|---|---|---|---|
| 96.0% (PCA-ANN) | SK-MEL-2 (Cell lines) | ||
|
| 90.0% | SK-MEL-28 | |
|
| 90.0% | MW-266-4 | |
|
| 92.7% | All | |
| 98.0% (PCA-LDA) | |||
|
| 92.3% (ANN) | ||
|
| 78.3% (PCA-QDA) | Processed, whole spectra | |
| 90.2% |
| Processed, fingerprint | |
|
| 86.7% | Processed, high wavenumber | |
|
| 68.3% | Unprocessed, whole spectra | |
|
| 61.7% | Unprocessed, fingerprint | |
|
| 60.0% | Unprocessed, high wavenumber | |
|
| 86.5% (SVM) | ||
|
| 86.6% (PCA-LDA) | Adenomcarcinoma | |
|
| 82.1% | Squamous cell carcinoma | |
|
| 73.6% (PLS-LDA) | All data | |
|
| 83.7% | NPC vs. control | |
|
| 68.4% | NPC vs. post-treatment | |
|
| 52.7% (KNN) | Processed data | |
|
| 42.0% (SVM) | Unprocessed data | |
|
| 95.4% (PCA-SVM) | ||
|
| 88.5% (PCA-SVM) | ||
|
| 88.5% (SVM) |