| Literature DB >> 35326526 |
Leon Jekel1,2,3, Waverly R Brim1,4, Marc von Reppert1, Lawrence Staib1, Gabriel Cassinelli Petersen1, Sara Merkaj1, Harry Subramanian1, Tal Zeevi1, Seyedmehdi Payabvash1, Khaled Bousabarah5, MingDe Lin1,6, Jin Cui1, Alexandria Brackett7, Amit Mahajan1, Antonio Omuro8, Michele H Johnson1, Veronica L Chiang9,10, Ajay Malhotra1, Björn Scheffler2,3, Mariam S Aboian1.
Abstract
Glioma and brain metastasis can be difficult to distinguish on conventional magnetic resonance imaging (MRI) due to the similarity of imaging features in specific clinical circumstances. Multiple studies have investigated the use of machine learning (ML) models for non-invasive differentiation of glioma from brain metastasis. Many of the studies report promising classification results, however, to date, none have been implemented into clinical practice. After a screening of 12,470 studies, we included 29 eligible studies in our systematic review. From each study, we aggregated data on model design, development, and best classifiers, as well as quality of reporting according to the TRIPOD statement. In a subset of eligible studies, we conducted a meta-analysis of the reported AUC. It was found that data predominantly originated from single-center institutions (n = 25/29) and only two studies performed external validation. The median TRIPOD adherence was 0.48, indicating insufficient quality of reporting among surveyed studies. Our findings illustrate that despite promising classification results, reliable model assessment is limited by poor reporting of study design and lack of algorithm validation and generalizability. Therefore, adherence to quality guidelines and validation on outside datasets is critical for the clinical translation of ML for the differentiation of glioma and brain metastasis.Entities:
Keywords: artificial intelligence; brain metastasis; glioblastoma; glioma; machine learning; reporting quality assessment; systematic review
Year: 2022 PMID: 35326526 PMCID: PMC8946855 DOI: 10.3390/cancers14061369
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1Characterization of search strategy using PRISMA. This flowchart represents the search and screening workflow and the eligibility criteria applied to the studies. BM = brain metastasis.
Overview of study characteristics and best performing classifier from each study. Abbreviations: GBM = Glioblastoma; MET = Brain metastasis; PCNSL = Primary central nervous system lymphoma; MEN = Meningioma; MED = Medulloblastoma; CV = Cross-validation; LOOCV = Leave-One-Out cross-validation; ML = Machine learning; DL = Deep learning; T1CE = contrast-enhanced T1-weighted sequence; DWI = Diffusion weighted imaging; DTI = Diffusion tensor imaging; PWI = Perfusion weighted imaging; rCBV = relative cerebral blood volume; FLAIR = Fluid-attenuated inversion recovery; TE = Time to echo; AUC = Area under the receiver operating characteristic curve; ADC = Apparent diffusion coefficient; LASSO = Least absolute shrinkage and selection operator; SVM = Support vector machine; MLP = Multilayer perceptron; NNW = Neural networks; LogReg = Logistic Regression; DNN = Deep neural network; LDA = Linear discriminant analysis; NB = Naïve Bayes; VFI = Voting feature intervals; KNN = k-nearest neighbors; PNN = Probabilistic neural networks; RF = Random Forest; RBF = Radial basis function kernel; n/a = not available.
| Paper | Total Patient Number | Number of Glioma Patients | Number of BM Patients | Ratio of Glioma/met Patients | Solitary BM Only | GBM Only | Tumor Types Studied | Number of Additional Tumors | Number of Patients (Training) | Number of Patients (Validation) | Testing | External Validation | Source of Data | ML Method | Algorithms Used for Classification | Gold Standard for Accuracy | Extracted Feature Types | Best Performing Classifier |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Swinburne et al., 2019 [ | 26 | 9 | 9 | 1.000 | no | yes | GBM vs. MET vs. PCNSL | 8 (PCSNL) | LOOCV | no | single-center | ML | SVM, MLP | Pathology | Perfusion | MLP (Ktrans on T1CE mask) | ||
| Park et al., 2020 [ | 276 | 137 | 59 | 2.322 | no | yes | GBM vs. MET vs. PCNSL | 80 (PCSNL) | 216 (109 GBM, 58 PCNSL, 49 MET) | 60 (28 GBM, 22 CNSL, 10 MET) | no | multi-center | DL | CNN | Pathology | Perfusion (Temporal Patterns of Time-Signal | CNN (DSC, FLAIR, T1CE)—internally validated | |
| Shrot et al., 2019 [ | 141 | 41 | 38 | 1.079 | no | yes | GBM vs. MET vs. PCNSL vs. MEN | 12 (PCSNL), 50 (Meningioma) | LOOCV | no | single-center | ML | Decision tree (SVM) | Pathology | Morphology, Diffusion, Perfusion | Binary hierarchical tree with SVM classifier (T1, T1c, T2, FLAIR, DTI, DSC) | ||
| Yamashita et al., 2008 [ | 126 | 95 | 19 | 5.000 | multiple | no | Glioma vs. MET vs. PCNSL | 12 (PCSNL) | LOOCV | no | not specified | ML | 3-layered NNW | Pathology | Clinical, Qualitative/Semantic imaging features | ANN | ||
| Blanchet et al., 2011 [ | 33 | 18 | 15 | 1.200 | solitaty | yes | GBM vs. MET | LOOCV | no | single-center | ML | k-means clustering | Pathology | Shape | k-means clustering (T1, T2) | |||
| Tsolaki et al., 2013 [ | 49 | 35 | 14 | 2.500 | solitary | yes | GBM vs. MET | 10-fold CV | no | single-center | ML | SVM, Naive Bayes, KNN | Pathology | Spectroscopy | SVM (MRS: NAA; rCBV)—peritumoral | |||
| Yang et al., 2014 [ | 48 | 30 | 18 | 1.667 | solitary | yes | GBM vs. MET | LOOCV | no | single-center | ML | QDA, NB, SVM, KNN, NNW (MLP architecture) | Pathology | Shape, Diffusion | Neural Network (DTI) | |||
| Tateishi et al., 2020 [ | 127 | 73 | 53 | 1.377 | multiple, largest selected for classification | yes | GBM vs. MET | 5-fold CV | no | single-center | ML | SVM | Pathology, clinical history of path-proven primary cancer | Texture | SVM (T1CE, T2, ADC) | |||
| Abidin et al., 2019 [ | 52 | 35 | 17 | 2.059 | solitary | yes | GBM vs. MET | stratified 10-fold CV | no | single-center | ML | AdaBoost | Pathology | First-order statistics, Texture, Higher-order-features: Topology (Minkowski functionals), Wavelet-transformed, Local Binary Patterns (LBP) | AdaBoost (Local Binary Pattern, T1CE) | |||
| Bae et al., 2020 [ | 248 | 159 | 89 | 1.787 | solitary | yes | GBM vs. MET | 166 (109 GBM, 57 MET) | 82 (50 GBM, 32 MET) | yes | single-center | ML and DL | DNN, | Pathology | DL extracted (DL) | Deep Neural Network (T1CE)—internal | ||
| Artzi et al., 2019 [ | 439 | 212 | 227 | 0.934 | solitary | yes | GBM vs. MET vs. MET-subtypes | 5-fold CV | no | single-center | ML | SVM, KNN, decision trees, ensemble classifiers | Pathology | Clinical features, Qualitative/semantic imaging features, Morphology, First-order statistics, Texture, Higher-order features: Wavelet features, Bagof-features | SVM (T1CE) | |||
| Yang et al., 2016 [ | 48 | 30 | 18 | 1.667 | solitary | yes | GBM vs. MET | LOOCV | no | single-center | ML | SVM | Pathology | Shape | SVM (DTI, Cluster 1 & 4) | |||
| Dong et al., 2020 [ | 120 | 60 | 60 | 1.000 | solitary | n/a | Glioma vs. MET | 84 (42 GBM, 42 MET) | 36 (18 GBM, 18 MET) | no | single-center | ML | NNW, DT, NB, KNN, SVM | Radiological | Shape, First-order statistics, Texture | Naive Bayes (T1, T1CE, T2) | ||
| Meier et al., 2020 [ | 109 | 25 | 84 | 0.298 | 231 lesions in 109 patients | yes | GBM vs. MET | stratified 3-fold CV | no | single-center | ML | SVM | Pathology | Qualitative/Semantic imaging features | SVM (Qualitative image features) | |||
| Georgiadis et al., 2008 [ | 67 | 21 | 19 | 1.105 | no | no | Glioma vs. MET vs. MEN | 27 (Meningioma) | external cross-validation (ECV) with 3-fold split | no | single-center | ML | PNN, LSFT-PNN, SVM-RBF, ANN, Cubic LSFT-PNN, Quardratic LSFT-PNN | Radiological | Texture | ANN (T1)—Primary tumors vs. Secondary tumors (MET + Meningioma) | ||
| Tsolaki et al., 2015 [ | 126 | 80 | 22 | 3.636 | solitary | no | Glioma vs. MET vs. MEN | 24 (Meningioma) | 10-fold cross validation | no | single-center | ML | SVM, Naïve Bayes, k-NN, LDA | Pathology | Spectroscopy, Diffusion, Perfusion | SVM (DWI/DTI/PWI/short TE)—peritumoral | ||
| Zacharaki et al., 2009 [ | 98 | 74 | 24 | 3.083 | no | no | Glioma vs. MET vs. MEN | 4 (Meningioma) | LOOCV | no | single-center | ML | SVM, k-NN, LDA | Pathology | Shape, First-order statistics, Texture | SVM (FLAIR, T2, T1ce, rCBV, T1) | ||
| Zacharaki et al., 2011 [ | 97 | 73 | 23 | 3.174 | no | no | Glioma vs. MET vs. MEN | LOOCV | no | single-center | ML | VFI, KNN, Naive Bayes | Pathology | Clinical, Shape, First-order | kNN with wrapper evaluator | |||
| Svolos et al., 2013 [ | 115 | 73 | 18 | 4.056 | solitary | no | Glioma vs. MET vs. MEN | 24 (atypical Meningioma) | 10-fold cross validation | no | single-center | ML | SVM | Pathology | Diffusion, Perfusion | SVM (HGG Grade 4 vs. MET) (ADC, FA, rCBV)—peritumoral | ||
| Sachdeva et al., 2016 [ | 428 | 177 | 66 | 2.682 | no | no | Glioma vs. MET vs. MEN vs. MED | 97 (Meningioma), 88 (Medulloblastoma) | 40% training, 10% testing, 50% validation | 40% training, 10% testing, 50% validation | 40% training, 10% testing, 50% validation | no | public dataset | ML | GA, GA-SVM, GA-ANN | Radiological | First-order statistics, Texture | GA-ANN—no binary classification |
| Payabvash et al., 2020 [ | 248 | 99 | 65 | 1.523 | no | no | Glioma vs. MET vs. MED vs. Hemangioblastoma vs. Ependymoma | Hemangioblastoma (n = 44), Ependymoma (n = 27), Medulloblastoma (n = 26). | 10-fold cross validaiton | no | single center | ML | NB, RF, NN, SVM | Pathology | Clinical (Age), Qualitative/Semantic imaging features, Diffusion | Random Forest—MET vs. All primary tumors | ||
| Qin et al., 2019 [ | 42 | 24 | 18 | 1.333 | solitary | yes | GBM vs. MET | 5-fold cross validation | no | single center | ML | Decision trees, LDA, LogReg, linear SVM, KNN | Pathology | First-order, Second-order (Energy) | kNN | |||
| Chen et al., 2019 [ | 134 | n/a | n/a | no | yes | GBM vs. MET | 80% | 20% | no | single center | ML | LDA, SVM, RF, KNN, Gaussian NB | Pathology | Texture | LogReg + Distance correlation | |||
| Ortiz-Ramón et al., 2020 [ | 100 | 50 | 50 | 1.000 | no | yes | GBM vs. MET | nested cross-validation | no | single center | ML | random forest (RF), support vector machine (SVM) with linear kernel, k-nearest neighbors (KNN), naïve Bayes (NB) and multilayer perceptron (MLP) | Radiological | Texture | MLP | |||
| Shin et al., 2021 [ | 741 | 482 | 259 | 1.861 | solitary | yes | GBM vs. MET | 450 | 48 | 100 | 143 | multi-center | DL | CNN (2D) | Pathology | DL extracted | CNN (2D; T1CE, T2)—internal | |
| Priya et al., 2021 [ | 120 | 60 | 60 | 1.000 | no | yes | GBM vs. MET | nested cross-validation | no | single center | ML | Linear (LASSO, Elastic Net) and logistic regression, NNW, SVM- MLP, RF, AdaBoost | Clinico-Radiological | Shape, First-order statistics, Texture | LASSO (T1, T1CE, T2, FLAIR, ADC) | |||
| de Causans et al., 2021 [ | 180 | 92 | 88 | 1.045 | multiple, largest selected for classification | yes | GBM vs. MET | 143 (71 GBM, 72 BM) | nested cross-validation (10 repeated 5-fold CV) | 37 (21, 16) | no | multi-center | ML | LogReg (Yeo-Johnson scaling features) | Pathology | Shape, First-order statistics, Texture | LogReg (T1CE) | |
| Liu et al., 2021 [ | 268 | 140 | 128 | 1.094 | solitary | yes | GBM vs. MET | 208 (110 GBM, 98 BM) | 10-fold cross validation | 60 (30, 30) | no | single center | ML | RF, DT, LogReg, | Pathology | Shape, First-order statistics, Texture, Higher-order: Wavelet-transformed, Laplace of Gaussian | Random Forest (Boruta selection) (T1CE) | |
| Samani et al., 2021 [ | 136 | 86 | 50 | 1.720 | no, 3 patients with multifocal metastasis | yes | GBM vs. MET | 108 (66 GBM, 40 BM) | 5-fold cross validation | 30 (20, 10) | no | single center | DL | 2D CNN | Pathology | Diffusion | CNN (2D, DTI, FW-VP map)—patch wise | |
Figure 2Source of datasets. The chart displays the distribution of types of datasets, from which MRI scans were derived, among the different studies. Note how the majority (89%) of studies trained and validated on single-center data.
Figure 3Class distribution of gliomas and brain metastases (left) and total number of patient studies (right) in each study. The panel on the left-hand side shows the ratio of glioma and brain metastasis patients among the different datasets. The dotted line indicates equal class distribution, i.e. class balance. The right-hand panel indicates the total number of patients across all studies. Note that most studies were trained and validated on datasets with less than 200 patients.
Figure 4Most frequently reported performance metrics for the best performing classifier from each study included in this data aggregation (n = 26). The lines indicate the mean of the different metrics and reached an overall high level. Note that not all the above-mentioned evaluation metrics were indicated for every classifier and are displayed in different amounts in this plot.
Figure 5Bar graph of TRIPOD adherence index, a measure for degree of satisfaction for each TRIPOD item applicable in model development studies. Item 11 (Risk groups) was applicable in none of the studies at hand. Item 21 (Supplementary information) is not shown, as it is not included in overall scoring according to official guidelines. Item labels are as follows: 1—Title, 2—Abstract, 3a—Background, 3b—Objectives, 4a—Source of data (Study design), 4b—Source of data (Study dates), 5a—Participants (Study setting), 5b—Eligibility criteria, 5c—Participants (Treatments received), 6a—Outcome (Definition), 6b—Outcome (Blind assessment, 7a—Predictors (Definition), 7b—Predictors (Blind assessment), 8—Sample size, 9—Missing data, 10a—Statistical analysis (Predictors), 10b—Statistical analysis (Model development), 10d—Statistical analysis (Model evaluation), 13a—Flow of participants, 13b—Participant characteristics, 14a—Number of participants and outcomes, 14b—Model development (Predictor and outcome association), 15a—Full model specification, 15b—Model explanation, 16—Model performance, 18—Limitations, 19b—Interpretations, 20—Implications, 22—Funding. Note that items 15a (Full model specification) and 16 (Model performance) are among the TRIPOD items with lowest adherence in the surveyed studies despite having a paramount role for model reproducibility and successful translation to the clinic.