| Literature DB >> 31218287 |
Sandip S Panesar1, Rhett N D'Souza2, Fang-Cheng Yeh2,3, Juan C Fernandez-Miranda1.
Abstract
BACKGROUND: Machine learning (ML) is the application of specialized algorithms to datasets for trend delineation, categorization, or prediction. ML techniques have been traditionally applied to large, highly dimensional databases. Gliomas are a heterogeneous group of primary brain tumors, traditionally graded using histopathologic features. Recently, the World Health Organization proposed a novel grading system for gliomas incorporating molecular characteristics. We aimed to study whether ML could achieve accurate prognostication of 2-year mortality in a small, highly dimensional database of patients with glioma.Entities:
Keywords: ANN, Artificial neural network; AUC, Area under the curve; CI, Confidence interval; DT, Decision tree; Diagnosis; Gliomas; LR, Logistic regression; Logistic regression; ML, Machine learning; Machine learning; NLR, Negative likelihood ratio; NPV, Negative predictive value; Neuro-oncology; PLR, Positive likelihood ratio; PPV, Positive predictive value; Prognostication; SVM, Support vector machine; WHO, World Health Organization
Year: 2019 PMID: 31218287 PMCID: PMC6581022 DOI: 10.1016/j.wnsx.2019.100012
Source DB: PubMed Journal: World Neurosurg X ISSN: 2590-1397
Figure 1Graphical representation of traditional statistical approaches to regression, with logistic (A) and linear regression (B) on the top row. The bottom row demonstrates machine learning approaches graphically, with support vector machine (C), artificial neural network (D), and decision tree (E) approaches.
Figure 2Diagram of study design demonstrating both approaches to machine learning (ML). Left demonstrates the raw approach, and right demonstrates the initial statistical testing of variable significance prior to ML input (feature selection). Outputs (performance) were analyzed independently and then compared. ANN, artificial neural network; DT, decision tree; LR, logistic regression; SVM, support vector machine.
Demographic and Variable Features of the Population Categorized by 2-Year Survival
| Variable | Total (N = 76) | Dead at 2 Years (n = 24) | Alive at 2 Years (n = 52) | Statistic | |
|---|---|---|---|---|---|
| Age (years) | 47.29 ± 16.78 | 60.48 ± 14.03 | 43.10 ± 15.85 | 4.81 | <0.05 |
| Sex | |||||
| Male | 37 | 14 | 23 | 3.84 | 0.25 |
| Female | 39 | 10 | 29 | ||
| Average diameter (cm) | 3.41 ± 1.61 | 3.40 | 3.42 | –0.06 | 0.95 |
| Initial intervention | 10.69 | <0.05 | |||
| Total resection | 29 | 5 | 24 | ||
| Subtotal resection | 38 | 18 | 20 | ||
| Biopsy only | 6 | 0 | 6 | ||
| Gamma knife | 2 | 1 | 1 | ||
| None | 1 | 0 | 1 | ||
| ECOG Performance Status | |||||
| Preoperative score | 1.70 ± 0.67 | 1.92 | 1.60 | 1.98 | 0.05 |
| Postoperative score | 1.55 ± 0.85 | 1.92 | 1.38 | 2.44 | <0.05 |
| Adjunctive treatment | 0.06 | 0.97 | |||
| Chemotherapy | 51 | 18 | 33 | ||
| Radiotherapy | 48 | 18 | 30 | ||
| Vaccine | 3 | 1 | 2 | ||
| Number of surgeries | 0.22 | 0.64 | |||
| 1 | 51 | 17 | 34 | ||
| >1 | 25 | 7 | 18 | ||
| Lobe | 10.16 | 0.12 | |||
| Frontal | 28 | 5 | 23 | ||
| Temporal | 22 | 11 | 11 | ||
| Parietal | 2 | 1 | 1 | ||
| Occipital | 2 | 0 | 2 | ||
| Brainstem | 2 | 0 | 2 | ||
| Other | 3 | 0 | 3 | ||
| Multiple | 17 | 7 | 10 | ||
| WHO grade | 16.73 | <0.05 | |||
| 1 | 6 | 0 | 6 | ||
| 2 | 24 | 2 | 22 | ||
| 3 | 8 | 2 | 6 | ||
| 4 | 38 | 20 | 18 | ||
| Molecular features (number unknown) | 23.71 | <0.05 | |||
| | 21 | 12 (1) | 9 (2) | ||
| PTEN deletion | 30 | 16 (1) | 14 (2) | ||
| p53 mutation | 29 | 6 (1) | 23 (2) | ||
| 1p deletion | 15 | 2 (1) | 13 (2) | ||
| 19q deletion | 19 | 5 (1) | 14 (2) | ||
| 9p(p16) deletion | 34 | 14 (1) | 20 (2) | ||
| | 24 | 3 (1) | 21 (2) | ||
| | 3 | 0 (1) | 3 (2) | ||
| MGMT methylation | 35 | 10 (1) | 25 (2) | ||
| Ki67 index | 18.80 ± 16.73 | 27.90 | 14.60 | 3.74 | <0.05 |
Values are mean ± SD, number of patients, or as otherwise indicated.
WHO, World Health Organization; MGMT, O6-methylguanine-DNA methyltransferase; PTEN, phosphate and tensin homolog.
Statistic is either χ2 (categorical variables) or T statistic (continuous variables).
Because multiple independent statistical tests were performed, P values have been adjusted via application of Bonferroni correction.
Performance for All Machine Learning Categories
| ANN: Raw Data | SVM: Raw Data | DT: Raw Data | LR: Raw Data | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Alive | Dead | Alive | Dead | Alive | Dead | Alive | Dead | ||||
| Predicted alive | 19.13 | 6.20 | Predicted alive | 18.67 | 6.67 | Predicted alive | 17.33 | 6.40 | Predicted alive | 18.06 | 6.53 |
| Predicted dead | 4.33 | 6.26 | Predicted dead | 4.87 | 5.80 | Predicted dead | 6.20 | 6.07 | Predicted dead | 5.47 | 5.93 |
Smaller tables are 2 × 2 confusion matrices containing the averaged output variables of the 15 cycles of machine learning for each algorithm. Underneath each confusion matrix is the performance of each test, calculated from the matrix and given to 95% CI. The upper 2 rows of the tables are for the raw datasets, and the lower 2 rows of tables are for the feature selected datasets.
ANN, artificial neural network; SVM, support vector machine; DT, decision tree; LR, logistic regression; CI, confidence interval; PLR, positive likelihood ratio; NLR, negative likelihood ratio; PPV, positive predictive value; NPV, negative predictive value.
Figure 3A 4 × 3 array of figures demonstrating the algorithm performance using both raw data (far left column) and feature-selected (middle column) approaches. Change in performance between raw and feature-selected data is demonstrated in the far-right column. The first row shows sensitivity versus specificity performance; the second row shows positive predictive value versus negative predictive value performance; the third row shows positive likelihood ratio versus negative likelihood ratio performance; and the fourth row shows accuracy versus area under the curve performance. *Area under the curve metrics have been scaled to 100 to correlate with accuracy. ANN, artificial neural network; AUC, area under the curve; DT, decision tree; LR, logistic regression; NLR, negative likelihood ratio; NPV, negative predictive value; PLR, positive likelihood ratio; PPV, positive predictive value; SVM, support vector machine.
Receiver Operating Curve Characteristics for Uncensored and Censored Approaches
| Uncensored | Feature Selected | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Algorithm | AUC | SE | 95% CI (Lower) | 95% CI (Upper) | Algorithm | AUC | SE | 95% CI (Lower) | 95% CI (Upper) |
| ANN | 69.19 | 9.86 | 49.86 | 88.52 | ANN | 75.40 | 6.40 | 62.90 | 87.90 |
| SVM | 71.88 | 9.43 | 53.40 | 90.36 | SVM | 74.71 | 6.50 | 62.10 | 87.40 |
| DT | 70.54 | 9.65 | 51.62 | 89.46 | DT | 63.00 | 7.10 | 49.10 | 76.90 |
| LR | 64.29 | 10.54 | 43.63 | 84.95 | LR | 72.87 | 6.60 | 60.00 | 85.80 |
Machine learning versus LR methods for 2-year mortality prognostication in a small, heterogeneous glioma database.
AUC, area under the curve; CI, confidence interval; ANN, artificial neural network; SVM, support vector machine; DT, decision tree; LR, logistic regression.
Figure 4Receiver operating curves for raw (left) and feature-selected (right) data. Lines are color coordinated using the figure legend in the bottom right corner of the graph. Perforated diagonal line is y = x, with an area under the curve of 0.5 (indicative of random guessing). The performance increase is discerned by the increased distance between all curves and that of y = x. The lower right column chart demonstrates the scaled 95% confidence intervals of the area under the curves calculated for each machine learning method. Artificial neural network and logistic regression methods were not statistically different from 0.5. Support vector machine and decision tree methods were better than random guessing; however, their statistical significance was weak. Performance of algorithms using raw datasets can be compared with the chart for the feature-selected dataset in the lower right corner. These are the 95% confidence intervals for all machine learning methods, and can be concluded to be not only further away from 0.5 but also narrower, indicating greater significance. ANN, artificial neural network; AUC, area under the curve; C.I., confidence interval; DT, decision tree; LR, logistic regression; ROC, receiver operating curve; SVM, support vector machine.