| Literature DB >> 31395825 |
Shigao Huang1, Jie Yang2,3,4, Simon Fong5,6, Qi Zhao7.
Abstract
This study is to identify the optimum prognosis index for brain metastases by machine learning. Seven hundred cancer patients with brain metastases were enrolled and divided into 446 training and 254 testing cohorts. Seven features and seven prediction methods were selected to evaluate the performance of cancer prognosis for each patient. We used mutual information and rough set with particle swarm optimization (MIRSPSO) methods to predict patient's prognosis with the highest accuracy at area under the curve (AUC) = 0.978 ± 0.06. The improvement by MIRSPSO in terms of AUC was at 1.72%, 1.29%, and 1.83% higher than that of the traditional statistical method, sequential feature selection (SFS), mutual information with particle swarm optimization(MIPSO), and mutual information with sequential feature selection (MISFS), respectively. Furthermore, the clinical performance of the best prognosis was superior to conventional statistic method in accuracy, sensitivity, and specificity. In conclusion, identifying optimal machine-learning methods for the prediction of overall survival in brain metastases is essential for clinical applications. The accuracy rate by machine-learning is far higher than that of conventional statistic methods.Entities:
Keywords: artificial intelligence; brain metastases; prognosis index; radiosurgery
Year: 2019 PMID: 31395825 PMCID: PMC6721536 DOI: 10.3390/cancers11081140
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Figure 1Overall survival curve for the four Prognostic Indices. RPA, Recursive Partitioning Analysis; GPA, Graded Prognostic Assessment; SIR, Score Index for Radiosurgery; BSBM, Basic Score for Brain Metastases; and OS, overall survival curve for all patients.
The description (letter representation, name) of cancer features in patients.
| Cancer Feature | Label |
|---|---|
| Age | A |
| Karnofsky Performance Status | K |
| Extracranial metastasis | E |
| Primary tumor control | P |
| Number of lesions | N |
| Max lesion volume | M |
| If chemical therapy | C |
Importance ranking of cancer features in patients by different methods.
| No. | Methods | Cancer Features in Patients | ||||||
|---|---|---|---|---|---|---|---|---|
| 1 | SFS | C | A | E | K | P | M | N |
| 2 | MISFS | A | P | E | K | C | N | M |
| 3 | MIPSO | P | E | A | C | K | N | M |
| 4 | MIRSPSO | P | C | A | E | N | K | M |
| P | A | C | E | N | K | M | ||
| P | E | C | A | N | K | M | ||
| P | A | E | C | K | N | M | ||
| P | A | E | C | K | N | M | ||
| A | P | C | E | K | N | M | ||
| P | E | A | C | N | K | M | ||
| P | A | E | C | N | K | M | ||
| P | A | E | C | N | K | M | ||
SFS: Sequential Feature Selection; MISFS: Sequential Feature Selection with mutual information; MIPSO: mutual information with particle swarm optimization; MIRSPSO: mutual information and rough set with particle swarm optimization.
Comparison of importance ranking of cancer features in patients for different weights in MIRSPSO.
| No. | Cancer Features in Patients |
|
| ||||||
|---|---|---|---|---|---|---|---|---|---|
| 1 | P | C | A | E | N | K | M | 0.9 | 0.1 |
| 0.9074 | 0.8535 | 0.7286 | 0.6303 | 0.4771 | 0.3896 | 0.3247 | |||
| 2 | P | A | C | E | N | K | M | 0.8 | 0.2 |
| 0.9278 | 0.8634 | 0.8021 | 0.6935 | 0.455 | 0.4011 | 0.3546 | |||
| 3 | P | A | E | C | N | K | M | 0.7 | 0.3 |
| 0.9322 | 0.8734 | 0.7559 | 0.6765 | 0.4834 | 0.3724 | 0.3618 | |||
| 4 | P | A | E | C | K | N | M | 0.6 | 0.4 |
| 0.8921 | 0.8105 | 0.765 | 0.6267 | 0.4933 | 0.4 | 0.3911 | |||
| 5 | P | A | E | C | K | N | M | 0.5 | 0.5 |
| 0.9213 | 0.7705 | 0.7068 | 0.703 | 0.4267 | 0.4201 | 0.3692 | |||
| 6 | A | P | C | E | K | N | M | 0.4 | 0.6 |
| 0.7183 | 0.6437 | 0.5902 | 0.5559 | 0.4757 | 0.43 | 0.3783 | |||
| 7 | P | E | A | C | N | K | M | 0.3 | 0.7 |
| 0.891 | 0.9211 | 0.7948 | 0.7774 | 0.4469 | 0.351 | 0.3 | |||
| 8 | P | A | E | C | N | K | M | 0.2 | 0.8 |
| 0.9335 | 0.91 | 0.8283 | 0.7632 | 0.4528 | 0.3439 | 0.3267 | |||
| 9 | P | A | E | C | N | K | M | 0.1 | 0.9 |
| 0.9546 | 0.8973 | 0.7751 | 0.6748 | 0.4486 | 0.3382 | 0.36 | |||
Figure 2Distribution of degrees of importance for different features in patients.
Figure 3Heat map illustrating the predictive performance (AUC) of the relationship between cancer features and importance degree in patients. P: Primary tumor control was the most important degree in patients.
Figure 4Predictive performance of the constructed classifier in the test cohort. (A) Heatmap depicting the prognostic performance (AUC) of feature selection (in rows) and classification (in columns) methods. It depicts the mean AUC of all the crossed methods, along with standard deviation (SD) and the 0.25 and 0.75 (the quantile values). It can be observed that MIRSPSO+RF, MIRSPSO+SVM had the highest values of AUC. (B) Confusion matrix with the MIRSPSO+RF classifier.
Figure 5(A) Box-plots of the AUC for the four methods. (B) Receiver operating characteristic (ROC) curve of the optimal classifier. The 5-fold cross-validated ROC curve of the optimal MIRSPSO+ Support Vector Machine (SVM) classifier in the test cohort.
Comparison of predictive performance between machine-learning and statistic methods in the test cohort.
| Statistic Method | RPA | GPA | SIR | BSBM | Overall Survival | Machine Learning |
|---|---|---|---|---|---|---|
| MST | 19 | 26 | 25 | 23 | 24 | - |
| Sensitivity | 0.67 | 0.71 | 0.59 | 0.88 | - | 0.92 |
| Specificity | 0.39 | 0.33 | 0.43 | 0.46 | - | 0.85 |
| Accuracy | 0.682 | 0.655 | 0.611 | 0.758 | - | 0.885 |
| PPV | - | - | - | - | 0.83 | 0.86 |
| NPV | - | - | - | - | 0.65 | 0.914 |
| P | <0.0001 a | <0.0001 a | <0.0001 a | <0.000 a | - | - |
a Chi-square test. Abbreviations: MST, medium survival time (months); PPV, positive prediction value; NPV, negative prediction value.
The characteristics and demographics of the patients.
| Characteristics | N (%) | |
|---|---|---|
| Patients | 700 | |
| Gender | Male | 456 (65.1%) |
| Female | 244 (34.9%) | |
| Age(years) | Median | 55 |
| Range | 48 (16–92) | |
| KPS | Median | 75 |
| Range | 30 (55–95) | |
| Primary tumor type | NSCLC | 635 (90.7%) |
| Breast cancer | 57 (8.1%) | |
| Other | 8 (1.2%) | |
| Primary tumor control | No | 319 (45.6%) |
| Yes | 381 (54.4%) | |
| Number of lesions | Median | 3 |
| Range | 5 (1–6) | |
| Tumor volume(mL) | Median | 6 |
| Range | 40.4 (0.04–49) | |
| Maximum diameter(mm) | <10 | 28 (4%) |
| 10–20 | 189 (27%) | |
| 21–30 | 245 (35%) | |
| 31–40 | 205 (29.3%) | |
| >40 | 33 (4.7%) | |
| Type of therapy | SRS | 225 (32.1%) |
| Fractionated SRS WBRT | 127 (18.1%) | |
| SBRT | 133 (19%) | |
| Surgical resection | 197 (28.1%) | |
| Extracranial metastasis | No | 470 (67.1%) |
| Yes | 230 (33.9%) | |
| Histology classification | Adenocarcinoma | 170 (24.3%) |
| Squamous cell carcinoma | 120 (17.1%) | |
| Large cell carcinoma | 87 (12.4%) | |
| In situ carcinoma | 195 (27.9%) | |
| Invasive carcinoma | 128 (18.3%) | |
| Molecular classification | EGFR | 23 (3.3%) |
| KRAS | 17 (2.4%) | |
| BRAF | 4 (0.6%) | |
| TP53 | 9 (1.3%) | |
| ALK | 15 (2.1%) | |
| TNBC | 8 (1.1%) | |
| HER2 | 9 (1.3%) | |
| PTEN | 7 (1.0%) | |
| Pattern of dissemination | Blood | 361 (51.6%) |
| Lymph | 255 (36.4%) | |
| Others | 84 (12%) | |
NSCLC: Non-small Cell Lung Cancer; SRS: Stereotactic Radiosurgery; WBRT: Whole Brain Radiotherapy; SBRT: Stereotactic Body Radiation Therapy; EGFR: Epidermal Growth Factor Receptor; KRAS: K-Ras gene; BRAF: B-Raf gene; TP53: TP53 gene; ALK: Anaplastic lymphoma kinase; TNBC: Triple-negative breast cance; HER2: human epidermal growth factor receptor 2; PTEN: Phosphatase and tensin homolog.
Figure 6The relationship among the methods used in this study.