| Literature DB >> 32274426 |
José Marcio Luna1, Hann-Hsiang Chao2, Russel T Shinohara3, Lyle H Ungar4, Keith A Cengel1, Daniel A Pryma5, Chidambaram Chinniah6, Abigail T Berman1, Sharyn I Katz5, Despina Kontos5, Charles B Simone7, Eric S Diffenderfer1.
Abstract
BACKGROUND ANDEntities:
Keywords: Chemoradiation; Intensity-modulated radiation therapy; Machine learning; Non-small cell lung cancer; Proton beam therapy; Radiation esophagitis; Radiation-induced toxicity
Year: 2020 PMID: 32274426 PMCID: PMC7132156 DOI: 10.1016/j.ctro.2020.03.007
Source DB: PubMed Journal: Clin Transl Radiat Oncol ISSN: 2405-6308
Fig. 1Multivariate analysis workflow. Diagram illustrating the workflow by which the input data undergoes stepwise resampling to estimate model performance for prediction of radiation esophagitis.
Summary of categorical patient characteristics. Description of clinical characteristics of the cohort with their respective categorization and percentages.
| Categorical Predictors | Classes | Number of Patients | (%) |
|---|---|---|---|
| Sex | Male | 90 | 44.6 |
| Female | 112 | 55.4 | |
| Smoking History | Former | 136 | 67.3 |
| Current | 26 | 12.9 | |
| Never | 17 | 8.4 | |
| Not Available | 23 | 11.4 | |
| Ethnicity | White | 137 | 67.8 |
| Black | 47 | 23.3 | |
| Asian | 4 | 2.0 | |
| Other | 14 | 6.9 | |
| Pre Treatment ECOG | 0 | 77 | 38.1 |
| Perform. Status | 1 | 55 | 27.2 |
| 2 | 14 | 6.9 | |
| 3 | 2 | 1.0 | |
| 4 | 2 | 1.0 | |
| Not Recorded | 52 | 25.7 | |
| Stage Grouping | IIB | 1 | 0.5 |
| IIIA | 120 | 58.9 | |
| IIIB | 81 | 40.6 | |
| Tumor Stage | Tx | 15 | 7.4 |
| T1 | 51 | 25.2 | |
| T2 | 63 | 31.2 | |
| T3 | 32 | 15.8 | |
| T4 | 41 | 20.3 | |
| Nodal Stage | Nx | 7 | 3.5 |
| N0 | 9 | 4.5 | |
| N1 | 12 | 5.9 | |
| N2 | 126 | 62.4 | |
| N3 | 48 | 23.8 | |
| Histology | Adenocarcinoma | 202 | 100.0 |
| Radiation Modality | Photon (IMRT) | 181 | 89.6 |
| Proton | 21 | 10.4 | |
| Chemotherapy | Concurrent | 176 | 86.6 |
| Sequential | 21 | 10.4 | |
| None | 5 | 3.0 | |
| Chemotherapy Agents | Carboplatin-based Doublet | 104 | 51.5 |
| Cisplatin-based Doublet | 70 | 34.7 | |
| Platinum-based Triplet | 6 | 3.0 | |
| Single Agent | 2 | 1.0 | |
| Other | 20 | 9.9 |
Summary of numerical patient characteristics. Description of numerical characteristics of the cohort with their respective median and interquartile ranges.
| Continuous Predictors | Median | Range ‡ |
|---|---|---|
| Age (yr) | 64 | (56–73) |
| Pack-Year (current/former smokers) | 35 | (14.5–50) |
| BMI (kg/m2) | 26.0 | (23.0–30.0) |
| Radiation Dose Delivered (Gy) | 66.6 | (64.8–66.6) |
| Dose per fraction (Gy) | 1.8 | (1.8–1.8) |
| Esophagus Mean Dose (Gy) | 24.5 | (18.3–31.9) |
| Esophagus Maximum Dose (Gy) | 69.4 | (65.4–72.4) |
‡ Interquartile range.
Fig. 2Feature correlation heat map. Heat map, illustrating the Pearson correlation between the continuous features under study.
Univariate analysis. Predictive performance for individual features using AUC analysis with their respective significance using Wilcoxon rank-sum test (continuous features) and (categorical features). None of the features can predict grade ≥ 3 RE using Bonferroni correction () for multiple comparison.
| Feature | AUC § | P-value |
|---|---|---|
| T Stage | 0.41 (0.28,0.55) | 0.07 |
| Lung V20 | 0.39 (0.27,0.52) | 0.09 |
| BMI | 0.61 (0.48,0.72) | 0.09 |
| Pack Years | 0.40 (0.29,0.52) | 0.12 |
| Concurrent v Sequential | 0.57 (0.55,0.60) | 0.15 |
| Eso V60 | 0.59 (0.47,0.70) | 0.16 |
| Lung Mean | 0.41 (0.30,0.54) | 0.16 |
| Total Dose | 0.42 (0.31,0.55) | 0.18 |
| Heart Mean | 0.58 (0.43,0.71) | 0.23 |
| Agents Drugs | 0.43 (0.33,0.54) | 0.25 |
| Heart V30 | 0.57 (0.43,0.70) | 0.28 |
| Pre Treatment ECOG | 0.56 (0.43,0.68) | 0.31 |
| Sex | 0.56 (0.44,0.65) | 0.32 |
| Eso Max | 0.44 (0.33,0.56) | 0.33 |
| Eso V50 | 0.56 (0.44,0.67) | 0.34 |
| Ethnicity | 0.45 (0.37,0.58) | 0.34 |
| Eso V40 | 0.56 (0.43,0.67) | 0.38 |
| N Stage | 0.55 (0.45,0.62) | 0.38 |
| Heart V5 | 0.55 (0.41,0.69) | 0.42 |
| Heart V60 | 0.55 (0.41,0.67) | 0.43 |
| Best CS AJCC Stage | 0.43 (0.32,0.54) | 0.44 |
| Heart V50 | 0.55 (0.41,0.67) | 0.48 |
| Age at Diagnosis | 0.46 (0.36,0.57) | 0.52 |
| Lung V10 | 0.46 (0.33,0.58) | 0.52 |
| Nr of Fractions | 0.46 (0.37,0.59) | 0.55 |
| Eso Mean | 0.54 (0.41,0.66) | 0.55 |
| Grade Differentiation | 0.45 (0.36,0.56) | 0.56 |
| Fraction Size | 0.53 (0.41,0.61) | 0.57 |
| PFT DLCO pred | 0.47 (0.36,0.59) | 0.63 |
| Linac | 0.51 (0.46,0.62) | 0.66 |
| Proton | 0.49 (0.38,0.54) | 0.66 |
| Lung V5 | 0.47 (0.34,0.60) | 0.67 |
| PFT Pre Bronch Actual FEV1 L | 0.52 (0.40,0.63) | 0.76 |
| Primary Tumor Long Dim cm | 0.48 (0.35,0.61) | 0.78 |
| ECOG 3 mo Post-RT | 0.49 (0.37,0.61) | 0.85 |
§ Estimate with 95% confidence interval.
Multivariate analysis. Combined predictive performance of features using 11 statistical models with three variants namely, a) 35 handcrafted features, b) BSFS and c) BSFS and SMOTE. None of the models can predict grade ≥ 3 RE using Bonferroni correction () for multiple comparison.
| Experiment | Algorithm | AUC § | BACC § | P-value |
|---|---|---|---|---|
| All 35 Features | Logistic Regression | 0.58 (0.27,0.88) | 0.58 (0.29,0.87) | 0.09 |
| Linear Discriminant | 0.57 (0.21,0.93) | 0.57 (0.25,0.89) | 0.30 | |
| Linear SVM | 0.56 (0.22,0.90) | 0.49 (0.38,0.60) | 0.50 | |
| Elastic Net | 0.52 (0.17,0.87) | 0.47 (0.28,0.66) | 0.88 | |
| RUSBoost | 0.52 (0.07,0.96) | 0.56 (0.23,0.88) | 0.62 | |
| k-NN | 0.50 (0.20,0.79) | 0.53 (0.27,0.79) | 0.72 | |
| Quadratic SVM | 0.49 (0.08,0.89) | 0.53 (0.19,0.88) | 0.74 | |
| Random Forest | 0.46 (0.10,0.82) | 0.50 (0.50,0.50) | 0.55 | |
| Quadratic Discriminant | 0.45 (0.09,0.80) | 0.48 (0.14,0.82) | 0.27 | |
| CART | 0.44 (0.17,0.71) | 0.47 (0.33,0.60) | 0.32 | |
| Gaussian SVM | 0.40 (0.03,0.77) | 0.50 (0.48,0.51) | 0.12 | |
| BSFS | Logistic Regression | 0.61 (0.41,0.81) | 0.54 (0.31,0.78) | 0.26 |
| Linear Discriminant | 0.59 (0.30,0.88) | 0.52 (0.34,0.70) | 0.25 | |
| Linear SVM | 0.57 (0.50,0.64) | 0.50 (0.46,0.54) | 0.54 | |
| Random Forest | 0.56 (0.22,0.90) | 0.53 (0.35,0.71) | 0.35 | |
| k-NN | 0.53 (0.19,0.86) | 0.56 (0.25,0.87) | 0.71 | |
| Elastic Net | 0.52 (0.17,0.87) | 0.47 (0.28,0.66) | 0.88 | |
| Quadratic SVM | 0.50 (0.14,0.85) | 0.50 (0.23,0.78) | 0.84 | |
| RUSBoost | 0.49 (0.05,0.93) | 0.49 (0.18,0.81) | 0.76 | |
| Quadratic Discriminant | 0.48 (0.09,0.88) | 0.50 (0.23,0.77) | 0.66 | |
| Gaussian SVM | 0.46 (0.13,0.78) | 0.52 (0.43,0.61) | 0.73 | |
| CART | 0.40 (0.20,0.60) | 0.46 (0.41,0.52) | 0.25 | |
| BSFS and SMOTE | Elastic Net | 0.61 (0.16,1.00) | 0.63 (0.24,1.00) | 0.07 |
| Linear Discriminant | 0.58 (0.17,0.98) | 0.55 (0.21,0.89) | 0.39 | |
| Logistic Regression | 0.55 (0.15,0.95) | 0.54 (0.18,0.90) | 0.42 | |
| Linear SVM | 0.50 (0.14,0.86) | 0.51 (0.34,0.68) | 0.85 | |
| Quadratic SVM | 0.49 (0.12,0.86) | 0.52 (0.13,0.90) | 0.92 | |
| RUSBoost | 0.49 (0.12,0.85) | 0.50 (0.18,0.82) | 0.73 | |
| Random Forest | 0.48 (0.12,0.83) | 0.53 (0.26,0.80) | 0.73 | |
| k-NN | 0.47 (0.10,0.85) | 0.51 (0.19,0.83) | 0.61 | |
| Gaussian SVM | 0.43 (0.02,0.84) | 0.49 (0.22,0.76) | 0.33 | |
| CART | 0.42 (0.01,0.83) | 0.48 (0.21,0.76) | 0.35 | |
| Quadratic Discriminant | 0.41 (0.12,0.69) | 0.48 (0.45,0.51) | 0.54 |
§ Estimate with 95% confidence interval.