| Literature DB >> 35422007 |
Jian-Man Zhu1, Lei Sun2, Linjing Wang3, Tong-Chong Zhou1, Yawei Yuan1, Xin Zhen4, Zhi-Wei Liao5.
Abstract
OBJECTIVE: This study was to explore the most appropriate radiomics modeling method to predict the progression-free survival of EGFR-TKI treatment in advanced non-small cell lung cancer with EGFR mutations. Different machine learning methods may vary considerably and the selection of a proper model is essential for accurate treatment outcome prediction. Our study were established 176 discrimination models constructed with 22 feature selection methods and 8 classifiers. The predictive performance of each model were evaluated using the AUC, ACC, sensitivity and specificity, where the optimal model was identified.Entities:
Keywords: EGFR-TKI; Machine learning; Non-small cell lung cancer; Radiomics
Mesh:
Substances:
Year: 2022 PMID: 35422007 PMCID: PMC9008953 DOI: 10.1186/s13104-022-06019-x
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Feature selection methods and classifiers for the top ten models
| Classifier | Feature selection method | AUC | ACC | Sensitivity | Specificity |
|---|---|---|---|---|---|
| Logistic | gini-index | 0.797 | 0.722 | 0.758 | 0.693 |
| Logistic | ll-l21 | 0.763 | 0.681 | 0.716 | 0.653 |
| Bagging | CIFE | 0.764 | 0.671 | 0.636 | 0.707 |
| Logistic | reliefF | 0.759 | 0.662 | 0.676 | 0.655 |
| Bagging | MRMR | 0.743 | 0.662 | 0.616 | 0.729 |
| Bagging | MIFS | 0.742 | 0.670 | 0.656 | 0.673 |
| Adaboosting | f-score | 0.740 | 0.671 | 0.756 | 0.593 |
| SVM | MRMR | 0.739 | 0.712 | 0.733 | 0.695 |
| SVM | CIFE | 0.734 | 0.692 | 0.713 | 0.676 |
| SVM | MIFS | 0.734 | 0.712 | 0.733 | 0.695 |
The top ten features and the corresponding mean (± SD) value (or median (IQR)) and the p-value between the slow and fast progression groups
| Feature category | Feature | Slow-progress | Fast-progress | ||
|---|---|---|---|---|---|
| Shape-based (n = 4) | Elongation | 0.76 ± 0.12 | 0.71 ± 0.13 | 0.067a | |
| Least Axis Length | 30.06 ± 11.91 | 20.53 ± 8.85 | 0.234a | ||
| Flatness | 0.59 ± 0.14 | 0.55 ± 0.13 | 0.229a | ||
| Major Axis Length | 52.25 (19.52,110.61) | 46.56 (23.28,145.60) | 0.858b | ||
| First-order based (n = 1) | Inter quartile range | 140 (36,560) | 154 (42,385) | 0.962b | |
| Texture | GLSZM (n = 1) | Small Area Emphasis | 0.68 ± 0.04 | 0.70 ± 0.03 | 0.003a |
| GLCM (n = 1) | Difference variance | 14.59 ± 6.63 | 18.16 ± 8.71 | 0.024a | |
| Clinical based (n = 3) | Smoke | – | – | 0.238c | |
| Mutation | – | – | 0.030c | ||
| Outcome | – | – | 0.000c | ||
at-test
bMann–Whitney U test
cChi-square test
Fig. 1A Time-dependent ROC curves of the “gini-index-Logistic regression” model of using the top-10 features at 10 months. B Kaplan Meier survival curves of EGFR positive NSCLC patients. The p-values were calculated using the log-rank tests