| Literature DB >> 30290849 |
Wenzheng Sun1,2, Mingyan Jiang3, Jun Dang4, Panchun Chang5, Fang-Fang Yin2.
Abstract
BACKGROUND: To investigate the effect of machine learning methods on predicting the Overall Survival (OS) for non-small cell lung cancer based on radiomics features analysis.Entities:
Keywords: Machine learning; Non-small cell lung cancer; Overall survival; Radiomics analysis
Mesh:
Year: 2018 PMID: 30290849 PMCID: PMC6173915 DOI: 10.1186/s13014-018-1140-9
Source DB: PubMed Journal: Radiat Oncol ISSN: 1748-717X Impact factor: 3.481
Fig. 1The flow chart of predicted process for each ML method. (I) Dividing total data into three folds using the cross validation method. (II) Training each ML model using the selected radiomics features of the training fold. (III) Validating the prediction performance of each ML model on the validation fold
Fig. 2Radiomics features used in this study. The definitions of radiomics features could be found in the IBSI document [26]. (I) Intensity features (1–4): 3.4.19, 3.4.18, 3.3.4 and 3.3.3 sections; (II) Fine texture features (5–26): 3.6.20, 3.6.23, 3.6.22, 3.6.21, 3.6.12, 3.6.19, 3.6.7, 3.6.5, 3.6.11, 3.6.4, 3.6.14, 3.6.16, 3.6.24, 3.6.25, 3.6.17, 3.6.15, 3.6.18, 3.6.1, 3.6.8, 3.6.10, 3.6.9 and 3.6.3 sections; (III) Coarse texture features (27–37): 3.7.1, 3.7.2, 3.7.9, 3.7.11, 3.7.13 and 3.7.3–3.7.8 sections; (IV) Morphological feature: 3.1.5, 3.1.6, 3.1.8, 3.1.7, 3.1.3 and 3.1.1 sections
The specifics of the packages for each feature selection and machine learning method
| Methods | Software | Packages | Website Links |
|---|---|---|---|
| PCC | SML toolbox | corr |
|
| KCC | |||
| SCC | |||
| MI | MIToolbox | mi |
|
| CI | Hisc | rcorr.cens |
|
| Cox | survival | coxph |
|
| GB-Cox | mboost | mboost |
|
| GB-Cindex | mboost | mboost |
|
| CoxBoost | CoxBoost | CoxBoost |
|
| BST | ipred | bagging |
|
| RFS | randomForestSRC | rfsrc |
|
| SR | survival | survreg |
|
| SVCR | survivalsvm | survivalsvm |
|
SML statistics and machine learning
Fig. 3The performance of feature selection and machine learning methods on the merged validation fold
Maximum CI with confidence interval for each machine learning method on the merged validation fold
| Methods | FS | Maximum CI | CFI of Maximum CI |
|---|---|---|---|
| GB-Cox | CI | 0.682 | [0.620, 0.744] |
| CoxBoost | CI | 0.674 | [0.615, 0.731] |
| Cox | MI | 0.646 | [0.578, 0.714] |
| GB-Cindex | SCC | 0.357 | [0.290, 0.423] |
| RFS | PCC | 0.627 | [0.558, 0.695] |
| SR | MI | 0.380 | [0.310, 0.452] |
| BST | SCC | 0.385 | [0.318, 0.450] |
| SVCR | KCC | 0.405 | [0.341, 0.470] |
FS feature selection method
The range of parameter tuning
| Methods | Parameters | Range of Parameters |
|---|---|---|
| Cox | ||
| GB-Cox | Number of boosting steps | [1, 500] |
| GB-Cindex | Number of boosting steps | [1, 500] |
| Coxboost | Number of boosting steps | [1, 500] |
| BST | Minsplit | [1, 10] |
| Number of trees | [1, 500] | |
| RFS | Average terminal node size of forest | [1, 10] |
| Number of trees | [1, 500] | |
| SR | Assumed distribution | Weibull, Gaussian, Exponential |
| SVCR | Parameter of regularization | [0.01, 1] |
Fig. 4Examples of the Kaplan-Meier evaluations. All the NSCLC patients on each validation fold were stratified into low- and high- risk groups based on the cut-off values determined by the corresponding training fold. Here, (a), (b) and (c) presented the Kaplan-Meier curve of the three CV validation folds, respectively