| Literature DB >> 35201534 |
Abstract
BACKGROUND: In radiomic studies, several models are often trained with different combinations of feature selection methods and classifiers. The features of the best model are usually considered relevant to the problem, and they represent potential biomarkers. Features selected from statistically similarly performing models are generally not studied. To understand the degree to which the selected features of these statistically similar models differ, 14 publicly available datasets, 8 feature selection methods, and 8 classifiers were used in this retrospective study. For each combination of feature selection and classifier, a model was trained, and its performance was measured with AUC-ROC. The best-performing model was compared to other models using a DeLong test. Models that were statistically similar were compared in terms of their selected features.Entities:
Keywords: Biomarkers; Feature relevance; Feature selection; Machine learning; Radiomics
Year: 2022 PMID: 35201534 PMCID: PMC8873309 DOI: 10.1186/s13244-022-01170-2
Source DB: PubMed Journal: Insights Imaging ISSN: 1869-4101
Overview of the datasets used for the study
| Dataset | Dimensionality (#Samples/#Features) | Outcome balance [%] | Modality | Tumor type | Software for feature extraction | Feature selection and classifier | DOI | ||
|---|---|---|---|---|---|---|---|---|---|
| Arita2018 [ | 168 | 685 | 0.25 | 66 | MRI | Brain | Inhouse | LASSO and LASSO | |
| Carvalho2018 [ | 262 | 118 | 2.22 | 59 | FDG-PET | NSCLC | Inhouse | LASSO and Cox regression | |
| Hosny2018A (HarvardRT) [ | 293 | 1005 | 0.29 | 54 | CT | NSCLC | Pyradiomics | mRMR and random forest | |
| Hosny2018B (Maastro) [ | 211 | 1005 | 0.21 | 28 | CT | NSCLC | Pyradiomics | mRMR and random forest | |
| Hosny2018C (Moffitt) [ | 183 | 1005 | 0.18 | 73 | CT | NSCLC | Pyradiomics | mRMR and random forest | |
| Ramella2018 [ | 91 | 243 | 0.37 | 55 | CT | NSCLC | Inhouse | Random forest for both | |
| Lu2019 [ | 213 | 658 | 0.32 | 43 | CT | Ovarian cancer | Inhouse | Univariate and LASSO + Cox | |
| Sasaki2019 [ | 138 | 588 | 0.23 | 49 | MRI | Brain | Inhouse | Super PCA and LASSO | |
| Toivonen2019 [ | 100 | 7106 | 0.01 | 80 | MRI | Prostate cancer | Inhouse | Logistic regression for both | |
| Keek2020 [ | 273 | 1323 | 0.21 | 40 | CT | HNSCC | Inhouse | Univariate Concordance Index and Cox regression as well as random survival forest | |
| Li2020 [ | 51 | 397 | 0.13 | 63 | MRI | Glioma | Artificial Intelligence Kit, GE Healthcare | LASSO + Mann–Whitney-U + correlation and logistic regression | |
| Park2020 [ | 768 | 941 | 0.82 | 24 | US | Thyroid cancer | Inhouse | LASSO for both | |
| Song2020 [ | 260 | 265 | 0.98 | 49 | MRI | Prostate cancer | Pyradiomics | ANOVA, RFE, relief and 10 classifiers | |
| Veeraraghavan2020 [ | 150 | 201 | 0.75 | 31 | DCE-MRI | Breast | Inhouse | No feature selection and random forest |
For reproducibility reasons only publicly, available datasets were used. The sample size is denoted by N, the number of features as d, which corresponds to the dimension of the data. Outcome balance denotes the percentage of events in the outcome used. The software that was used to extract the features, the feature selection and classifier methods is reported as stated in the corresponding study. Finally, DOI denotes the identifier of the publication corresponding to the dataset
Overview of all feature selection methods used
| Feature selection | Type | Hyperparameters |
|---|---|---|
| ANOVA | Filtering | – |
| Bhattacharyya distance | Filtering | – |
| ExtraTrees | Wrapper | – |
| Fast correlation-based filtering (FCBF) | Filtering | – |
| Kendall correlation | Filtering | – |
| LASSO | Wrapper | Regularization parameter, fixed at |
| Mutual information (MIM) | Filtering | – |
| Miinimum redundancy maximum relevance ensemble (MRMRe) | Filtering | Number of ensembles, fixed at 5 |
Filtering methods assign a score to each feature directly, while wrapper methods use a classifier
Overview of all classifiers used during training
| Classifier | Hyperparameters |
|---|---|
| Linear discriminant analysis (LDA) | – |
| Linear SVM | Regularization parameter |
| Logistic regression | Regularization parameter, |
| Naive Bayes | – |
| Neural network (three layers) | Neurons in layer 1, 2, 3 in {4, 16, 64} |
| Random forest | Number of trees in 50, 250, 500 |
| Radial basis function-SVM (RBF-SVM) | Regularization parameter, |
| XGBoost | Learning rate in 0.001, 0.1, 0.3, 0.9, number of estimators in 50, 250, 500 |
Fig. 1Graphical overview of the predictive performance of all models. The AUC-ROC of all computed models were plotted for all datasets. Those models that cannot be statistically shown to be different from the best model were marked in cyan color, while those that were worse were marked in orange color
Counts of how many models were statistically not different to the best model for each dataset, sorted by AUC-ROC of the best model
| Dataset | AUC-ROC of best model | Number of stat. eq. models |
|---|---|---|
| Song2020 | 0.98 | 9 |
| Li2020 | 0.89 | 34 |
| Toivonen2019 | 0.87 | 45 |
| Arita2018 | 0.83 | 18 |
| Ramella2018 | 0.81 | 23 |
| Lu2019 | 0.76 | 27 |
| Hosny2018B | 0.73 | 26 |
| Hosny2018C | 0.69 | 14 |
| Keek2020 | 0.69 | 56 |
| Carvalho2018 | 0.67 | 52 |
| Sasaki2019 | 0.67 | 61 |
| Park2020 | 0.65 | 48 |
| Hosny2018A | 0.64 | 53 |
| Veeraraghavan2020 | 0.64 | 42 |
Fig. 2Histogram of the Pearson correlation between all features. The “Normal” histogram was obtained by creating a dummy dataset that only contained independent and normally distributed features and serves as a reference
Fig. 3Relation of feature stability with the number of selected features
Fig. 4Analyzed measured for all statistically similar models. The range is given in parentheses, the color of each cell corresponds to the stated measure
Fig. 5Association of the mean AUC-ROC for all statistically similar models with (a) the number of equivalent models, (b) stability, (c) similarity and (d) correlation. Each point represents one dataset