| Literature DB >> 31349872 |
Seung-Hak Lee1,2, Hwan-Ho Cho1,2, Ho Yun Lee3,4, Hyunjin Park5,6.
Abstract
BACKGROUND: Radiomics suffers from feature reproducibility. We studied the variability of radiomics features and the relationship of radiomics features with tumor size and shape to determine guidelines for optimal radiomics study.Entities:
Keywords: Computed tomography; Feature reproducibility; Guideline for multi-center analysis; Precision medicine; Radiomics
Mesh:
Year: 2019 PMID: 31349872 PMCID: PMC6660971 DOI: 10.1186/s40644-019-0239-z
Source DB: PubMed Journal: Cancer Imaging ISSN: 1470-7330 Impact factor: 3.909
Fig. 1Overall design for Experiment 1. a Feature extraction and the 1st selection step. In the 1st selection step, we selected features with ICC ≥ 0.7. b In the 2nd selection, we applied LASSO to select features that can explain nodule status. c The features were used to train a RF classifier to classify nodule status. It was later tested in a test cohort
Fig. 2Overall design for Experiment 2. a Feature extraction and the 1st selection step. In the 1st selection step, we selected features with ICC ≥ 0.7. In this process, we found that both histogram- and ISZM-based features have ICC ≥ 0.9. Thus, we fixed the histogram- and ISZM-based features to the default bin settings. b In the 2nd selection, we applied LASSO to select features that can explain nodule status. c The features were used to train a RF classifier to classify nodule status. It was later tested in a test cohort
Classification performance of test set using RF for two voxel settings (Experiment 1)
| Original voxel setting | Isotropic voxel setting | |
|---|---|---|
| Area under curve | 0.6967 | 0.6587 |
| Accuracy | 0.7250 | 0.7000 |
| Sensitivity | 0.9000 | 0.9000 |
| Specificity | 0.4333 | 0.3667 |
Fig. 3Performance curve of the RF classifier in the test set. a shows the receiver operating characteristic (ROC) curve of the original voxel setting and b) shows the ROC curve of the isotropic voxel setting
Classification performance of test set using RF for different GLMC bin settings (Experiment 2)
| 32 bins | 64 bins | 128 bins | |
|---|---|---|---|
| Area under curve | 0.7333 | 0.7297 | 0.7480 |
| Accuracy | 0.7250 | 0.7250 | 0.7375 |
| Sensitivity | 0.8800 | 0.8600 | 0.9000 |
| Specificity | 0.4667 | 0.5000 | 0.4667 |
Fig. 4Performance curve of the RF classifier in the test set. a shows the receiver operating characteristic (ROC) curve of the 32 bins setting, b) shows the ROC curve of the 64 bins setting, and c) shows the ROC curve of the 128 bins setting
Features showing high reproducibility from two experiments
| Category | Parameter | Description / Interpretation | |
|---|---|---|---|
| Experiment 1 | Histogram-based features | Maximum | Measures maximum intensity value of histogram |
| Minimum | Measures minimum intensity value of histogram | ||
| Shape-based features | Maximum 3d diameter | Measures maximum 3D ROI diameter as the largest pairwise Euclidean distance between surface voxels of the ROI | |
| Spherical disproportion | Ratio of the surface area of the ROI to the surface area of a sphere with the same volume as the ROI | ||
| Texture-based features (GLCM) | Custer tendency | Measures homogeneity of GLCM | |
| Dissimilarity | Measures differences of entries in GLCM | ||
| Entropy | Measures irregularity of GLCM | ||
| Filter-based feature | Log Skewness ( | Measurement of skewness of ROI image processed by log filter | |
| Fractal-based feature | Lacunarity | Measure of the texture or distribution of gaps within an image | |
| Experiment 2 | Histogram-based features | Maximum | Same as experiment 1 |
| Minimum | Same as experiment 1 | ||
| Entropy | Measures irregularity of histogram | ||
| Texture-based features | Difference entropy | Measures entropy of processed GLCM matrix Px-y | |
| Homogeneity | Measures closeness of GLCM |
Confidence interval of various features for non-error group related to the failure of NGTDM
| Shape feature | Volume | Maximum 3d diameter | Surface area | Surface volume ratio |
| 1045.5 ~ 1412.28 | 18.15 ~ 20.46 | 780.5 ~ 964.07 | 0.86 ~ 0.98 | |
| Histogram feature | Mean | Skewness | Range | Median |
| − 182.03 ~ −141.26 | −0.8 ~ −0.55 | 756.35 ~ 805.08 | − 158.52 ~ − 107.86 |
Fig. 5Various features compared between the error and non-error groups related to computation of NGTDM features. Blue plots were the difference between shape-based features, and green plots were differences between histogram-based features
Fig. 6Various features compared between error and non-error groups related to computation of sub-sampled GLCM features. Blue plot on the right is for the non-error group and light blue plot on the left is for the error group
Confidence interval of various features for non-error group related to the failure of sub-sampled GLCM
| Shape feature | Volume | Maximum 3d diameter | Surface area |
| 1186.17 ~ 1567.5 | 19.37 ~ 21.34 | 871.56 ~ 1045.96 | |
| Compactness | Sphericity | Spherical disproportion | |
| 0.024 ~ 0.025 | 0.58 ~ 0.61 | 1.66 ~ 1.76 |