| Literature DB >> 26640485 |
Ahmad Chaddad1, Camel Tanougast1.
Abstract
Statistical features are widely used in radiology for tumor heterogeneity assessment using magnetic resonance (MR) imaging technique. In this paper, feature selection based on decision tree is examined to determine the relevant subset of glioblastoma (GBM) phenotypes in the statistical domain. To discriminate between active tumor (vAT) and edema/invasion (vE) phenotype, we selected the significant features using analysis of variance (ANOVA) with p value < 0.01. Then, we implemented the decision tree to define the optimal subset features of phenotype classifier. Naïve Bayes (NB), support vector machine (SVM), and decision tree (DT) classifier were considered to evaluate the performance of the feature based scheme in terms of its capability to discriminate vAT from vE. Whole nine features were statistically significant to classify the vAT from vE with p value < 0.01. Feature selection based on decision tree showed the best performance by the comparative study using full feature set. The feature selected showed that the two features Kurtosis and Skewness achieved a highest range value of 58.33-75.00% accuracy classifier and 73.88-92.50% AUC. This study demonstrated the ability of statistical features to provide a quantitative, individualized measurement of glioblastoma patient and assess the phenotype progression.Entities:
Year: 2015 PMID: 26640485 PMCID: PMC4660016 DOI: 10.1155/2015/728164
Source DB: PubMed Journal: Adv Bioinformatics ISSN: 1687-8027
Figure 1Histogram of the GBM tumor. (a) Raw image of FLAIR sequence; (b) two Gaussian distributions represent vE and vAT and necrosis parts which are located inside the vAT with lower intensity values.
Figure 2Block diagram of the proposed approach.
Figure 3Example of phenotype segmentation. (a) Raw image of FLAIR sequence, (b) edema part vE, and (c) active tumor vAT.
Statistical features description.
| Symbol | Features |
|---|---|
|
| Geometric mean, indicates the central tendency |
|
| |
|
| Harmonic mean, calculates the average sample |
|
| |
|
| Mean excluding outliers, measures the probability distribution |
|
| |
|
| Mean (average) |
|
| |
|
| Standard deviation (absolute deviation) |
|
| |
|
| 75th percentile, splits off the highest 25% of pixels from the lowest 75% |
|
| |
|
| Quantile |
|
| |
|
| Skewness, assesses the asymmetry of the distribution |
|
| |
|
| Kurtosis, measures the degree of peakedness of a distribution |
|
| |
| Features vector |
|
Mean ± standard deviation of vAT and vE.
| Features |
|
|
|
|---|---|---|---|
|
| 473.02 ± 345.65 | 461.15 ± 341.98 | <0.01 |
|
| 466.49 ± 344.81 | 453.57 ± 342.15 | <0.01 |
|
| 478.88 ± 347.18 | 468.25 ± 342.53 | <0.01 |
|
| 47.50 ± 31.33 | 53.69 ± 31.15 | <0.01 |
|
| 37.92 ± 25.98 | 44.19 ± 25.18 | <0.01 |
|
| 327.31 ± 294.16 | 321.74 ± 302.46 | <0.01 |
|
| 519.27 ± 367.27 | 516.73 ± 359.66 | <0.01 |
|
| −0.20 ± 0.29 | −0.09 ± 0.40 | <0.01 |
|
| 3.58 ± 0.85 | 2.93 ± 0.59 | <0.01 |
Metrics (%) of vAT and vE discrimination.
| Features | DT | SVM | NB | |||
|---|---|---|---|---|---|---|
| Accuracy | AUC | Accuracy | AUC | Accuracy | AUC | |
| Full feature set ( | 68.33 | 96.05 | 68.33 | 80.22 | 53.33 | 77.66 |
|
| ||||||
| Subset feature | 75 | 92.5 | 58.33 | 73.88 | 58.33 | 76.44 |
Confusion matrix based on selected features.
| Features | DT | SVM | NB | |||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| 30 | 20 | 10 | 15 | 15 | 13 | 17 |
| 30 | 5 | 25 | 10 | 20 | 8 | 22 |
Figure 4Decision tree grown using 9 statistical features extracted from 30 vAT and 30 vE parts.
Figure 5Receiver operating characteristic curves for distinguishing between vAT and vE. FFS denotes full feature set, and FS is the feature selection.
Figure 6Heat map with correlation coefficients between statistical features.