| Literature DB >> 31857632 |
Takahiro Nakamoto1,2, Wataru Takahashi3, Akihiro Haga1,4, Satoshi Takahashi5, Shigeru Kiryu6, Kanabu Nawa1, Takeshi Ohta1, Sho Ozaki1, Yuki Nozawa1, Shota Tanaka5, Akitake Mukasa7, Keiichi Nakagawa1.
Abstract
We conducted a feasibility study to predict malignant glioma grades via radiomic analysis using contrast-enhanced T1-weighted magnetic resonance images (CE-T1WIs) and T2-weighted magnetic resonance images (T2WIs). We proposed a framework and applied it to CE-T1WIs and T2WIs (with tumor region data) acquired preoperatively from 157 patients with malignant glioma (grade III: 55, grade IV: 102) as the primary dataset and 67 patients with malignant glioma (grade III: 22, grade IV: 45) as the validation dataset. Radiomic features such as size/shape, intensity, histogram, and texture features were extracted from the tumor regions on the CE-T1WIs and T2WIs. The Wilcoxon-Mann-Whitney (WMW) test and least absolute shrinkage and selection operator logistic regression (LASSO-LR) were employed to select the radiomic features. Various machine learning (ML) algorithms were used to construct prediction models for the malignant glioma grades using the selected radiomic features. Leave-one-out cross-validation (LOOCV) was implemented to evaluate the performance of the prediction models in the primary dataset. The selected radiomic features for all folds in the LOOCV of the primary dataset were used to perform an independent validation. As evaluation indices, accuracies, sensitivities, specificities, and values for the area under receiver operating characteristic curve (or simply the area under the curve (AUC)) for all prediction models were calculated. The mean AUC value for all prediction models constructed by the ML algorithms in the LOOCV of the primary dataset was 0.902 ± 0.024 (95% CI (confidence interval), 0.873-0.932). In the independent validation, the mean AUC value for all prediction models was 0.747 ± 0.034 (95% CI, 0.705-0.790). The results of this study suggest that the malignant glioma grades could be sufficiently and easily predicted by preparing the CE-T1WIs, T2WIs, and tumor delineations for each patient. Our proposed framework may be an effective tool for preoperatively grading malignant gliomas.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31857632 PMCID: PMC6923390 DOI: 10.1038/s41598-019-55922-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1A conceptual design for predicting glioma grades based on radiomic features.
Patients’ characteristics in the validation dataset for this study.
| Characteristic | Value |
|---|---|
| Total number of patients | 67 |
| Gender | Male: 45 (67.2%) |
| Female: 22 (32.8%) | |
| Mean age | 55.2 ± 16.2 (range: 11–83) |
| Grade | III: 22 (32.8%) |
| IV: 45 (67.2%) | |
| Histological type | GBM: 45 (67.2%) |
| AA: 8 (11.9%) | |
| AO: 9 (13.4%) | |
| AOA: 5 (7.5%) | |
| IDH mutation status in GBM (n = 45) | Mutated: 2 (4.4%) |
| Wild type: 19 (42.2%) | |
| Unknown: 24 (53.3%) | |
| MGMT methylation status in GBM (n = 45) | Methylated: 7 (15.6%) |
| Unmethylated: 13 (28.9%) | |
| Unknown: 25 (55.6%) |
GBM: glioblastoma, AA: anaplastic astrocytoma, AO: anaplastic oligodendroglioma, AOA: anaplastic oligoastrocytoma, IDH: isocitrate dehydrogenase, MGMT: O6-methylguanine-DNA methyltransferase.
Figure 2Transverse images of a tumor on original magnetic resonance (MR) image (T2-weighted MR image (T2WI)) and on eight frequency component-filtered images to which a three-dimensional (3D) Coiflet wavelet transform had been applied.
Figure 3Heat maps of radiomic features in primary and validation datasets.
Figure 4Mean area under the curve (AUC) values for five-times five-fold cross-validation (CV) for each value of a regularization hyper-parameter. The dashed line depicts a hyper-parameter value, which maximizes the mean AUC value for five-times five-fold CV.
Selected radiomic features for all folds in a leave-one-out cross-validation (LOOCV) of the primary dataset.
| MRI sequence | Wavelet | Quantization levels | Feature type | Feature name |
|---|---|---|---|---|
| CE-T1 | LLL | — | Intensity | Median |
| CE-T1 | LHL | 8 bit | GLRLM | Run-length variance |
| CE-T1 | LLL | 5 bit | GLSZM | Gray-level non-uniformity normalized |
| CE-T1 | HLL | 7 bit | GLSZM | Gray-level variance |
| CE-T1 | HLL | 7 bit | NGLDM | High dependence low gray-level emphasis |
| T2 | LLL | — | Intensity | Root mean square |
CE-T1: contrast-enhanced T1, L: low-pass filter, H: high-pass filter, GLRLM: gray-level run length matrix, GLSZM: gray-level size zone matrix, NGLDM: neighboring gray-level dependence matrix.
Figure 5Receiver operating characteristic (ROC) curves of the prediction models constructed by the five machine learning (ML) algorithms in a leave-one-out cross-validation (LOOCV) of the primary dataset.
Accuracies, sensitivities, specificities, and area under the curve (AUC) values of prediction models in a leave-one-out cross-validation (LOOCV) of the primary dataset.
| Machine learning algorithm | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|
| LR | 0.834 | 0.833 | 0.836 | 0.915 |
| SVM | 0.866 | 0.902 | 0.800 | 0.932 |
| SNN | 0.796 | 0.833 | 0.727 | 0.896 |
| RF | 0.815 | 0.892 | 0.673 | 0.902 |
| NB | 0.809 | 0.853 | 0.727 | 0.867 |
| Mean ± SD | 0.824 ± 0.027 | 0.863 ± 0.033 | 0.753 ± 0.065 | 0.902 ± 0.024 |
| 95% CI | 0.790–0.858 | 0.822–0.903 | 0.672–0.833 | 0.873–0.932 |
LR: logistic regression, SVM: support vector machine, SNN: standard neural network, RF: random forest, NB: naïve Bayes.
Figure 6Receiver operating characteristic (ROC) curves for all prediction models in an independent validation constructed by using selected radiomic features for all folds in a leave-one-out cross-validation (LOOCV).
Accuracies, sensitivities, specificities, and area under the curve (AUC) values of prediction models in an independent validation.
| Machine learning algorithm | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|
| LR | 0.746 | 0.756 | 0.727 | 0.755 |
| SVM | 0.746 | 0.844 | 0.545 | 0.731 |
| SNN | 0.716 | 0.867 | 0.409 | 0.707 |
| RF | 0.806 | 0.822 | 0.773 | 0.800 |
| NB | 0.776 | 0.822 | 0.682 | 0.743 |
| Mean ± SD | 0.758 ± 0.034 | 0.822 ± 0.042 | 0.627 ± 0.149 | 0.747 ± 0.034 |
| 95% CI | 0.716–0.800 | 0.771–0.874 | 0.443–0.812 | 0.705–0.790 |
LR: logistic regression, SVM: support vector machine, SNN: standard neural network, RF: random forest, NB: naïve Bayes.
Prediction performances for malignant glioma grade identification using a radiomic approach in the proposed framework and in previous studies.
| Study | No. of data | MRI sequence | Feature type | Filtering | Feature selection | ML algorithm | Data augmentation | Validation method | Accuracy | Sensitivity | Specificity | AUC value |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Proposed framework | Primary dataset: 157 (III: 55, IV: 102) | •CE-T1 •T2 | •Shape/size •Intensity •Histogram •GLCM •GLRLM •GLSZM •NGLDM •NGTDM | Wavelet transform high-pass and low-pass filters for all feature types excluding the shape/size | WMW test & LASSO-LR | SVM (rbf kernel) | No | LOOCV | 0.866 | 0.902 | 0.800 | 0.932 |
Entire dataset: 224 (Primary dataset: 157 & Validation dataset: 67 (III: 22, IV: 45)) | Using the selected radiomic features for all folds in the LOOCV of the primary dataset | RF | No | Independent validation | 0.806 | 0.822 | 0.773 | 0.800 | ||||
| Zacharaki | 52 (III: 18, IV: 34) | •CE-T1 •T1 •T2 •FLAIR •rCBV | •Shape •Intensity •Rotation invariant texture | Gabor filter for rotation invariant texture features | SVM-RFE | SVM (rbf kernel) | No | LOOCV | 0.904 | 1.000 | 0.722 | 0.985 |
| t-test with bagging | 0.942 | NR | NR | 1.000 | ||||||||
| Tian | 111 (III: 33, IV: 78) | •CE-T1 •T1 •T2 •Diffusion •3D pCASL | •GLCM •GLGCM | No | SVM-RFE | SVM (rbf kernel) | No | 100-times 10-fold CV | 0.937 | 0.942 | 0.927 | 0.982 |
| SMOTE | 0.981 | 0.987 | 0.974 | 0.992 |
CE-T1: contrast-enhanced T1, FLAIR: fluid attenuated inversion recovery, rCBV: relative blood volume, 3D-pCASL: three-dimensional pseudo-continuous arterial spin labeling, GLCM: gray-level co-occurrence matrix, GLRLM: gray-level run length matrix, GLSZM: gray-level size zone matrix, NGLDM: neighboring gray-level dependence matrix, NGTDM: neighborhood gray-tone difference matrix, GLGCM: gray-level gradient co-occurrence matrix, WMW: Wilcoxon-Mann-Whitney, LASSO-LR: least absolute shrinkage and selection operator logistic regression, RFE: recursive feature elimination, SMOTE: synthetic minority over sampling technique, SVM: support vector machine, rbf: radial basis function, RF: random forest, LOOCV: leave-one-out cross validation, AUC: area under the curve, NR: not reported.