| Literature DB >> 35927323 |
Duyen Thi Do1, Ming-Ren Yang1,2, Luu Ho Thanh Lam3, Nguyen Quoc Khanh Le4,5,6, Yu-Wei Wu7,8.
Abstract
O6-Methylguanine-DNA-methyltransferase (MGMT) promoter methylation was shown in many studies to be an important predictive biomarker for temozolomide (TMZ) resistance and poor progression-free survival in glioblastoma multiforme (GBM) patients. However, identifying the MGMT methylation status using molecular techniques remains challenging due to technical limitations, such as the inability to obtain tumor specimens, high prices for detection, and the high complexity of intralesional heterogeneity. To overcome these difficulties, we aimed to test the feasibility of using a novel radiomics-based machine learning (ML) model to preoperatively and noninvasively predict the MGMT methylation status. In this study, radiomics features extracted from multimodal images of GBM patients with annotated MGMT methylation status were downloaded from The Cancer Imaging Archive (TCIA) public database for retrospective analysis. The radiomics features extracted from multimodal images from magnetic resonance imaging (MRI) had undergone a two-stage feature selection method, including an eXtreme Gradient Boosting (XGBoost) feature selection model followed by a genetic algorithm (GA)-based wrapper model for extracting the most meaningful radiomics features for predictive purposes. The cross-validation results suggested that the GA-based wrapper model achieved the high performance with a sensitivity of 0.894, specificity of 0.966, and accuracy of 0.925 for predicting the MGMT methylation status in GBM. Application of the extracted GBM radiomics features on a low-grade glioma (LGG) dataset also achieved a sensitivity 0.780, specificity 0.620, and accuracy 0.750, indicating the potential of the selected radiomics features to be applied more widely on both low- and high-grade gliomas. The performance indicated that our model may potentially confer significant improvements in prognosis and treatment responses in GBM patients.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35927323 PMCID: PMC9352871 DOI: 10.1038/s41598-022-17707-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1The overall feature selection steps. The left part demonstrates the pre-processing and segmentation steps while the right part list the two-stage feature selection procedure. The extracted feature set is then evaluated for its efficacy.
Figure 2The Genetic Algorithm workflow. The steps are: (1) Generation of the initial population of solutions; (2) Evaluation of fitness values of each solution within the population; (3) The “mating” process of the solution, in which the probability of a solution to be selected is proportional to the estimated fitness value; (4) The random designation of crossover points on each vector of solution during the “mating” process. SC and DC stand for Single- and Double-Crossover, respectively; (5) The introduction of random mutations on the crossover-ed solution vectors; (6) The replacement of the entire population by daughter solutions.
Performance evaluations for different machine learning-incorporated genetic algorithm (GA) models on the GBM dataset.
| Classifiers | No. of features | Sensitivity | Specificity | Accuracy |
|---|---|---|---|---|
| GA-XGB | 18 | 0.889 | 0.88 | 0.889 |
| GA-SVM | 14 | 0.720 | 0.454 | 0.678 |
RF random forest, XGB XGBoost, SVM support vector machine.
Model with the best performance is indicated in bold font.
Figure 3Performance evaluations and comparisons of different GA-incorporated models in predicting MGMT methylation statuses. Y-axis represents accuracy. Statistical significances evaluated by Kolmogorov–Smirnov test are represented by stars, in which three stars (***) indicate p < 0.001.
Figure 4Receiver operating characteristic (ROC) curves of different feature sets as evaluated by the random forest (RF) algorithm. The feature sets are: (A) all 704 radiomics features; (B) 38 features selected by XGBoost; (C) the feature set selected by F-scores; and (D) the feature set selected by the genetic algorithm (GA)-RF algorithm.
Figure 5Common radiomics features selected by different methods. Solid circles represent the presence of certain features in each feature set.
Classification performances of the models on the LGG dataset.
| Classifiers | Sensitivity | Specificity | Accuracy |
|---|---|---|---|
| GA-XGB | 0.780 | 0.460 | 0.718 |
| GA-SVM | 0.840 | 0.23 | 0.718 |
| XGB-Fscore | 0.670 | 0.380 | 0.615 |
RF random forest, XGB XGBoost, SVM support vector machine, XGB-Fscore the F-score technique proposed by Le et al.
Model with the best performance is indicated in bold font.
Comparisons between our models and other previous predictors of the MGMT methylation status in glioblastoma multiforme.
| Study | Year | No. of features | Classifier | SN | SP | ACC |
|---|---|---|---|---|---|---|
| Le et al. [ | 2020 | 9 | XGBoost | 0.88 | 0.887 | 0.887 |
| Xi et al. [ | 2018 | 63 | Support vector machine | 0.888 | 0.838 | 0.866 |
| Levner et al. [ | 2009 | 8 | L1-regularized neural networks | 0.854 | 0.9 | 0.877 |
| Korfiatis et al. [ | 2016 | 4 | Support vector machine | 0.803 | 0.813 | N/Aa |
| Crisi et al. [ | 2020 | 14 | Multilayer perception | 0.75 | 0.85 | N/A |
| Kanas et al. [ | 2017 | N/A | K-Nearest Neighbor | 0.736 | 0.852 | 0.663 |
| Sasaki et al. [ | 2019 | 5 | LASSOb | 0.67 | 0.66 | 0.67 |
| L Han et al. [ | 2018 | N/A | CRNNc | 0.67 | 0.67 | 0.67 |
| Ahn et al. [ | 2014 | N/A | Mann–Whitney U-test | 0.563 | 0.852 | N/A |
SN, sensitivity; SP, specificity; ACC, accuracy.
a“N/A” means that the information was not shown in the research.
bLASSO, least absolute shrinkage and selection operator; GA-RF, genetic algorithm-random forest. Bold font indicates the results of this study.
cBi-directional convolutional recurrent neural network architecture.