| Literature DB >> 35884551 |
Luu Ho Thanh Lam1,2, Ngan Thy Chu3, Thi-Oanh Tran4,5, Duyen Thi Do6, Nguyen Quoc Khanh Le7,8,9,10.
Abstract
Glioma is a Center Nervous System (CNS) neoplasm that arises from the glial cells. In a new scheme category of the World Health Organization 2016, lower-grade gliomas (LGGs) are grade II and III gliomas. Following the discovery of suppression of negative immune regulation, immunotherapy is a promising effective treatment method for lower-grade glioma patients. However, the therapy is not effective for all types of LGGs, and tumor mutational burden (TMB) has been shown to be a potential biomarker for the susceptibility and prognosis of immunotherapy in lower-grade glioma patients. Hence, predicting TMB benefits brain cancer patients. In this study, we investigated the correlation between MRI (magnetic resonance imaging)-based radiomic features and TMB in LGG by applying machine learning methods. Six machine learning classifiers were examined on the features extracted from the genetic algorithm. Subsequently, a light gradient boosting machine (LightGBM) succeeded in selecting 11 radiomics signatures for TMB classification. Our LightGBM model resulted in high accuracy of 0.7936, and reached a balance between sensitivity and specificity, achieving 0.76 and 0.8107, respectively. To our knowledge, our study represents the best model for classification of TMB in LGG patients at present.Entities:
Keywords: genetic algorithm; lower-grade glioma; radiomics signature; tumor mutational burden
Year: 2022 PMID: 35884551 PMCID: PMC9324877 DOI: 10.3390/cancers14143492
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.575
Patient characteristics.
| Training | Validation |
| ||
|---|---|---|---|---|
| Age | 44.06 ± 14.00 | 49.76 ± 13.44 | 0.040399 * | |
| Gender | Male | 26 | 23 | 0.088934 |
| Female | 37 | 19 | ||
| Histology | Astrocytoma | 21 | 11 | 0.127063 |
| Oligoastrocytoma | 18 | 10 | ||
| Oligodendroglioma | 24 | 21 | ||
| Grade | II | 26 | 20 | 0.262543 |
| III | 37 | 22 | ||
| Subtype | Classic-like | 2 | 3 | 0.09987 |
| Codel | 13 | 15 | ||
| G-CIMP-high | 38 | 16 | ||
| G-CIMP-low | 2 | |||
| Mesenchymal-like | 6 | 7 | ||
| PA-like | 2 | 1 | ||
| Vital status | Dead | 13 | 13 | 0.117054 |
| Alive | 29 | 50 | ||
| IDH status | Mutant | 53 | 31 | 0.099583 |
| Wildtype | 10 | 11 | ||
| 1p_19q codeletion status | Codel | 13 | 14 | 0.073916 |
| Non-codel | 50 | 28 | ||
| MGMT promoter status | Methylated | 51 | 39 | 0.045106 * |
| Unmethylated | 12 | 3 | ||
| TMB group | TMB high | 25 | 18 | 0.37498 |
| TMB low | 38 | 24 |
* statistically significant with p < 0.05.
Comparative performance among different GA-based machine learning algorithms in predicting the TMB group.
| Algorithm | GA Features | Sensitivity | Specificity | Precision | Accuracy | Running Time (s) |
|---|---|---|---|---|---|---|
| LR | 13 | 0.64 ± 0.265 | 0.9 ± 0.079 | 0.8367 ± 0.162 | 0.7936 ± 0.132 | 0.159057 |
| SVM | 10 | 0.56 ± 0.15 | 0.8714 ± 0.177 | 0.7733 ± 0.228 | 0.7462 ± 0.133 | 0.050138 |
| RF | 6 | 0.64 ± 0.15 | 0.9179 ± 0.064 | 0.8833 ± 0.108 | 0.8089 ± 0.041 | 0.817011 |
| LDA | 4 | 0.56 ± 0.16 | 0.8714 ± 0.131 | 0.77 ± 0.131 | 0.7449 ± 0.056 | 0.125044 |
| LGBM | 11 | 0.72 ± 204 | 0.8893 ± 0.131 | 0.8367 ± 0.131 | 0.8218 ± 0.1 | 0.094271 |
| XGB | 7 | 0.6 ± 204 | 0.9 ± 0.009 | 0.8 ± 0.106 | 0.7808 ± 0.08 | 1.63018 |
GA: genetic algorithm, logistic regression (LR), random forest (RF), support vector machine (SVM), linear discriminant analysis (LDA), light gradient boosting machine (LGBM), extreme gradient boosting (XGB).
Comparative performance among different sampling strategies in predicting TMB group using LightGBM.
| Method | Sensitivity | Specificity | Precision | Accuracy |
|---|---|---|---|---|
| ADASYN | 0.72 ± 0.204 | 0.7571 ± 0.168 | 0.7010 ± 0.244 | 0.7449 ± 0.103 |
| BorderlineSMOTE | 0.8 ± 204 | 0.7143 ± 0.151 | 0.6573 ± 0.092 | 0.7462 ± 0.083 |
| RandomOversampler | 0.72 ± 0.219 | 0.8143 ± 0.148 | 0.7262 ± 0.129 | 0.7782 ± 0.127 |
| RandomUndersampler | 0.8 ± 0.4 | 0.7393 ± 0.15 | 0.6810 ± 0.169 | 0.7641 ± 0.069 |
| SMOTE | 0.8 ± 0.32 | 0.7714 ± 0.19 | 0.7219 ± 0.258 | 0.7808 ± 0.11 |
| SVMSMOTE | 0.76 ± 0.126 | 0.8107 ± 0.068 | 0.7952 ± 0.112 | 0.7936 ± 0.081 |
ADASYN: adaptive synthetic, SMOTE: synthetic minority oversampling technique.
Figure 1LightGBM model interpretation. (A) SHAP analysis on 11 optimal features, (B) correlation of two first-rank features with the TMB group. All of these model interpretation experiments were coded in Python programming language environment.
Figure 2Comparative performance between training and validation data. (A) ROC curve, (B) PR curve.
Comparison among different studies on TMB prediction.
| Study | Method Summary | Kind of Cancer | Result |
|---|---|---|---|
| Jain et al. [ | Machine learning algorithm, Image2TMB, integrated three deep learning models. | Lung cancer | auPRC = 0.92 |
| Shi et al. [ | Deep learning model is based on the ResNet18 architecture. | Lung cancer | AUC = 0.64 |
| Shimada et al. [ | Convolutional neural network (CNN)-based algorithm. | Colorectal cancer | AUC = 0.934 |
| Tang et al. [ | LASSO regression selected features. Nomogram model predicted TMB. | Bladder cancer | AUC = 0.853 |
| Liu et al. [ | Nomogram model predicted TMB. | Lower-grade glioma | AUC = 0.736 |
| The proposed study | The genetic algorithm selected radiomics signatures. LGBM algorithm predicted TMB. | Lower-grade glioma | AUC = 0.7875 |
Only Liu et al. predicted TMB on LGG patients and our proposed study achieved a better performance than this study.
Figure 3An example of MRI segmentation. The enhancing part of the tumor core (ET) in yellow, the non-enhancing part of the tumor core (NET) in red, and the peritumoral edema (ED) in green.
Hyperparameters of machine learning algorithms in predicting the TMB group.
| Algorithm | Optimal Hyperparameters |
|---|---|
| Logistic Regression | solver = ‘saga’, C = 2.015990003658406, penalty = ‘l1’ |
| Random Forest | ‘n_estimators’ = 5, ‘min_samples_split’ = 6, ‘min_samples_leaf’ = 3, ‘max_features’ = ‘auto’, ‘max_depth’ = 30, ‘bootstrap’ = False |
| Support Vector Machine | kernel = ’rbf’, gamma = 1 × 10−4, C = 10 |
| Linear Discriminant Analysis | solver = ‘svd’ |
| Light GBM | learning_rate = 0.005, num_leaves = 15, max_depth = 25, min_data_in_leaf = 15, feature_fraction = 0.6, bagging_fraction = 0.6 |
| XGBoost | max_depth = 1, gamma = 9, colsample_bytree = 0.5, min_child_weight = 1 |