| Literature DB >> 33042013 |
Congxin Dai1, Yanghua Fan1, Yichao Li2, Xinjie Bao1, Yansheng Li2, Mingliang Su2, Yong Yao1, Kan Deng1, Bing Xing1, Feng Feng3, Ming Feng1, Renzhi Wang1.
Abstract
Background: Some patients with acromegaly do not reach the remission standard in the short term after surgery but achieve remission without additional postoperative treatment during long-term follow-up; this phenomenon is defined as postoperative delayed remission (DR). DR may complicate the interpretation of surgical outcomes in patients with acromegaly and interfere with decision-making regarding postoperative adjuvant therapy. Objective: We aimed to develop and validate machine learning (ML) models for predicting DR in acromegaly patients who have not achieved remission within 6 months of surgery.Entities:
Keywords: LIME; SHAP; acromegaly; delayed remission; machine learning
Mesh:
Year: 2020 PMID: 33042013 PMCID: PMC7525125 DOI: 10.3389/fendo.2020.00643
Source DB: PubMed Journal: Front Endocrinol (Lausanne) ISSN: 1664-2392 Impact factor: 5.555
Patients' characteristics of the training and the test datasets.
| 306 | 244 | 62 | ||
| Age (mean ± | 37.69 ± 11.90 | 38.37 ± 11.35 | 35.03 ± 13.63 | 0.05 |
| Female | 162 | 129 | 33 | 0.96 |
| Male | 144 | 115 | 29 | |
| Microadenoma | 54 | 45 | 9 | 0.469 |
| Macroadenoma | 252 | 199 | 53 | |
| Grade 0 | 67 | 57 | 10 | 0.460 |
| Grade 1 | 33 | 27 | 6 | |
| Grade 2 | 50 | 36 | 14 | |
| Grade 3 | 99 | 77 | 22 | |
| Grade 4 | 57 | 47 | 1o | |
| No | 199 | 155 | 44 | 0.272 |
| Yes | 107 | 89 | 18 | |
| Normal | 119 | 94 | 25 | 0.234 |
| Impaired glucose tolerance | 109 | 92 | 17 | |
| Diabetes | 78 | 58 | 20 | |
| Pre-rGH (ng/ml) | 27.50 (9.05–66.00) | 28.37 (9.80–67.35) | 22.35 (8.20–63.20) | 0.337 |
| Pre-IGF-1 (ng/ml) | 921.31 ± 277.81 | 918.67 ± 278.95 | 931.74 ± 275.30 | 0.741 |
| Pre-nGH (ng/ml) | 17.60 (6.10–38.83) | 18.20 (6.40–39.83) | 14.40 (5.20–34.78) | 0.354 |
| No | 204 | 166 | 38 | 0.315 |
| Yes | 102 | 78 | 24 | |
| Soft | 239 | 195 | 44 | 0.128 |
| Firm | 67 | 49 | 18 | |
| <3% | 211 | 169 | 42 | 0.817 |
| ≥3% | 95 | 75 | 20 | |
| Post-1w rGH (ng/ml) | 3.99 (1.70–10.45) | 4.20 (1.70–10.95) | 3.40 (1.90–9.79) | 0.845 |
| Post-1w IGF-1 (ng/ml) | 701.50 (559.25–908.00) | 693.00 (554.75–893.5) | 730.33 (561.78–969.25) | 0.367 |
| Post-1w nGH (ng/ml) | 2.80 (1.10–6.65) | 2.87 (1.09–6.74) | 2.39 (1.14–6.90) | 0.914 |
| Post-6m rGH (ng/ml) | 3.45 (1.80–8.10) | 3.50 (1.65–7.95) | 3.40 (1.98–8.60) | 0.772 |
| Post-6m IGF-1 (ng/ml) | 524.50 (367.00–732.25) | 535.70 (367.25–739.50) | 499.00 (364.50–726.25) | 0.772 |
| Post-6m nGH (ng/ml) | 1.99 (0.90–4.56) | 1.99 (0.90–4.37) | 1.99 (0.97–5.74) | 0.541 |
| No | 251 | 198 | 53 | 0.427 |
| Yes | 55 | 46 | 9 |
SD, standard deviation; MTD, maximal tumor diameter; pre-, preoperative; rGH, random GH; nGH, nadir GH; post-1w, postoperative 1 week; post-6m, postoperative 6 months.
Continuous features consistent with a normal distribution were presented as mean ± standard deviation; otherwise, the median and the quartile are used. Chi-square or Fisher's exact test was used to compare the differences in categorical features.
Univariate analysis of the clinical characteristics of patients in the training and the test datasets.
| Age (mean ± | 37.62 ± 10.78 | 41.57 ± 13.19 | 0.033 | 35.17 ± 13.77 | 34.22 ± 13.56 | 0.849 |
| Female | 108 | 21 | 0.276 | 29 | 4 | 0.568 |
| Male | 90 | 25 | 24 | 5 | ||
| Microadenoma | 29 | 16 | 0.002 | 8 | 1 | 0.754 |
| Macroadenoma | 169 | 30 | 45 | 8 | ||
| Grade 0 | 38 | 19 | 0.000 | 8 | 2 | 0.453 |
| Grade 1 | 17 | 10 | 4 | 2 | ||
| Grade 2 | 28 | 8 | 12 | 2 | ||
| Grade 3 | 71 | 6 | 19 | 3 | ||
| Grade 4 | 44 | 3 | 10 | 0 | ||
| No | 132 | 23 | 0.034 | 39 | 5 | 0.271 |
| Yes | 66 | 23 | 14 | 4 | ||
| Normal | 78 | 16 | 0.454 | 22 | 3 | 0.876 |
| Impaired glucose tolerance | 71 | 21 | 14 | 3 | ||
| Diabetes | 49 | 9 | 17 | 3 | ||
| Pre-rGH (ng/ml) | 33.60 (12.07–77.55) | 11.06 (4.48–30.70) | 0.001 | 22.60 (8.20–65.65) | 20.00 (10.35–26.35) | 0.589 |
| Pre-IGF-1 (ng/ml) | 939.14 ± 253.71 | 830.52 ± 358.63 | 0.057 | 922.95 ± 274.32 | 983.56 ± 291.94 | 0.546 |
| Pre-nGH (ng/ml) | 20.45 (8.38–43.78) | 9.43 (2.21–21.60) | 0.001 | 14.40 (5.20–37.55) | 14.40 (8.65–21.80) | 0.920 |
| No | 129 | 37 | 0.045 | 31 | 7 | 0.272 |
| Yes | 69 | 9 | 22 | 2 | ||
| Soft | 156 | 39 | 0.361 | 36 | 8 | 0.200 |
| Firm | 42 | 7 | 17 | 1 | ||
| <3% | 130 | 39 | 0.011 | 36 | 6 | 0.941 |
| ≥3% | 68 | 7 | 17 | 3 | ||
| Post-1w rGH (ng/ml) | 5.39 (2.18–14.85) | 1.29 (0.78–2.23) | 0.001 | 4.33 (2.30–10.10) | 1.30 (1.10–2.65) | 0.004 |
| Post-1w IGF-1 (ng/ml) | 756.81 ± 254.28 | 611.20 ± 251.13 | 0.001 | 738.66 (562.68–977.00) | 704.00 (471.50–939.00) | 0.478 |
| Post-1w nGH (ng/ml) | 3.91 (1.61–8.68) | 0.83 (0.46–1.63) | 0.000 | 2.92 (1.41–7.82) | 1.06 (0.93–2.27) | 0.033 |
| Post-6m rGH (ng/ml) | 4.50 (2.58–9.50) | 1.05 (0.275–2.15) | 0.000 | 3.70 (2.39–10.25) | 0.90 (0.45–4.20) | 0.033 |
| Post-6m IGF-1 (ng/ml) | 628.28 ± 248.67 | 332.63 ± 87.07 | 0.000 | 529.00 (379.00–782.00) | 384.00 (286.00–391.50) | 0.011 |
| Post-6m nGH (ng/ml) | 2.57 (1.33–5.09) | 0.39 (0.12–1.02) | 0.000 | 2.10 (1.33–6.45) | 0.71 (0.24–2.41) | 0.009 |
SD, standard deviation; MTD, maximal tumor diameter; pre-, preoperative; rGH, random GH; nGH, nadir GH; post-1w, postoperative 1 week; post-6m, postoperative 6 months.
Continuous features consistent with a normal distribution were presented as mean ± standard deviation; otherwise, the median and the quartile are used. Chi-square or Fisher's exact test was used to compare the differences in categorical features.
The performance of multiple resampling methods on the six ML models.
| None | 0.4444 | 0.6667 | 0.5556 | 0.6667 | 0.5556 | 0.4445 |
| SMOTE | 0.5556 | 0.6667 | 0.5556 | 0.5556 | 0.5556 | 0.5556 |
| SMOTETomek | 0.5556 | 0.6667 | 0.5556 | 0.5556 | 0.5556 | 0.4445 |
| SMOTEENN | 0.5556 | 0.6667 | 0.6667 | 0.7778 | 0.6667 | 0.5556 |
SP, specificity; ML, machine learning; LR, logistic regression; RF, random forest.
Figure 1Receiver operating characteristic curves showing the delayed remission predictive performance of six machine learning algorithms based on the selected significant features in the training (A) and test (B) datasets. LR, logistic regression; GBDT, gradient boosting decision tree; XGBoost, extreme gradient boost; AdaBoost, adaptive boosting; CatBoost, categorical boosting.
The best performance of the six ML algorithms in the test dataset.
| Feature number | 18 | 14 | 7 | 15 | 15 | 9 |
| AUC | 0.7945 | 0.7013 | 0.8061 | 0.8260 | 0.8239 | 0.7338 |
| Threshold | 0.00008 | 1 | 0.9997 | 0.6041 | 0.2743 | 0.9 |
| Youden index | 0.3858 | 0.4025 | 0.4759 | 0.6436 | 0.4025 | 0.3292 |
| ACC | 0.7903 | 0.7258 | 0.7097 | 0.7742 | 0.7258 | 0.7419 |
| Specificity | 0.8302 | 0.7358 | 0.6981 | 0.7547 | 0.7358 | 0.7736 |
| Sensitivity | 0.5556 | 0.6667 | 0.7778 | 0.8889 | 0.6667 | 0.5556 |
| PPV | 0.3571 | 0.3 | 0.3043 | 0.381 | 0.3 | 0.2941 |
| NPV | 0.9167 | 0.9286 | 0.9487 | 0.9756 | 0.9286 | 0.9111 |
| PLR | 3.2716 | 2.5238 | 2.5764 | 3.6239 | 2.5238 | 2.4537 |
| NLR | 0.5354 | 0.453 | 0.3183 | 0.1472 | 0.453 | 0.5745 |
LR, logistic regression; RF, random forest; AUC, area under the curve; ACC, accuracy; PPV, positive predictive value; NPV, negative predictive value; PLR, positive likelihood ratio; NLR, negative likelihood ratio.
Figure 2Feature importance ranking based on permutation importance (A) and SHapley Additive exPlanations (SHAP) values (B,C) in XGboost model. (A) The features are ranked based on the permutation importance method in the XGboost model. (B) The features are ranked according to the sum of the SHAP values for all patients, and the SHAP values are used to show the distribution of the effect of each feature on the XGboost model outputs. Red indicates that the value of a feature is high, and blue indicates that the value of a feature is low. The x-axis indicates the effect of SHAP values on the model output. The larger the value of the x-axis, the greater the probability of delayed remission. (C) Standard bar charts were drawn and sorted using the average absolute value of the shape values of each feature in the XGboost model.
Univariate and multivariate analyses measure the correlation between the clinical features and the delayed remission.
| Age | 1.023 | 0.998–1.049 | 0.067 | |||
| Gender | 1.442 | 0.803–2.591 | 0.221 | |||
| Tumor size | 0.386 | 0.198–0.755 | 0.005 | 0.539 | 0.189–1.535 | 0.247 |
| Knosp grade | 0.592 | 0.476–0.736 | 0.000 | 0.725 | 0.522–1.029 | 0.072 |
| Hypertension | 2.061 | 1.141–3.724 | 0.017 | 1.674 | 0.714–3.925 | 0.236 |
| Fasting blood glucose | 1.013 | 0.701–1.465 | 0.945 | |||
| Pre-rGH (ng/ml) | 0.991 | 0.984–0.998 | 0.018 | 1.005 | 0.974–1.037 | 0.771 |
| Pre-IGF-1(ng/ml) | 0.999 | 0.998–1.000 | 0.054 | |||
| Pre-nGH (ng/ml) | 0.989 | 0.978–0.999 | 0.036 | 1.016 | 0.974–1.037 | 0.473 |
| Cavernous sinus invasion | 0.44 | 0.216–0.893 | 0.023 | 0.814 | 0.272–2.438 | 0.713 |
| Tumor texture | 0.554 | 0.248–1.238 | 0.150 | |||
| Ki-67 (%) | 0.434 | 0.208–0.904 | 0.026 | 0.605 | 0.213–1.720 | 0.346 |
| Post-1w rGH (ng/ml) | 0.917 | 0.863–0.973 | 0.004 | 1.031 | 0.839–1.267 | 0.771 |
| Post-1w IGF-1 (ng/ml) | 0.998 | 0.996–0.999 | 0.001 | 1.000 | 0.998–1.002 | 0.661 |
| Post-1w nGH (ng/ml) | 0.903 | 0.837–0.974 | 0.008 | 1.003 | 0.762–1.320 | 0.985 |
| Post-6m rGH (ng/ml) | 0.488 | 0.379–0.627 | 0.000 | 0.615 | 0.437–0.866 | 0.005 |
| Post-6m IGF-1 (ng/ml) | 0.991 | 0.988–0.994 | 0.000 | 0.991 | 0.987–0.995 | 0.000 |
| Post-6m nGH (ng/ml) | 0.249 | 0.155–0.400 | 0.000 | 0.54 | 0.285–1.022 | 0.058 |
Figure 3Results of local interpretable model–agnostic explanation (LIME) with XGBoost classifiers applied to two correctly predicted patients [one negative (non-delayed remission) and one positive (delayed remission) patient)] and one incorrectly predicted patient (non-delayed remission patient, incorrectly predicted with high probabilities of delayed remission). The figure reveals the role of various features in the incidence of delayed remission in each patient. The first column represents the prediction probabilities of negative and positive results achieved from the classifiers. The second column shows the contributions made by the features included in the models to the probability. The third column displays the original data values of these features. (A) LIME explanation for patient 1 as true positive, (B) LIME explanation for patient 2 as true negative, and (C) LIME explanation for patient 3 as false positive.
Figure 4Partial correlation plot of delayed remission probability based on post-6m IGF1 (A) and post-6m rGH (B) in the XGBoost model. The y-axis represents the predicted probability compared with the baseline, and the x-axis represents the value of post-6m IGF1or post-6m rGH. The blue areas represent confidence intervals.