| Literature DB >> 35250868 |
Markus Huber1, Markus M Luedi1, Gerrit A Schubert2, Christian Musahl2, Angelo Tortora2, Janine Frey3, Jürgen Beck4,5, Luigi Mariani6, Emanuel Christ7, Lukas Andereggen2,8.
Abstract
BACKGROUND: First-line surgery for prolactinomas has gained increasing acceptance, but the indication still remains controversial. Thus, accurate prediction of unfavorable outcomes after upfront surgery in prolactinoma patients is critical for the triage of therapy and for interdisciplinary decision-making.Entities:
Keywords: dopamine agonists; long-term outcome; machine learning; prediction modeling; primary surgical therapy; prolactinoma
Mesh:
Substances:
Year: 2022 PMID: 35250868 PMCID: PMC8888454 DOI: 10.3389/fendo.2022.810219
Source DB: PubMed Journal: Front Endocrinol (Lausanne) ISSN: 1664-2392 Impact factor: 5.555
Patients’ characteristics at diagnosis.
| Characteristics | All patients |
|---|---|
| Age at diagnosis (years; | 32.0 [27.0;42.0] |
| BMI (kg/m2; | 26.4 (5.59) |
| Sex (female; | 71 (82.6%) |
| Macroadenoma ( | 41 (47.7%) |
| Secondary hypogonadism ( | 53 (76.8%) |
| Secondary hypothyroidism ( | 5 (6.25%) |
| Secondary hypocorticism ( | 3 (4.05%) |
| Cavernous sinus invasion ( | 17 (19.8%) |
| Serum prolactin levels ( | 199 [97.6;443] |
Data availability is indicated for each variable. Categorical variables are presented with counts and percentages; continuous variables are presented with median and interquartile range (IQR).
Patients’ characteristics at early (30 days postoperatively) and long-term follow-up.
| Characteristics | Early Follow-up | Long-term Follow-up |
|---|---|---|
| BMI (kg/m2) | 25.0 [21.4;28.7] (N=63) | 25.8 [21.3;29.0] (N=73) |
| Secondary hypocorticism | 3/75 (4.00%) | 3/84 (3.57%) |
| Secondary hypogonadism | 33/52 (63.5%) | 13/48 (27.1%) |
| Secondary hypothyroidism | 4/76 (5.26%) | 8/85 (9.41%) |
| Serum prolactin levels ( | 15.0 [7.33;72.8] (N=76) | 12.7 [7.60;20.4] (N=83) |
| DAs (i.e. Cabergoline) | 5/85 (5.88%) | 20/85 (23.5%) |
| DAs (i.e. Bromocriptine) | 14/85 (16.5%) | 11/85 (12.9%) |
| Outcomes | ||
| DA dependency [ | 19/85 (22.3%) | 31/85 (36.5%) |
| Control of hyperprolactinemia [ | 50/76 (65.8%) | 76/83 (91.6%) |
Data availability is indicated for each variable. Categorical variables are presented with counts and percentages; continuous variables are presented with median and interquartile range (IQR).
Figure 1Hyperparameter tuning in our set of machine learning classifiers. The impact of varying the default values of a single hyperparameter on the area under the curve (AUROC) is illustrated for a selection of hyperparameters in each algorithm (shown on the ordinate). Each hyperparameter is sampled 50 times and its performance is assessed within a repeated cross-validation sampling (three-fold, 4-repeats), resulting in an AUROC distribution, which is illustrated with a box and whiskers plot. The outcome was dependence on dopamine agonists at long-term follow-up. For comparison, the range of AUROC values derived using the default hyperparameter settings are shown as DEFAULT in each panel. Due to the repeated cross-validation sampling, the default hyperparameter settings also feature AUROC distributions, despite using only a fixed set of hyperparameters.
Figure 2Relationship between two performance metrics in a set of supervised classification algorithms resulting from randomly sampling two hyperparameters in each algorithm (N=500 samples). The area under the curve (AUROC) performance indicator is shown on the abscissa, whereas the corresponding value for the Matthews correlation coefficient (MCC) is shown on the ordinate. The outcomes are (A) dependency on DA on long-term follow-up and (B) successful control of hyperprolactinemia at early follow-up. For illustration purposes, a Locally Weighted Scatterplot Smoothing (LOESS) curves with associated 95% confidence intervals are shown for each classification algorithm.
Figure 3Area under the curve (AUROC) and Matthews correlation coefficient (MCC) values for the outcomes at early- and long-term follow-up. Median and 95% confidence intervals are shown, where the latter were derived in a repeated cross-validation sampling (three-fold, 100-repeats). For each machine learning algorithm, two influential hyperparameters (refer to ) were sampled 100 times and the hyperparameters settings resulting in the best AUROC performance were selected.
Performance metrics of a stacked super learner combining the outcome predictions of the individual classifiers (referred to as base learners; see method section).
| Outcome | AUROC | MCC | SENS | SPEC | PPV | NPV |
|---|---|---|---|---|---|---|
| Dopamine Agonist dependency | ||||||
| Long-term | 0.97 (0.92–1.00) | 0.85 (0.60–1.00) | 0.94 (0.83–1.00) | 0.91 (0.64–1.00) | 0.95 (0.82–1.00) | 0.91 (0.75–1.00) |
| Early-term | 0.80 (0.57–0.94) | 0.38 (−0.08 to 0.77) | 0.89 (0.73–1.00) | 0.46 (0.14–0.86) | 0.86 (0.77–0.95) | 0.56 (0.15–1.00) |
|
| ||||||
| Long-term | 0.80 (0.58–0.97) | 0.11 (−0.12 to 0.69) | 0.17 (0.00–0.67) | 0.95 (0.80–1.00) | 0.23 (0.00–1.00) | 0.93 (0.88–0.96) |
| Early-term | 0.69 (0.50–0.83) | 0.27 (−0.02 to 0.57) | 0.53 (0.22–0.78) | 0.74 (0.53–0.94) | 0.52 (0.33–0.76) | 0.76 (0.64–0.88) |
Outcomes are dependency on dopamine agonists and successful control of hyperprolactinemia at early-and long-term follow-up. Mean and 95% confidence intervals from a repeated cross-validation are shown.
AUROC, area under the receiver operating characteristic; MCC, Matthews correlation coefficient; SENS, sensitivity; SPEC, specificity; PPV, positive predictive value; NPV, negative predictive value.
Figure 4Importance of the available set of variables in predicting early and long-term outcome. The variable importance metric is based on a permutation approach, where the impact of perturbing the values of a given predictor on a particular performance metric [in this case: area under the curve (AUROC)] is assessed: the larger the decrease in the AUROC metric, the more important a predictor is considered. The variable importance is assessed for each classification algorithm with optimized hyperparameters, and the importance values for each predictor are simply stacked upon each other to illustrate the overall importance of a particular predictor and to visualize the inter-algorithm agreement in the assessment of the importance of a single predictor.