| Literature DB >> 28610631 |
Andreas Schulz1,2, Daniela Zöller3, Stefan Nickels4, Manfred E Beutel5, Maria Blettner3, Philipp S Wild6,7,8,9, Harald Binder10.
Abstract
BACKGROUND: There are a growing number of observational studies that do not only focus on single biomarkers for predicting an outcome event, but address questions in a multivariable setting. For example, when quantifying the added value of new biomarkers in addition to established risk factors, the aim might be to rank several new markers with respect to their prediction performance. This makes it important to consider the marker correlation structure for planning such a study. Because of the complexity, a simulation approach may be required to adequately assess sample size or other aspects, such as the choice of a performance measure.Entities:
Keywords: AUC; Added value; Biomarker; Brier score; Nagelkerke- R 2; Planning; Random forest; Risk prediction; Simulation
Mesh:
Substances:
Year: 2017 PMID: 28610631 PMCID: PMC5470184 DOI: 10.1186/s12874-017-0364-y
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Structure of the simulation. On the left side the determination of reference values is illustrated and on the right side the structure of the simulation procedure. *This step is skipped for the first part of simulation where the best simulation model was investigated and only used to determine the needed sample size of a pilot study
Logistic regression models: results from population data
| Marker | OR per-SD | Lower 95%CI | Upper 95%CI |
| Brier score diff. | Increase in AUC |
|
|---|---|---|---|---|---|---|---|
| MR-proADM | 1.18 | 1.08 | 1.30 | 0.00035 | -0.00048 | 0.00232 | 0.00349 |
| Nt-proBNP | 1.15 | 1.05 | 1.25 | 0.0017 | -0.00033 | 0.00151 | 0.00272 |
| hs-CRP | 1.08 | 1.00 | 1.17 | 0.044 | -0.00024 | 0.00084 | 0.00111 |
| CT-proAVP | 1.04 | 0.96 | 1.12 | 0.36 | 0.00000 | 0.00014 | 0.00022 |
| MR-proANP | 0.99 | 0.91 | 1.07 | 0.75 | 0.00000 | -0.00004 | 0.00003 |
All biomarkers were log-transformed. The models were adjusted for sex, age and bmi. The basic model had Brier score of 0.163, AUC of 0.759 and R 2 of 0.227
Mean ranks
| Ranking measure | MR-proADM | Nt-proBNP | hs-CRP |
|---|---|---|---|
| Brier score difference | 1.57 | 1.99 | 2.43 |
| Increase in AUC | 1.38 | 1.96 | 2.66 |
|
| 1.45 | 1.86 | 2.69 |
Mean rank is calculated on 10000 bootstrap samples from population data. Only the top three markers were ranked
Fig. 2Simulated mean ranks. The mean rank is based on 10000 simulation runs, with different methods of data generation. The dashed line represents the reference mean ranks from population data. Normal data: multivariate normal distributed covariate data. Quantile data: covariate data drawn from the empirical distribution. GLM: modeling of the relationship with logistic regression. GAM: modeling of the relationship with generalized additive models
Fig. 3Simulated absolute values. Absolute values of differences were generated with 10000 simulation runs and with different models. Dashed lines represent the reference values, thus the differences between the values of the basic model and the values of the extended model in population data. GLM stands for generalized linear models, GAM for generalized additive models
Fig. 4Pilot sample size investigation. Mean ranks for MR-proADM with random forest approach and different pilot sample sizes. For every step 1000 simulation runs were used. The dashed line represents the reference mean rank from population data