| Literature DB >> 31093563 |
Abstract
Over the past few decades, interest in biomarkers to enhance predictive modeling has soared. Methodology for evaluating these has also been an active area of research. There are now several performance measures available for quantifying the added value of biomarkers. This commentary provides an overview of methods currently used to evaluate new biomarkers, describes their strengths and limitations, and offers some suggestions on their use.Entities:
Keywords: Biomarkers; Calibration; Clinical utility; Model fit; Reclassification
Year: 2018 PMID: 31093563 PMCID: PMC6460632 DOI: 10.1186/s41512-018-0037-2
Source DB: PubMed Journal: Diagn Progn Res ISSN: 2397-7523
Summary of performance measures for quantifying added value
| Measure | Advantages | Disadvantages |
|---|---|---|
| Likelihood-based measures | Reflects probability of obtaining the observed data | Based on assumed model |
| Likelihood ratio (LR), change in AIC or BIC | The LR test is the uniformly most powerful test for nested models. The AIC and BIC can be used to assess non-nested models. | While powerful, statistical association or model improvement may not be of clinical importance. |
| Discrimination | Assesses separation of cases and non-cases | Only one component of model fit |
| Difference in ROC curves, AUC, | Assesses discrimination between those with and without outcome of interest across the whole range of a continuous predictor or score. Useful for classification | Based on ranks only. Does not assess calibration. Differences may not be of clinical importance. |
| Clinical risk reclassification | Examines difference in assigning to clinically important risk strata | Strata should be pre-defined. Loses information if strata are not clinically important |
| Reclassification calibration statistic | Assesses calibration within cross-classified risk strata | A test for each model is needed |
| Categorical NRI | Can assess changes in important risk strata. Cases and non-cases can be considered separately | Depends on the number of categories and cut points used |
| NRI( | Nice statistical properties. Does not vary by event rate in the data | May not be clinically relevant |
| Conditional NRI | Indicates improvement within clinically important risk subgroups | Biased in its crude form, and a correction based on the full data is needed. |
| Category-free measures | Does not require cut points | May lose clinical intuition |
| Brier score | Proper scoring rule | May be difficult to interpret; the maximum value depends on incidence of the outcome. |
| NRI(0) | Continuous, does not depend on categories | Based on ranks only. Measure of association rather than model improvement. Behavior may be erratic if the new predictor is not normally distributed. |
| IDI | Nice statistical properties. Related to the difference in model | Depends on event rate. Values are low and may be difficult to interpret. |
| Decision analytics | Estimates clinical impact of using model | Not a direct estimate of model fit or improvement. Need reasonable estimates of decision thresholds |
| Decision curve | Displays the net benefit across a range of thresholds | Does not compare model improvement directly but clinical consequences of using the models for treatment decisions |
| Cost-benefit analysis | Compares costs and benefits of one models or treatment strategy vs. another | Need detailed estimates of costs and benefits of misclassification, including further diagnostic workup and treatments |
Recommendations
| 1. Test for model improvement using a likelihood-based or similar test. | |
| 2. Assess overall calibration and discrimination of each model. | |
| 3. If relevant risk strata are available, compute the risk reclassification table with clinical cut points or the overall prevalence, if relevant. | |
| 4. If relevant, consider bias-corrected conditional NRI to enhance screening of individuals at intermediate risk. | |
| 5. If pre-specified risk strata are not available, consider cost tradeoffs to develop appropriate cut points. | |
| 6. Consider decision analysis to assess the net benefit of using models for treatment decisions. | |
| 7. Validate |