| Literature DB >> 28420342 |
Suman Kundu1, Madhu Mazumdar2, Bart Ferket2.
Abstract
BACKGROUND: The area under the ROC curve (AUC) of risk models is known to be influenced by differences in case-mix and effect size of predictors. The impact of heterogeneity in correlation among predictors has however been under investigated. We sought to evaluate how correlation among predictors affects the AUC in development and external populations.Entities:
Keywords: AUC; Correlation; External validation; Risk prediction; Simulation study
Mesh:
Year: 2017 PMID: 28420342 PMCID: PMC5395845 DOI: 10.1186/s12874-017-0345-1
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Input and estimated parameters in Approach I
| Population | Input parameters | Estimated parameters | |||||||
|---|---|---|---|---|---|---|---|---|---|
| ρ | Normal ( | Adjusted OR | Cases | Controls | SD of | AUC | |||
| ρ | ( | ρ | ( | ||||||
| A | 0.2 |
| (1.5, 1.5) | 0.17 |
| 0.17 |
| 0.61 | 0.663 |
| B | -0.1 | ,, | ,, | -0.12 |
| -0.12 |
| 0.54 | 0.645 |
| C | - 0.2 | ,, | ,, | -0.22 |
| -0.22 |
| 0.51 | 0.639 |
| D | 0.1 | ,, | ,, | 0.07 |
| 0.07 |
| 0.59 | 0.660 |
| E | 0.4 | ,, | ,, | 0.37 |
| 0.37 |
| 0.67 | 0.676 |
| F | 0.2 | ,, | (1.5, 1.2) | 0.18 |
| 0.19 |
| 0.47 | 0.629 |
| G | ,, | ,, | (1.2, 1.2) | 0.19 |
| 0.19 |
| 0.27 | 0.575 |
| H | ,, | ,, | (1.5, 3) | 0.10 |
| 0.14 |
| 1.25 | 0.789 |
| I | ,, | ,, | (0.8, 0.8) | 0.20 |
| 0.19 |
| 0.33 | 0.593 |
| J | -0.1 | ,, | (1.5, 0.8) | -0.09 |
| -0.08 |
| 0.49 | 0.632 |
| K | 0.2 | ,, | ,, | 0.21 |
| 0.21 |
| 0.42 | 0.616 |
| L | 0.4 | ,, | ,, | 0.40 |
| 0.41 |
| 0.37 | 0.603 |
| M | ,, | Mean: (0, 0); SD: (1, 3) | (1.5, 1.5) | 0.10 |
| 0.14 |
| 1.37 | 0.804 |
| N | - 0.2 | ,, | ,, | -0.25 |
| -0.24 |
| 1.21 | 0.781 |
| O | 0.1 | ,, | ,, | 0.01 |
| 0.04 |
| 1.31 | 0.795 |
| P | 0.4 | ,, | ,, | 0.30 |
| 0.33 |
| 1.42 | 0.810 |
In each population, a disease prevalence of 20% was used
Population ‘A’ is considered as reference population; all other populations are compared w.r.t ‘A’
SD standard deviation, OR odds ratio
ρ: Pearson correlation between two continuous predictors
A risk factor X ~ Normal (μ, σ) implies ‘X’ follows a normal distribution with mean μ and variance σ
In Approach I, the adjusted ORs were pre-specified and thus considered as input parameters
Numbers are rounded to two decimals except for AUC estimates
Input and estimated parameters in Approach II
| Population | Input parameters for cases and controls | Estimated parameters for the population | ||||||
|---|---|---|---|---|---|---|---|---|
| ρ | Normal ( | ρ | ( | Unadjusted OR * | Adjusted OR ** | SD of | AUC | |
| A | Cases = 0.2 |
| 0.25 |
| 1.28, 1.65 | 1.17, 1.60 | 1.13 | 0.770 |
| B | Cases = 0.2 | ,, | 0.40 | ,, | ,, | 1.09, 1.60 | 1.09 | 0.765 |
| C | Cases = 0.2 | ,, | -0.04 | ,, | ,, | 1.34, 1.68 | 1.25 | 0.785 |
| D | Cases = 0.1 | ,, | 0.16 | ,, | ,, | 1.22, 1.62 | 1.17 | 0.777 |
| E | Cases = - 0.1 | ,, | -0.02 | ,, | ,, | 1.35, 1.70 | 1.28 | 0.795 |
| F | Cases = 0.2 |
| 0.27 |
| 1.28, 2.12 | 1.11, 2.07 | 1.77 | 0.858 |
| G | ,, |
| 0.23 |
| 1.28, 1.28 | 1.23, 1.23 | 0.67 | 0.676 |
| H | ,, |
| 0.24 |
| 1.28, 1.25 | 1.21, 1.22 | 0.80 | 0.705 |
| I | ,, |
| 0.27 |
| 1.28, 7.39 | 1.05, 7.23 | 2.56 | 0.922 |
In each population, a disease prevalence of 20% was used
Population ‘A’ is considered as reference population and all other populations are compared w.r.t. ‘A’
SD: Standard Deviation; OR: Odds Ratio; Ctrls: controls
ρ: Pearson correlation between two continuous predictors
A risk factor X ~ Normal (μ, σ) implies ‘X’ follows a normal distribution with mean μ and variance
*when a risk factor is normally distributed in both cases and controls and sigma is the common variance of the risk factor in both cases and controls, then unadjusted OR = exp((μ –μ )/SD2) [19]
**adjusted ORs estimated by fitting logistic model
Fig. 1Relationships between AUC and correlation coefficient of two predictors: a Odds Ratios pointing in the same direction; b Odds Ratios pointing in opposite directions. Legend: Modeling is based on Approach I with populationμ : (0, 0);σ : (1, 1). ρ: Pearson correlation
Fig. 2Amount of separation of linear predictor values for cases and controls in hypothetical populations with different AUCs. Legend: AUC of population ‘A’ is 0.770; ‘D’ is 0.777; and ‘E’ is 0.795. Modeling is based on Approach II with the following specifications: Population ‘A’: ρ = 0.2, ρ = 0.2; μ e : (1, 2); μ : (0, 0); σ e : (2, 2); σ : (2, 2). Population ‘D’: ρ = 0.1, ρ = 0.1; μ and σ same like in ‘A’. Population ‘D’: ρ =0.1, ρ =0.1; μand σ same like in ‘A’. Population ‘E’: ρ =-0.1, ρ = -0.1; μ and σ same like in ‘A’. Note: When the two linear predictor distributions are fully overlapping, for each chosen cut-off value on the range of linear predictor values, the proportion of false positives (controls labeled as high risk) equals true positives (cases labeled as high risk). This would result in an AUC of 0.5. Similarly when the two distributions are not overlapping, the AUC approximates 1
Fig. 3Relationships between AUC and fixed correlation coefficients in cases, while varying correlations of controls. Legend: a μ : (1, 2), μ : (0, 0), σ : (2, 2), σ : (2, 2). b μ : (1, 3), μ : (0, 0), σ : (2, 2), σ : (2, 2). c μ : (1, 2), μ : (0, 0), σ : (2, 3), σ : (2, 3). d μ : (1, 2), μ : (0, 0), σ : (2, 1), σ : (2, 1). ρ: Pearson correlation
AUCs for risk models developed and validated in various populations: Approach I
| Validated in population | Developed in population | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | B | C | D | E | F | G | H | I | J | K | L | M | N | 0 | P | |
| A |
| * | * | * | * | 0.656 | 0.663 | 0.652 | 0.663 | 0.556 | ** | ** | * | * | * | * |
| B | 0.645 |
| * | * | * | 0.632 | 0.645 | 0.631 | 0.644 | 0.534 | ** | ** | * | * | * | * |
| C | 0.639 | * |
| * | * | 0.626 | 0.639 | 0.622 | 0.639 | 0.534 | ** | ** | * | * | * | * |
| D | 0.660 | * | * |
| * | 0.649 | 0.660 | 0.649 | 0.659 | 0.545 | ** | ** | * | * | * | * |
| E | 0.676 | * | * | * |
| 0.670 | 0.676 | 0.670 | 0.676 | 0.570 | ** | ** | * | * | * | * |
| F | 0.624 | * | * | * | * |
| 0.624 | 0.602 | 0.625 | 0.578 | ** | ** | * | * | * | * |
| G | 0.575 | * | * | * | * | 0.571 |
| 0.570 | 0.575 | 0.526 | ** | ** | * | * | * | * |
| H | 0.770 | * | * | * | * | 0.728 | 0.770 |
| 0.767 | 0.502 | ** | ** | * | * | * | * |
| I | 0.593 | * | * | * | * | 0.590 | 0.593 | 0.587 |
| 0.534 | ** | ** | * | * | * | * |
| J | 0.531 | * | * | * | * | 0.529 | 0.530 | 0.531 | 0.530 |
| ** | ** | * | * | * | * |
| K | 0.540 | * | * | * | * | 0.571 | 0.540 | 0.502 | 0.543 | 0.615 |
| ** | * | * | * | * |
| L | 0.542 | * | * | * | * | 0.541 | 0.540 | 0.541 | 0.542 | 0.602 | ** |
| * | * | * | * |
| M | 0.804 | * | * | * | * | 0.792 | 0.804 | 0.800 | 0.804 | 0.693 | ** | ** |
| * | * | * |
| N | 0.781 | * | * | * | * | 0.759 | 0.781 | 0.775 | 0.781 | 0.692 | ** | ** | * |
| * | * |
|
| 0.795 | * | * | * | * | 0.782 | 0.795 | 0.790 | 0.795 | 0.686 | ** | ** | * | * |
| * |
| P | 0.810 | * | * | * | * | 0.802 | 0.810 | 0.806 | 0.810 | 0.695 | ** | ** | * | * | * |
|
*Risk models with the same adjusted ORs will have equal impact on an external validation population. Therefore, prediction models developed in population A-E and M-P will perform similarly in an external validation population, and thus the values indicated as ‘*’ in these columns are identical to those in column A
**The adjusted ORs in population J-K are the same and therefore perform similarly in an external validation population. Thus, the values indicated as ‘**’ in columns K and L are identical to those in column J
The numbers in bold indicate the AUC estimated in the development population
AUCs for risk models developed and validated in various populations: Approach II
| Validated in population | Developed in population | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| A | B | C | D | E | F | G | H | I | ||
|
|
| 0.768 | 0.767 | 0.770 | 0.767 | 0.767 | 0.753 | 0.754 | 0.762 | |
|
| 0.764 |
| 0.759 | 0.763 | 0.759 | 0.764 | 0.745 | 0.746 | 0.761 | |
|
| 0.783 | 0.777 |
| 0.785 | 0.784 | 0.773 | 0.774 | 0.774 | 0.763 | |
|
| 0.777 | 0.773 | 0.776 |
| 0.776 | 0.771 | 0.763 | 0.764 | 0.763 | |
|
| 0.790 | 0.781 | 0.795 | 0.793 |
| 0.777 | 0.785 | 0.786 | 0.763 | |
|
| 0.855 | 0.858 | 0.845 | 0.852 | 0.845 |
| 0.819 | 0.821 | 0.857 | |
|
| 0.664 | 0.655 | 0.672 | 0.667 | 0.671 | 0.651 |
| 0.676 | 0.642 | |
|
| 0.696 | 0.690 | 0.702 | 0.700 | 0.703 | 0.689 | 0.705 |
| 0.682 | |
|
| 0.896 | 0.914 | 0.864 | 0.885 | 0.863 | 0.917 | 0.811 | 0.814 |
| |
The numbers in bold indicate the AUC estimated in the development population