| Literature DB >> 15198922 |
Sylvia Richardson1, Andrew Thomson, Nicky Best, Paul Elliott.
Abstract
There is currently much interest in conducting spatial analyses of health outcomes at the small-area scale. This requires sophisticated statistical techniques, usually involving Bayesian models, to smooth the underlying risk estimates because the data are typically sparse. However, questions have been raised about the performance of these models for recovering the "true" risk surface, about the influence of the prior structure specified, and about the amount of smoothing of the risks that is actually performed. We describe a comprehensive simulation study designed to address these questions. Our results show that Bayesian disease-mapping models are essentially conservative, with high specificity even in situations with very sparse data but low sensitivity if the raised-risk areas have only a moderate (less than 2-fold) excess or are not based on substantial expected counts (> 50 per area). Semiparametric spatial mixture models typically produce less smoothing than their conditional autoregressive counterpart when there is sufficient information in the data (moderate-size expected count and/or high true excess risk). Sensitivity may be improved by exploiting the whole posterior distribution to try to detect true raised-risk areas rather than just reporting and mapping the mean posterior relative risk. For the widely used conditional autoregressive model, we show that a decision rule based on computing the probability that the relative risk is above 1 with a cutoff between 70 and 80% gives a specific rule with reasonable sensitivity for a range of scenarios having moderate expected counts (approximately 20) and excess risks (approximately 1.5- to 2-fold). Larger (3-fold) excess risks are detected almost certainly using this rule, even when based on small expected counts, although the mean of the posterior distribution is typically smoothed to about half the true value.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15198922 PMCID: PMC1247195 DOI: 10.1289/ehp.6740
Source DB: PubMed Journal: Environ Health Perspect ISSN: 0091-6765 Impact factor: 9.031
Figure 1Histograms of the raw SMRs (A,E ) and posterior means of the relative risks (B–H) for all the background areas of Simu 2 derived by each of the three models. Note that the crosses on the x-axes indicate the minimum and maximum values obtained. SF indicates the scale factor used for the expected values.
Posterior mean relative risk estimates for the raised-risk areas for the BYM model (average over replicate data sets).
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Raised-risk area | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 |
| Simu 1 | ||||||||||||
| 10% area ( | 1.01 | 1.02 | 1.06 | 1.01 | 1.02 | 1.12 | 1.01 | 1.03 | 1.20 | 1.01 | 1.07 | 1.40 |
| 25% area ( | 1.03 | 1.04 | 1.10 | 1.00 | 1.03 | 1.15 | 1.01 | 1.05 | 1.28 | 1.02 | 1.09 | 1.52 |
| 50% area ( | 1.02 | 1.05 | 1.15 | 1.00 | 1.05 | 1.28 | 1.02 | 1.08 | 1.46 | 1.03 | 1.16 | 1.79 |
| 75% area ( | 1.03 | 1.05 | 1.31 | 1.03 | 1.07 | 1.55 | 1.04 | 1.12 | 1.86 | 1.05 | 1.33 | 2.35 |
| 90% area ( | 1.03 | 1.07 | 1.34 | 1.03 | 1.10 | 1.62 | 1.04 | 1.15 | 2.07 | 1.07 | 1.40 | 2.47 |
| Simu 2 | ||||||||||||
| 1% cluster (& | 1.04 | 1.08 | 1.45 | 1.04 | 1.14 | 1.76 | 1.05 | 1.23 | 2.11 | 1.09 | 1.45 | 2.43 |
| Simu 3 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 |
| 20 × 1% clusters ( | 1.04 | 1.23 | 1.63 | 1.07 | 1.3 | 1.74 | 1.12 | 1.38 | 1.84 | 1.19 | 1.48 | 1.95 |
Posterior mean relative risk estimates for the raised-risk areas for the L1-BYM model (average over replicate data sets).
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Raised-risk area | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 |
| Simu 1 | ||||||||||||
| 10% area ( | 1.01 | 1.02 | 1.05 | 1.01 | 1.02 | 1.12 | 1.01 | 1.02 | 1.16 | 1.01 | 1.07 | 1.21 |
| 25% area ( | 1.01 | 1.03 | 1.11 | 1.00 | 1.04 | 1.15 | 1.00 | 1.06 | 1.24 | 1.03 | 1.09 | 1.35 |
| 50% area ( | 1.01 | 1.03 | 1.16 | 1.00 | 1.05 | 1.28 | 1.01 | 1.08 | 1.55 | 1.03 | 1.17 | 2.22 |
| 75% area ( | 1.02 | 1.05 | 1.32 | 1.03 | 1.08 | 1.56 | 1.03 | 1.13 | 1.98 | 1.05 | 1.35 | 2.67 |
| 90% area ( | 1.04 | 1.07 | 1.48 | 1.03 | 1.13 | 1.93 | 1.05 | 1.25 | 2.43 | 1.08 | 1.60 | 2.72 |
| Simu 2 | ||||||||||||
| 1% cluster ( | 1.04 | 1.08 | 1.45 | 1.04 | 1.14 | 1.76 | 1.05 | 1.23 | 2.11 | 1.09 | 1.45 | 2.43 |
| Simu 3 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 |
| 20 × 1% clusters (& | 1.04 | 1.22 | 1.61 | 1.07 | 1.29 | 1.74 | 1.12 | 1.38 | 1.85 | 1.19 | 1.49 | 1.97 |
Posterior mean relative risk estimates for the raised-risk areas for the MIX model (average over replicate data sets).
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Raised-risk area | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 |
| Simu 1 | ||||||||||||
| 10% area ( | 1.00 | 1.01 | 1.02 | 1.00 | 1.02 | 1.27 | 1.00 | 1.01 | 1.53 | 1.01 | 1.10 | 2.50 |
| 25% area ( | 1.00 | 1.02 | 1.09 | 1.00 | 1.01 | 1.17 | 1.00 | 1.05 | 1.80 | 1.01 | 1.22 | 2.67 |
| 50% area ( | 1.00 | 1.02 | 1.25 | 1.00 | 1.04 | 1.88 | 1.00 | 1.23 | 2.78 | 1.02 | 1.72 | 3.02 |
| 75% area ( | 1.00 | 1.03 | 1.57 | 1.00 | 1.07 | 2.44 | 1.01 | 1.42 | 2.91 | 1.04 | 1.87 | 3.02 |
| 90% area ( | 1.00 | 1.03 | 1.60 | 1.01 | 1.09 | 2.46 | 1.01 | 1.49 | 2.91 | 1.06 | 1.89 | 3.02 |
| Simu 2 | ||||||||||||
| 1% cluster (& | 1.02 | 1.06 | 1.98 | 1.01 | 1.25 | 2.66 | 1.03 | 1.72 | 2.92 | 1.21 | 1.92 | 2.98 |
| Simu 3 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 |
| 20 × 1% clusters (& | 1.02 | 1.19 | 1.55 | 1.05 | 1.31 | 1.64 | 1.12 | 1.44 | 1.81 | 1.31 | 1.55 | 2.06 |
Figure 2Histograms comparing the distribution of the posterior means of the relative risks estimated by the BYM or MIX models for the high-risk areas of Simu 2 or Simu 3 using a scale factor of 4 for the expected values and a true relative risk (marked by the vertical line on each plot) of θ = 2 (Simu 2) or θ *1=1.65 (Simu 3).
Figure 3Box plots of the posterior means of the relative risks estimated by the three models for the high-risk areas of Simu 2 as a function of the scaling factor.
Figure 4Posterior distribution of a relative risk θ, with shaded area indicating Prob(θ > 1).
False-positive rates (1 – specificity) for the three models.
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Background | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 |
| BYM | ||||||||||||
| Simu 1 | 0.08 | 0.10 | 0.05 | 0.04 | 0.06 | 0.04 | 0.03 | 0.08 | 0.06 | 0.03 | 0.05 | 0.08 |
| Simu 2 | 0.07 | 0.06 | 0.06 | 0.05 | 0.05 | 0.06 | 0.05 | 0.05 | 0.07 | 0.04 | 0.08 | 0.10 |
| Simu 3 | 0.02 | 0.03 | 0.02 | 0.02 | 0.03 | 0.02 | 0.03 | 0.03 | 0.01 | 0.03 | 0.02 | 0.01 |
| L1-BYM | ||||||||||||
| Simu 1 | 0.05 | 0.09 | 0.06 | 0.06 | 0.10 | 0.05 | 0.03 | 0.06 | 0.06 | 0.05 | 0.05 | 0.08 |
| Simu 2 | 0.07 | 0.09 | 0.06 | 0.05 | 0.07 | 0.06 | 0.05 | 0.06 | 0.06 | 0.04 | 0.07 | 0.08 |
| Simu 3 | 0.04 | 0.03 | 0.02 | 0.02 | 0.03 | 0.02 | 0.03 | 0.03 | 0.02 | 0.03 | 0.02 | 0.01 |
| MIX | ||||||||||||
| Simu 1 | 0.00 | 0.04 | 0.00 | 0.01 | 0.04 | 0.00 | 0.03 | 0.02 | 0.00 | 0.02 | 0.00 | 0.08 |
| Simu 2 | 0.00 | 0.01 | 0.11 | 0.00 | 0.04 | 0.04 | 0.00 | 0.06 | 0.01 | 0.01 | 0.02 | 0.00 |
| Simu 3 | 0.02 | 0.51 | 0.44 | 0.02 | 0.52 | 0.25 | 0.01 | 0.33 | 0.12 | 0.00 | 0.14 | 0.03 |
aDecision rules are D(0.8, 1) for BYM and L1-BYM and D(0.05, 1.5) for MIX.
bFor Simu 3, θ* = 1.35, 1.65, or 2.1 instead of θ = 1.5, 2, or 3, respectively.
Simu 3: performance of the BYM and MIX models under alternative decision rules.
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | |
| BYM – | ||||||||||||
| Probability (false detection) | 0.10 | 0.07 | 0.05 | 0.07 | 0.07 | 0.04 | 0.08 | 0.06 | 0.03 | 0.08 | 0.05 | 0.02 |
| Probability (true detection) | 0.23 | 0.51 | 0.71 | 0.36 | 0.68 | 0.84 | 0.56 | 0.82 | 0.93 | 0.81 | 0.95 | 0.99 |
| MIX – | ||||||||||||
| Probability (false detection) | 0.00 | 0.03 | 0.07 | 0.00 | 0.06 | 0.05 | 0.00 | 0.07 | 0.03 | 0.00 | 0.03 | 0.01 |
| Probability (true detection) | 0.00 | 0.23 | 0.76 | 0.00 | 0.62 | 0.88 | 0.00 | 0.84 | 0.93 | 0.00 | 0.93 | 0.98 |
Sensitivity (1 – false-negative rate) for the BYM model.
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Raised-risk area | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 | θ = 1.5 | θ = 2 | θ = 3 |
| Simu 1 | ||||||||||||
| 10% area ( | 0.08 | 0.06 | 0.08 | 0.04 | 0.02 | 0.36 | 0 | 0.06 | 0.68 | 0.02 | 0.42 | 0.98 |
| 25% area ( | 0.36 | 0.48 | 0.38 | 0.20 | 0.24 | 0.36 | 0.20 | 0.50 | 0.82 | 0.28 | 0.54 | 1 |
| 50% area ( | 0.32 | 0.48 | 0.40 | 0.16 | 0.32 | 0.66 | 0.24 | 0.66 | 0.98 | 0.30 | 0.96 | 1 |
| 75% area ( | 0.08 | 0.30 | 0.74 | 0.12 | 0.52 | 0.98 | 0.22 | 0.76 | 1 | 0.66 | 1 | 1 |
| 90% area ( | 0.12 | 0.22 | 0.74 | 0.10 | 0.64 | 0.98 | 0.34 | 0.88 | 1 | 0.88 | 1 | 1 |
| Simu 2 | ||||||||||||
| 1% cluster ( | 0.18 | 0.42 | 0.95 | 0.30 | 0.74 | 1 | 0.53 | 0.97 | 1 | 0.90 | 1 | 1 |
| Simu 3 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 |
| 20 × 1% clusters ( | 0.09 | 0.34 | 0.56 | 0.17 | 0.51 | 0.74 | 0.37 | 0.71 | 0.88 | 0.66 | 0.90 | 0.94 |
aDecision rule is D(0.8, 1).
Probability of true detection (sensitivity) for the L1-BYM model.
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Raised-risk area | θ= 1.5 | θ= 2 | θ= 3 | θ= 1.5 | θ= 2 | θ= 3 | θ= 1.5 | θ= 2 | θ= 3 | θ= 1.5 | θ= 2 | θ= 3 |
| Simu 1 | ||||||||||||
| 10% area ( | 0.02 | 0.04 | 0.04 | 0.04 | 0.08 | 0.32 | 0.02 | 0.02 | 0.54 | 0.04 | 0.28 | 0.54 |
| 25% area ( | 0.26 | 0.34 | 0.38 | 0.24 | 0.38 | 0.40 | 0.16 | 0.46 | 0.88 | 0.44 | 0.52 | 0.98 |
| 50% area ( | 0.28 | 0.38 | 0.42 | 0.30 | 0.42 | 0.66 | 0.26 | 0.56 | 0.96 | 0.44 | 0.86 | 1 |
| 75% area ( | 0.08 | 0.24 | 0.74 | 0.06 | 0.50 | 0.94 | 0.20 | 0.78 | 1 | 0.68 | 1 | 1 |
| 90% area ( | 0.16 | 0.22 | 0.76 | 0.10 | 0.68 | 0.98 | 0.24 | 0.90 | 1 | 0.86 | 1 | 1 |
| Simu 2 | ||||||||||||
| 1% cluster ( | 0.17 | 0.35 | 0.91 | 0.23 | 0.64 | 1 | 0.39 | 0.95 | 1 | 0.85 | 1 | 1 |
| Simu 3 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 |
| 20 × 1% clusters ( | 0.10 | 0.31 | 0.55 | 0.16 | 0.48 | 0.75 | 0.35 | 0.70 | 0.89 | 0.65 | 0.90 | 0.98 |
aDecision rule is D(0.8, 1).
Probability of true detection (sensitivity) for the MIX model.
| SF = 1
| SF = 2
| SF = 4
| SF = 10
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Raised-risk area | θ= 1.5 | θ= 2 | θ= 3 | θ= 1.5 | θ= 2 | θ= 3 | θ= 1.5 | θ= 2 | θ= 3 | θ= 1.5 | θ= 2 | θ= 3 |
| Simu 1 | ||||||||||||
| 10% area ( | 0 | 0 | 0.05 | 0 | 0.02 | 0.35 | 0 | 0.04 | 0.56 | 0 | 0.31 | 0.54 |
| 25% area ( | 0 | 0.02 | 0.20 | 0 | 0.01 | 0.30 | 0 | 0.16 | 0.72 | 0.06 | 0.53 | 0.98 |
| 50% area ( | 0 | 0.02 | 0.33 | 0 | 0.10 | 0.77 | 0 | 0.51 | 0.98 | 0.05 | 0.94 | 1 |
| 75% area ( | 0 | 0.02 | 0.51 | 0 | 0.18 | 0.90 | 0 | 0.67 | 0.99 | 0.10 | 0.98 | 1 |
| 90% area ( | 0 | 0.05 | 0.55 | 0 | 0.19 | 0.93 | 0 | 0.68 | 0.99 | 0.14 | 0.98 | 1 |
| Simu 2 | ||||||||||||
| 1% cluster ( | 0.02 | 0.10 | 0.86 | 0.01 | 0.46 | 0.99 | 0.05 | 0.95 | 1 | 0.47 | 1.00 | 1.00 |
| Simu 3 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 | θ* = 1.35 | θ* = 1.65 | θ* = 2.1 |
| 20 × 1% clusters ( | 0.04 | 0.85 | 0.99 | 0.04 | 0.99 | 0.99 | 0.06 | 0.99 | 0.99 | 0.0 | 0.99 | 1.00 |
aDecision rule is D(0.5, 1.5).