| Literature DB >> 20642827 |
Robin L Young1, Janice Weinberg, Verónica Vieira, Al Ozonoff, Thomas F Webster.
Abstract
BACKGROUND: A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics.Entities:
Mesh:
Year: 2010 PMID: 20642827 PMCID: PMC2918545 DOI: 10.1186/1476-072X-9-37
Source DB: PubMed Journal: Int J Health Geogr ISSN: 1476-072X Impact factor: 3.918
Description of Hypothesis Testing Methods and Significance Cutoffs
| Hypothesis Testing Method | Abbreviation | Description | Significance Cutoff |
|---|---|---|---|
| Conditional Permutation Test | CPT | Select optimal span size for observed data by minimizing AIC statistic across range of spans. Compare difference in deviance statistic to conditional permutation distribution obtained by holding span size constant. | |
| Fixed Span Permutation Test | FSPT | Select span size | |
| Fixed Multiple Span Permutation Test | FMSPT | Select 3-5 span sizes | |
| Unconditional Permutation Test | UPT | Select optimal span size for observed data as in CPT. Compare difference in deviance statistic to unconditional permutation distribution obtained by selecting optimal span size for each permuted dataset. | |
| Spatial Scan Statistic | --- | Detects the most likely cluster through a likelihood ratio test comparing the likelihood of cases within to outside a circular zone of interest. P-values are obtained through Monte Carlo methods | |
Figure 1Case 1 Study Region Diagram. This figure is a diagram for the study region generated for Case 1.
Figure 2Case 2 Study Region Diagram. This figure is a diagram for the study region generated for Case 2.
Figure 3Case 3 Study Region Diagram. This figure is a diagram for the study region generated for Case 3.
Figure 4Case 1 Example Data with Odds Ratio of 3.0. This figure is a replicate of data simulated for Case 1 with an odds ratio and a probability of disease outside the cluster of 20%. Cases are displayed in red while non-cases are displayed in blue.
Theoretical Power Based on Pearson Chi-Square Test and Simple Logistic Regression
| Probability of Disease Unexposed | Odds Ratios | |||||||
|---|---|---|---|---|---|---|---|---|
| 0.5 | 1.0 | 1.5 | 2.0 | 2.5 | 3.0 | 3.5 | ||
| Case 1^ | 0.05 | 0.258 | 0.050 | 0.214 | 0.598 | 0.884 | >0.999 | |
| 0.20 | 0.731 | 0.050 | 0.521 | 0.953 | 0.999 | >0.999 | ||
| Case 2* | 0.05 | 0.255 | 0.050 | 0.167 | 0.331 | 0.502 | 0.650 | 0.766 |
| 0.20 | 0.565 | 0.050 | 0.321 | 0.663 | 0.872 | 0.959 | 0.988 | |
| Case 3* | 0.05 | 0.311 | 0.050 | 0.215 | 0.464 | 0.695 | 0.850 | 0.935 |
| 0.20 | 0.691 | 0.050 | 0.432 | 0.832 | 0.970 | 0.996 | >0.999 | |
^Power for comparison of odds inside to outside cluster
*Power for one standard deviation increase in distance from exposure source
Observed Power for GAM Hypothesis Tests and Spatial Scan Statistic
| Probability of Disease Unexposed | Odds Ratios | |||||||
|---|---|---|---|---|---|---|---|---|
| 0.05 | CPT | 0.081 | 0.043 | 0.070 | 0.174 | 0.301 | 0.474 | |
| FMSPT-3 | 0.055 | 0.033 | 0.048 | 0.139 | 0.250 | 0.403 | ||
| FMSPT-5 | 0.044 | 0.023 | 0.039 | 0.103 | 0.225 | 0.361 | ||
| Scan Statistic | 0.087 | 0.052 | 0.075 | 0.169 | 0.302 | 0.494 | ||
| 0.20 | CPT | 0.239 | 0.047 | 0.149 | 0.447 | 0.764 | 0.923 | |
| FMSPT-3 | 0.166 | 0.027 | 0.111 | 0.378 | 0.697 | 0.890 | ||
| FMSPT-5 | 0.152 | 0.020 | 0.094 | 0.33 | 0.673 | 0.880 | ||
| Scan Statistic | 0.243 | 0.044 | 0.121 | 0.467 | 0.833 | 0.963 | ||
| 0.05 | CPT | 0.095 | 0.069 | 0.054 | 0.094 | 0.176 | 0.230 | 0.381 |
| FMSPT-3 | 0.074 | 0.048 | 0.035 | 0.075 | 0.131 | 0.186 | 0.31 | |
| FMSPT-5 | 0.053 | 0.037 | 0.029 | 0.061 | 0.115 | 0.154 | 0.281 | |
| Scan Statistic | 0.059 | 0.052 | 0.048 | 0.091 | 0.137 | 0.194 | 0.293 | |
| 0.20 | CPT | 0.223 | 0.05 | 0.118 | 0.293 | 0.520 | 0.714 | 0.878 |
| FMSPT-3 | 0.173 | 0.033 | 0.089 | 0.22 | 0.441 | 0.658 | 0.833 | |
| FMSPT-5 | 0.147 | 0.023 | 0.073 | 0.199 | 0.404 | 0.611 | 0.798 | |
| Scan Statistic | 0.143 | 0.045 | 0.080 | 0.17 | 0.367 | 0.584 | 0.758 | |
| 0.05 | CPT | 0.094 | 0.040 | 0.059 | 0.131 | 0.240 | 0.426 | 0.547 |
| FMSPT-3 | 0.071 | 0.027 | 0.037 | 0.078 | 0.178 | 0.359 | 0.478 | |
| FMSPT-5 | 0.056 | 0.017 | 0.028 | 0.064 | 0.152 | 0.324 | 0.437 | |
| Scan Statistic | 0.068 | 0.042 | 0.059 | 0.085 | 0.129 | 0.201 | 0.285 | |
| 0.20 | CPT | 0.276 | 0.058 | 0.137 | 0.400 | 0.712 | 0.882 | 0.970 |
| FMSPT-3 | 0.218 | 0.036 | 0.094 | 0.327 | 0.622 | 0.843 | 0.954 | |
| FMSPT-5 | 0.191 | 0.026 | 0.074 | 0.298 | 0.587 | 0.818 | 0.942 | |
| Scan Statistic | 0.158 | 0.047 | 0.067 | 0.192 | 0.348 | 0.534 | 0.703 | |
Case 1 Sensitivity - Mean Proportion of True Cluster Detected as Hot- or Coldspot
| Odds Ratios | ||||||
|---|---|---|---|---|---|---|
| 0.05 | CPT | 0.380 | 0.360 | 0.582 | 0.671 | 0.740 |
| FMSPT-3 | 0.418 | 0.513 | 0.697 | 0.807 | 0.855 | |
| FMSPT-5 | 0.364 | 0.538 | 0.725 | 0.834 | 0.876 | |
| Scan Statistic | 0.335 | 0.325 | 0.536 | 0.607 | 0.686 | |
| 0.20 | CPT | 0.653 | 0.516 | 0.738 | 0.811 | 0.855 |
| FMSPT-3 | 0.584 | 0.644 | 0.851 | 0.926 | 0.961 | |
| FMSPT-5 | 0.618 | 0.660 | 0.874 | 0.935 | 0.969 | |
| Scan Statistic | 0.684 | 0.498 | 0.688 | 0.782 | 0.842 | |
*Senstivity = mean(Percent Cluster Located | Rejected Ho)
Cases 2 and 3 Sensitivity - Detecting the Exposure Source Location
| Probability of Disease Unexposed | Odds Ratios | ||||||
|---|---|---|---|---|---|---|---|
| 0.05 | CPT | 0.453 | 0.352 | 0.543 | 0.688 | 0.896 | 0.882 |
| FMSPT-3 | 0.541 | 0.543 | 0.773 | 0.824 | 0.952 | 0.952 | |
| FMSPT-5 | 0.585 | 0.517 | 0.754 | 0.8 | 0.968 | 0.947 | |
| Scan Statistic | 0.026 | 0.012 | 0.038 | 0.094 | 0.153 | 0.229 | |
| 0.20 | CPT | 0.740 | 0.559 | 0.860 | 0.950 | 0.976 | 0.987 |
| FMSPT-3 | 0.936 | 0.652 | 0.964 | 0.995 | 0.995 | 0.998 | |
| FMSPT-5 | 0.973 | 0.644 | 0.965 | 0.995 | 0.997 | 0.999 | |
| Scan Statistic | 0.082 | 0.029 | 0.12 | 0.301 | 0.533 | 0.725 | |
| Mean | Mean | Mean | Mean | Mean | Mean | ||
| 0.05 | CPT | 0.256 | 0.234 | 0.333 | 0.391 | 0.463 | 0.524 |
| FMSPT-3 | 0.308 | 0.344 | 0.436 | 0.516 | 0.59 | 0.634 | |
| FMSPT-5 | 0.368 | 0.353 | 0.473 | 0.549 | 0.622 | 0.661 | |
| Scan Statistic | 0.263 | 0.189 | 0.232 | 0.299 | 0.315 | 0.346 | |
| 0.20 | CPT | 0.394 | 0.339 | 0.454 | 0.548 | 0.641 | 0.704 |
| FMSPT-3 | 0.511 | 0.453 | 0.574 | 0.659 | 0.737 | 0.788 | |
| FMSPT-5 | 0.550 | 0.475 | 0.598 | 0.638 | 0.758 | 0.807 | |
| Scan Statistic | 0.363 | 0.242 | 0.282 | 0.390 | 0.441 | 0.443 | |
^Sensitivity reflects the probability of the identification of the center of the region as high or low risk, given that the global null hypothesis was rejected, i.e.
Sensitivity = P(Exposure Source Located | Rejected Ho)
* Sensitivity reflects the mean percent of the vertical exposure source identified as high or low risk, given that the global null hypothesis was rejected, i.e.
Sensitivity = mean(Percent Exposure Source Located | Rejected Ho)
Figure 5Distributions of Optimal Span Size and Most Likely Cluster Radius Observed for Case 1 with Odds Ratio of 3.0. a: Case 1 Conditional Permutation Test Optimal Span Size for Odds Ratio of 3.0. This figure depicts the optimal span size selected by applying GAMs across a range of possible spans and selecting the optimal span as that which corresponds to the minimal model AIC statistic. b: Case 1 Scan Statistic Most Likely Cluster Radius for Odds Ratio of 3.0. This figure depicts the distribution of the observed radius for most likely clusters selected by the scan statistic. It is paired with Figure 4a as we can compare the tendencies of the methods to over- or under-smooth through these figures. With Figure 4a we see that for lower disease prevalence the GAM methods tend to choose a large span size, possibly over-smoothing and missing the cluster. The scan statistic tends to under-smooth and finds a most likely cluster that is much smaller than the true cluster radius, as shown in Figure 4b.
Radii of Scan Statistic Most Likely Cluster with Significant P-Value (p < 0.05)
| Odds Ratios | |||||||
|---|---|---|---|---|---|---|---|
| 0.5 | 1.0 | 1.5 | 2.0 | 2.5 | 3.0 | 3.5 | |
| Case 1 | |||||||
| 0.05 | 0.39 | 0.34 | 0.32 | 0.37 | 0.35 | 0.37 | |
| 0.20 | 0.37 | 0.24 | 0.33 | 0.37 | 0.38 | 0.38 | |
| Case 2 | |||||||
| 0.05 | 0.37 | 0.33 | 0.31 | 0.36 | 0.43 | 0.43 | 0.45 |
| 0.20 | 0.4 | 0.28 | 0.36 | 0.45 | 0.47 | 0.50 | 0.53 |
| Case 3 | |||||||
| 0.05 | 0.4 | 0.43 | 0.34 | 0.35 | 0.41 | 0.47 | 0.46 |
| 0.20 | 0.46 | 0.38 | 0.39 | 0.41 | 0.49 | 0.52 | 0.52 |
Figure 6Case 1 Points Detected at High Risk for Data with Scan Statistic Minimum Radius, Probability of Disease Outside Cluster = 0.20. This figure compares the area of the region detected as high risk by the methods discussed in this paper. This particular figure shows the minimum radius observed for a significant most likely cluster with an odds ratio of 3.0 and a probability of disease outside the cluster of 0.20.
Figure 7Case 1 Points Detected at High Risk for Data with Scan Statistic Maximum Radius, Probability of Disease Outside Cluster = 0.20. This figure compares the area of the region detected as high risk by the methods discussed in this paper. This particular figure shows the maximum radius observed for a significant most likely cluster with an odds ratio of 3.0 and a probability of disease outside the cluster of 0.20.