| Literature DB >> 35347729 |
Wei Wang1, Xiong Xiao1, Jian Qian1, Shiqi Chen2, Fang Liao3,4, Fei Yin1, Tao Zhang1, Xiaosong Li1, Yue Ma1.
Abstract
Most spatial models include a spatial weights matrix (W) derived from the first law of geography to adjust the spatial dependence to fulfill the independence assumption. In various fields such as epidemiological and environmental studies, the spatial dependence often shows clustering (or geographic discontinuity) due to natural or social factors. In such cases, adjustment using the first-law-of-geography-based W might be inappropriate and leads to inaccuracy estimations and loss of statistical power. In this work, we propose a series of data-driven Ws (DDWs) built following the spatial pattern identified by the scan statistic, which can be easily carried out using existing tools such as SaTScan software. The DDWs take both the clustering (or discontinuous) and the intuitive first-law-of-geographic-based spatial dependence into consideration. Aiming at two common purposes in epidemiology studies (ie, estimating the effect value of explanatory variable X and estimating the risk of each spatial unit in disease mapping), the common spatial autoregressive models and the Leroux-prior-based conditional autoregressive (CAR) models were selected to evaluate performance of DDWs, respectively. Both simulation and case studies show that our DDWs achieve considerably better performance than the classic W in datasets with clustering (or discontinuous) spatial dependence. Furthermore, the latest published density-based spatial clustering models, aiming at dealing with such clustering (or discontinuity) spatial dependence in disease mapping, were also compared as references. The DDWs, incorporated into the CAR models, still show considerable advantage, especially in the datasets for common diseases.Entities:
Keywords: clustering spatial dependence; conditional autoregressive model; disease mapping; spatial autoregressive model; spatial weights matrix
Mesh:
Year: 2022 PMID: 35347729 PMCID: PMC9313839 DOI: 10.1002/sim.9395
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.497
The method of assigning weights within a cluster or the baseline region
| Method of assigning weights | Notation |
|---|---|
| Geographic continuity weighting (GW) |
otherwise, |
| Null weighting (NW) |
|
| Risk weighting (RW) |
|
Note: and are the observed values of units and , respectively.
Six categories of DDWs
| Assigning weights for the baseline | Assigning weights for clusters | Label of the DDW |
|---|---|---|
| GW | GW | GG |
| GW | NW | GN |
| GW | RW | GR |
| NW | GW | NG |
| NW | NW | NN |
| NW | RW | NR |
FIGURE 1The process of constructing the DDWs
Six scenarios built in the simulation study
| Scenario | The number of clusters | The type of clusters |
|---|---|---|
| SEM(SLM)_C1Hh | 1 | Hh |
| SEM(SLM)_C1Hl | 1 | Hl |
| SEM(SLM)_C1F | 1 | F |
| SEM(SLM)_C2Hh | 2 | Hh |
| SEM(SLM)_C2Hl | 2 | Hl |
| SEM(SLM)_C2F | 2 | F |
FIGURE 2The distribution of clusters in the simulation scenarios for the SAR models. The top map shows the location of Sichuan Province in China. The other six maps represent the position of the artificial clusters in Sichuan Province, where different colors correspond to different relative risks with respect to the white
Generating the simulated X for each spatial unit in the SLM
| X | The location of X | Distribution | Available scenarios |
|---|---|---|---|
| X0 | In the baseline | N (0.8, 0.25) | All scenarios |
| X1 | In a high H cluster | N (1.5, 0.25) | SLM_C1Hh and SLM_C2Hh |
| X2 | In a second high H cluster | N (1.0625, 0.25) | SLM_C1Hl and SLM_C2Hl |
| X3 | In a high‐risk unit in a F cluster | N (1.5, 0.25) | SLM_C1F and SLM_C2F |
| X4 | In a second high‐risk unit in a F cluster | N (1.325, 0.25) | SLM_C1F and SLM_C2F |
Note: The other parameters of the SLM are shown in Figure S1.
The simulation results in the SEM over 10 000 replicas
| Scenario | AG | GG | GN | NG | NN | GR | NR | |
|---|---|---|---|---|---|---|---|---|
| C1Hh | ABias | 0.747 | 0.207 | 0.175 | 0.008 | 0.038 | 2.152 | 3.014 |
| MSE | 12.414 | 0.956 | 0.949 | 1.110 | 1.102 | 0.964 | 1.125 | |
|
| 0.639 | 0.978 | 0.978 | 0.973 | 0.973 | 0.978 | 0.973 | |
| C1Hl | ABias | 0.030 | 0.003 | 0.017 | 0.030 | 0.016 | 2.643 | 3.467 |
| MSE | 2.545 | 0.923 | 0.914 | 1.110 | 1.100 | 0.936 | 1.127 | |
|
| 0.630 | 0.881 | 0.881 | 0.861 | 0.859 | 0.882 | 0.859 | |
| C1F | ABias | 0.925 | 0.243 | 0.221 | 0.577 | 0.565 | 3.329 | 3.760 |
| RMSE | 5.576 | 1.864 | 1.581 | 2.578 | 2.041 | 1.453 | 1.806 | |
|
| 0.706 | 0.938 | 0.943 | 0.932 | 0.937 | 0.947 | 0.941 | |
| C2Hh | ABias | 1.460 | 0.337 | 0.278 | 0.185 | 0.259 | 5.273 | 7.032 |
| MSE | 25.213 | 0.967 | 0.961 | 1.115 | 1.103 | 1.020 | 1.191 | |
|
| 0.579 | 0.988 | 0.988 | 0.986 | 0.986 | 0.988 | 0.986 | |
| C2Hl | ABias | 0.526 | 0.120 | 0.146 | 0.130 | 0.159 | 5.831 | 7.285 |
| MSE | 4.469 | 0.937 | 0.936 | 1.113 | 1.112 | 1.001 | 1.198 | |
|
| 0.584 | 0.927 | 0.927 | 0.915 | 0.914 | 0.928 | 0.915 | |
| C2F | ABias | 0.231 | 0.114 | 0.086 | 0.101 | 0.072 | 7.515 | 9.022 |
| MSE | 11.674 | 2.666 | 2.226 | 3.613 | 2.846 | 1.983 | 2.457 | |
|
| 0.654 | 0.952 | 0.956 | 0.949 | 0.954 | 0.961 | 0.958 |
Note: For the ABias and MSE, the values have been multiplied 1000 to clearly present the comparison.
The simulation results in the SLM over 10 000 replicas
| Scenario | AG | GG | GN | NG | NN | GR | NR | |
|---|---|---|---|---|---|---|---|---|
| C1Hh | ABias | 436.661 | 3.7674 | 4.062 | 34.011 | 36.932 | 0.706 | 33.648 |
| MSE | 198.881 | 1.087 | 1.092 | 2.485 | 2.706 | 1.067 | 2.469 | |
|
| 0.728 | 0.976 | 0.976 | 0.971 | 0.970 | 0.976 | 0.970 | |
| C1Hl | ABias | 63.584 | 0.514 | 0.241 | 31.956 | 34.805 | 3.966 | 30.992 |
| RMSE | 6.554 | 1.062 | 1.064 | 2.354 | 2.556 | 1.080 | 2.305 | |
|
| 0.685 | 0.879 | 0.879 | 0.851 | 0.849 | 0.880 | 0.850 | |
| C1F | ABias | 248.806 | 30.916 | 28.407 | 56.611 | 55.975 | 24.731 | 52.213 |
| RMSE | 66.986 | 2.690 | 2.397 | 5.248 | 5.010 | 2.044 | 4.388 | |
|
| 0.771 | 0.942 | 0.946 | 0.933 | 0.937 | 0.951 | 0.944 | |
| C2Hh | ABias | 865.224 | 7.087 | 7.5550 | 35.081 | 38.235 | 0.094 | 30.890 |
| RMSE | 760.985 | 1.143 | 1.149 | 2.563 | 2.806 | 1.087 | 2.297 | |
|
| 0.712 | 0.987 | 0.987 | 0.984 | 0.984 | 0.987 | 0.984 | |
| C2Hl | ABias | 144.303 | 2.790 | 3.254 | 33.653 | 36.765 | 4.387 | 29.169 |
| RMSE | 24.933 | 1.097 | 1.100 | 2.474 | 2.705 | 1.110 | 2.206 | |
|
| 0.659 | 0.924 | 0.924 | 0.908 | 0.907 | 0.925 | 0.908 | |
| C2F | ABias | 524.901 | 62.686 | 58.240 | 79.726 | 77.186 | 48.519 | 67.418 |
| RMSE | 283.911 | 6.244 | 5.472 | 8.965 | 8.314 | 4.083 | 6.483 | |
|
| 0.749 | 0.956 | 0.960 | 0.953 | 0.956 | 0.966 | 0.963 |
Note: For the ABias and MSE, the values have been multiplied 1000 to clearly present the comparison.
FIGURE 3The distribution of clusters in the simulation scenarios for the CAR models. No clusters exist in scenario 1, 11 high/low‐risk clusters in scenario 2, and 9 intra‐heterogeneous clusters in scenario 3
Average values of MARB, MRRMSE, and LS in the CAR models
| No background cluster | With background cluster | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LCAR |
|
|
|
|
|
| NNCAR | NGCAR | GNCAR | GGCAR | ||
| Scenario 1A | MARB | 0.012 | 0.013 | 0.014 | 0.013 | 0.012 | 0.012 | 0.012 | 0.012 | 0.012 | 0.012 | 0.012 |
| MRRMSE | 0.014 | 0.034 | 0.038 | 0.036 | 0.029 | 0.033 | 0.032 | 0.016 | 0.016 | 0.016 | 0.016 | |
| LS | 1394.7 | 1420.4 | 1540.7 | 1417.7 | 1414.1 | 1411.5 | 1405.9 | 1394.5 | 1394.5 | 1394.4 | 1394.4 | |
| Scenario 1B | MARB | 0.012 | 0.017 | 0.034 | 0.086 | 0.014 | 0.017 | 0.031 | 0.012 | 0.012 | 0.012 | 0.012 |
| MRRMSE | 0.023 | 0.05 | 0.093 | 0.169 | 0.044 | 0.056 | 0.1 | 0.026 | 0.026 | 0.026 | 0.026 | |
| LS | 776.5 | 775.6 | 774.9 | 783.4 | 776.2 | 776.0 | 776.6 | 776.6 | 776.6 | 776.5 | 776.5 | |
| Scenario 1C | MARB | 0.012 | 0.047 | 0.057 | 0.078 | 0.033 | 0.033 | 0.032 | 0.013 | 0.013 | 0.013 | 0.013 |
| MRRMSE | 0.032 | 0.112 | 0.129 | 0.165 | 0.101 | 0.101 | 0.112 | 0.053 | 0.052 | 0.054 | 0.054 | |
| LS | 521.4 | 520.6 | 521.1 | 524.5 | 520.8 | 521.1 | 522.1 | 522.6 | 523.6 | 5214 | 521.4 | |
| Scenario 2A | MARB | 0.058 | 0.054 | 0.054 | 0.057 | 0.054 | 0.055 | 0.057 | 0.053 | 0.052 | 0.052 | 0.047 |
| MRRMSE | 0.125 | 0.129 | 0.131 | 0.133 | 0.125 | 0.13 | 0.132 | 0.138 | 0.138 | 0.123 | 0.124 | |
| LS | 1533.9 | 1530.0 | 1532.7 | 1547.1 | 1514.4 | 1522.2 | 1539.8 | 1539.5 | 1539.5 | 1509.5 | 1516.6 | |
| Scenario 2B | MARB | 0.096 | 0.099 | 0.113 | 0.127 | 0.099 | 0.102 | 0.104 | 0.117 | 0.117 | 0.11 | 0.112 |
| MRRMSE | 0.17 | 0.213 | 0.243 | 0.268 | 0.207 | 0.213 | 0.219 | 0.143 | 0.143 | 0.145 | 0.143 | |
| LS | 826.5 | 817.0 | 820.8 | 833.8 | 817.0 | 819.6 | 822.7 | 807.6 | 807.6 | 803.3 | 804.9 | |
| Scenario 2C | MARB | 0.132 | 0.13 | 0.134 | 0.125 | 0.121 | 0.119 | 0.115 | 0.111 | 0.111 | 0.108 | 0.108 |
| MRRMSE | 0.202 | 0.266 | 0.276 | 0.285 | 0.268 | 0.267 | 0.266 | 0.187 | 0.186 | 0.187 | 0.183 | |
| LS | 555.3 | 547.8 | 551.9 | 558.9 | 547.5 | 550.3 | 552.4 | 545.5 | 545.3 | 541.8 | 542.9 | |
| Scenario 3A | MARB | 0.049 | 0.04 | 0.045 | 0.046 | 0.041 | 0.045 | 0.047 | 0.051 | 0.051 | 0.041 | 0.04 |
| MRRMSE | 0.097 | 0.1 | 0.104 | 0.104 | 0.098 | 0.1 | 0.103 | 0.116 | 0.116 | 0.095 | 0.100 | |
| LS | 1522.6 | 1522.9 | 1536.4 | 1531.8 | 1508.6 | 1519.5 | 1525.4 | 1520.9 | 1520.9 | 1493.1 | 1498.6 | |
| Scenario 3B | MARB | 0.105 | 0.075 | 0.075 | 0.098 | 0.079 | 0.08 | 0.078 | 0.106 | 0.105 | 0.087 | 0.090 |
| MRRMSE | 0.139 | 0.158 | 0.196 | 0.233 | 0.162 | 0.171 | 0.184 | 0.129 | 0.128 | 0.118 | 0.118 | |
| LS | 833.6 | 825.1 | 827.5 | 834.5 | 825.1 | 828.4 | 829.7 | 830.9 | 830.5 | 824.0 | 824.9 | |
| Scenario 3C | MARB | 0.152 | 0.081 | 0.087 | 0.096 | 0.081 | 0.089 | 0.088 | 0.139 | 0.138 | 0.124 | 0.125 |
| MRRMSE | 0.174 | 0.207 | 0.226 | 0.243 | 0.206 | 0.216 | 0.224 | 0.166 | 0.165 | 0.156 | 0.156 | |
| LS | 558.8 | 552.0 | 553.9 | 561.1 | 552.6 | 553.9 | 556.3 | 556.3 | 556.3 | 552.2 | 552.4 | |
Note: In the DDWCAR models, the fixed cluster‐level spatial effects are considered for a relatively small number of identified clusters. We also tried the random cluster‐level spatial effects, which performs a similar result. Only the high‐risk clusters were considered in the scan statistic for the rare (or much rare) diseases.
The results of parameter estimation in the SAR models for the case study
| X | AG | NG | NR | NN | GG | GR | GN |
|---|---|---|---|---|---|---|---|
| Temperature | 0.0169 | 0.0207 | 0.0176 | 0.0220 | 0.0160 | 0.0110 | 0.0174 |
| (0.0036) | (0.0031) | (0.0027) | (0.0031) | (0.0031) | (0.0025) | (0.0032) | |
| Sunshine | −0.0230 | −0.0265 | −0.0198 | −0.0241 | −0.0244 | −0.0156 | −0.0224 |
| (0.0141) | (0.0131) | (0.0117) | (0.0132) | (0.0127) | (0.0105) | (0.0127) | |
| Wind | 0.0066 | 0.0263 | 0.0650 | 0.0567 | 0.0204 | 0.0582 | 0.0494 |
| (0.0378) | (0.0355) | (0.0314) | (0.0355) | (0.0341) | (0.0282) | (0.0341) | |
| Humility | −0.0164 | −0.0239 | −0.0072 | −0.0203 | −0.0199 | −0.0019 | −0.0165 |
| (0.0197) | (0.0188) | (0.0166) | (0.0190) | (0.0178) | (0.0148) | (0.0179) | |
|
| 0.58 | 0.63 | 0.71 | 0.63 | 0.66 | 0.77 | 0.66 |
| AIC | 713.08 | 691.48 | 650.05 | 685.75 | 685.42 | 628.71 | 681.22 |
Note: The values in parentheses are the standard errors corresponding to the estimated parameters.
P < .1.
P < .05.
P < .001.
FIGURE 4The estimated risk surface and logarithm score (LS) values using LCAR, DBSC, and DDWCAR models in the case study