| Literature DB >> 27473270 |
Md Hamidul Huque1,2, Craig Anderson3,4, Richard Walton5, Louise Ryan3,4.
Abstract
BACKGROUND: Mapping disease rates over a region provides a visual illustration of underlying geographical variation of the disease and can be useful to generate new hypotheses on the disease aetiology. However, methods to fit the popular and widely used conditional autoregressive (CAR) models for disease mapping are not feasible in many applications due to memory constraints, particularly when the sample size is large. We propose a new algorithm to fit a CAR model that can accommodate both individual and group level covariates while adjusting for spatial correlation in the disease rates, termed indiCAR. Our method scales well and works in very large datasets where other methods fail.Entities:
Keywords: Covariate adjustment; Disease mapping; Geographical variation; Neutropenia; Spatial model
Mesh:
Year: 2016 PMID: 27473270 PMCID: PMC4966783 DOI: 10.1186/s12942-016-0055-7
Source DB: PubMed Journal: Int J Health Geogr ISSN: 1476-072X Impact factor: 3.918
Simulation results for estimated regression coefficients following indiCAR and Leroux et al. [7] where each area consists of a random number of subjects between 10 and 1000
| True value | indiCAR | Leroux et al. [ | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
| Estimated coefficient | |||||||||||||
| 0.00 | −0.178 | −2.500 | 0.700 | −1.997 | −1.500 | 0.201 | 0.501 | 0.800 | 0.199 | 0.396 | 0.019 | 0.198 | 0.442 | 0.069 |
| 0.25 | −0.174 | −2.499 | 0.699 | −1.997 | −1.498 | 0.200 | 0.501 | 0.801 | 0.198 | 0.395 | 0.251 | 0.198 | 0.421 | 0.304 |
| 0.50 | −0.162 | −2.500 | 0.700 | −2.001 | −1.501 | 0.200 | 0.500 | 0.801 | 0.198 | 0.396 | 0.503 | 0.198 | 0.413 | 0.523 |
| 0.75 | −0.152 | −2.501 | 0.701 | −2.005 | −1.499 | 0.201 | 0.500 | 0.800 | 0.198 | 0.394 | 0.736 | 0.198 | 0.407 | 0.722 |
| 0.99 | −0.144 | −2.499 | 0.700 | −2.000 | −1.500 | 0.200 | 0.499 | 0.799 | 0.199 | 0.396 | 0.958 | 0.199 | 0.412 | 0.950 |
| Empirical standard error | ||||||||||||||
| 0.00 | 0.033 | 0.016 | 0.012 | 0.059 | 0.039 | 0.025 | 0.026 | 0.026 | 0.021 | 0.030 | 0.028 | 0.022 | 0.035 | 0.056 |
| 0.25 | 0.037 | 0.016 | 0.011 | 0.060 | 0.039 | 0.027 | 0.026 | 0.027 | 0.017 | 0.033 | 0.101 | 0.017 | 0.035 | 0.106 |
| 0.50 | 0.042 | 0.016 | 0.012 | 0.058 | 0.040 | 0.027 | 0.027 | 0.026 | 0.016 | 0.029 | 0.130 | 0.016 | 0.028 | 0.120 |
| 0.75 | 0.054 | 0.016 | 0.012 | 0.061 | 0.039 | 0.028 | 0.027 | 0.028 | 0.013 | 0.025 | 0.114 | 0.014 | 0.025 | 0.106 |
| 0.99 | 0.209 | 0.017 | 0.012 | 0.064 | 0.038 | 0.027 | 0.027 | 0.027 | 0.012 | 0.020 | 0.037 | 0.013 | 0.020 | 0.043 |
| Average of the simulated standard error | ||||||||||||||
| 0.00 | 0.034 | 0.016 | 0.012 | 0.061 | 0.039 | 0.027 | 0.027 | 0.027 | 0.021 | 0.016 | 0.026 | 0.022 | 0.018 | 0.031 |
| 0.25 | 0.036 | 0.016 | 0.012 | 0.061 | 0.039 | 0.028 | 0.027 | 0.027 | 0.017 | 0.017 | 0.053 | 0.018 | 0.018 | 0.058 |
| 0.50 | 0.041 | 0.016 | 0.012 | 0.062 | 0.039 | 0.028 | 0.027 | 0.027 | 0.015 | 0.018 | 0.079 | 0.015 | 0.018 | 0.080 |
| 0.75 | 0.028 | 0.016 | 0.012 | 0.062 | 0.039 | 0.028 | 0.027 | 0.027 | 0.014 | 0.019 | 0.086 | 0.014 | 0.019 | 0.086 |
| 0.99 | 0.130 | 0.016 | 0.012 | 0.062 | 0.040 | 0.028 | 0.027 | 0.028 | 0.013 | 0.019 | 0.034 | 0.013 | 0.020 | 0.038 |
Simulation results for estimated regression coefficients following indiCAR and Leroux et al. [7] where each area consists of a random number of subjects between 10 and 50
| True value | indiCAR | Leroux et al. [ | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
| Estimated coefficient | |||||||||||||
| 0.00 | −0.161 | −2.495 | 0.697 | −2.000 | −1.481 | 0.210 | 0.509 | 0.811 | 0.197 | 0.380 | 0.043 | 0.189 | 0.444 | 0.059 |
| 0.25 | −0.175 | −2.501 | 0.699 | −2.020 | −1.500 | 0.205 | 0.505 | 0.804 | 0.197 | 0.382 | 0.247 | 0.193 | 0.426 | 0.225 |
| 0.50 | −0.179 | −2.500 | 0.701 | −2.011 | −1.509 | 0.199 | 0.501 | 0.802 | 0.199 | 0.380 | 0.462 | 0.195 | 0.430 | 0.396 |
| 0.75 | −0.177 | −2.504 | 0.700 | −2.038 | −1.504 | 0.207 | 0.507 | 0.806 | 0.200 | 0.380 | 0.656 | 0.195 | 0.468 | 0.496 |
| 0.99 | −0.156 | −2.498 | 0.700 | −2.039 | −1.505 | 0.204 | 0.505 | 0.802 | 0.199 | 0.402 | 0.929 | 0.197 | 0.510 | 0.856 |
| Empirical standard error | ||||||||||||||
| 0.00 | 0.117 | 0.148 | 0.056 | 0.301 | 0.184 | 0.115 | 0.114 | 0.123 | 0.032 | 0.064 | 0.070 | 0.033 | 0.065 | 0.078 |
| 0.25 | 0.086 | 0.050 | 0.038 | 0.200 | 0.120 | 0.086 | 0.081 | 0.083 | 0.026 | 0.047 | 0.154 | 0.027 | 0.053 | 0.147 |
| 0.50 | 0.085 | 0.049 | 0.039 | 0.183 | 0.128 | 0.083 | 0.081 | 0.086 | 0.025 | 0.042 | 0.185 | 0.026 | 0.048 | 0.179 |
| 0.75 | 0.125 | 0.071 | 0.051 | 0.260 | 0.163 | 0.118 | 0.110 | 0.115 | 0.027 | 0.051 | 0.200 | 0.029 | 0.056 | 0.217 |
| 0.99 | 0.225 | 0.064 | 0.051 | 0.259 | 0.163 | 0.118 | 0.116 | 0.124 | 0.029 | 0.052 | 0.094 | 0.031 | 0.058 | 0.147 |
| Average of the simulated standard error | ||||||||||||||
| 0.00 | 0.116 | 0.066 | 0.049 | 0.254 | 0.160 | 0.114 | 0.112 | 0.113 | 0.031 | 0.036 | 0.075 | 0.032 | 0.035 | 0.062 |
| 0.25 | 0.092 | 0.051 | 0.038 | 0.197 | 0.125 | 0.088 | 0.086 | 0.087 | 0.025 | 0.033 | 0.114 | 0.027 | 0.032 | 0.096 |
| 0.50 | 0.093 | 0.052 | 0.038 | 0.197 | 0.125 | 0.088 | 0.086 | 0.087 | 0.024 | 0.036 | 0.163 | 0.025 | 0.035 | 0.134 |
| 0.75 | 0.122 | 0.067 | 0.049 | 0.261 | 0.163 | 0.115 | 0.113 | 0.114 | 0.028 | 0.051 | 0.191 | 0.029 | 0.047 | 0.163 |
| 0.99 | 0.164 | 0.067 | 0.049 | 0.260 | 0.163 | 0.115 | 0.112 | 0.114 | 0.027 | 0.051 | 0.059 | 0.029 | 0.052 | 0.088 |
Comparison of estimated time and conditional AIC between indiCAR and other methods when data are generated without spatial random effect,
| Sample per group | Total sample | Time to convergence (s) | Conditional AIC | ||||||
|---|---|---|---|---|---|---|---|---|---|
| indiCAR | glmer with random intercept | hlmer with random intercept | hlmer with CAR | indiCAR | glmer with random intercept | hlmer with random intercept | hlmer with CAR | ||
| Data generated in 100 groups | |||||||||
| 1:50 | 2373 | 0.73 | 1.98 | 0.43 | 2.36 | 1419.26 | 1492.26 | 1445.65 | 1445.87 |
| 1:100 | 5056 | 2.09 | 5.26 | 0.55 | 2.93 | 3170.9 | 3225.28 | 3194.03 | 3193.98 |
| 1:500 | 26,473 | 10.34 | 23.63 | 1.65 | 12.96 | 15,996.05 | 15,968.02 | 15,955.41 | 15,955.40 |
| 1:1000 | 48,778 | 37.25 | 53.25 | 3.01 | 29.68 | 31,063.34 | 31,011.67 | 31,001.58 | 31,001.72 |
| Data generated in 400 groups | |||||||||
| 1:50 | 10,192 | 51.39 | 9.44 | 2.64 | 97.45 | 6027.28 | 6242.24 | 6097.72 | 6097.84 |
| 1:100 | 19,843 | 73.31 | 33.13 | 11.39 | 244.02 | 12,017.28 | 12,185.15 | 12,037.20 | 12037.27 |
| 1:500 | 98,870 | 140.74 | 71.96 | 38.91 | Not feasible | 59,061.39 | 58,929.01 | 58,879.30 | Not feasible |
| 1:1000 | 205,952 | 207.84 | 214.51 | 149.96 | Not feasible | 121,733.50 | 121,533.80 | 121,510.20 | Not feasible |
Comparison of estimated time and conditional AIC between indiCAR and other methods when data are generated with spatial random effect parameter,
| Sample per group | Total sample | Time to convergence (s) | Conditional AIC | ||||||
|---|---|---|---|---|---|---|---|---|---|
| indiCAR | glmer with random intercept | hlmer with random intercept | hlmer with CAR | indiCAR | glmer with random intercept | hlmer with random intercept | hlmer with CAR | ||
| Data generated in 100 groups | |||||||||
| 1:50 | 2517 | 0.48 | 2.34 | 0.39 | 2.10 | 1748.78 | 1883.67 | 1881.69 | 1881.81 |
| 1:100 | 4688 | 3.30 | 3.40 | 0.38 | 6.75 | 2821.61 | 2899.41 | 2897.73 | 2897.83 |
| 1:500 | 26,519 | 4.23 | 23.71 | 1.92 | 15.52 | 15,865.50 | 15,943.65 | 15,943.58 | 15,943.55 |
| 1:1000 | 52,911 | 188.19 | 61.84 | 3.62 | Not feasible | 32,632.45 | 32,669.39 | 32,669.18 | Not feasible |
| Data generated in 400 groups | |||||||||
| 1:50 | 10,118 | 51.55 | 14.81 | 2.65 | 138.33 | 5935.14 | 6323.31 | 6309.56 | 6309.44 |
| 1:100 | 20,652 | 36.74 | 25.53 | 4.20 | 434.66 | 12,476.61 | 12,893.85 | 12,889.04 | 12,889.13 |
| 1:500 | 103,267 | 85.75 | 73.45 | 22.31 | Not feasible | 60,233.22 | 60,533.49 | 60,533.24 | Not feasible |
| 1:1000 | 205,739 | 113.65 | 236.95 | 46.23 | Not feasible | 120,212.20 | 120,423.70 | 120,423.00 | Not feasible |
Descriptive analsis of neutropenia data
| Variables | Neutropenia n (%) | Total |
|---|---|---|
| Age group (years) | ||
| 20–30 | 408 (9.2) | 4418 |
| 30–39 | 851 (7.7) | 10,988 |
| 40–49 | 1649 (6.2) | 26,395 |
| 50–59 | 2942 (5.6) | 52,281 |
| 60–69 | 3465 (4.8) | 71,446 |
| 70–79 | 2577 (3.7) | 69,236 |
| 80+ | 769 (1.7) | 44,859 |
| Sex | ||
| Female | 6363 (5.0) | 127,519 |
| Male | 6298 (4.1) | 152,104 |
| Year of diagnosis | ||
| 2001 | 1343 (4.9) | 27,356 |
| 2002 | 1411 (5.0) | 28,451 |
| 2003 | 1503 (5.1) | 29,560 |
| 2004 | 1478 (4.8) | 30,970 |
| 2005 | 1596 (5.1) | 31,533 |
| 2006 | 1452 (4.6) | 31,865 |
| 2007 | 1453 (4.5) | 32,603 |
| 2008 | 1405 (4.2) | 33,343 |
| 2009 | 1020 (3.0) | 33,942 |
| ARIA | ||
| Major cities | 9199 (4.9) | 189,322 |
| Inner regional Australia | 2638 (3.9) | 67,086 |
| Outer regional Australia | 774 (3.6) | 21,664 |
| Remote or very remote Australia | 50 (3.2) | 1551 |
| Cancer type | ||
| Breast cancer | 2059 (5.3) | 38,620 |
| Lung cancer | 1401 (6.2) | 22,744 |
| Colon and rectum cancer | 1011 (3.0) | 34,018 |
| Haematological malignancy | 5134 (25.0) | 20,518 |
| Other cancer | 3056 (1.9) | 163,723 |
| No. of major comorbidities | ||
| 0 | 6072 (3.7) | 163,645 |
| 1 | 2228 (4.9) | 45,817 |
| 2 | 2315 (6.7) | 34,670 |
| 3 | 976 (5.7) | 17,264 |
| 4+ | 1,070 (5.9) | 18,227 |
| SEIFA | ||
| Most disadvantaged | 1388 (4.6) | 30,302 |
| 2 | 1750 (4.1) | 42,558 |
| 3 | 3546 (4.5) | 78,006 |
| 4 | 2800 (4.6) | 60,880 |
| Least disadvantaged | 3177 (4.7) | 67,877 |
Comparison of individual covariate adjusted conditional autoregressive model (indiCAR) with the Leroux et al. [7] method based on age-sex adjustments
| Regression coefficients | indiCAR | Leroux et al. | ||
|---|---|---|---|---|
| Estimates | SE | Estimates | SE | |
| Intercept | −2.781 | 0.110 | – | – |
| Age group (years) | ||||
| 20–30 | 0.124 | 0.056 | – | – |
| 30–39 | 0.208 | 0.042 | – | – |
| 40–49 | Ref. | |||
| 50–59 | −0.119 | 0.031 | – | – |
| 60–69 | −0.287 | 0.031 | – | – |
| 70–79 | −0.712 | 0.033 | – | – |
| 80+ | −1.586 | 0.045 | – | – |
| Sex | ||||
| Female | Ref. | |||
| Male | −0.082 | 0.020 | – | – |
| Year of diagnosis | ||||
| 2001 | Ref. | |||
| 2002 | 0.018 | 0.038 | – | – |
| 2003 | 0.083 | 0.038 | – | – |
| 2004 | 0.021 | 0.038 | – | – |
| 2005 | 0.096 | 0.037 | – | – |
| 2006 | 0.036 | 0.038 | – | – |
| 2007 | 0.026 | 0.038 | – | – |
| 2008 | −0.001 | 0.038 | – | – |
| 2009 | −0.315 | 0.042 | – | – |
| ARIA | ||||
| Major cities | Ref. | |||
| Inner regional Australia | −0.023 | 0.047 | – | – |
| Outer regional Australia | −0.147 | 0.068 | – | – |
| Remote/very remote Australia | −0.231 | 0.163 | – | – |
| Cancer type | ||||
| Breast cancer | Ref. | – | – | – |
| Lung cancer | 0.253 | 0.038 | – | – |
| Colon and rectum cancer | −0.434 | 0.040 | – | – |
| Haematological malignancy | 1.572 | 0.029 | – | – |
| Other cancer | −0.942 | 0.031 | – | – |
| No. of major comorbidities | ||||
| 0 | Ref. | – | – | |
| 1 | 0.413 | 0.026 | – | – |
| 2 | 0.670 | 0.026 | – | – |
| 3 | 0.609 | 0.036 | – | – |
| 4+ | 0.605 | 0.035 | – | – |
| SEIFA | ||||
| Most disadvantaged | Ref. | |||
| 2 | −0.083 | 0.044 | −0.075 | 0.042 |
| 3 | −0.071 | 0.041 | −0.068 | 0.038 |
| 4 | −0.125 | 0.047 | −0.121 | 0.044 |
| Least disadvantaged | −0.131 | 0.056 | −0.129 | 0.052 |
| Variance parameter | ||||
| | 0.204 | 0.022 | 0.210 | 0.022 |
| | 0.992 | 0.012 | 0.989 | 0.015 |
Fig. 1SIR of neutropenia admissions in NSW region following indiCAR
Fig. 2Estimated spatial random effect across NSW using proposed indiCAR method
Application of indiCAR with age as a continuous predictor
| Regression coefficients | Estimates | SE |
|---|---|---|
| Intercept | −1.493 | 0.047 |
| Age | −0.027 | 0.001 |
| Sex | ||
| Female | Ref. | |
| Male | −0.043 | 0.020 |
| Year of diagnosis | ||
| 2001 | Ref. | |
| 2002 | 0.021 | 0.038 |
| 2003 | 0.083 | 0.038 |
| 2004 | 0.019 | 0.038 |
| 2005 | 0.095 | 0.037 |
| 2006 | 0.038 | 0.038 |
| 2007 | −0.022 | 0.038 |
| 2008 | −0.004 | 0.038 |
| 2009 | −0.315 | 0.042 |
| ARIA | ||
| Major cities | Ref. | |
| Inner regional Australia | −0.006 | 0.022 |
| Outer regional Australia | −0.118 | 0.037 |
| Remote or very remote Australia | −0.192 | 0.142 |
| Cancer type | ||
| Breast cancer | Ref. | |
| Lung cancer | 0.240 | 0.037 |
| Colon and rectum cancer | −0.463 | 0.040 |
| Haematological malignancy | 1.497 | 0.029 |
| Other cancer | −0.986 | 0.031 |
| No. of major comorbidities | ||
| 0 | Ref. | |
| 1 | 0.413 | 0.026 |
| 2 | 0.682 | 0.026 |
| 3 | 0.589 | 0.036 |
| 4+ | 0.594 | 0.035 |
| SEIFA | ||
| Most disadvantaged | Ref. | |
| 2 | −0.089 | 0.043 |
| 3 | −0.078 | 0.038 |
| 4 | −0.134 | 0.045 |
| Least disadvantaged | −0.144 | 0.053 |
| Variance parameter | ||
| | 0.209 | 0.022 |
| | 0.992 | 0.012 |