| Literature DB >> 25280499 |
Paramita Dasgupta, Susanna M Cramb, Joanne F Aitken, Gavin Turrell, Peter D Baade1.
Abstract
BACKGROUND: Multilevel and spatial models are being increasingly used to obtain substantive information on area-level inequalities in cancer survival. Multilevel models assume independent geographical areas, whereas spatial models explicitly incorporate geographical correlation, often via a conditional autoregressive prior. However the relative merits of these methods for large population-based studies have not been explored. Using a case-study approach, we report on the implications of using multilevel and spatial survival models to study geographical inequalities in all-cause survival.Entities:
Mesh:
Year: 2014 PMID: 25280499 PMCID: PMC4197252 DOI: 10.1186/1476-072X-13-36
Source DB: PubMed Journal: Int J Health Geogr ISSN: 1476-072X Impact factor: 3.918
A comparison of multilevel discrete-time and Bayesian spatial survival models used in this case study
| Multilevel | Bayesian spatial | |
|---|---|---|
| Software | MLwiN 2.261 | WinBUGS version 1.4 |
| Cost | Once-off purchase | Free |
| Available interfaces | Stata | Stata, SAS, R |
| Initial data structure | ||
|
| Yes | No |
|
| Yes | No |
| Geographical Structure | None | Preserves adjacent areas |
| Explanatory variables | Unit Record Individual and higher-level | Aggregated at individual-level and higher-level |
| Modelled Outcome | Individual deaths | Aggregated deaths |
| Random Effects | Yes | Yes |
| Prior distributions | Gamma, Uniform | Any including Gamma, Uniform, CAR |
| Default Priors | Yes | requires user specification of priors; greater flexibility |
| Estimation Method: MCMC | Yes | Yes |
| Number of MCMC chains | Single only | Single (multiple allowed also) |
| Level of random effects | Individual and higher-level | Higher-level |
| Within-area correlation | Yes | No |
| Between-area correlation | No | Yes |
| Adjacency matrix | No | Yes |
| Computational efficiency ( 5 year data)2 | 5-7 days | 5-7 days |
| Ease of Implementation | R equires prior data expansion | Requires specification of model including prior distributions |
| Diagnostic Tests/ convergence plots | Yes | Yes |
| Questions answered: | ||
|
| Yes | No |
|
| Yes | No |
|
| No | Yes |
|
| No | Yes |
| Cross level interactions | Yes | No |
| Allow unit record individual-level inferences | Yes | No |
| Parameter estimates | Odds ratio (OR) | Relative risk (RR) |
CAR: Conditional autoregressive prior; MCMC: Markov chain Monte Carlo.
1. Can also be run with MLwiN/WinBUGS interface.
2. On an Intel® Xeon® 2 Duo processor 64 bit CPU with 2.39 GHz processor speed and 24.0 GB RAM.
Assumptions, underlying concepts and interpretation of area-level effects: multilevel discrete-time and Bayesian spatial survival models
| Multilevel discrete-time | Bayesian spatial | |
|---|---|---|
|
| ||
| Data structure | Data is hierarchically structured with individuals nested within geographical areas. | Data is assumed to be spatially structured at the aggregated level. |
| Individuals | Individuals (level 1) living in the same area (level 2) are assumed to be correlated | No individual-level data is retained |
| Hazard | Constant hazards over each follow-up interval. | Constant hazards over each follow-up interval. |
| Area-level effects | Area-level random effect is constant and normally distributed. Area-level random effects for different geographical areas are independent of each other; hence any spatial associations between neighboring areas are ignored. | Area-level random effect is not assumed to be constant; rather it depends on the spatial relationship between areas with the assumption that the mean outcome between two neighboring areas is more similar than that between two more distant areas. |
| Modelled outcome | These are essentially logistic regression models with the outcome variable being a binary indicator that gives the probability of a death occurring in a follow-up interval given that no death has occurred in the previous year. | A Poisson distribution is assumed for the modeled outcome (i.e. observed mortality count) in each aggregated stratum. However the usual assumption for a Poisson model, that the variance equals the mean, is relaxed since additional random effect parameters are included. |
|
| ||
| Baseline hazard | The baseline hazard is modelled on the logistic scale as a function of the follow-up interval. | The baseline hazard is not specifically defined as this is a semi-parametric model. |
| Censoring | The censoring information is included. A censored individual has a sequence of zero’s for each year whereas a person who dies has a value of one for the year of death and zero for previous years. | The censoring information is included. A censored individual has a sequence of zero’s for each year whereas a person who dies has a value of one for the year of death and zero for previous years. However deaths are then aggregated acrosseach stratum. |
| Equivalence to Cox model | Multilevel logistic regression with expanded dataset is a good approximation to the Cox proportional hazard model [ | The Poisson survival model is a good approximation to the Cox proportional hazards model [ |
| Spatial smoothing | No spatial smoothing is incorporated | Models borrow information from adjacent regions (termed ‘spatial smoothing’) to help overcome data sparseness, allow shrinkage towards overall risk, produce more robust estimates and account for between-area spatial associations [ |
| Spatial structure | An individual’s probability of death is statistically dependent on their area of residence at diagnosis. Spatial proximity to other areas is not considered. | The spatial structure is encoded into the prior distribution specified for the random effects and requires the definition of relationships between spatially close SLAs [ |
| Levels of variance | The total variance is partitioned at different levels: between individuals living in the same area (individual-level) and that between two different areas (area-level). | The overall variance cannot be decomposed over different analytical levels. However the 2 random effects at the area-level allow the variance to be partitioned into spatially structured and unstructured variance. |
|
| ||
| Number | One type | Two types |
| Nature | Area-level random effects disregard any spatial correlation that may be present in the data and ignore the specific effect of location. | The spatially correlated area-level random effect assumes similarity between neighboring areas and quantifies the residual variation that is associated with geographical location. The uncorrelated or unstructured area-level random effect assumes independence between areas and allows for area-level variation that is not spatially correlated. |
Cohort description and five year all-cause survival estimates for colorectal cancer patients, Queensland, 1997-2007
| Sub group | N (%) | % Deaths | All-cause survival [95% CI] 1 | p |
|---|---|---|---|---|
| All patients in cohort | 22,727 | 41.1 | 58.1 [57, 58] | |
|
|
| |||
| Major city | 13,155 (57.9) | 39.6 | 59.6 [59, 60] | |
| Inner regional | 5,139 (22.6) | 41.4 | 57.8 [56, 59] | |
| Outer regional | 3,485 (15.3) | 45.1 | 54.1 [52, 56] | |
| Remote2 | 948 (4.2) | 46.2 | 53.1 [50, 56] | |
|
|
| |||
| Quintile 5 (least disadvantaged) | 3,193 (14.1) | 36.4 | 62.8 [61, 65] | |
| Quintile 4 | 5,101 (22.4) | 38.9 | 60.2 [59, 62] | |
| Quintile 3 | 6,075 (26.7) | 41.0 | 58.2 [57, 59] | |
| Quintile 2 | 5,335 (23.5) | 44.5 | 54.6 [53, 56] | |
| Quintile 1 (most disadvantaged) | 3,023 (13.3) | 43.8 | 55.4 [54, 57] | |
|
|
| |||
| 20 to 49 | 1,873 (8.2) | 32.0 | 67.4 [65, 70] | |
| 50 to 59 | 3,938 (17.3) | 32.8 | 66.7 [65, 68] | |
| 60 to 69 | 6,578 (28.9) | 37.1 | 62.1 [61, 63] | |
| 70-79 | 7,718 (34.1) | 45.6 | 53.5 [52, 55] | |
| 80-84 | 2,620 (11.5) | 56.7 | 41.9 [40,44] | |
|
|
| |||
| Male | 12,879 (56.7) | 42.9 | 56.2 [55, 57] | |
| Female | 9,848 (43.3) | 38.8 | 60.6 [60, 62] | |
|
|
| |||
| Non Indigenous | 20,868 (91.8) | 43.1 | 56.1 [55, 57] | |
| Indigenous | 181 (0.8) | 45.3 | 53.7 [45, 61] | |
| Not stated | 1,678 (7.4) | 16.7 | 82.9 [81, 85] | |
|
|
| |||
| Married | 14,532 (63.9) | 39.0 | 60.1 [59, 61] | |
| Never married/single | 1,541 (6.8) | 46.5 | 52.6 [50, 55] | |
| Widowed | 3,951 (17.4) | 48.2 | 51.1 [49, 52] | |
| Divorced | 1,822 (8) | 44.4 | 54.7 [52, 57] | |
| Separated | 454 (2) | 31.9 | 67.3 [63, 71] | |
| Not stated | 427 (1.9) | 20.6 | 79.3 [75, 83] | |
|
|
| |||
| Professional | 4,783 (21.1) | 48.6 | 50.6 [49, 52] | |
| White collar | 2,665 (11.7) | 52.6 | 46.7 [44,49] | |
| Blue collar | 3,789 (16.7) | 59.5 | 39.4 [38,41] | |
| Not in labor force | 7,529 (33.1) | 33.5 | 65.9 [65, 67] | |
| Not stated/Inadequately described | 3,961 (17.4) | 21.0 | 78.2 [77, 79] | |
|
|
| |||
| Australia | 17,367 (76.4) | 41.9 | 57.2 [57, 58] | |
| Other English-speaking | 4,580 (20.2) | 39.2 | 60.2 [59, 62] | |
| Non-English-speaking | 780 (3.4) | 34.0 | 64.2 [61, 68] | |
|
|
| |||
| Proximal (R) colon | 7,874 (34.6) | 41.8 | 57.5 [56, 59] | |
| Distal (L) colon | 5,865 (25.9) | 39.5 | 59.6 [58, 61] | |
| Colon NOS | 1,299 (5.7) | 54.0 | 45.3 [43,48] | |
| Rectal | 7,689 (33.8) | 39.4 | 59.8 [59, 60] | |
|
|
| |||
| Stage A | 4,332 (19.1) | 18.3 | 81.1 [80, 83] | |
| Stage B | 6,323 (27.8) | 28.9 | 70.3 [69, 71] | |
| Stage C | 5,846 (25.7) | 47.9 | 50.8 [50, 52] | |
| Stage D | 2,576 (11.3) | 84.7 | 13.9 [12,15] | |
| Unknown stage | 3,650 (16.1) | 47.4 | 51.9 [50, 54] | |
|
|
| |||
| Well differentiated | 1,107 (4.9) | 31.9 | 67.3 [65, 70] | |
| Moderately differentiated | 13,953 (61.4) | 36.7 | 62.4 [62, 63] | |
| Poorly differentiated | 4,206 (18.5) | 52.9 | 46.2 [45,48] | |
| Not stated | 3,461 (15.2) | 47.2 | 52.2 [50, 54] | |
|
|
| |||
| Clear | 16,664 (73.4) | 36.3 | 62.9 [62, 64] | |
| Positive | 530 (2.3) | 39.8 | 59.7 [55, 61] | |
| Unknown | 5,533 (24.3) | 55.7 | 43.6 [43,45] |
CI = confidence interval; p-values calculated using log-rank test for equality of survivor functions restricting follow-up to five years for each patient.
1. From Kaplan-Meir survival analysis.
2. Includes remote and very remote categories.
3. Other English-speaking: those born in New Zealand, United Kingdom, Ireland, or North America; non-English-speaking: those not born in Australia, New Zealand, United Kingdom, Ireland or North America.
4. Colorectal sites defined as proximal colon (ICDO3: C180 to C184), distal colon (ICDO3: C185-C187), unspecified colon (ICDO3: C188-C189) and rectal (ICDO3: C19-C20, C218).
Estimated area-level random effects from multilevel discrete-time survival models
| Model | Description 1 |
|
| MOR (95% CrI) 4 |
|---|---|---|---|---|
| 1 |
| 0.025 (0.014, 0.039) | <0.001 | 1.16 (1.13, 1.21) |
| 2 |
| 0.011 (0.006, 0.018) | 0.04 | 1.10 (1.08, 1.14) |
| 3 |
| 0.007 (0.003, 0.014) | 0.10 | 1.08 (1.05, 1.12) |
| 4 |
| 0.006 (0.001, 0.014) | 0.08 | 1.08 (1.03, 1.12) |
| 5 |
| 0.005 (0.001, 0.012) | 0.12 | 1.07 (1.03, 1.11) |
CrI: Credible Interval.
1. Models 2–5 adjusted for all individual-level covariates; Model 4 also adjusted for area disadvantage; Model 5 also adjusted for area remoteness and area disadvantage.
2. The residual area-level variance from the MCMC simulations for multilevel analysis.
3. From Wald χ2 test.
4. Median odds ratio-Refer to text and Appendix 1 for details.
Estimated area-level random effects from Bayesian spatial survival models
| Area random-effects (95% CrI) | |||||
|---|---|---|---|---|---|
| Model | Description 1 | Spatial ( | Unstructured ( | Total ( | Spatial fraction 5 (95% CrI) |
| 7 |
| 0.018 (0.016, 0.23) | 0.006 (0.004, 0.012) | 0.024 (0.012, 0.28) | 0.70 (0.51, 0.82) |
| 8 |
| 0.010 (0.005, 0.018) | 0.005 (0.002, 0.011) | 0.015 (0.009, 0.024) | 0.64 (0.38, 0.86) |
| 9 |
| 0.007 (0.003, 0.015) | 0.005 (0.002, 0.010) | 0.012 (0.006, 0.021) | 0.56 (0.26, 0.82) |
| 10 |
| 0.006 (0.003, 0.13) | 0.005 (0.002, 0.011) | 0.011 (0.007, 0.019) | 0.58 (0.29, 0.81) |
| 11 |
| 0.006 (0.002, 0.013) | 0.005 (0.003, 0.009) | 0.011 (0.006, 0.019) | 0.55 (0.35, 0.73) |
| 12 |
| - | 0.009 (0.003, 0.013) | ||
| 13 |
| 0.009 (0.003, 0.014) | - | ||
CrI: Credible Interval.
1. Models 8 to13 adjusted for all individual-level covariates; Model 9 also adjusted for area remoteness; Model 10 also adjusted for area disadvantage, Model 11 also adjusted for area remoteness and area disadvantage. Models 12 and 13 are adjusted for all covariates in Model 11 but exclude the spatial and unstructured random effects respectively.
2. Spatial variance (spat_σ ).
3. Unstructured variance (spat_σ ).
4. Total variance ().
5. Refer to Appendix 2 for details.
Covariate fixed effects from multilevel discrete-time and Bayesian spatial survival models
| Variable | Multilevel model: OR (95% CrI) 1 | Spatial model: RR (95%CrI) 2 |
|---|---|---|
|
| ||
|
| ||
| Major city |
|
|
| Inner regional | 0.95 (0.88, 1.02) | 0.98 (0.89, 1.07) |
| Outer regional | 1.09 (1.01, 1.18) | 1.06 (1.01, 1.19) |
| Remote | 1.15 (1.02, 1.28) | 1.09 (1.01, 1.21) |
|
| ||
| Most advantaged |
|
|
| Advantaged | 1.14 (1.03, 1.23) | 1.08 (1.01, 1.17) |
| Middle | 1.18 (1.08, 1.29) | 1.15 (1.06, 1.25) |
| Disadvantaged | 1.22 (1.11, 1.34) | 1.17 (1.07, 1.28) |
| Most disadvantaged | 1.23 (1.10, 1.36) | 1.18 (1.07, 1.32) |
|
|
|
|
|
| ||
| 20 to 49 | 0.24 (0.21, 0.27) | 0.29 (0.26, 0.32) |
| 50 to 59 | 0.29 (0.26, 0.32) | 0.35 (0.32, 0.38) |
| 60 to 69 | 0.42 (0.39, 0.46) | 0.48 (0.45, 0.52) |
| 70-79 | 0.68 (0.63, 0.73) | 0.73 (0.69, 0.77) |
| 80-85 |
|
|
|
| ||
| Male |
|
|
| Female | 1.08 (1.02, 1.14) | 1.07 (1.02, 1.13) |
|
| ||
| Married |
|
|
| Never married/single | 1.33 (1.21, 1.46) | 1.31 (1.20, 1.40) |
| Widowed | 1.11 (1.03, 1.19) | 1.09 (1.02, 1.15) |
| Divorced | 1.18 (1.08, 1.29) | 1.16 (1.08, 1.25) |
| Separated | 0.94 (0.77, 1.13) | 0.95 (0.81, 1.15) |
| Not stated | 1.32 (1.02, 1.68) | 1.36 (1.08, 1.69) |
|
| ||
| Professional |
|
|
| White collar | 1.11 (1.02, 1.20) | 1.07 (1.01, 1.15) |
| Blue collar | 1.38 (1.29, 1.49) | 1.29 (1.22, 1.37) |
| Not in labor force | 0.46 (0.43, 0.50) | 0.52 (0.49, 0.56) |
| Not stated/Inadequately described | 0.35 (0.32, 0.39) | 0.39 (0.35, 0.42) |
|
| ||
| Australia |
|
|
| Other English-speaking | 0.96 (0.90, 1.02) | 0.95 (0.92, 1.00) |
| Non-English-speaking | 0.88 (0.76, 0.97) | 0.87 (0.78, 0.98) |
|
| ||
| Non Indigenous |
|
|
| Indigenous | 1.16 (0.89, 1.49) | 1.12 (0.97, 1.38) |
| Not stated | 0.45 (0.39, 0.51) | 0.48 (0.42, 0.54) |
|
| ||
| Proximal (R) colon | 1.02 (1.01, 1.08) | 1.06 (1.01, 1.11) |
| Distal (L) colon | 1.03 (0.96, 1.10) | 1.07 (1.01, 1.13) |
| Colon NOS | 1.04 (1.01, 1.16) | 1.08 (1.00, 1.18) |
| Rectal |
|
|
|
| ||
| Stage I |
|
|
| Stage II | 1.61 (1.47, 1.77) | 1.57 (1.44, 1.71) |
| Stage III | 3.17 (2.91, 3.45) | 2.85 (2.64, 3.10) |
| Stage IV | 11.41 (10.30, 12.57) | 7.88 (7.23, 8.59) |
| Unknown stage | 2.09 (1.86, 2.34) | 2.10 (1.91, 2.32) |
|
| ||
| Well differentiated |
|
|
| Moderately differentiated | 1.14 (1.01, 1.29) | 1.18 (1.06, 1.32) |
| Poorly differentiated | 1.64 (1.44, 1.87) | 1.65 (1.47, 1.85) |
| Not stated differentiation | 1.25 (1.09, 1.43) | 1.35 (1.20, 1.52) |
|
| ||
| Clear |
|
|
| Positive | 1.42 (1.19, 1.66) | 1.37 (1.19, 1.57) |
| Unknown margin | 1.84 (1.68, 2.01) | 1.73 (1.61, 1.85) |
CrI Credible Interval OR Odds Ratios, RR Relative Risk ratios.
1. Estimates derived from best fitting fully adjusted Model 5 as described in text.
2. Estimates derived from best fitting fully adjusted Model 11 as described in text.