Literature DB >> 19183458

A multilevel model for cardiovascular disease prevalence in the US and its application to micro area prevalence estimates.

Peter Congdon1.   

Abstract

BACKGROUND: Estimates of disease prevalence for small areas are increasingly required for the allocation of health funds according to local need. Both individual level and geographic risk factors are likely to be relevant to explaining prevalence variations, and in turn relevant to the procedure for small area prevalence estimation. Prevalence estimates are of particular importance for major chronic illnesses such as cardiovascular disease.
METHODS: A multilevel prevalence model for cardiovascular outcomes is proposed that incorporates both survey information on patient risk factors and the effects of geographic location. The model is applied to derive micro area prevalence estimates, specifically estimates of cardiovascular disease for Zip Code Tabulation Areas in the USA. The model incorporates prevalence differentials by age, sex, ethnicity and educational attainment from the 2005 Behavioral Risk Factor Surveillance System survey. Influences of geographic context are modelled at both county and state level, with the county effects relating to poverty and urbanity. State level influences are modelled using a random effects approach that allows both for spatial correlation and spatial isolates.
RESULTS: To assess the importance of geographic variables, three types of model are compared: a model with person level variables only; a model with geographic effects that do not interact with person attributes; and a full model, allowing for state level random effects that differ by ethnicity. There is clear evidence that geographic effects improve statistical fit.
CONCLUSION: Geographic variations in disease prevalence partly reflect the demographic composition of area populations. However, prevalence variations may also show distinct geographic 'contextual' effects. The present study demonstrates by formal modelling methods that improved explanation is obtained by allowing for distinct geographic effects (for counties and states) and for interaction between geographic and person variables. Thus an appropriate methodology to estimate prevalence at small area level should include geographic effects as well as person level demographic variables.

Entities:  

Mesh:

Year:  2009        PMID: 19183458      PMCID: PMC2647533          DOI: 10.1186/1476-072X-8-6

Source DB:  PubMed          Journal:  Int J Health Geogr        ISSN: 1476-072X            Impact factor:   3.918


Background

Estimates of prevalence of disease and health behaviours for different areas are increasingly required for the equitable allocation of health funds according to local need and to target interventions. As stressed by Bazos et al [1] community health need assessments are ideally based on locally disaggregated (i.e. small area) health status and disease prevalence information. To estimate prevalence in different small areas, a commonly adopted approach involves synthetic estimation whereby prevalence rates for demographic subgroups of the population are obtained (e.g. from national health surveys) and an indicative rate then obtained based on the demographic composition of each area. Thus prevalence of most health conditions varies considerably with age, and often also by sex and race: so a synthetic estimate may be obtained by using age, sex and race specific prevalence rates. However, synthetic estimates of this kind do not take account of geographic context, exemplified by interactions between demographic risk factors and geographic location, or by independent effects of geographic variables (e.g. area poverty or urbanity-rurality) on prevalence that remain even after taking account of patient level risk factors. By contrast, the multilevel prevalence model for cardiovascular outcomes proposed here as a basis for small area prevalence estimates incorporates the modifying effects of geographic context as well as patient risk factors. In the US, a number of population health surveys are carried out and provide cumulative evidence on CVD trends and epidemiology. Thus the National Health Interview Survey (NHIS) for 2005 estimated the prevalence of cardiovascular disease (CVD) at 68 million among adults aged 18 years and over in the US, which includes coronary heart disease, hypertension, stroke, angina pectoris or heart attack. The analysis here is concerned with a positive response to one or more of three questions included in the 2005 Behavioral Risk Factor Surveillance System (BRFSS) survey; these questions encompass the different forms of CVD, namely, had the subject ever been told by a health professional that they had experienced a heart attack, or told they had undergone a stroke, or told they had CHD or angina. The epidemiology of these conditions differ to some degree, for example in terms of male-female differentials in prevalence and incidence [2], in trends through time [3], and in ethnic group differentials. However, for these and related conditions there is evidence for a role of geographic context, in terms of wide geographic disparities by region, state and urbanity [4-8]. In particular, there is evidence of direct effects of area variables after controlling for person level risk factors, and evidence of interactions between place and person variables. For example, Cubbin et al [9] report higher levels of hypertension and diabetes among African American women living in socioeconomically deprived neighborhoods as against African American women from more affluent neighborhoods, after allowing for individual-level socioeconomic status, while Halverson et al [10] report local clustering of excess CVD mortality after controlling for area population composition. As for place-person interactions, Barnett et al [6] and Casper et al [11] report that ethnic disparities in CHD mortality vary by area of residence. The prevalence model and small area prevalence estimates described here are based on around 336,000 survey responses, and on a regression analysis relating CVD status both to individual level risk factors and to county level measures of poverty and urban-rural status. The analysis further adjusts for differentiation at US state level in the impact of ethnicity on prevalence. Thus adjustment for geographic context is much more comprehensive than is possible using disease status data from the Health Survey for England where only broad regional identifiers are available – an example being the work of Congdon [12] on CHD prevalence. One goal of the analysis here is to develop prevalence estimates for micro areas, namely 32000 ZIP Code Tabulation Areas (ZCTAs) for which certain population tabulations are provided by the US Census Bureau [13]. Inclusion in the prevalence model of patient risk categories such as gender and ethnicity (and interactions between them) therefore requires that such categories are available in these tabulations for micro area populations.

Methods

The regression model for prevalence includes person level attributes (age, gender, ethnicity, education level) that are known to have significant CVD risk gradients. A pronounced gradient in CVD prevalence by age is reported by Neyer et al [14]; thus the MI rate among 18–44 year olds is 0.8%, among 45–64 year olds is 4.8% and among the over 65s is 12.9%. In terms of the main ethnic groups in the US (white non-hispanic, black, hispanic, other) elevated CVD mortality and morbidity for nonwhite groups are reported by Barnett et al [6] and Caspar et al [11], though ethnic differentials may to some degree express socioeconomic disadvantage. Certain subgroups such as black females, have more clearly elevated CVD prevalence [15]. As to education level, Neyer et al [14] report that prevalence of one or more of an MI history or a CHD/angina history decreases with educational attainment: of persons with less than a high school diploma, 9.8% report a history of one or more of the conditions, nearly twice the proportion (5%) among college graduates. Education is interrelated with issues such as linguistic competence and health literacy that affect health status [16], and with health insurance [17].

Methods: Translating Survey Model to Small Area Prevalence Estimates

However, to permit small area (ZCTA) prevalence estimation, inclusion of risk variables (and interactions between them) in the regression model is subject to the constraint that such variables are available both in the BRFSS and in tabulations for ZCTA populations. So an interaction between risk factors requires a matching cross-tabulation in the ZCTA population. Impacts of age group, gender and ethnic group are straightforward to include since they are available as BRFSS variables and in a ZCTA level cross-tabulation of adult populations by gender, ethnicity, and quinquennial age. For particular gender-ethnic-age subgroups, parameters from the survey model (e.g. relative risk for white males aged 65–69) can then be applied to the ZCTA sub-population. For other person level variables (e.g. education, marital status), either primary ZCTA tabulations are available from the 2000 census, or a restricted cross tabulation (e.g. adult population by education, ethnicity and gender in US census tabulation P148), but not tabulations involving cross-hatching against all other risk factors. A small area prevalence adjustment can be applied only for the main effect of such variables, or for a partial interaction. Thus the BRFSS regression models include gender-education effects, and so gradients in CVD relative risk can be applied to ZCTA male and female adult populations subdivided by education level. Gender-education-ethnic interactions are not adopted as the relevant ZCTA cross tabulation often includes very small numbers.

Methods: The Prevalence Model

The regression involves 129 thousand male and 207 thousand female respondents, and is confined to adults aged 18 and over. Separate regressions are carried out for males and females, in view of evidence of gender effect modification over a range of risk variables [18]. The regression also takes account of varying survey weights w for different respondents to account for differential response between demographic categories and for different sampling rates in different US states. The detailed derivation of weights is discussed in CDC [19] and is based on the inverse of the sampling fraction in each area stratum and age-by-race-by-gender category. Let y = 1 if a subject reports a particular CVD symptom, with y = 0 otherwise, and denote p as the probability that a respondent reports a symptom. Then a weighted likelihood [20] over subjects i and gender r (r = 1 for males, 2 for females) is used, giving greater weight to undersampled demographic categories or areas, namely To facilitate a relative risk interpretation for parameters a log link is used in the binary regression [21] – see Appendix 1 for model details. In Winbugs this requires (a) a model regression statement linking log() to risk factor covariates and any random effects and (b) a statement selecting the minimum of 1 and as the actual probability pthat y= 1. The occurrence of values > 1 was confined within the first hundred or so MCMC iterations (depending on how close the starting parameter values are to the posterior means), and thereafter convergence was straightforward. Three types of regression model are applied in order to assess geographic effects. The first baseline model (model 1) includes only person level risk variables. It allows first for differential risks of each CVD symptom for black, hispanic and other ethnic groups as against whites as the reference category. Second, it allows differential risk according to education attainment with categories 1 = never attended, elementary only, or some high school; 2 = high school graduate; 3 = some college or technical school; 4 = college graduate (with level 1 as reference category for statistical estimation). Finally, since age gradients are known to vary by ethnic group, differential risks are assumed specific to combinations of age group (12 levels) and the four ethnic groups; the age bands are 18–24,25–29,30–34,..,70–74, and 75+. The second type of model (model 2) includes geographic effects but without any interaction between area and person attributes (except for gender). Although prevalence is to be estimated for ZCTAs, the ZCTA of residence for BRFSS respondents is not available for confidentiality reasons. However, county and state of residence are provided, and one may model their impact on CVD prevalence. Since there are over 3000 US counties, some counties are sparsely represented in the survey, and so random effects at this level are not adopted. However, county level variables are used as predictors, these being the 2005 percent of population in poverty and a category variable, namely the 9 category rural-urban continuum coding [22] – see Table 1.
Table 1

Categorisation of Counties by Rural-Urban Continuum*

CodeDescriptionMetropolitan Type
1Counties in metro areas of 1 million population or moreMetropolitan
2Counties in metro areas of 250,000 to 1 millionMetropolitan
3Counties in metro areas of fewer than 250,000Metropolitan
4Urban population of 20,000 or more, adjacent to a metropolitan areaNon-metro
5Urban population of 20,000 or more, not adjacent to a metropolitan areaNon-metro
6Urban population of 2,500 to 19,999, adjacent to a metropolitan areaNon-metro
7Urban population of 2,500 to 19,999, not adjacent to a metropolitan areaNon-metro
8Completely rural or less than 2,500 urban population, adjacent to a metropolitan areaNon-metro
9Completely rural or less than 2,500 urban population, not adjacent to a metropolitan areaNon-metro

* see

Categorisation of Counties by Rural-Urban Continuum* * see Many geographic influences may be unobserved (e.g. various environmental and health behavioral influences) and these are represented in the second and third models by state level random effects. These are modelled using a random effects approach (see Appendix 2) that allows both for spatial correlation between effects for contiguous states and for the presence of spatially isolated states. It is sensible to allow unobserved state influences to be spatially correlated to reflect smoothly varying risk factors in space [23]. However, application of conditional autoregressive spatial schemes [24], with spatial interaction typically based on contiguity of areas, is complicated by the presence of two spatially isolated states (Alaska, Hawaii). A different approach based on Congdon [25] is applied instead, which allows for varying strength in spatial clustering over the mainland states and also encompasses spatial isolates. In model 2, effects of county poverty and urbanity are included together with random effects for the 51 states. The third model (model 3) allows for area-person interactions, in that state random effects are taken to be ethnicity specific. Differentiation of area effects by ethnicity reflects epidemiological evidence such as that from Casper et al [11] that CVD mortality and prevalence disparities between ethnic groups vary by place of residence. Let Cand Srespectively denote the county and state in which subject i is resident. Let rdenote a subject's gender, gdenote their ethnic group, xdenote their age group, and edenote their education level. Then the prevalence probability is specified under the full model as where αare gender specific intercepts measuring the overall prevalence level, the βparameters measure varying prevalence by ethnicity, the ηterms measure varying prevalence by education, the γmeasure ethnic specific age gradients, κis the coefficient for county poverty, the δterms reflect the effect of different categories U in the rural-urban continuum, and the wterms are state random effects specific for ethnic group. County poverty rates (for all ages in 2005) are expressed as proportions and range from 0.025 to 0.51, and are centred around the average poverty rate.

Methods: ZCTA Prevalence Rates

To translate the prevalence model parameters into ZCTA level estimates requires categorisations of the ZCTA populations that match the survey derived individual and geographic risk factors used in the prevalence model. The goal is to obtain ZCTA age-sex-ethnic prevalence rates (and case totals) that reflect not only demographic gradients, but also reflect the impact that the location and socioeconomic character of the ZCTA have on prevalence. Among important socioeconomic influences on disease (including CVD) that are available for ZCTAs in 2000 Census tabulations are education, income, poverty status, and household tenure. Here education is used as a socioeconomic measure of small area populations because of established CVD prevalence gradients by education level [14], and because it is available both as a BRFSS survey question and in ZCTA census tabulations. Education has been used as a measure of socioeconomic status in other area health studies [26]. Essentially the age-sex-ethnic rates obtained from the survey prevalence model (for the reference education group) are adjusted according to a sex-specific education effect that is also estimated in the model. Let Cand Srespectively denote the county and state in which ZCTA j is located. Let r denote gender, g denote ethnic group and x denote age group. Then given a particular county Cand state of residence S, prevalence rates for ZCTA j specific to age-sex-ethnic group, but unadjusted for that ZCTA's education mix, are obtained from the full model as This is the model for the reference education group (namely, the group with less than high school education). As described in Appendix 1, the β and γ parameters represent ethnic and age-ethnic effects for gender r; the parameters κ and δ represent county poverty and urban-rural effects, and the w parameters are state level random effects. To take account of the impact on CVD prevalence of education attainment mix, let π[j, r, e] be the 2000 census data relative proportions at education level e in each gender's adult population in ZCTA j. Also let be the survey model estimate of CVD relative risk at education level e after controlling for age, ethnicity and geographic effects (county and state effects). The composite relative risk associated with the educational mix in ZCTA j can be represented as a weighted total of the relative risks for each education level, namely Finally, age-sex-ethnic prevalence rates p[j, r, x, g] in ZCTA j adjusted for its education mix are obtained as

Results

Estimation of the three models follows the Bayesian method, whereby pre-existing knowledge regarding parameters is expressed in prior densities, and updated or posterior knowledge is obtained by combining the prior densities with the likelihood (1) of the observed data. Estimation uses iterative Monte Carlo Markov Chain sampling methods [27], as provided in the WINBUGS program [28]. Goodness of fit is assessed by the Deviance Information Criterion or DIC [29], whereby the average deviance is adjusted to account for model complexity. The DIC is the average deviance plus the complexity, with lower DICs representing better fit. Summaries of parameters (means and 95% intervals) are based on the second halves of two chain runs of 5000 iterations, with dispersed initial values. Convergence was achieved in all models using Brooks-Gelman-Rubin criteria [30]. Table 2 summarises the fit of the models, while Tables 3 and 4 show gender-specific es-timates of the parameters {α, β, η, κ, δ} from the three models. The DIC criteria in Table 2 show a gain in introducing geographic contextual variables (model 2 vs model 1), and a clear gain also in making state random effects specific to ethnic groups (model 3 vs model 2).
Table 2

Summary of Model Fit

AverageDevianceComplexityDIC
MalesModel 1644903364523
Model 2644007364473
Model 36418210564287
FemalesModel 1911574191198
Model 2908588690944
Model 39056710790674
Table 3

Cardiovascular Prevalence Models 1 to 3, Parameter Estimates for Males

Model 1Model 2Model 3
Mean2.5%97.5%Rel've RiskMean2.5%97.5%Rel've RiskMean2.5%97.5%Rel've Risk
α-2.49-2.53-2.45-2.51-2.72-2.43-2.45-2.54-2.41
Ethnic Coefficients (log relative risk)*
β110.001.000.001.000.001.00
β120.060.000.141.07-0.05-0.120.030.95-0.08-0.17-0.010.92
β130.04-0.010.081.040.080.000.201.090.01-0.110.101.01
β140.250.180.311.280.170.080.241.180.210.100.321.24
Education Coefficients (log relative risk)**
η110.001.000.001.000.001.00
η12-0.22-0.26-0.170.81-0.20-0.24-0.150.82-0.21-0.26-0.150.81
η13-0.24-0.29-0.190.78-0.22-0.27-0.160.80-0.24-0.29-0.170.79
η14-0.56-0.60-0.510.57-0.53-0.58-0.470.59-0.55-0.59-0.480.58
County Effects
κ1 (County poverty)***0.480.180.831.110.460.170.781.10
δ1 Parameters (Urbanity)
Metro > 1 m0.001.000.001.00
Metro, 250th-1m-0.03-0.090.020.97-0.02-0.060.020.98
Metro < 250th0.02-0.040.091.020.02-0.030.071.02
Urban > 20th, adj Metro-0.08-0.15-0.010.92-0.07-0.14-0.010.93
Urban > 20th, not adj Metro-0.12-0.23-0.010.88-0.11-0.200.000.89
Urban 2.5–20th, adj Metro-0.02-0.090.060.980.00-0.070.061.00
Urban 2.5–20th, not adj Metro0.04-0.040.121.040.04-0.030.121.04
Rural or < 2,5th, adj Metro0.01-0.090.161.010.03-0.080.141.03
Rural or < 2,5th, not adj Metro0.01-0.140.191.010.04-0.070.131.04
State Spatial Effects
λa, Average Spatial Dependence0.570.270.810.380.150.62
τw Overall Spatial Variance (Model 2)0.0110.0050.025
φ11 Spatial Variance, Wh (Model 3)0.0100.0050.017
φ22 Spatial Variance, Blk (Model 3)0.0770.0330.167
φ33 Spatial Variance, Hisp (Model 3)0.0410.0190.078
φ44 Spatial Variance, Oth (Model 3)0.0210.0090.045

* 1 White; 2 Black; 3 Hispanic; 4 Other

** 1 No school, elementary only, or some high school without graduating; 2 High school graduate; 3 Some college; 4 College Graduate

*** Relative Risks for County Poverty Compares Risks at 5th and 95th percentiles of 2005 all age poverty rate

Table 4

Cardiovascular Prevalence Models 1 to 3, Parameter Estimates for Females

Model 1Model 2Model 3
Mean2.5%97.5%Rel've RiskMean2.5%97.5%Rel've RiskMean2.5%97.5%Rel've Risk
α-2.58-2.63-2.55-2.67-2.72-2.61-2.66-2.72-2.61
Ethnic Coefficients (log relative risk)*
β210.001.000.001.000.001.00
β220.380.330.431.460.350.270.431.420.320.250.411.38
β230.100.040.161.110.110.040.161.11-0.01-0.120.060.99
β240.340.280.401.410.370.290.431.440.480.420.561.62
Education Coefficients (log relative risk)**
η210.001.000.001.000.001.00
η22-0.41-0.45-0.380.66-0.38-0.41-0.340.69-0.37-0.40-0.320.69
η23-0.47-0.51-0.420.63-0.43-0.47-0.390.65-0.42-0.47-0.380.65
η24-0.99-1.03-0.940.37-0.93-0.98-0.880.39-0.92-0.97-0.870.40
County Effects
κ2 (County poverty)***0.800.481.101.180.830.511.171.19
δ2 Parameters (Urbanity)
Metro > 1 m0.001.000.001.00
Metro, 250th-1m0.01-0.030.051.010.02-0.020.061.02
Metro < 250th0.070.030.121.070.060.010.111.06
Urban > 20th, adj Metro0.07-0.010.121.070.080.010.161.09
Urban > 20th, not adj Metro-0.03-0.140.090.970.01-0.060.101.01
Urban 2.5–20th, adj Metro0.100.050.151.110.100.050.151.10
Urban 2.5–20th, not adj Metro0.04-0.030.111.040.05-0.020.131.05
Rural or < 2,5th, adj Metro0.04-0.130.171.040.04-0.070.181.04
Rural or < 2,5th, not adj Metro0.01-0.070.101.010.01-0.120.141.01
State Spatial Effects
λa, Average Spatial Dependence0.590.240.860.280.110.57
τw Overall Spatial Variance (Model 2)0.0310.0160.054
φ11 Spatial Variance, Wh (Model 3)0.0230.0110.049
φ22 Spatial Variance, Blk (Model 3)0.0280.0120.052
φ33 Spatial Variance, Hisp (Model 3)0.0640.0330.117
φ44 Spatial Variance, Oth (Model 3)0.0780.0460.132

* 1 White; 2 Black; 3 Hispanic; 4 Other

** 1 No school, elementary only, or some high school without graduating; 2 High school graduate; 3 Some college; 4 College Graduate

*** Relative Risks for County Poverty Compares Risks at 5th and 95th percentiles of 2005 all age poverty rate

Summary of Model Fit Cardiovascular Prevalence Models 1 to 3, Parameter Estimates for Males * 1 White; 2 Black; 3 Hispanic; 4 Other ** 1 No school, elementary only, or some high school without graduating; 2 High school graduate; 3 Some college; 4 College Graduate *** Relative Risks for County Poverty Compares Risks at 5th and 95th percentiles of 2005 all age poverty rate Cardiovascular Prevalence Models 1 to 3, Parameter Estimates for Females * 1 White; 2 Black; 3 Hispanic; 4 Other ** 1 No school, elementary only, or some high school without graduating; 2 High school graduate; 3 Some college; 4 College Graduate *** Relative Risks for County Poverty Compares Risks at 5th and 95th percentiles of 2005 all age poverty rate

Results: Person Level Attributes

In terms of person-level attributes, it can be seen from Tables 3 and 4 that there is a steeper educational gradient for females than males. In model 3, the relative risk for female college graduates is exp(η24) = 0.40 is under a half that of the first education category, those with limited education (elementary education only or did not graduate from high school). Black females also show excess CVD risk (an excess that remains after controlling for socioeconomic and geographic effects), whereas black males do not. However, both males and females in the other ethnic group have elevated risk. The ethnic specific age gradients for males (γ1) and for females (γ2) under model 3 are shown in Figures 1 and 2. The age gradients are presented in the form
Figure 1

Ethnic specific age gradients, males.

Figure 2

Ethnic specific age gradients, females.

Ethnic specific age gradients, males. Ethnic specific age gradients, females. namely probabilities of CVD caseness by gender, age and ethnicity at reference levels of education and county urbanity and average county poverty. There are cross-over effects between black and white males with higher rates for black males up to early old age, and but lower rates thereafter. This reflects a wider finding that blacks "experience heart disease and die of heart-related problems at earlier ages than whites" [31]. For black females prevalence rates exceed those among white females except among the very old. Probabilities of CVD by gender, age, ethnicity and education at reference levels of county urbanity and average county poverty are obtained as The overall age adjusted prevalence pfor ethnic groups g at education level e may be obtained by using age weights wfor a standard population (e.g. the European Standard Population), namely Table 5 contains posterior summaries (expressed as percents CVD caseness) of the pover the four ethnic groups and four education levels. The widest contrast is among women, exemplified by the rates for white, college-educated women (mean prevalence of 3.0%), as opposed to women of other ethnicity with limited education (mean prevalence of 11.8%). The stronger effect of education on female risk means that the male to female risk ratio is higher for college graduates than those with lesser education.
Table 5

Posterior Mean Cardiovascular Prevalence Rates (Percents) by Gender, Ethnicity, and Education

MalesFemales
EthnicityEducationMean2.5%97.5%Mean2.5%97.5%Male-Female Risk Ratio
WhiteNo High School9.89.010.57.67.57.71.28
High Sch Graduate8.27.88.75.45.25.51.53
Some College8.17.68.65.15.05.31.59
College Graduate5.95.66.23.03.03.11.94
BlackNo High School8.78.29.410.39.911.00.85
High Sch Graduate7.36.97.87.27.07.61.01
Some College7.26.87.66.86.67.31.05
College Graduate5.24.95.64.13.84.41.28
HispNo High School8.47.99.16.56.26.71.30
High Sch Graduate7.16.67.64.54.34.81.55
Some College7.06.57.54.34.14.51.62
College Graduate5.14.75.52.62.52.71.97
OtherNo High School13.012.313.811.811.512.11.10
High Sch Graduate10.910.411.68.38.18.51.31
Some College10.710.211.37.87.78.01.36
College Graduate7.87.48.44.74.54.91.66
Posterior Mean Cardiovascular Prevalence Rates (Percents) by Gender, Ethnicity, and Education

Results: Geographic Variables

Tables 3 and 4 show that the county poverty effect is more pronounced for female than male CVD caseness. Whereas all county poverty effects are significant, many of the coefficients for the county urban-rural category are not significant. Significance of urban-rural category differs whether model 2 or model 3 is considered, and also differs to some extent by gender. Under model 3, male risks are significantly low in the non-metropolitan category "urban population with over 20 thousand or more, adjacent to a metropolitan area", while under model 2, significantly lower risk prevails in both categories of "urban population with over 20 thousand or more". These may be interpreted as categories intermediate between highly metropolitan and highly rural settings, and the lower risks there fit with the view of Ingram & Franco [32] that metropolitan and rural areas tend to have worse health than intermediate area types. However, for females under model 3, counties in smaller metropolitan areas, as well as those with urban populations over 2500 and adjacent to a metropolitan area, have a significantly elevated risk. The absence of clear patterns may be because the association between urban status and health is linked to the uneven distribution of poverty in the US, which tends to be disproportionately concentrated in metropolitan centres as well as in some rural areas [33]. So rural-urban prevalence gradients may be attenuated once poverty levels are controlled for. State level random effects are included in both models 2 and 3 (see Appendix 2). A summary expression of unobserved state level influences applicable across all ethnic groups is obtainable from the additive person and area effects model 2 – see Table 6. These are residual relative risks in the form
Table 6

Residual State Effects (Relative Risk) Model 2 Significantly high in bold, significantly low in bold and italicised

MalesFemales
Regional DivisionStateMean2.5%97.5%Mean2.5%97.5%
East North CentralIllinois1.090.991.230.960.891.04
East North CentralIndiana1.101.031.231.000.911.11
East North CentralMichigan1.060.961.161.101.011.20
East North CentralOhio0.970.901.101.050.971.15
East North CentralWisconsin0.980.881.060.820.700.94
East South CentralAlabama1.110.991.261.040.931.16
East South CentralKentucky1.101.021.221.181.071.30
East South CentralMississippi1.070.961.201.141.001.27
East South CentralTennessee1.060.971.191.161.071.28
Middle AtalanticNew Jersey0.970.901.070.980.891.06
Middle AtalanticNew York1.000.941.090.900.830.97
Middle AtalanticPennsylvania1.010.941.120.990.921.06
MountainArizona1.020.961.110.960.881.03
MountainColorado0.880.790.960.910.790.99
MountainIdaho0.990.911.100.990.871.13
MountainMontana0.920.821.011.020.881.19
MountainNevada0.990.911.081.150.971.31
MountainNew Mexico0.960.841.100.890.771.03
MountainUtah0.980.901.060.930.771.05
MountainWyoming0.960.861.050.960.811.13
New EnglandConnecticut0.990.891.090.920.831.02
New EnglandMaine0.960.841.120.980.821.16
New EnglandMassachusetts1.020.951.090.920.801.04
New EnglandNew Hampshire0.990.901.101.060.891.28
New EnglandRhode Island0.960.831.120.890.771.07
New EnglandVermont1.020.921.170.890.731.12
PacificCalifornia0.910.850.971.040.981.11
PacificOregon0.960.881.060.960.851.08
PacificWashington0.960.881.060.930.831.03
PacificAlaska1.000.801.251.000.711.42
PacificHawaii0.940.831.090.830.670.99
South AtlanticDelaware1.030.951.191.020.881.17
South AtlanticDistrict of Columbia1.000.881.150.940.771.12
South AtlanticFlorida1.060.991.171.211.131.30
South AtlanticGeorgia1.040.941.140.960.871.04
South AtlanticMaryland0.990.901.090.980.871.09
South AtlanticNorth Carolina1.050.961.161.010.931.10
South AtlanticSouth Carolina1.030.921.120.940.821.05
South AtlanticVirginia1.101.021.210.990.911.10
South AtlanticWest Virginia1.060.971.211.401.261.55
West North CentralIowa1.000.911.080.950.831.06
West North CentralKansas1.020.931.111.010.871.15
West North CentralMinnesota0.930.821.010.860.780.97
West North CentralMissouri1.060.991.171.090.981.19
West North CentralNebraska0.980.871.070.920.801.03
West North CentralNorth Dakota0.960.851.080.950.821.08
West North CentralSouth Dakota0.970.841.060.990.851.15
West South CentralArkansas1.020.891.191.040.911.17
West South CentralLouisiana1.101.011.221.020.931.16
West South CentralOklahoma1.070.951.201.040.951.15
West South CentralTexas1.020.941.171.151.061.24
Residual State Effects (Relative Risk) Model 2 Significantly high in bold, significantly low in bold and italicised over states s, and amount to residual effects after controlling for the age, ethnic and educational composition of populations, and also for county poverty and urbanity. High residual relative risks, namely those significantly exceeding 1 (in the sense that the 95% credible interval is confined to values over 1) tend to occur in the South East and South of the US. For males elevated unexplained risks are present in Indiana, Kentucky, Louisiana and Virginia, and for females in Kentucky, Mississippi, Tennessee, Texas and West Virginia. Significantly low relative risks, those significantly under 1, occur for males in California and Colorado, and for females in Colorado, Minnesota, New York and Hawaii. When residual state effects are made ethnic-specific in model 3, there are clear contrasts in variability between ethnic groups (see the spatial variance estimates in Tables 3 and 4). For males, there is greater variability in black and hispanic unexplained relative risk than for non-hispanic whites, while for females variability is greatest for hispanic and other ethnicities. To summarise the relative risk patterns, and in particular the location of states with two or more ρ= exp(w) significantly above 1, the nine Census Bureau Regional Divisions (listed in Table 6) are used to categorise the states (Table 7). There are consistent patterns, with multiple elevated residual effects tending to occur in the South (South Atlantic, East South Central) and East North Central divisions; this pattern shows similarities with that found by studies such as [8], though here the pattern is one that persists after controlling for important person and county risk factors.
Table 7

States with Elevated Residual Risk (95% interval exceeding 1) according to Census Division.

MalesFemales
DivisionZeroOneTwoZeroOneTwo
East North Central131032
East South Central121022
Middle Atalantic300300
Mountain710701
New England600420
Pacific500311
South Atlantic522711
West North Central610520
West South Central301310
Total379532127

Number of states with elevated residual RR

States with Elevated Residual Risk (95% interval exceeding 1) according to Census Division. Number of states with elevated residual RR

Results: ZCTA Prevalence Estimates

As discussed above, the model provides estimates of p[j, r, x, g] for approximately 32 thousand ZCTAs in 51 states. These are gender-ethnic-age prevalence rates adjusted for the education mix of each ZCTA. Summary ZCTA prevalence rates for gender-ethnic combinations may then be obtained by applying standard population age weights w, namely Implications for prevalence levels and prevalence inequalities by state or county can then be assessed by considering relevant subsets of the gender-ethnic rates. Being able to assess small area inequality in health is important in health needs assessment [1]. Thus Table 8 presents female prevalence levels for the three main ethnic groups across the 51 states, obtained by averaging p[j, 2, g] within states. Also shown are within state variances and ranges of the ZCTA prevalences. Prevalence levels and within state variability both tend to be higher in southern states such as Alabama, Kentucky, Louisiana, Mississippi, Texas and West Virginia.
Table 8

ZCTA Cardiovascular Prevalence Estimates for Females: State Averages and Variability by Ethnicity

White FemalesBlack FemalesHispanic Females
AverageVarianceRangeAverageVarianceRangeAverageVarianceRange
Alabama8.261.227.029.211.517.837.140.916.06
Arizona6.701.306.0810.413.149.487.551.656.86
Arkansas8.420.755.8610.581.187.367.200.555.01
California6.060.794.629.912.127.548.591.596.54
Colorado5.410.523.637.961.125.376.390.724.31
Connecticut4.980.272.706.730.493.655.040.272.73
Delaware7.700.703.379.020.953.927.960.743.46
Distr of Columbia5.730.762.946.310.933.255.370.662.75
Florida7.991.005.5710.091.597.035.000.393.47
Georgia7.461.125.229.171.696.425.870.694.13
Idaho7.320.443.889.890.805.236.940.393.65
Illinois5.700.454.127.950.885.754.650.303.36
Indiana6.390.343.748.960.665.265.040.212.94
Iowa6.060.182.728.920.393.996.820.233.07
Kansas6.190.323.748.010.544.836.030.303.66
Kentucky8.861.396.4310.141.827.376.680.794.85
Louisiana7.880.865.2611.111.717.428.831.085.88
Maine6.790.523.618.380.804.453.740.161.97
Maryland6.310.814.947.761.226.095.410.594.22
Massachusetts5.220.363.197.510.744.615.340.383.29
Michigan6.560.423.8911.181.216.657.520.554.47
Minnesota5.320.262.997.640.534.305.490.273.11
Mississippi9.081.386.4811.322.158.0610.511.857.48
Missouri7.850.805.3510.061.316.847.580.745.14
Montana6.080.333.438.100.584.535.080.232.83
Nebraska5.820.203.437.770.364.635.490.183.24
Nevada6.820.433.3510.831.095.328.570.684.20
New Hampshire5.630.242.999.350.674.965.790.253.08
New Jersey6.000.483.798.090.875.114.780.303.01
New Mexico6.870.885.039.971.857.335.430.553.99
New York5.660.414.087.930.815.735.880.454.26
North Carolina7.200.754.728.661.095.687.850.905.14
North Dakota5.920.243.378.910.545.086.290.273.59
Ohio7.080.624.9210.491.377.296.150.474.27
Oklahoma7.570.654.1810.321.205.746.840.533.80
Oregon6.230.343.389.330.775.076.440.373.47
Pennsylvania6.430.484.108.950.925.735.950.413.82
Rhode Island5.200.292.497.130.553.416.110.402.90
South Carolina7.470.895.148.921.266.136.740.724.62
South Dakota6.960.544.368.950.895.625.530.343.46
Tennessee8.861.227.049.471.397.536.670.695.31
Texas9.121.629.2510.882.3111.065.780.655.88
Utah6.070.555.588.931.198.206.800.696.27
Vermont5.210.182.467.610.383.595.490.202.56
Virginia6.750.975.147.791.285.934.490.433.42
Washington6.280.694.278.521.265.816.110.654.16
West Virginia10.681.346.5213.552.158.327.190.614.41
Wisconsin4.940.183.047.680.444.724.690.172.88
Wyoming6.660.322.998.220.483.685.830.242.62
Alaska7.010.694.048.961.135.176.080.523.51
Hawaii6.720.201.968.460.322.447.120.222.07
US6.892.3511.619.262.9613.426.272.1011.43
ZCTA Cardiovascular Prevalence Estimates for Females: State Averages and Variability by Ethnicity

Conclusion

Geographic variations in the prevalence of chronic disease partly reflect the demographic composition of area populations. However, prevalence variations may also show distinct geographic 'contextual' effects that are differentiated between ethnic and other demographic categories. Studies of cardiovascular disease in the US have found major geographic variations that do not seem to be explicable by area demography alone. The present study has demonstrated by formal modelling methods applied to BRFSS data that improved explanation is obtained by allowing for distinct geographic effects (for counties and states) and for interaction between geographic and person variables. There are significant spatial effects (e.g. county poverty effects, state residual effects) after adjusting for CVD gradients over person level variables, namely age, education, ethnicity. This has direct implications for an appropriate methodology to estimate prevalence at small area level, with the focus here being ZIP Code Tabulation Areas. Thus – on the basis of the model estimates in the above analysis – prevalence estimates for a ZCTA need to reflect its region of location (e.g. in a South East state as opposed to a northern or mountain state) and the poverty level of the county containing it. In methodological terms, this paper is distinct in using a log link multilevel binary regression model that takes account of both person level risk factors and the spatial context for a major chronic disease. The use of a log link allows straightforward inferences on relative risks and potentially allows the incorporation into the model of cumulative prior evidence (e.g. on relative CVD risks over ethnic groups). Statewide contextual effects have been represented by a structured random effect, that allows for spatial correlation in unobserved risk factors but also extends to include spatially isolated areas (see Appendix 2). In an extended model (model 3) state random effects are differentiated by ethnic group, reflecting evidence from other sources that ethnic relativities are not constant geographically. Variations and extensions to the models presented above are possible. One option is state or county averages in the person level variables such as ethnicity and education level (e.g. county percent black or county percent college graduates). This has been proposed as a way of measuring contextual effects [34], though there is likely to be a positive correlation with the already included county poverty rate. Another possibility would be a longitudinal analysis over a sequence of successive surveys, which can indicate whether gradients over person level risk factors are changing, or whether geographic variability is changing.

Appendix 1 Formal statement of model

Let C, Sand Udenote the county, state and (county level) rural-urban category of residence for respondent i. Also let {x, g, e} denote the age, ethnicity and education level of respondent i. Then prevalence models are specific for gender r, and one may write prevalence model 3 (with ethnic-specific state effects) as where Bin(n, p) denotes the binomial density, the parameters {α, β, δ, η, κ } are fixed effects, and the parameters {γ,w} are random. This model is run separately for males and females. Since the parameters operate on the log relative risk scale, state level relative risks by ethnic group ρ(after controlling for known person and county attributes) may be obtained by exponentiating the state effect, namely Thus excess risk or unduly low risk may reflect geographic variations in prevalence that remain even after the impact of a range of important person and county attributes has been allowed for. Excess risk can be defined in terms of the 95% estimation interval for ρbeing confined to values above 1. The baseline model 1 (with person level risk factors only) is The intermediate model (model 2), including county regression terms, and state random effects, but not including area-ethnicity interactions is Thus the unobserved state effects are assumed to be equal across ethnic groups. For the unknown fixed effects parameters, namely {α, β, η} in model 1, and {α, β, η, κ, δ} in models 2 and 3, diffuse normal priors with mean zero and variance 1000 are adopted. Corner constraints are used for the β, ηand δparameters for identifiability, namely β= η= δ= 0. To pool strength across the age pro les of different ethnic groups, a first order random walk prior is used for the G-dimensional vector γ= (γ,.., γ), x = 1,.., X of age effects across G ethnic groups. This has conditional form where the G × G matrix represents covariation between age mortality profiles of ethnic groups. The precision (inverse covariance) matrices Ωare assigned a Wishart prior with identity scale matrix and G degrees of freedom, namely Ω~ Wish(I,G).

Appendix 2 State random effects

The 51 states in the model are the mainland US states (k = 1,.., 49) arranged alphabetically (Alabama to Wyoming, including the District of Columbia), together with Alaska and Hawaii (k = 50, 51). The presence of these two spatially isolated states complicates applications of standard approaches for spatially correlated effects, at least those based on a spatial contiguity matrix. It would still be possible to use a spatial model based on interstate distances, but this means that a spatial decay function in distance has to be specified and its parameters estimated. Here we follow the most common approach to spatial clustering, based on contiguity of areas, with a spatial effect that "should describe the fact that areas close to each other tend to behave similarly" [26]. One option that brings in all 51 states would be to follow the convolution approach of Besag et al [35] and assume there are two effects, one of which follows a conditional autoregressive scheme and applies only to the mainland states (k = 1,.., 49), while the other effect, applying to all 51 states is unstructured in the sense of not incorporating spatial structure. Thus for states k = 1, 49 the total state effect would be where hrepresents spatially unstructured heterogeneity, and wrepresents a conditional autoregressive scheme based on contiguity. The suffix r for gender is omitted for simplicity. Specified conditionally on effects w[-in the remaining 48 states, one has for mainland states k = 1,..,49 where τis a variance parameter, Lis the number of states adjacent to state k, and Wis the average of wover states m = 1,.., Ladjacent to state k. For example, W1 (for Alabama) would be an average of the four w effects for the contiguous states, Mississippi, Georgia, Florida and Tennessee. The prior for the hwould be over all 51 states, rather than the mainland 49 states, and typically specified as where τis a variance parameter. Under this convolution approach, for states 50 and 51 (Alaska and Hawaii) the state effect would consist of honly. While this approach is an option when a collection of areas includes spatial isolates, it is not used here. The problems that occur with the model (A2.1) include identifiability, since only the total h+ wis identified by the data, and the heavy (i.e. non-parsimonious) parameterisation. Leroux et al [36] propose an alternative more parsimonious model that uses a single random effect, with a conditional form where m ~ k denotes states m adjacent to state k. This reduces to a purely spatial model, as in (A2.2), when λ = 1 and to pure heterogeneity (i.e. no spatial clustering) when λ = 0. The λ parameter can be estimated and provides a measure of spatial dependence actually present in the data. Congdon [25] extends model (A2.4) to allow the spatial dependence parameters to vary by area, and a version of such an approach is used in the CVD prevalence modelling here. This extension allows spatial dependence to vary over sub-regions of the total region or nation being considered, and also allows for spatial outliers, distinct from their neighbours in terms of outcome level such as disease risk. Outliers would have relatively low λvalues, since spatial pooling (towards the neighbourhood average) is contra-indicated by the disparity between an area's risk and that of its neighbours. By contrast, areas surrounded by areas with similar levels of the outcome would have relatively high λvalues, since spatial pooling (towards the neighbourhood average) is supported by the data. The conditional specification now takes the form This model for spatial effects adapts to spatial outliers by taking λ= 0, so that for the subset of areas which are not connected to other areas one has This approach extends to a multivariate random effect w= (w,.., w) for G ethnic groups. With a uniform value of λ over areas the conditional mean under the Leroux et al [36] model is with inverse dispersion matrix (precision matrix) where Φ is a symmetric matrix of dimension G. Allowing for varying spatial dependence over the entire region/nation being considered, one has In the application of (A2.5) in model 2, it is assumed that 1/τis gamma distributed a priori, namely 1/τ~Ga(1, 0.001). This is approximately equivalent to assuming 1/τto be uniformly distributed while constrained to positive values. Such a choice of gamma prior for 1/τfollows the strategy of studies such as [35] and [37]. In the application of (A2.9) in model 3, it is assumed that Φ is Wishart distributed, with G degrees of freedom and an identity scale matrix. The varying spatial parameters in models 2 and 3 are assumed to be beta distributed where ν1 and ν2 are positive quantities equal to or exceeding 0.5. Thus ν1 = ν2 = 1 corresponds to a diffuse uniform prior λ~ U(0, 1), while more informative priors are obtained for ν1 > 1 and ν2 > 1. A baseline is provided when ν1 = ν2 = 0.5, equivalent to a prior sample size of 1. It is assumed that ν1 ~ U(0.5, 5) and ν2 ~ U(0.5, 5). The average value of the λover all contiguous states can be obtained as

Competing interests

The author declares that they have no competing interests.
  18 in total

1.  Geographic disparities in heart disease and stroke mortality among black and white populations in the Appalachian region.

Authors:  Joel A Halverson; Elizabeth Barnett; Michele Casper
Journal:  Ethn Dis       Date:  2002       Impact factor: 1.847

2.  The health effects of rural-urban residence and concentrated poverty.

Authors:  Amy H Auchincloss; Wilbur Hadden
Journal:  J Rural Health       Date:  2002       Impact factor: 4.333

3.  Modelling of discrete spatial variation in epidemiology with SAS using GLIMMIX.

Authors:  Søren Rasmussen
Journal:  Comput Methods Programs Biomed       Date:  2004-10       Impact factor: 5.428

4.  The importance of place of residence: examining health in rural and nonrural areas.

Authors:  Mark S Eberhardt; Elsie R Pamuk
Journal:  Am J Public Health       Date:  2004-10       Impact factor: 9.308

5.  Geographic variation in cardiovascular disease mortality in US blacks and whites.

Authors:  L W Pickle; R F Gillum
Journal:  J Natl Med Assoc       Date:  1999-10       Impact factor: 1.798

6.  Discovering the full spectrum of cardiovascular disease: Minority Health Summit 2003: report of the Advocacy Writing Group.

Authors:  Antronette K Yancey; Robert G Robinson; Robert K Ross; Reginald Washington; Heather R Goodell; Norma J Goodwin; Elisabeth R Benjamin; Rosa G Langie; James M Galloway; L Natalie Carroll; B Waine Kong; Christopher J W B Leggett; Richard Allen Williams; Michael J Wong
Journal:  Circulation       Date:  2005-03-15       Impact factor: 29.690

7.  Estimating CHD prevalence by small area: integrating information from health surveys and area mortality.

Authors:  Peter Congdon
Journal:  Health Place       Date:  2007-04-29       Impact factor: 4.078

Review 8.  The conundrum of time trends in stroke.

Authors:  C R Gale; C N Martyn
Journal:  J R Soc Med       Date:  1997-03       Impact factor: 5.344

9.  Cohort differences in obesity-related health indicators among 70-year olds with special reference to gender and education.

Authors:  Claudia Cabrera; Katarina Wilhelmson; Peter Allebeck; Hans Wedel; Bertil Steen; Lauren Lissner
Journal:  Eur J Epidemiol       Date:  2003       Impact factor: 8.082

10.  On the use of ZIP codes and ZIP code tabulation areas (ZCTAs) for the spatial analysis of epidemiological data.

Authors:  Tony H Grubesic; Timothy C Matisziw
Journal:  Int J Health Geogr       Date:  2006-12-13       Impact factor: 3.918

View more
  13 in total

1.  The effects of rurality on mental and physical health.

Authors:  Steven Stern; Elizabeth Merwin; Emily Hauenstein; Ivora Hinton; Virginia Rovnyak; Melvin Wilson; Ishan Williams; Irma Mahone
Journal:  Health Serv Outcomes Res Methodol       Date:  2010-08-06

2.  Validation of multilevel regression and poststratification methodology for small area estimation of health indicators from the behavioral risk factor surveillance system.

Authors:  Xingyou Zhang; James B Holt; Shumei Yun; Hua Lu; Kurt J Greenlund; Janet B Croft
Journal:  Am J Epidemiol       Date:  2015-05-07       Impact factor: 4.897

3.  Constructing Statistical Intervals for Small Area Estimates Based on Generalized Linear Mixed Model in Health Surveys.

Authors:  Yan Wang; Xingyou Zhang; Hua Lu; Janet B Croft; Kurt J Greenlund
Journal:  Open J Stat       Date:  2022

4.  County-level estimates of human papillomavirus vaccine coverage among young adult women in Texas, 2008.

Authors:  Jan M Eberth; Xingyou Zhang; Monir Hossain; Jasmin A Tiro; James B Holt; Sally W Vernon
Journal:  Tex Public Health J       Date:  2013-01

5.  Human papillomavirus vaccine coverage among females aged 11 to 17 in Texas counties: an application of multilevel, small area estimation.

Authors:  Jan M Eberth; Md Monir Hossain; Jasmin A Tiro; Xingyou Zhang; James B Holt; Sally W Vernon
Journal:  Womens Health Issues       Date:  2013 Mar-Apr

6.  Spatial clustering of non-transported cardiac decedents: the results of a point pattern analysis and an inquiry into social environmental correlates.

Authors:  Elizabeth Barnett Pathak; Steven Reader; Jean Paul Tanner; Michele L Casper
Journal:  Int J Health Geogr       Date:  2011-07-28       Impact factor: 3.918

7.  Overcoming data challenges examining oral health disparities in appalachia.

Authors:  Denise D Krause; Warren L May; Jeralynne S Cossman
Journal:  Online J Public Health Inform       Date:  2012-12-19

8.  A multilevel approach to estimating small area childhood obesity prevalence at the census block-group level.

Authors:  Xingyou Zhang; Stephen Onufrak; James B Holt; Janet B Croft
Journal:  Prev Chronic Dis       Date:  2013-05-02       Impact factor: 2.830

9.  Contribution of individual risk factor changes to reductions in population absolute cardiovascular risk.

Authors:  Thomas Cochrane; Rachel Davey; Christopher Gidlow; Zafar Iqbal; Jagdish Kumar; Yvonne Mawby; Ruth Chambers
Journal:  Biomed Res Int       Date:  2014-06-05       Impact factor: 3.411

10.  The geography of diabetes by census tract in a large sample of insured adults in King County, Washington, 2005-2006.

Authors:  Adam Drewnowski; Colin D Rehm; Anne V Moudon; David Arterburn
Journal:  Prev Chronic Dis       Date:  2014-07-24       Impact factor: 2.830

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.