| Literature DB >> 32870312 |
Mary G Krauland1,2, Robert J Frankeny2, Josh Lewis3, LuAnn Brink3, Eric G Hulsey3, Mark S Roberts1,2, Karen A Hacker1,3.
Abstract
Importance: Evaluating the association of social determinants of health with chronic diseases at the population level requires access to individual-level factors associated with disease, which are rarely available for large populations. Synthetic populations are a possible alternative for this purpose. Objective: To construct and validate a synthetic population that statistically mimics the characteristics and spatial disease distribution of a real population, using real and synthetic data. Design, Setting, and Participants: This population-based decision analytical model used data for Allegheny County, Pennsylvania, collected from January 2015 to December 2016, to build a semisynthetic population based on the synthetic population used by the modeling and simulation platform FRED (A Framework for Reconstructing Epidemiological Dynamics). Disease status was assigned to this population using health insurer claims data from the 3 major insurance providers in the county or from the National Health and Nutrition Examination Survey. Biological, social, and other variables were also obtained from the National Health Interview Survey, Allegheny County, and public databases. Data analysis was performed from November 2016 to February 2020. Exposures: Risk of cardiovascular disease (CVD) death. Main Outcomes and Measures: Difference between expected and observed CVD death risk. A validated risk equation was used to estimate CVD death risk.Entities:
Year: 2020 PMID: 32870312 PMCID: PMC7489828 DOI: 10.1001/jamanetworkopen.2020.15047
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Flowchart for Population Creation and Estimation of Cardiovascular Disease (CVD) Deaths
FRED indicates A Framework for Reconstructing Epidemiological Dynamics; NHANES, National Health and Nutrition Examination Survey.
Characteristics of Synthetic Population
| Characteristic | Population values, No. (%) | |
|---|---|---|
| Synthetic population | 2010 Census population | |
| Total population, No. | 1 188 112 | 1 223 348 |
| Age range, y | ||
| 0-4 | 63 016 (5.3) | 63 614 (5.2) |
| 5-9 | 67 435 (5.7) | 61 167 (5.0) |
| 10-14 | 71 306 (6.0) | 70 954 (5.8) |
| 15-19 | 72 376 (6.1) | 79 518 (6.5) |
| 20-24 | 75 512 (6.4) | 89 304 (7.3) |
| 25-29 | 77 178 (6.5) | 84 411 (6.9) |
| 30-34 | 68 702 (5.8) | 72 177 (5.9) |
| 35-39 | 71 183 (6.0) | 68 507 (5.6) |
| 40-44 | 76 275 (6.4) | 77 070 (6.3) |
| 45-49 | 89 920 (7.6) | 88 081 (7.2) |
| 50-54 | 96 528 (8.1) | 97 868 (8.0) |
| 55-59 | 88 166 (7.4) | 89 304 (7.3) |
| 60-64 | 70 504 (5.9) | 74 624 (6.1) |
| 65-69 | 53 524 (4.5) | 52 604 (4.3) |
| 70-74 | 41 771 (3.5) | 42 817 (3.5) |
| 75-79 | 41 026 (3.5) | 37 924 (3.1) |
| 80-84 | 33 856 (2.9) | 35 477 (2.9) |
| ≥85 | 29 822 (2.5) | 35 477 (2.9) |
| % Male sex in total population | 47.6 | 47.8 |
| Population count per census tract | 120-14222 | NA |
| Range per census tract (mean [SD]) | ||
| Age, y | 23.9-64.4 (42.7 [6.8]) | 40.5 |
| % Male sex by census tract | 21.9-73.9 (45.4 [6.1]) | 49.2 |
| % Smoker | 9.2-21.1 (15.5 [5.0]) | 15.9 |
| % With diabetes | 1.4-69.2 (15.3 [14.8]) | 10 |
| Range per census tract (mean [SD]) {total population mean} | ||
| Total cholesterol level, mg/dL | 167.1-194.7 (183.4 [4.4]) {189.36} | 189 |
| Blood pressure, mm Hg | 112.7-133.2 (121.0 [3.9]) {122.6} | 122 |
| HDL-C level, mg/dL | 48.8-57.7 (52.2 [1.4]) {52.17} | 40-59 |
Abbreviations: FRED, A Framework for Reconstructing Epidemiological Dynamics; HDL-C, high-density lipoprotein cholesterol; NA, not applicable; NHANES, National Health and Nutrition Examination Survey.
SI conversion factors: To convert HDL-C and total cholesterol to millimoles per liter, multiply by 0.0259.
Values are mean ranges per census tract, except those values that are specific to the total population. Population means are given for reference when applicable and are specific to Allegheny County, Pennsylvania, when such data were available.
Data from US Census Bureau.[30]
Data from synthetic population used by FRED.
Group quarters included 38 305 individuals and were not used in this study.
Percentage of male individuals in the US; data from Howden et al.[31]
Data from Data USA.[32]
Mean of values assigned by matching to an individual selected from NHANES based on demographic characteristics.
Percentage of smokers in Pennsylvania; data from Centers for Disease Control and Prevention.[33]
Data from claims collected for the Data Across Sectors for Health project combined with values assigned using the NHANES for population outside of that covered by claims data.
Data from Allegheny County Health Department.[34]
Data from Miller.[35]
Data from Wright et al.[36]
Data from Lab Tests Online.[37]
Figure 2. Expected, Observed, and Difference in per Census Tract 4-Year Cardiovascular Disease (CVD) Death Risk per 100 000 in Allegheny County, Pennsylvania
In panels A and B, the scale refers to number of deaths per 100 000 persons per census tract. The gray areas were not included in the analysis because the population was too small for personally nonidentifiable values to be provided, so most data were missing. In panels C to E, green indicates less observed vs expected CVD death risk, and red indicates greater observed vs expected CVD death risk. The scales show excess deaths over expected deaths per 100 000 persons per census tract; negative values indicate greater risk, and positive values indicate less risk.
Correlation Between Expected and Observed Cardiovascular Disease Death Risk and Social and Biological Variables
| Determinant | Univariate regression result | Income model | Income and education model | Biological model | Combined social and biological model | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Regression slope (95% CI) | Regression slope (95% CI) | Regression slope (95% CI) | Regression slope (95% CI) | Regression slope (95% CI) | ||||||
| Percentage of households receiving food stamps | −0.02 (−0.02 to −0.01) | <.001 | −0.02 (−0.02 to −0.01) | <.001 | −0.01 (−0.02 to −0.01) | <.001 | NA | NA | −0.02 (−0.02 to −0.01) | <.001 |
| Percentage without jobs | −0.02 (−0.03 to −0.016) | <.001 | 0.02 (0.01 to 0.03) | .0016 | 0.02 (0.01 to 0.03) | <.001 | NA | NA | 0.02 (0.012 to 0.035) | <.001 |
| Median household income | 1.0 × 10−7 −(8.0 × 10−8 to 1.20 × 10−7) | <.001 | 5.2 × 10−8(2.5 × 10−8 to 7.9 × 10−7) | <.001 | 3.6 × 10−8 (7.1 × 10−9 to 6.5 × 10−8) | <.0145 | NA | NA | −2.6 × 10−10 (−3.1 × 10−8 to 3.1 × 10−8) | .99 |
| Percentage with high school educational level | 0.05 (0.04 to 0.06) | <.001 | NA | NA | 0.03 (0.01 to 0.04) | <.001 | NA | NA | 0.02 (0.01 to 0.03) | .006 |
| Percentage with diabetes | NA | NA | NA | NA | NA | NA | −0.03 (−0.05 to −0.01) | .002 | −0.01 (−0.03 to 0.01) | .36 |
| Percentage with hyperlipidemia | NA | NA | NA | NA | NA | NA | 0.04 (0.03 to 0.06) | <.001 | 0.02 (0.004 to 0.037) | .02 |
| Percentage with hypertension, | NA | NA | NA | NA | NA | NA | −0.03 (−0.05 to −0.01) | .007 | −0.02 (−0.044 to −0.001) | .04 |
| High level of particulate matter in the census tract | −0.005 (−0.008 to −0.003) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage with house in poor condition | −0.05 (−0.06 to −0.03) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage of households renting | −0.007 (−0.01 to −0.005) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage with college education | 0.012 (0.010 to 0.015) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Individuals living below federal poverty level | −2 × 10−3 (−5 × 10−3 to 1 × 10−3) | .22 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage without insurance | −0.04 (−0.05 to −0.03) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage of vacant houses in the census tract | −0.04 (−0.05 to −0.03) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Neighborhood walk score | −1 × 10−5 (−4 × 10−5 to 1 × 10−5) | .28 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage of households living below federal poverty level | −0.016 (−0.020 to −0.013) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Poverty index | −8 × 10−4 (−9 × 10−4 to −6 × 10−4) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| No. of supermarkets per census tract | 5 × 10−4 (−2 × 10−4 to 1.3 × 10−3) | .16 | NA | NA | NA | NA | NA | NA | NA | NA |
| No. of fast food restaurants per census tract | 1 × 10−4 (2 × 10−5 to 2.5 × 10−4) | .02 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage of households with no vehicle | −0.02 (−0.022 to −0.013) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Homicide counts per census tract | −3 × 10−4 (−5 × 10−4 to −2 × 10−4) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
| Median age | −8 × 10−5 (−2 × 10−4 to −9 × 10−7) | .05 | NA | NA | NA | NA | NA | NA | NA | NA |
| Percentage with obesity | −0.02 (−0.021 to −0.01) | <.001 | NA | NA | NA | NA | NA | NA | NA | NA |
Abbreviation: NA, not applicable.
Figure 3. Plot of Difference in 4-Year Cardiovascular Disease (CVD) Death Risk by Mean Rank of Social Determinants per Census Tract
Census tracts (circles) were ranked for level of each social determinant, and mean ranks were calculated for each census tract to get an overall ranking. Twenty census tracts with greatest excess in CVD death risk are plotted in orange. Social determinants associated with CVD death risk include percentage of high school graduates, percentage with a college degree, percentage with food stamps, percentage living below federal poverty level, percentage with obesity, median income, percentage of households with no vehicle, percentage without jobs, percentage without insurance, and percentage of vacant houses in the census tract. Regression line is in orange.