| Literature DB >> 25723176 |
Seth E Spielman1, David C Folch2.
Abstract
The American Community Survey (ACS) is the largest survey of US households and is the principal source for neighborhood scale information about the US population and economy. The ACS is used to allocate billions in federal spending and is a critical input to social scientific research in the US. However, estimates from the ACS can be highly unreliable. For example, in over 72% of census tracts, the estimated number of children under 5 in poverty has a margin of error greater than the estimate. Uncertainty of this magnitude complicates the use of social data in policy making, research, and governance. This article presents a heuristic spatial optimization algorithm that is capable of reducing the margins of error in survey data via the creation of new composite geographies, a process called regionalization. Regionalization is a complex combinatorial problem. Here rather than focusing on the technical aspects of regionalization we demonstrate how to use a purpose built open source regionalization algorithm to process survey data in order to reduce the margins of error to a user-specified threshold.Entities:
Mesh:
Year: 2015 PMID: 25723176 PMCID: PMC4344219 DOI: 10.1371/journal.pone.0115626
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1An illustration of “combining geographic areas” strategy.
Estimates and Uncertainty for Selected Census Tracts, Poverty Scenario for Chicago MSA (Cook County).
| 222600 | 222700 | 222800 | 222900 | |
|---|---|---|---|---|
| Housing cost as share of income (owners) | ||||
| Estimate | 28.9% | 28.4% | 63.3% | 46.5% |
| MOE | 15.7% | 10.2% | 114.5% | 19.7% |
| SE | 9.6% | 6.2% | 69.6% | 12.0% |
| CV | 0.331 | 0.218 | 1.099 | 0.257 |
| Housing cost as share of income (renters) | ||||
| Estimate | 32% | 38.8% | 39.2% | 48.3% |
| MOE | 9.1% | 15.4% | 15.6% | 23.1% |
| SE | 5.5% | 9.4% | 9.5% | 14.0% |
| CV | 0.173 | 0.242 | 0.243 | 0.291 |
| Children above poverty | ||||
| Estimate | 89.7% | 53% | 33.7% | 59.2% |
| MOE | 19.5% | 26.8% | 29.9% | 18.2% |
| SE | 11.8% | 16.3% | 18.2% | 11.1% |
| CV | 0.132 | 0.308 | 0.54 | 0.187 |
| Total above poverty | ||||
| Estimate | 78.7% | 74.7% | 61.9% | 64.6% |
| MOE | 10.2% | 7.5% | 21% | 14.8% |
| SE | 6.2% | 4.5% | 12.8% | 9.0% |
| CV | 0.078 | 0.061 | 0.207 | 0.139 |
| Percent employed | ||||
| Estimate | 94.9% | 78.5% | 88.9% | 87.7% |
| MOE | 3.6% | 8.5% | 6.2% | 31.0% |
| SE | 2.2% | 5.2% | 3.8% | 18.8% |
| CV | 0.023 | 0.066 | 0.042 | 0.215 |
Attribute and Scenario Summary.
| General | Poverty | Transportation | Housing | |
|---|---|---|---|---|
| Average number of rooms | X | X | ||
| Average household income | X | |||
| Persons per housing unit | X | |||
| Percent occupied | X | X | ||
| Percent married | X | |||
| Percent bachelor’s degree or higher | X | |||
| Percent same housing unit last year | X | |||
| Percent white | X | |||
| Percent black | X | |||
| Percent Hispanic | X | |||
| Percent under 18 | X | |||
| Percent 65 and older | X | |||
| Housing cost as share of income (owners) | X | |||
| Housing cost as share of income (renters) | X | |||
| Percent of children above poverty | X | |||
| Percent of population above poverty | X | |||
| Percent employed | X | |||
| Vehicles per person | X | |||
| Average commute time | X | |||
| Percent drove alone | X | |||
| Percent transit | X | |||
| Average home value (owners) | X | |||
| Average rent | X | |||
| Percent owner occupied | X | |||
| Percent renter occupied | X | |||
| Percent single family housing unit | X |
Metropolitan Statistical Areas Studied.
| MSA | Census Division | 2010 Population | 2000–2010 Population Change |
|---|---|---|---|
| Atlanta GA | South Atlantic | 5,268,860 | 24.0% |
| Austin TX | West South Central | 1,716,289 | 37.3% |
| Birmingham AL | East South Central | 1,128,047 | 7.2% |
| Boston MA | New England | 4,552,402 | 3.7% |
| Buffalo NY | Middle Atlantic | 1,135,509 | −3.0% |
| Chicago IL | East North Central | 9,461,105 | 4.0% |
| Cleveland OH | East North Central | 2,077,240 | −3.3% |
| Hartford CT | New England | 1,212,381 | 5.6% |
| Kansas City MO | West North Central | 2,035,334 | 10.9% |
| Los Angeles CA | Pacific | 12,828,837 | 3.7% |
| Minneapolis MN | West North Central | 3,279,833 | 10.5% |
| Nashville TN | East South Central | 1,589,934 | 21.2% |
| Oklahoma City OK | West South Central | 1,252,987 | 14.4% |
| Phoenix AZ | Mountain | 4,192,887 | 28.9% |
| Pittsburgh PA | Middle Atlantic | 2,356,285 | −3.1% |
| Portland OR | Pacific | 2,226,009 | 15.5% |
| Salt Lake City UT | Mountain | 1,124,197 | 16.0% |
| Washington DC | South Atlantic | 5,582,170 | 16.4% |
Note: Population change measured using 2009 MSA definitions.
Fig 2Maps of Regionalization input and outputs for Washington, DC.
Fig 3Region and tract boundaries in central Washington, DC.
Background Image Source: Stamen Design/Open Street Map.
Fig 4Chicago Diagnostic Plots.
Number of Tracts Meeting Uncertainty Threshold (CV = 0.12), Poverty Scenario for Chicago MSA.
| Number of Attributes | Number of Tracts |
|---|---|
| 0 | 32 |
| 1 | 164 |
| 2 | 645 |
| 3 | 698 |
| 4 | 593 |
| 5 | 78 |
| Total | 2,210 |
Number of Tracts Meeting Uncertainty Threshold (CV = 0.12), Poverty Scenario for Chicago MSA.
| Attribute | Number of Tracts |
|---|---|
| Housing cost as share of income (owners) | 800 |
| Housing cost as share of income (renters) | 194 |
| Children above poverty | 1,248 |
| Total above poverty | 2,056 |
| Percent employed | 2,012 |
Regionalization Results Summary, Variation in Maximum CV Value, Poverty Scenario for Chicago MSA.
| Maximum CV |
| Number of Regions | Areas Per Region |
|---|---|---|---|
| 0.05 | 0.785 | 51 | 43.196 |
| 0.10 | 0.823 | 193 | 11.415 |
| 0.15 | 0.846 | 393 | 5.606 |
| 0.20 | 0.875 | 639 | 3.448 |
| 0.40 | 0.925 | 1573 | 1.401 |
Accuracy Results by Attribute (S ), Variation in Number of Attributes, Poverty Scenario for Chicago MSA.
| Number of Attributes | |||||
|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | |
| Percent employed | 0.991 | 0.980 | 0.928 | 0.887 | 0.832 |
| Total above poverty | 0.968 | 0.887 | 0.845 | 0.758 | |
| Children above poverty | 0.915 | 0.879 | 0.814 | ||
| Housing cost as share of income (owners) | 0.922 | 0.882 | |||
| Housing cost as share of income (renters) | 0.897 | ||||
Regionalization Results Summary, Variation in Number of Attributes, Poverty Scenario for Chicago MSA.
Note that S is the average of the columns in Table 7
| Number of Attributes |
| Number of Regions | Areas Per Region |
|---|---|---|---|
| 1 | 0.991 | 2021 | 1.090 |
| 2 | 0.974 | 1950 | 1.130 |
| 3 | 0.910 | 1346 | 1.637 |
| 4 | 0.883 | 923 | 2.387 |
| 5 | 0.836 | 256 | 8.605 |
Fig 5Regionalization Results Diagnostics, 18 MSAs and 4 Scenarios.