| Literature DB >> 18215308 |
Kevin A Henry1, Francis P Boscoe.
Abstract
BACKGROUND: To reduce the number of non-geocoded cases researchers and organizations sometimes include cases geocoded to postal code centroids along with cases geocoded with the greater precision of a full street address. Some analysts then use the postal code to assign information to the cases from finer-level geographies such as a census tract. Assignment is commonly completed using either a postal centroid or by a geographical imputation method which assigns a location by using both the demographic characteristics of the case and the population characteristics of the postal delivery area. To date no systematic evaluation of geographical imputation methods ("geo-imputation") has been completed. The objective of this study was to determine the accuracy of census tract assignment using geo-imputation.Entities:
Mesh:
Year: 2008 PMID: 18215308 PMCID: PMC2266732 DOI: 10.1186/1476-072X-7-3
Source DB: PubMed Journal: Int J Health Geogr ISSN: 1476-072X Impact factor: 3.918
Summary of population subsets used in study.
| Population Subset | Cases |
| Total Population | 95,303 |
| Non-Hispanic White | 75,577 |
| Asian | 2,235 |
| Non-Hispanic Black | 11,123 |
| Hispanic | 6,368 |
| Non-Hispanic White | |
| 20–49 | 8,597 |
| 50–64 | 22,191 |
| 65–84 | 39,388 |
| >85 | 5,401 |
| Asian | |
| 20–49 | 563 |
| 50–64 | 852 |
| 65–84 | 773 |
| >85 | 47 |
| Non-Hispanic Black | |
| 20–49 | 1,502 |
| 50–64 | 4,157 |
| 65–84 | 5,032 |
| >85 | 432 |
| Hispanic | |
| 20–49 | 1,269 |
| 50–64 | 2,252 |
| 65–84 | 2,605 |
| >85 | 242 |
Source: Data from the New Jersey State Cancer Registry, New Jersey Department of Health and Senior Services, 2005
Figure 1Census block centroid populations are used to calculate the proportion of census tract populations which fall within the boundaries of ZIP codes. For example, the portion of census tract 1811.00 within ZIP code 07524 receives only 3,101 individuals of the total census tract population of 6,774.
Figure 2Procedures used for geo-imputation.
Summary of results from different census tract assignment methods (ZIP code centroids and geo-imputation).
| Population | Cases | Randoma | Geographic Centroidb | Population Centroidc | Overall Population Distribution | Population Distribution by Race/Ethnicity | Population Distribution by Race/Ethnicity-Age |
| N | Mean % (Min, Max %) | Total % | Total % | Mean % (Min, Max %) | Mean % (Min, Max %) | Mean % (Min, Max %) | |
| All Cases | 95,303 | 14.1 (13.5,14.7) | 20.7 | 25.9 | 25.8 (25.3, 26.3) | 26.7 (26.0, 27.4) | 27.7 (26.5, 29.0) |
| Non-Hispanic White | 75,577 | 15.0 (14.6, 15.4) | 22.0 | 27.9 | 27.8 (27.4, 28.1) | 28.2 (27.7, 28.7) | 29.3 (28.4, 30.3) |
| Non-Hispanic Black | 11,123 | 9.8 (9.0, 10.8) | 15.3 | 16.6 | 16.5 (15.7, 17.4) | 20.1 (18.7, 21.0) | 21.3 (19.5, 23.2) |
| Hispanic | 6,368 | 11.1 (10.0, 12.4) | 16.3 | 19.2 | 18.8 (17.4, 20.0) | 19.8 (18.7, 21.3) | 20.2 (17.5, 22.7) |
| Asian | 2,235 | 13.7 (11.6, 15.8) | 17.3 | 25.4 | 26.0 (24.2, 28.5) | 28.8 (26.4, 31.5) | 28.8 (24.5, 33.5) |
| Age | |||||||
| 20–49 | 11,931 | 14.2 (13.4, 15.2) | 22.0 | 26.5 | 26.5 (26.4, 27.4) | 27.5 (26.4, 28.4) | 27.7 (25.6, 29.5) |
| 50–64 | 29,452 | 14.2 (13.6, 14.8) | 21.1 | 26.4 | 26.2 (25.6, 26.6) | 27.0 (26.2, 27.3) | 27.4 (26.2, 28.6) |
| 65–84 | 47,798 | 14.1 (13.6, 14.5) | 20.3 | 25.4 | 25.6 (25.1, 26.0) | 26.5 (26.1, 27.2) | 27.8 (26.8, 28.8) |
| >85 | 6,122 | 13.9 (12.8, 15.3) | 19.4 | 25.9 | 25.2 (23.9, 26.4) | 25.9 (24.1, 27.4) | 29.1 (26.9, 31.3) |
| Population Density | |||||||
| <1,132 | 17,262 | 16.5 (15.9, 17.2) | 29.6 | 29.6 | 30.8 (30.1, 31.2) | 31.4 (30.3, 32.1) | 31.8 (30.5, 33.1) |
| 1,133–2,882 | 23,036 | 15.3 (14.9, 15.8) | 24.1 | 29.0 | 30.2 (29.7, 31.0) | 30.8 (30.2, 31.2) | 32.4 (31.5, 33.4) |
| 2,883–5,078 | 21,744 | 14.1 (13.4, 14.6) | 19.2 | 27.1 | 25.6 (24.9, 26.2) | 26.4 (26.7, 27.4) | 27.8 (26.7, 28.7) |
| 5,079–11,579 | 18,822 | 14.7 (13.9, 15.5) | 17.1 | 26.3 | 24.9 (24.8, 25.3) | 26.4 (25.5, 27.2) | 26.8 (25.3, 28.2) |
| >11,579 | 14,439 | 8.6 (7.9, 9.3) | 12.0 | 14.3 | 14.3 (13.6, 15.0) | 15.5 (14.3, 16.2) | 16.3 (14.9, 17.6) |
aRandom assignment was based on census tracts having equal probability within each postal ZIP code.
bGeographic postal ZIP centroids were based on the center of the bounding rectangle for each ZIP code area.
cPopulation weighted postal ZIP centroids were calculated as the mean of the census block centroid coordinates weighted by the number of persons per block.
Source: Data from the New Jersey State Cancer Registry, New Jersey Department of Health and Senior Services, 2005.
Figure 3Mean percent of correct census tract matches using geo-imputation based on the overall population distribution, and population distributions by race/ethnicity, and race/ethnicity-age.
Figure 4Mean percent of correct census tract matches by geo-imputation based on race/ethnicity/age groups.