| Literature DB >> 31061781 |
Quynh C Nguyen1, Sahil Khanna2, Pallavi Dwivedi1, Dina Huang1, Yuru Huang1, Tolga Tasdizen3, Kimberly D Brunisholz4, Feifei Li5, Wyatt Gorman6, Thu T Nguyen7, Chengsheng Jiang8.
Abstract
Neighborhood attributes have been shown to influence health, but advances in neighborhood research has been constrained by the lack of neighborhood data for many geographical areas and few neighborhood studies examine features of nonmetropolitan locations. We leveraged a massive source of Google Street View (GSV) images and computer vision to automatically characterize national neighborhood built environments. Using road network data and Google Street View API, from December 15, 2017-May 14, 2018 we retrieved over 16 million GSV images of street intersections across the United States. Computer vision was applied to label each image. We implemented regression models to estimate associations between built environments and county health outcomes, controlling for county-level demographics, economics, and population density. At the county level, greater presence of highways was related to lower chronic diseases and premature mortality. Areas characterized by street view images as 'rural' (having limited infrastructure) had higher obesity, diabetes, fair/poor self-rated health, premature mortality, physical distress, physical inactivity and teen birth rates but lower rates of excessive drinking. Analyses at the census tract level for 500 cities revealed similar adverse associations as was seen at the county level for neighborhood indicators of less urban development. Possible mechanisms include the greater abundance of services and facilities found in more developed areas with roads, enabling access to places and resources for promoting health. GSV images represents an underutilized resource for building national data on neighborhoods and examining the influence of built environments on community health outcomes across the United States.Entities:
Keywords: Built environment; Computer vision systems; Geographic information system; Google Street View; Health; Neighborhood; Rural
Year: 2019 PMID: 31061781 PMCID: PMC6488538 DOI: 10.1016/j.pmedr.2019.100859
Source DB: PubMed Journal: Prev Med Rep ISSN: 2211-3355
Fig. 1Percent of street intersections images with highways, by county.
Data source: Google Street View images.
Fig. 2Percent of street intersection images with rural area, by county.
Data source: Google Street View images.
Fig. 3Percent of street intersection images with grassland, by county.
Data source: Google Street View images.
Descriptive characteristics of Google Street View-derived built environment characteristics.
| Google Street View images | County summaries | |||
|---|---|---|---|---|
| N | Percent (standard deviation) | N | Percent (standard deviation) | |
| Highway | 16,172,373 | 11.36 (31.73) | 2144 | 18.41 (14.31) |
| Rural area | 16,172,373 | 14.23 (34.93) | 2144 | 22.99 (16.95) |
| Grassland | 16,172,373 | 5.49 (22.78) | 2144 | 14.47 (18.23) |
Neighborhood characteristics derived from street images collected between December 2017–April 2018 from Google's Street View Image API.
Google Street View-derived predictors of county health outcomesa.
| Percent with fair/poor health | Percent with diabetes | Percent with obesity | Years of potential life lost (per 100,000 people) | Percent with physical distress | Percent with mental distress | |
|---|---|---|---|---|---|---|
| Prevalence difference | Prevalence difference | Prevalence difference | Prevalence difference | Prevalence difference | Prevalence difference | |
| Indicator of greater development | ||||||
| Highway | ||||||
| 3rd tertile (highest) | −0.51 (−0.74, −0.29) | −0.64 (−0.82, −0.46) | −0.10 (−0.48, 0.29) | −452.07 (−626.65, −277.50) | −0.27 (−0.40, −0.15) | −0.36 (−0.47, −0.26) |
| 2nd tertile | −0.14 (−0.36, 0.07) | −0.23 (−0.40, −0.06) | −0.07 (−0.42, 0.28) | −190.21 (−340.13, −40.30) | −0.07 (−0.19, 0.05) | −0.13 (−0.23, −0.02) |
| Indicators of less development | ||||||
| Rural area | ||||||
| 3rd tertile (highest) | 0.79 (0.55, 1.03) | 0.68 (0.50, 0.87) | 1.85 (1.44, 2.25) | 270.18 (77.13, 463.23) | 0.26 (0.13, 0.39) | 0.10 (−0.02, 0.22) |
| 2nd tertile | 0.44 (0.20, 0.67) | 0.37 (0.18, 0.55) | 1.35 (0.95, 1.74) | 73.11 (−107.02, 253.25) | 0.13 (0.01, 0.26) | −0.01 (−0.12, 0.10) |
| Grassland | ||||||
| 3rd tertile (highest) | 0.10 (−0.14, 0.34) | −0.12 (−0.32, 0.08) | 1.41 (0.99, 1.82) | −24.69 (−202.35, 152.98) | −0.24 (−0.38, −0.11) | −0.48 (−0.60, −0.36) |
| 2nd tertile | 0.18 (−0.06, 0.42) | 0.14 (−0.04, 0.33) | 1.07 (0.69, 1.45) | −38.24 (−201.87, 125.38) | −0.01 (−0.14, 0.12) | −0.12 (−0.23, −0.01) |
| N | 2108 | 2108 | 2108 | 2074 | 2044 | 2044 |
County built environment characteristics categorized into tertiles with the lowest tertile serving as the referent group. Adjusted linear regression models were run for each predictor and outcome separately. Models controlled for county-level demographics: county-level demographics: percent <18 years old, percent 65 years and older, percent Hispanic, percent non-Hispanic black, percent non-Hispanic Asian, percent American Indian/Alaska Native, economic disadvantage, percent not proficient in English, and population density. Robust standard errors reported.
p < 0.05.
Google Street View-derived predictors of county behavioral health outcomesa.
| Physical inactivity | Teen births | Excessive drinking | |
|---|---|---|---|
| Prevalence difference | Prevalence difference | Prevalence difference | |
| Indicator of greater development | |||
| Highway | |||
| 3rd tertile (highest) | −0.99 (−1.41, −0.56) | −2.20 (−3.19, −1.21) | 0.81 (0.54, 1.08) |
| 2nd tertile | −0.26 (−0.68, 0.15) | −0.54 (−1.52, 0.44) | 0.14 (−0.10, 0.39) |
| Indicators of less development | |||
| Rural area | |||
| 3rd tertile (highest) | 2.57 (2.09, 3.05) | 2.88 (1.77, 4.00) | −0.36 (−0.65, −0.06) |
| 2nd tertile | 1.40 (0.95, 1.85) | 2.00 (0.92, 3.08) | 0.05 (−0.23, 0.33) |
| Grassland | |||
| 3rd tertile (highest) | 1.47 (0.98, 1.95) | 1.19 (0.10, 2.28) | 0.28 (−0.01, 0.56) |
| 2nd tertile | 1.23 (0.78, 1.68) | 0.86 (−0.14, 1.86) | 0.09 (−0.17, 0.36) |
| N | 2108 | 2044 | 2108 |
County built environment characteristics categorized into tertiles with the lowest tertile serving as the referent group. Adjusted linear regression models were run for each predictor and outcome separately. Models controlled for county-level demographics: county-level demographics: percent <18 years old, percent 65 years and older, percent Hispanic, percent non-Hispanic black, percent non-Hispanic Asian, percent American Indian/Alaska Native, economic disadvantage, percent not proficient in English, and population density. Robust standard errors reported.
p < 0.05.
Google Street View derived rural area (limited infrastructure) as a predictor of county health care access and exercise opportunities.
| Rural area | Primary care physician rate | Exercise opportunities |
|---|---|---|
| Prevalence difference | Prevalence difference | |
| 3rd tertile (highest) | −13.96 (−17.89, −10.03) | −9.39 (−11.73, −7.06) |
| 2nd tertile | −8.69 (−12.35, −5.03) | −4.86 (−7.04, −2.69) |
| N | 2022 | 2108 |
Primary care physician = primary care physicians per 100,000 population, 2015.
Exercise opportunities = percent of the population with access to places for physical activity. Access was defined for urban census blocks as living within half a mile from a park or a mile from a recreational facility and defined for rural census blocks as living within 3 miles from a recreational facility, 2016.
County rural area indicator categorized into tertiles, with the lowest tertile serving as the referent group. Adjusted linear regression models were run for each predictor and outcome separately. Models controlled for county-level demographics: county-level demographics: percent <18 years old, percent 65 years and older, percent Hispanic, percent non-Hispanic black, percent non-Hispanic Asian, percent American Indian/Alaska Native, economic disadvantage, percent not proficient in English, and population density. Robust standard errors reported.
p < 0.05.
Google Street View-derived predictors of census tract health outcomesa.
| Obesity | Diabetes | Physical distress | Mental distress | Physical inactivity | Binge drinking | Limited access to healthy food | Dental care | |
|---|---|---|---|---|---|---|---|---|
| Prevalence difference (95% CI) | Prevalence difference (95% CI) | Prevalence difference (95% CI) | Prevalence difference (95% CI) | Prevalence difference (95% CI) | Prevalence difference (95% CI) | Prevalence difference (95% CI) | Prevalence difference (95% CI) | |
| Google Street View rural area | ||||||||
| 3rd tertile (highest) | 4.80 (4.48, 5.12) | 1.28 (1.11, 1.44) | 1.70 (1.53, 1.86) | 1.42 (1.29, 1.55) | 4.84 (4.50, 5.18) | −1.88 (−2.13, −1.63) | 34.48 (32.78, 36.17) | −5.55 (−5.93, −5.17) |
| 2nd tertile | 3.39 (3.20, 3.58) | 0.76 (0.68, 0.83) | 0.81 (0.72, 0.90) | 0.41 (0.34, 0.48) | 2.77 (2.57, 2.98) | −1.38 (−1.51, −1.25) | 23.10 (21.76, 24.45) | −2.78 (−2.99, −2.57) |
| Census derived | ||||||||
| Population density | ||||||||
| 1st tertile (lowest) | 2.82 (2.56, 3.07) | 0.54 (0.42, 0.67) | 0.81 (0.69, 0.94) | 0.72 (0.62, 0.81) | 2.36 (2.08, 2.65) | −1.04 (−1.21, −0.87) | 36.82 (35.76, 37.88) | −3.46 (−3.77, −3.16) |
| 2nd tertile | 2.16 (2.04, 2.28) | 0.51 (0.46, 0.56) | 0.64 (0.58, 0.70) | 0.37 (0.32, 0.41) | 1.52 (1.39, 1.66) | −1.12 (−1.20, −1.05) | 23.20 (22.36, 24.05) | −2.26 (−2.41, −2.12) |
| Rural census tract | 1.72 (1.38, 2.05) | 0.22 (0.06, 0.38) | 0.55 (0.40, 0.70) | 0.66 (0.55, 0.78) | 1.65 (1.29, 2.02) | −0.37 (−0.60, −0.15) | 22.33 (21.29, 23.38) | −2.55 (−2.93, −2.18) |
| USDA Rural-urban continuum codes | ||||||||
| Small town & rural (vs. metropolitan tracts) | 1.06 (0.92, 1.20) | 2.72 (2.64, 2.79) | 3.93 (3.84, 4.01) | 1.49 (1.43, 1.55) | −1.74 (−1.91, −1.58) | −1.78 (−1.87, −1.68) | 32.67 (19.50, 45.84) | −1.44 (−1.64, −1.24) |
| N | 9991 | 9991 | 9991 | 9991 | 9991 | 9991 | 10,529 | 9991 |
Data source of health outcomes: City Health Dashboard on 500 U.S. Cities. Census tract built environment characteristics categorized into tertiles with the lowest tertile serving as the referent group. Adjusted linear regression models were run for each predictor and outcome separately. Models controlled for census tract-level demographics: population density, rural census tract designation, percent 10–24 years old, percent 65 years and older, percent Hispanic, percent non-Hispanic black, households with relatives (other than spouse and children), households with unmarried partner, owner-occupied housing, economic disadvantage, and household size. A census tract was urban if the geographic centroid of the tract was in an area with >2500 people; all other tracts are rural. Robust standard errors reported. Separate models were run for each outcome and for each predictor (Google Street View derived rural area, census population density, rural census tract) because the predictors were collinear with each other.
Rural-Urban continuum codes: https://www.ers.usda.gov/data-products/rural-urban-commuting-area-codes.aspx#.U9lO7GPDWHo.
p < 0.05.