Daniel Carrión1, Kodi B Arfer2, Johnathan Rush2, Michael Dorman3, Sebastian T Rowland4, Marianthi-Anna Kioumourtzoglou4, Itai Kloog5, Allan C Just6. 1. Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA. Electronic address: daniel.carrion@mssm.edu. 2. Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA. 3. Department of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer-Sheva, Israel. 4. Department of Environmental Health Sciences, Columbia University, New York, USA. 5. Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer-Sheva, Israel. 6. Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Institute for Exposomic Research, Icahn School of Medicine at Mount Sinai, New York, USA.
Abstract
BACKGROUND: Accurate and precise estimates of ambient air temperatures that can capture fine-scale within-day variability are necessary for studies of air temperature and health. METHOD: We developed statistical models to predict temperature at each hour in each cell of a 927-m square grid across the Northeast and Mid-Atlantic United States from 2003 to 2019, across ~4000 meteorological stations from the Integrated Mesonet, using inputs such as elevation, an inverse-distance-weighted interpolation of temperature, and satellite-based vegetation and land surface temperature. We used a rigorous spatial cross-validation scheme and spatially weighted the errors to estimate how well model predictions would generalize to new cell-days. We assess the within-county association of temperature and social vulnerability in a heat wave as an example application. RESULTS: We found that a model based on the XGBoost machine-learning algorithm was fast and accurate, obtaining weighted root mean square errors (RMSEs) around 1.6 K, compared to standard deviations around 11.0 K. We found similar accuracy when validating our model on an external dataset from Weather Underground. Assessing predictions from the North American Land Data Assimilation System-2 (NLDAS-2), another hourly model, in the same way, we found it was much less accurate, with RMSEs around 2.5 K. This is likely due to the NLDAS-2 model's coarser spatial resolution, and the dynamic variability of temperature within its grid cells. Finally, we demonstrated the health relevance of our model by showing that our temperature estimates were associated with social vulnerability across the region during a heat wave, whereas the NLDAS-2 showed a much weaker association. CONCLUSION: Our high spatiotemporal resolution air temperature model provides a strong contribution for future health studies in this region.
BACKGROUND: Accurate and precise estimates of ambient air temperatures that can capture fine-scale within-day variability are necessary for studies of air temperature and health. METHOD: We developed statistical models to predict temperature at each hour in each cell of a 927-m square grid across the Northeast and Mid-Atlantic United States from 2003 to 2019, across ~4000 meteorological stations from the Integrated Mesonet, using inputs such as elevation, an inverse-distance-weighted interpolation of temperature, and satellite-based vegetation and land surface temperature. We used a rigorous spatial cross-validation scheme and spatially weighted the errors to estimate how well model predictions would generalize to new cell-days. We assess the within-county association of temperature and social vulnerability in a heat wave as an example application. RESULTS: We found that a model based on the XGBoost machine-learning algorithm was fast and accurate, obtaining weighted root mean square errors (RMSEs) around 1.6 K, compared to standard deviations around 11.0 K. We found similar accuracy when validating our model on an external dataset from Weather Underground. Assessing predictions from the North American Land Data Assimilation System-2 (NLDAS-2), another hourly model, in the same way, we found it was much less accurate, with RMSEs around 2.5 K. This is likely due to the NLDAS-2 model's coarser spatial resolution, and the dynamic variability of temperature within its grid cells. Finally, we demonstrated the health relevance of our model by showing that our temperature estimates were associated with social vulnerability across the region during a heat wave, whereas the NLDAS-2 showed a much weaker association. CONCLUSION: Our high spatiotemporal resolution air temperature model provides a strong contribution for future health studies in this region.
Authors: Scott M Lundberg; Gabriel Erion; Hugh Chen; Alex DeGrave; Jordan M Prutkin; Bala Nair; Ronit Katz; Jonathan Himmelfarb; Nisha Bansal; Su-In Lee Journal: Nat Mach Intell Date: 2020-01-17
Authors: J R Wortzel; J G Norden; B E Turner; D R Haynor; S T Kent; M Z Al-Hamdan; D H Avery; M J Norden Journal: J Psychiatr Res Date: 2018-12-19 Impact factor: 4.791
Authors: Iván Gutiérrez-Avila; Kodi B Arfer; Sandy Wong; Johnathan Rush; Itai Kloog; Allan C Just Journal: Int J Climatol Date: 2021-03-18 Impact factor: 3.651
Authors: Ana M Vicedo-Cabrera; Bertil Forsberg; Aurelio Tobias; Antonella Zanobetti; Joel Schwartz; Ben Armstrong; Antonio Gasparrini Journal: Am J Epidemiol Date: 2016-01-24 Impact factor: 4.897
Authors: Maayan Yitshak-Sade; Itai Kloog; Joel D Schwartz; Victor Novack; Offer Erez; Allan C Just Journal: Environ Int Date: 2021-04-30 Impact factor: 13.352
Authors: Whitney Cowell; Elena Colicino; Xueying Zhang; Rachel Ledyard; Heather H Burris; Michele R Hacker; Itai Kloog; Allan Just; Robert O Wright; Rosalind J Wright Journal: Toxics Date: 2021-12-14