| Literature DB >> 35564796 |
Konstantina Dimakopoulou1, Evangelia Samoli1, Antonis Analitis1, Joel Schwartz2,3, Sean Beevers4, Nutthida Kitwiroon4, Andrew Beddows4, Benjamin Barratt4,5, Sophia Rodopoulou1, Sofia Zafeiratou1, John Gulliver6, Klea Katsouyanni1,4.
Abstract
Land use regression (LUR) and dispersion/chemical transport models (D/CTMs) are frequently applied to predict exposure to air pollution concentrations at a fine scale for use in epidemiological studies. Moreover, the use of satellite aerosol optical depth data has been a key predictor especially for particulate matter pollution and when studying large populations. Within the STEAM project we present a hybrid spatio-temporal modeling framework by (a) incorporating predictions from dispersion modeling of nitrogen dioxide (NO2), ozone (O3) and particulate matter with an aerodynamic diameter equal or less than 10 μm (PM10) and less than 2.5 μm (PM2.5) into a spatio-temporal LUR model; and (b) combining the predictions LUR and dispersion modeling and additionally, only for PM2.5, from an ensemble machine learning approach using a generalized additive model (GAM). We used air pollution measurements from 2009 to 2013 from 62 fixed monitoring sites for O3, 115 for particles and up to 130 for NO2, obtained from the dense network in the Greater London Area, UK. We assessed all models following a 10-fold cross validation (10-fold CV) procedure. The hybrid models performed better compared to separate LUR models. Incorporation of the dispersion estimates in the LUR models as a predictor, improved the LUR model fit: CV-R2 increased to 0.76 from 0.71 for NO2, to 0.79 from 0.57 for PM10, to 0.81 to 0.66 for PM2.5 and to 0.75 from 0.62 for O3. The CV-R2 obtained from the hybrid GAM framework was also increased compared to separate LUR models (CV-R2 = 0.80 for NO2, 0.76 for PM10, 0.79 for PM2.5 and 0.75 for O3). Our study supports the combined use of different air pollution exposure assessment methods in a single modeling framework to improve the accuracy of spatio-temporal predictions for subsequent use in epidemiological studies.Entities:
Keywords: air pollution; chemical transport models; exposure modeling; land use regression; machine learning; particulate matter
Mesh:
Substances:
Year: 2022 PMID: 35564796 PMCID: PMC9103954 DOI: 10.3390/ijerph19095401
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 4.614
Figure 1Map of the geographical location of the fixed monitoring network operated at the Greater London (a) study area for NO2, PM10 and PM2.5 and (b) extended study area for O3.
Figure 2Predictor variables included in the final spatio-temporal land use regression (ST LUR) models developed for NO2, O3, PM10 and PM2.5 (μg/m3), in the Greater London Area for the years 2009 to 2013. TRAFMLOAD_50; TRAFMLOAD_100; TRAFMLOAD_300: traffic load of major roads (veh*m/day) in a buffer of 50m, 100m and 300m around each fixed monitoring site, respectively; MROADLENGTH_100: total length of major roads (m) in a buffer of 100m around each fixed monitoring site; INVDIST: inverse distance of fixed monitoring sites to the nearest major road (m−1); URBAN_300: urban areas (m2) in a buffer of 300m around each fixed monitoring site; DAYCOUNT: day count variable accounting for trends within each year coded from 1 to 365 or 366 (included penalized splines with 6 degrees of freedom (df)); YEARS: years of the study period (4 dummy variables with year 2009 as reference category); TEMP: daily mean temperature (°C, included penalized splines with 3 df); WDIR: daily mean wind direction (°N, included penalized splines with 3 df); WSPEED: daily mean wind speed (m/s); RHUM: daily mean relative humidity (%); CLOUD: daily mean cloud coverage (okta); BARPRESS: daily mean barometric pressure (mBar/hPa, included penalized splines with 3 df).
Distribution of air pollutant concentrations measured from the fixed monitoring network and of estimated concentrations by the independent and hybrid exposure assessment methods, by pollutant, in the Greater London Area (n = 5373 LSOAs) for the years 2009–2013.
| Concentrations | Pollutant Measurements (μg/m3) | ||||
|---|---|---|---|---|---|
| NO2 | O3 | PM10 | PM2.5 | ||
|
|
| 130 | 62 | 115 | 104 |
| 52.1 (24.39) | 53.2 (11.62) | 24.2 (5.18) | 14.5 (2.48) | ||
|
| |||||
|
|
| 41.4 (20.77) | 50.8 (8.23) | 20.3 (4.01) | 12.9 (2.12) |
|
| 37.7 (12.76) | 59.8 (6.66) | 19.7 (2.56) | 14.2 (1.80) | |
|
| - | - | 15.8 (1.28) | ||
|
|
| 43.8 (34.31) | 57.7 (12.77) | 21.9 (8.67) | 14.1 (2.47) |
|
| 35.2 (13.47) | 51.7 (7.27) | 19.5 (2.29) | 14.4 (1.18) | |
1 spatio-temporal Land Use Regression (LUR) model. 2 spatio-temporal dispersion model. 3 PM2.5 prediction model based on an ensemble machine learning spatio-temporal approach. 4 Incorporation of estimates derived from independent exposure assessment methods, within the LUR model. 5 Combination of estimates derived from independent exposure assessment methods, within a GAM.
Summary of agreement between the independent exposure assessment methods.
| Agreement | Pollutant (μg/m3) | |||||
|---|---|---|---|---|---|---|
| NO2 | O3 | PM10 | ||||
| Spatial | Temporal | Spatial | Temporal | Spatial | Temporal | |
| Mean difference 1 | −3.7 | −3.7 | 9.0 | 9.0 | −0.6 | −0.6 |
| Lin’s | 0.39 * | 0.78 * | 0.25 * | 0.61 * | 0.31 * | 0.69 * |
| r | 0.45 * | 0.82 * | 0.44 * | 0.71 * | 0.37 * | 0.75 * |
|
| ||||||
| a difference | b difference | c difference | ||||
| Spatial | Temporal | Spatial | Temporal | Spatial | Temporal | |
| Mean difference 2 | 0.4 | 0.4 | −1.6 | −1.6 | −2.0 | −2.0 |
| Lin’s | 0.26 * | 0.67 * | 0.24 * | 0.96 * | 0.12 * | 0.71 * |
| r | 0.28 * | 0.77 * | 0.39 * | 0.98 * | 0.22 * | 0.81 * |
Mean difference 1: Applicable to NO2, O3 and PM10; difference = Dipsersion—LUR predicted concentrations. Mean difference 2: Applicable only to PM2.5; a difference = Dispersion—LUR predicted concentrations; b difference = Dispersion—ensemble machine learning approach model predicted concentrations; c difference = LUR—ensemble machine learning approach model predicted concentrations. Lin: Lin’s concordance correlation coefficient. r: Pearson correlation coefficient. Ninety-five percent limits of agreement (LoA) Bland–Altman method. * p-value < 0.05.
Model performance evaluated by the value of adjusted R2 and 10-fold cross validated (CV) R2, root mean square error (RMSE) and mean bias, for the independent and hybrid modeling approaches.
| Pollutant (μg/m3) | ||||
|---|---|---|---|---|
| NO2 | O3 | PM10 | PM2.5 | |
|
| 130 | 62 | 115 | 104 |
|
| ||||
|
| ||||
| R2adj and (CV-R2) | 0.72 (0.71) | 0.69 (0.62) | 0.61 (0.57) | 0.69 (0.66) |
| RMSE | 4.28 | 13.67 | 7.42 | 3.64 |
| Mean bias 2 | −5.60 | −0.14 | 0.96 | 0.59 |
|
| ||||
| R2adj and (CV-R2) | 0.73 (0.70) | 0.60 (0.59) | 0.71 (0.69) | 0.75 (0.74) |
| RMSE | 4.13 | 15.60 | 6.41 | 4.26 |
| Mean bias 2 | 0.73 | −0.02 | 0.81 | 0.45 |
|
| ||||
| R2adj and (CV-R2) | - | - | - | 0.88 (0.83) |
| RMSE | - | - | - | |
| Mean bias 2 | - | - | - | 0.058 |
|
| ||||
|
| ||||
| R2adj and (CV-R2) | 0.84 (0.76) | 0.79 (0.75) | 0.82 (0.79) | 0.84 (0.81) |
| RMSE | 3.71 | 10.24 | 2.72 | 0.20 |
| Mean bias 2 | −7.13 | −11.12 | −0.35 | 0.13 |
|
| ||||
| R2adj and (CV-R2) | 0.81 (0.80) | 0.76 (0.75) | 0.77 (0.76) | 0.80 (0.79) |
| RMSE | 3.64 | 11.94 | 4.02 | 1.91 |
| Mean bias 2 | 1.64 | 0.03 | 0.68 | 0.36 |
1 developed spatio-temporal (ST) LUR models. RMSE: Root Mean Square Error. 2 bias = measured concentrations from fixed monitoring sites—10-fold CV predicted concentrations. 3 spatio-temporal dispersion model. 4 PM2.5 prediction model based on an ensemble machine learning spatio-temporal approach. 5 Incorporation of estimates derived from independent exposure assessment methods, within the LUR model. 6 Combination of estimates derived from independent exposure assessment methods, within a GAM R2adj: Adjusted R2 value of model. CV: 10-fold cross validation. CV -R2: R2 value of cross validated model.
Temporal and spatial fit of the hybrid modeling approaches. Results from 10-fold cross validation.
|
| ||||||||
|
|
|
|
| |||||
|
|
|
|
|
|
|
|
| |
|
| 0.67 | 0.61 | 0.59 | 0.74 | 0.52 | 0.70 | 0.47 | 0.82 |
|
| 0.72 | 0.63 | 0.61 | 0.72 | 0.62 | 0.76 | 0.59 | 0.87 |
1 Incorporation of estimates derived from independent exposure assessment methods, within the LUR model. 2 Combination of estimates derived from independent exposure assessment methods, within a GAM. R2: R2 value of 10-fold cross validated model.
Figure 3Yearly average (years 2009 to 2013) of estimated NO2 (24 h; μg/m3), O3 (8 h-max; μg/m3), PM10 (24 h; μg/m3) and PM2.5 (24 h; μg/m3) concentrations from the hybrid 2 model, in the Greater London Area. Hybrid 2: Combination of estimates derived from independent exposure assessment methods, within a GAM.
Figure 4Long-term average (years 2009–2013) of estimated NO2 (24 h; μg/m3), O3 (8 h-max; μg/m3), PM10 (24 h; μg/m3) and PM2.5 (24 h; μg/m3) concentrations from the hybrid 2 model, per LSOA in the Greater London Area. Hybrid 2: Combination of estimates derived from independent exposure assessment methods, within a GAM.