| Literature DB >> 28531151 |
Lianfa Li1,2, Jiehao Zhang3,4, Wenyang Qiu5,6, Jinfeng Wang7,8, Ying Fang9,10.
Abstract
Although fine particulate matter with a diameter of <2.5 μm (PM2.5) has a greater negative impact on human health than particulate matter with a diameter of <10 μm (PM10), measurements of PM2.5 have only recently been performed, and the spatial coverage of these measurements is limited. Comprehensively assessing PM2.5 pollution levels and the cumulative health effects is difficult because PM2.5 monitoring data for prior time periods and certain regions are not available. In this paper, we propose a promising approach for robustly predicting PM2.5 concentrations. In our approach, a generalized additive model is first used to quantify the non-linear associations between predictors and PM2.5, the bagging method is used to sample the dataset and train different models to reduce the bias in prediction, and the variogram for the daily residuals of the ensemble predictions is then simulated to improve our predictions. Shandong Province, China, is the study region, and data from 96 monitoring stations were included. To train and validate the models, we used PM2.5 measurement data from 2014 with other predictors, including PM10 data, meteorological parameters, remote sensing data, and land-use data. The validation results revealed that the R² value was improved and reached 0.89 when PM10 was used as a predictor and a kriging interpolation was performed for the residuals. However, when PM10 was not used as a predictor, our method still achieved a CV R² value of up to 0.86. The ensemble of spatial characteristics of relevant factors explained approximately 32% of the variance and improved the PM2.5 predictions. The spatiotemporal modeling approach to estimating PM2.5 concentrations presented in this paper has important implications for assessing PM2.5 exposure and its cumulative health effects.Entities:
Keywords: PM10 predictor; PM2.5; ensemble model; exposure estimation; kriging
Mesh:
Substances:
Year: 2017 PMID: 28531151 PMCID: PMC5451999 DOI: 10.3390/ijerph14050549
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Study region (Shandong Province in China) with monitoring stations.
Figure 2Daily averages of PM2.5 and PM10 over all the monitoring stations in 2014.
Figure 3Temporal basis functions for Shandong Province of China: (a) The first temporal basis function; (b) The second temporal basis function.
Figure 4Daily averages of the ratio of PM2.5 to PM10 over all the monitoring stations across 2014.
Variance explained for each predictive variable.
| Predictive Variable (Unit) | Variance Explained in the Univariate Model | Variance Explained in the Multivariate Model (without PM10) | Variance Explained in the Multivariate Model (Including PM10) |
|---|---|---|---|
| PM10 ( | 73.00% | - | 67.97% |
| Aerosol optical thickness (AOT) | 7.38% | 4.77% | 0.48% |
| Normalized difference vegetation index (NDVI) | 3.14% | 0.24% | 0.24% |
| Precipitation (kg/m2s) | 1.75% | 0.02% | 0.02% |
| Temperature (°C) | 1.08% | 2.62% | 0.48% |
| Mean specific humidity (kg/kg) | 9.08% | 0.48% | 0.48% |
| Roadway length within the 10 km buffer of a monitoring station (m) | 2.62% | 2.86% | 1.45% |
| Shortest distance of roadway to a monitoring station (m) | 1.73% | 2.15% | 1.21% |
| Wind speed vector | 3.76% | 1.43% | 0.48% |
| Area proportion of the factories and mines, oil fields and stone-pit land-use | 2.29% | 2.15% | 0.73% |
| Area proportion of the forest land-use | 2.06% | 2.62% | 0.48% |
| Number of the emission plants | 4.48% | 1.67% | 0.97% |
| Shortest distance to the emission plants | 1.70% | 1.67% | 0.24% |
| The first temporal basis function | 37.00% | 26.71% | 4.84% |
| The second temporal basis function | 5.79% | 1.43% | 0.48% |
| Time (day of year) | 14.70% | 2.38% | 0.73% |
| Total | 53.20% | 81.30% |
Figure 5The non-linear associations between each predictor and PM2.5. The non-linear association between (a) Log-PM10 and PM2.5; (b) aerosol optical thickness (AOT) and PM2.5; (c) number of the emission plants and PM2.5; (d) wind and PM2.5; (e) precipitation and PM2.5; (f) temperature and PM2.5; (g) day of year and PM2.5.
Comparison of multiple models.
| Model | Use of Predictive Variables and Residual Kriging | R2 | CV a R2 | CV RMSE b ( |
|---|---|---|---|---|
| Model 1 | GAM with no use of PM10 data and residual kriging | 0.53 | 0.53 | 34.69 |
| Model 2 | GAM with PM10 data but without residual kriging | 0.81 | 0.81 | 21.87 |
| Model 3 | Bagging without PM10 data and residual kriging | 0.53 | 34.79 | |
| Model 4 | Bagging without PM10 data but with residual kriging | 0.86 | 18.85 | |
| Model 5 | Bagging with PM10 data but without residual kriging | 0.82 | 21.82 | |
| Model 6 | Bagging with PM10 data and residual kriging | 0.89 | 17.06 |
a CV: Cross Validation; b RMSE: Root Mean Squared Error.
Summary of variogram parameters for Models 4 and 6.
| Model | Parameter | Minimum | 1st Qu. a | Median | Mean | 3rd Qu. a | Maximum |
|---|---|---|---|---|---|---|---|
| Model 4 | Range | 5551 | 63,780 | 93,050 | 107,100 | 137,000 | 712,900 |
| Partial sill | 1.65 | 96.18 | 208 | 448.7 | 507 | 6560 | |
| Nugget | 20.74 | 97.12 | 159.6 | 260.4 | 284.3 | 3096 | |
| Model 6 | Range | 4250 | 60,350 | 95,660 | 103,100 | 144,500 | 475,200 |
| Partial sill | 0.0839 | 29.7 | 71.27 | 144.4 | 162.3 | 2866 | |
| Nugget | 0.0912 | 85.33 | 146.6 | 228.1 | 264.5 | 2650 |
a Qu.: Quarter.
Figure 6The temporal trends of the three variogram parameters for the residuals of Model 4: (a) the temporal trend of range; (b) the temporal trend of partial sill; (c) the temporal trend of nugget.
Summary of standard deviation.
| Model | Minimum | 1st Qu. a | Median | Mean | 3rd Qu. a | Max. |
|---|---|---|---|---|---|---|
| Models 3 and 4 | 0.43 | 1.33 | 1.80 | 2.18 | 2.54 | 21.43 |
| Models 5 and 6 | 0.26 | 0.79 | 1.19 | 1.47 | 1.76 | 26.81 |
a Qu.: Quarter.
Figure 7The distribution of standard deviation across monitoring stations: (a) the distribution of standard deviation for Models 3 and 4; (b) the distribution of standard deviation for Models 5 and 6.