| Literature DB >> 32977562 |
Chin-Yu Hsu1,2, Yu-Ting Zeng3, Yu-Cheng Chen4, Mu-Jean Chen4, Shih-Chun Candice Lung5,6,7, Chih-Da Wu3,4.
Abstract
This paper uses machine learning to refine a Land-use Regression (LUR) model and to estimate the spatial-temporal variation in BTEX concentrations in Kaohsiung, Taiwan. Using the Taiwanese Environmental Protection Agency (EPA) data of BTEX (benzene, toluene, ethylbenzene, and xylenes) concentrations from 2015 to 2018, which includes local emission sources as a result of Asian cultural characteristics, a new LUR model is developed. The 2019 data was then used as external data to verify the reliability of the model. We used hybrid Kriging-land-use regression (Hybrid Kriging-LUR) models, geographically weighted regression (GWR), and two machine learning algorithms-random forest (RF) and extreme gradient boosting (XGBoost)-for model development. Initially, the proposed Hybrid Kriging-LUR models explained each variation in BTEX from 37% to 52%. Using machine learning algorithms (XGBoost) increased the explanatory power of the models for each BTEX, between 61% and 79%. This study compared each combination of the Hybrid Kriging-LUR model and (i) GWR, (ii) RF, and (iii) XGBoost algorithm to estimate the spatiotemporal variation in BTEX concentration. It is shown that a combination of Hybrid Kriging-LUR and the XGBoost algorithm gives better performance than other integrated methods.Entities:
Keywords: culture-specific sources; hybrid Kriging-LUR model; nitrogen dioxide (NO2); spatiotemporal variations
Mesh:
Substances:
Year: 2020 PMID: 32977562 PMCID: PMC7579284 DOI: 10.3390/ijerph17196956
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1The diurnal variation in the BTEX (benzene, toluene, ethylbenzene, and xylenes) concentrations for each season, averaged over 17 sampling sites.
Prediction variables for the Hybrid Kriging-LUR model. LUR—land-use regression; VIF—variance inflation factor.
| BTEX | Variable | Coefficient | Partial R2 | VIF | |
|---|---|---|---|---|---|
| Benzene | Intercept | 1.964 | <0.05 | - | - |
| BenzeneKriging-based | 0.223 | <0.05 | 0.006 | 1.395 | |
| Ultraviolet | −0.163 | <0.05 | 0.045 | 1.394 | |
| Rice farm150m | 0.002 | <0.05 | 0.068 | 1.272 | |
| HarborNearest distance | −1.113 × 10−4 | <0.05 | 0.070 | 1.163 | |
| Industry500m | 0.002 | <0.05 | 0.240 | 1.185 | |
| Toluene | Intercept | −1.229 | <0.05 | - | - |
| TolueneKriging-based | 0.581 | <0.05 | 0.061 | 2.366 | |
| Nitrogen Oxides | 0.068 | <0.05 | 0.246 | 2.311 | |
| Water bodyNearest distance | 5.966 × 10−4 | <0.05 | 0.001 | 1.412 | |
| Purely residential area250m | 0.002 | <0.05 | 0.048 | 1.649 | |
| Sandstone field150m | −0.005 | <0.05 | 0.058 | 1.102 | |
| Sandstone field 2500m | 0.002 | <0.05 | 0.002 | 1.257 | |
| Industry150m | 6.208 × 10−4 | <0.05 | 0.025 | 1.406 | |
| All types of road(width)50m | 3.241 × 10−4 | 0.153 | 0.005 | 1.359 | |
| Temple250m | 0.515 | <0.05 | 0.071 | 1.403 | |
| Ethylbenzene | Intercept | −0.105 | 0.442 | - | - |
| EthylbenzeKriging-based | 0.072 | 0.239 | 0.007 | 1.097 | |
| SO2 | 0.094 | <0.05 | 0.032 | 1.342 | |
| winter | 0.114 | <0.05 | 0.011 | 1.299 | |
| Industry250m | 3.737 × 10−4 | <0.05 | 0.160 | 1.072 | |
| Temple250m | 0.105 | <0.05 | 0.096 | 1.056 | |
| Sandstone field 500m | −3.224 × 10−4 | <0.05 | 0.010 | 1.928 | |
| Fruit orchard50m | 6.428 × 10−4 | <0.05 | 0.038 | 1.635 | |
| Fruit orchard1500m | 5.927 × 10−4 | 0.161 | 0.003 | 2.434 | |
| m,p-Xylene | Intercept | −0.045 | 0.778 | - | - |
| m,p-XyleneKriging-based | 0.432 | <0.05 | 0.041 | 1.062 | |
| Sandstone field 150m | −8.339 × 10−4 | 0.079 | 0.040 | 1.169 | |
| Funerary services1250m | 0.003 | <0.05 | 0.011 | 1.041 | |
| Industry50m | 6.963 × 10−4 | <0.05 | 0.075 | 1.516 | |
| Local road250m | 16.121 | <0.05 | 0.010 | 1.518 | |
| Temple250m | 0.364 | <0.05 | 0.248 | 1.042 |
BTEX = benzene, toluene, ethylbenzene, and xylenes.
Performance of the Hybrid Kriging-LUR, GWR-Hybrid LUR, RF-Hybrid LUR and XGBoost- Hybrid LUR models. GWR—geographically weighted regression; LUR—Land-use regression; RF—random forest; XGBoost—extreme gradient boosting; RMSE—root mean square error.
| BTEX | Statistic | Hybrid Kriging-LUR | GWR-Hybrid LUR | RF-Hybrid LUR | XGBoost-Hybrid LUR |
|---|---|---|---|---|---|
| Benzene | R2 (training, testing) | 0.45 (0.43, 0.55) | 0.47 (0.46, 0.45) | 0.57 (0.59, 0.42) | 0.63 (0.65, 0.53) |
| Adjusted R2 | 0.45 (0.42, 0.54) | 0.47 (0.46, 0.44) | 0.56 (0.59, 0.38) | 0.63 (0.64, 0.50) | |
| RMSE (training, testing) | 1.24 (1.29, 1.06) | 1.22 (1.23, 0.44) | 1.10 (1.11, 1.04) | 1.02 (1.01, 1.03) | |
| Toluene | R2 (training, testing) | 0.52 (0.52, 0.56) | 0.54 (0.52, 0.60) | 0.69 (0.70, 0.63) | 0.72 (0.74, 0.60) |
| Adjusted R2 | 0.52 (0.51, 0.56) | 0.54 (0.52, 0.59) | 0.68 (0.69, 0.59) | 0.71 (0.73, 0.56) | |
| RMSE (training, testing) | 1.35 (1.42, 1.10) | 1.33 (1.32, 1.36) | 1.09 (1.07, 1.16) | 1.03 (1.03, 1.16) | |
| Ethylbenzene | R2 (training, testing) | 0.37 (0.36, 0.49) | 0.38 (0.31, 0.23) | 0.50 (0.50, 0.45) | 0.61 (0.62, 0.54) |
| Adjusted R2 | 0.37 (0.34, 0.49) | 0.38 (0.31, 0.22) | 0.49 (0.49, 0.40) | 0.61 (0.61, 0.50) | |
| RMSE (training, testing) | 0.31 (0.33, 0.23) | 0.31 (0.32, 0.17) | 0.28 (0.29, 0.22) | 0.60 (0.25, 0.22) | |
| m,p-Xylene | R2 (training, testing) | 0.42 (0.42, 0.43) | 0.44 (0.40, 0.29) | 0.77 (0.77, 0.77) | 0.79 (0.79, 0.79) |
| Adjusted R2 | 0.42 (0.41, 0.42) | 0.44 (0.40, 0.29) | 0.77 (0.77, 0.77) | 0.79 (0.79, 0.77) | |
| RMSE (training, testing) | 0.70 (0.72, 0.67) | 0.69 (0.72, 0.27) | 0.44 (0.41, 0.44) | 0.42 (0.36, 0.61) |
External data validation for the proposed models.
| BTEX | Statistic | Hybrid Kriging-LUR | GWR-Hybrid LUR | RF-Hybrid LUR | XGBoost-Hybrid LUR |
|---|---|---|---|---|---|
| Benzene | R2 | 0.52 | 0.52 | 0.44 | 0.41 |
| Adjusted R2 | 0.52 | 0.52 | 0.43 | 0.40 | |
| RMSE | 0.29 | 0.29 | 0.31 | 0.80 | |
| Toluene | R2 | 0.65 | 0.58 | 0.56 | 0.55 |
| Adjusted R2 | 0.64 | 0.58 | 0.55 | 0.54 | |
| RMSE | 0.81 | 0.88 | 0.90 | 0.91 | |
| Ethylbenzene | R2 | 0.47 | 0.43 | 0.42 | 0.45 |
| Adjusted R2 | 0.47 | 0.42 | 0.41 | 0.44 | |
| RMSE | 0.15 | 0.16 | 0.16 | 0.16 | |
| m,p-Xylene | R2 | 0.34 | 0.28 | 0.51 | 0.52 |
| Adjusted R2 | 0.34 | 0.27 | 0.51 | 0.52 | |
| RMSE | 0.24 | 0.25 | 0.23 | 0.19 |
Figure 2Monthly average concentration of BTEX: (a) benzene, (b) toluene, (c) ethylbenzene, and (d) m,p-xylene.