| Literature DB >> 32287937 |
Jiaoyan Chen1, Huajun Chen1, Zhaohui Wu1, Daning Hu2, Jeff Z Pan3.
Abstract
Smog disasters are becoming more and more frequent and may cause severe consequences on the environment and public health, especially in urban areas. Social media as a real-time urban data source has become an increasingly effective channel to observe people׳s reactions on smog-related health hazard. It can be used to capture possible smog-related public health disasters in its early stage. We then propose a predictive analytic approach that utilizes both social media and physical sensor data to forecast the next day smog-related health hazard. First, we model smog-related health hazards and smog severity through mining raw microblogging text and network information diffusion data. Second, we developed an artificial neural network (ANN)-based model to forecast smog-related health hazard with the current health hazard and smog severity observations. We evaluate the performance of the approach with other alternative machine learning methods. To the best of our knowledge, we are the first to integrate social media and physical sensor data for smog-related health hazard forecasting. The empirical findings can help researchers to better understand the non-linear relationships between the current smog observations and the next day health hazard. In addition, this forecasting approach can provide decision support for smog-related health hazard management through functions like early warning.Entities:
Keywords: Data mining; Forecasting; Health hazard; Smog disaster; Social media; Urban data
Year: 2016 PMID: 32287937 PMCID: PMC7127716 DOI: 10.1016/j.is.2016.03.011
Source DB: PubMed Journal: Inf Syst ISSN: 0306-4379 Impact factor: 2.309
Fig. 1Predicting smog-related health hazard with social media and physical sensor data.
Fig. 2The proposed predictive analytics framework for smog-related health hazard forecasting.
The phrases used in Weibo about smog-related health hazard and smog severity.
| (a) Smog-related health hazard phrases | |
|---|---|
| Type | Phrases |
| Nose | |
| Eye | |
| Throat | |
| Respiratory | |
| Heart | |
| Others | |
Fig. 3The model and its inputs for smog-related health hazard forecasting.
Fig. 4Normalized daily PHI, D-PHI, SSI, D-SSI, AQI and records in Shanghai and Beijing.
Testing accuracies (RMSEs) of the health hazard prediction model using different training methods and features. ELM-ANN and BP-ANN represent one hidden layer ANNs with ELM and multiple hidden layers ANNs with BP. P and S represent physical sensor features and social media features, while the prefix D- means considering network diffusion simultaneously.
| City | ELM-ANN | BP-ANN | nu-SVR | epsilon-SVR | Random Forest | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| P | S | PS | PDS | P | S | PS | PDS | P | S | PS | PDS | P | S | PS | PDS | P | S | PS | PDS | |
| Beijing | .085 | .096 | . | .086 | .093 | . | .068 | .074 | .099 | .069 | .068 | .086 | .100 | .080 | .074 | .095 | .099 | .078 | .077 | |
| Tianjin | .091 | .107 | . | .089 | .105 | .077 | .072 | .081 | .110 | .078 | .075 | .085 | .110 | .081 | .073 | .092 | .109 | .082 | .079 | |
| Shijiazhuang | .091 | .105 | .072 | .081 | .089 | .103 | . | .084 | .093 | .112 | .081 | .086 | .084 | .113 | .078 | .094 | .088 | . 110 | .077 | |
| Shanghai | .092 | .096 | . | .063 | .091 | .098 | .069 | .089 | .107 | .078 | .071 | .081 | .108 | .075 | .078 | .089 | .099 | .070 | .066 | |
| Hangzhou | .102 | .117 | . | .102 | .118 | .105 | .075 | .122 | .120 | .108 | .078 | .119 | .118 | .112 | .080 | .105 | .103 | .092 | .081 | |
| Nanjing | .104 | .086 | . | .103 | .086 | .081 | .074 | .094 | .087 | .085 | .075 | .097 | .084 | .083 | .079 | .092 | .095 | .079 | .073 | |
| Wuhan | .101 | .125 | .084 | .100 | .122 | .085 | .076 | .102 | .121 | .086 | .077 | .104 | .125 | .088 | .077 | .099 | . 119 | . | .079 | |
| Guangzhou | .099 | .096 | .088 | .079 | .096 | .100 | . | .079 | .118 | .125 | .102 | .087 | .112 | .125 | .106 | .101 | .105 | . | ||
| Average | .096 | .103 | . | .095 | .103 | . 079 | .073 | .097 | .110 | .086 | .077 | . 096 | .110 | .088 | .079 | . 095 | .104 | .081 | .076 | |
Underline means the best PDS item in each line, and bold font means the best PS item in each line.
Fig. 5RMSEs under different AQI ranges with physical sensor features (P) and social media features (S/DS).
Fig. 6Relative errors under different AQI ranges with physical sensor features (P) and social media features (S/DS).
Fig. 7Results of smog-related health hazard (PHI) forecasting for Beijing and Shanghai.