| Literature DB >> 34069467 |
Siqing Shan1,2, Xijie Ju1,2, Yigang Wei1,2, Zijin Wang1,2.
Abstract
PM2.5 not only harms physical health but also has negative impacts on the public's wellbeing and cognitive and behavioral patterns. However, traditional air quality assessments may fail to provide comprehensive, real-time monitoring of air quality because of the sparse distribution of air quality monitoring stations. Overcoming some key limitations of traditional surface monitoring data, Web-based social media platforms, such as Twitter, Weibo, and Facebook, provide a promising tool and novel perspective for environmental monitoring, prediction, and evaluation. This study aims to investigate the relationship between PM2.5 levels and people's emotional intensity by observing social media postings. This study defines the "emotional intensity" indicator, which is measured by the number of negative posts on Weibo, based on Weibo data related to haze from 2016 and 2017. This study estimates sentiment polarity using a recurrent neural networks model based on LSTM (Long Short-Term Memory) and verifies the correlation between high PM2.5 levels and negative posts on Weibo using a Pearson correlation coefficient and multiple linear regression model. This study makes the following observations: (1) Taking the two-year data as an example, this study recorded the significant influence of PM2.5 levels on netizens' posting behavior. (2) Air quality, meteorological factors, the seasons, and other factors have a strong influence on netizens' emotional intensity. (3) From a quantitative viewpoint, the level of PM2.5 varies by 1 unit, and the number of negative Weibo posts fluctuates by 1.0168 units. Thus, it can be concluded that netizens' emotional intensity is significantly positively affected by levels of PM2.5. The high correlation between PM2.5 levels and emotional intensity and the sensitivity of social media data shows that social media data can be used to provide a new perspective on the assessment of air quality.Entities:
Keywords: PM2.5; machine learning; sentiment analysis; social media data
Year: 2021 PMID: 34069467 PMCID: PMC8159131 DOI: 10.3390/ijerph18105422
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Distribution map of cities with national air monitoring sites (2018).
Figure 2Analytical framework for using social media data for PM2.5 research.
Figure 3Implementation of the SVM classifier.
Figure 4Diagram of the LSTM model.
Correlation between PM2.5 levels and various statistical indicators (N = 113459).
| Microblog Statistical Indicators | Pearson Correlation Coefficient | Statistical Significance |
|---|---|---|
| Number of positive microblogs | 0.589 ** | <0.01 |
| Number of negative microblogs | 0.667 ** | <0.01 |
| Total number of microblogs | 0.625 ** | <0.01 |
| Positive microblogs’ share | −0.254 ** | <0.01 |
| Negative microblogs’ share | 0.254 ** | <0.01 |
Note: ** When the significance level (double test) is 0.01, the correlation is significant. Pearson correlation coefficient: (0.8, 1.) very strongly correlated; (0.6, 0.8) strongly correlated; (0.4, 0.6) moderately correlated; (0.2, 0.4) weakly correlated; (0, 0.2) very weakly correlated or not correlated.
Figure 5The number of negative microblogs and concentration levels of PM2.5.
Correlation between PM2.5 levels and the number of negative microblogs (N = 113459).
| Month | Pearson Correlation Coefficient | Statistical Significance |
|---|---|---|
| In 2016 | ||
| January | 0.752 ** | <0.01 |
| February | 0.856 ** | <0.01 |
| March | 0.930 ** | <0.01 |
| April | 0.689 ** | <0.01 |
| May | 0.732 ** | <0.01 |
| June | 0.602 ** | <0.01 |
| July | 0.626 ** | <0.01 |
| August | 0.461 ** | <0.01 |
| September | 0.734 ** | <0.01 |
| October | 0.813 ** | <0.01 |
| November | 0.711 ** | <0.01 |
| December | 0.771 ** | <0.01 |
|
| ||
| January | 0.616 ** | <0.01 |
| February | 0.887 ** | <0.01 |
| March | 0.829 ** | <0.01 |
| April | 0.841 ** | <0.01 |
| May | 0.927 ** | <0.01 |
| June | 0.527 ** | <0.01 |
| July | 0.566 ** | <0.01 |
| August | 0.610 ** | <0.01 |
| September | 0.804 ** | <0.01 |
| October | 0.846 ** | <0.01 |
| November | 0.699 ** | <0.01 |
| December | 0.823 ** | <0.01 |
Note: ** When the significance level (double test) is 0.01, the correlation is significant.
Figure 6The trend of negative microblogs and concentration levels of PM2.5 for every month during 2016 and 2017.
Figure 7The trend of negative microblogs and concentration levels of PM2.5 in August 2016.
Test results of correlation between PM2.5 levels and people’s emotional intensity.
| Explanatory Variable | Explained Variable | |||
|---|---|---|---|---|
| Number of Negative Weibo Posts | Share of Negative Weibo Posts | |||
| Model 1 | Model 2 | Model 3 | Model 4 | |
| Weekend | 5.8296 | 3.8199 | −0.0289 *** | −0.0294 *** |
| (6.5029) | (5.2182) | (0.0071) | (0.0071) | |
| Holiday | −22.1751 ** | −44.2820 *** | −0.0270 ** | −0.0323 *** |
| (9.1351) | (9.0503) | (0.0109) | (0.0111) | |
| Temperature | 0.1088 | 3.3306 *** | −0.0046 *** | −0.0038 *** |
| (1.1620) | (1.0039) | (0.0011) | (0.0011) | |
| Square of temperature | −0.1203 *** | −0.1350 *** | 0.0001 *** | 0.0001 *** |
| (0.0327) | (0.0268) | (0.0000) | (0.0000) | |
| Humidity | 2.3040 *** | 0.5547 *** | 0.0009 *** | 0.0005 ** |
| (0.2952) | (0.1722) | (0.0002) | (0.0002) | |
| Precipitation | −1.5675 *** | −0.2797 * | −0.0002 | 0.0001 |
| (0.5098) | (0.1575) | (0.0006) | (0.0006) | |
| Sea level pressure | −1.2848 ** | 1.0594 * | −0.0018 ** | −0.0012 |
| (0.6455) | (0.5485) | (0.0007) | (0.0007) | |
| Wind speed | 0.0508 | 2.7047 *** | −0.0033 *** | −0.0027 *** |
| (0.6709) | (0.6352) | (0.0007) | (0.0007) | |
| Major event | −0.2190 | 0.7872 | −0.0150 ** | −0.0148 ** |
| (6.5950) | (5.2985) | (0.0076) | (0.0074) | |
| Spring | −46.7479 *** | −58.0878 *** | 0.0091 | 0.0064 |
| (12.5696) | (10.9529) | (0.0111) | (0.0109) | |
| Summer | −78.2494 *** | −38.8843 *** | −0.0422 ** | −0.0327 * |
| (15.2589) | (11.2806) | (0.0175) | (0.0172) | |
| Autumn | −41.8634 *** | −11.4606 | −0.0220 ** | −0.0146 |
| (14.8600) | (11.0593) | (0.0093) | (0.0094) | |
| PM2.5 | 1.0168 *** | 0.0002 *** | ||
| (0.0935) | (0.0000) | |||
| R2 | 0.3343 | 0.5792 | 0.1368 | 0.1509 |
Note: The values in the table represent the correlation coefficients of the variables of the regression model, and the standard error of each coefficient is in parentheses. ***, **, * indicate significance at the levels of 0.01, 0.05 and 0.1, respectively. Winter is the default variable.
Regression results grouped by season.
| Explanatory variable | Explained Variable: Number of Negative Weibo Posts | |||
|---|---|---|---|---|
| Spring Sample | Summer Sample | Autumn Sample | Winter Sample | |
| Model 1 | Model 2 | Model 3 | Model 4 | |
| Weekend | −2.2482 | 0.0871 | 6.2589 | −8.4189 |
| (2.6921) | (2.1938) | (11.2871) | (13.5364) | |
| Holiday | −5.6358 | −1.2324 | −47.9146 *** | −67.8414 *** |
| (3.6301) | (3.0269) | (13.7614) | (20.1369) | |
| Temperature | 0.6360 | 8.0492 ** | 7.9258 *** | 5.6999 ** |
| (1.0657) | (3.3116) | (2.2704) | (2.3145) | |
| Square of temperature | −0.0424 | −0.1510 ** | −0.3072 *** | −0.2588 |
| (0.0296) | (0.0656) | (0.0700) | (0.2622) | |
| Humidity | −0.1998 * | 0.1838 *** | −0.3889 | 2.5306 *** |
| (0.1091) | (0.0656) | (0.4263) | (0.7840) | |
| Precipitation | 0.4642 * | −0.0656 | 1.9974 * | −20.4136 *** |
| (0.2790) | (0.0478) | (1.0437) | (7.0949) | |
| Sea level pressure | 0.6597 ** | 0.4025 ** | 1.2726 | 2.8981 ** |
| (0.3241) | (0.2010) | (1.2203) | (1.3610) | |
| Wind speed | 0.4996 | 1.2986 ** | 5.2688 *** | 4.3613 *** |
| (0.3360) | (0.6216) | (1.2799) | (1.2222) | |
| Major event | 0.6764 | −0.4948 | 6.8629 | 1.5558 |
| (4.9188) | (1.7036) | (11.7907) | (18.1195) | |
| PM2.5 | 0.5568 *** | 0.1989 *** | 1.8648 *** | 0.8980 *** |
| (0.0430) | (0.0558) | (0.2196) | (0.1695) | |
| R2 | 0.7682 | 0.3350 | 0.6598 | 0.5936 |
Note: The values in the table represent the correlation coefficients of the variables of the regression model, and the standard error of each coefficient is in parentheses. ***, **, * indicate significance at the levels of 0.01, 0.05 and 0.1, respectively. Winter is the default variable.
Regression results are grouped by holiday and weekend.
| Explanatory variable | Explained Variable: Number of Negative Weibo Posts | |||
|---|---|---|---|---|
| Holiday Sample | Non-Holiday Sample | Weekend Sample | Non-Weekend Sample | |
| Model 1 | Model 2 | Model 3 | Model 4 | |
| Weekend | 18.0618 * | 5.5692 | ||
| (10.7563) | (5.6416) | |||
| Holiday | −46.7156 *** | −32.3766 *** | ||
| (17.3686) | (8.4650) | |||
| Temperature | −0.6179 | 0.4666 | 3.5571 ** | −0.2817 |
| (1.3368) | (0.8598) | (1.5658) | (0.9612) | |
| Square of temperature | 0.1149 ** | −0.0763 *** | −0.1586 *** | −0.0434 * |
| (0.0501) | (0.0212) | (0.0429) | (0.0238) | |
| Humidity | 1.0464 *** | 1.1335 *** | 1.3154 *** | 1.0190 *** |
| (0.2814) | (0.1753) | (0.3307) | (0.1747) | |
| Precipitation | −2.7737 | −0.5381 *** | −1.1821 ** | −0.4681 ** |
| (1.8766) | (0.2042) | (0.5630) | (0.1972) | |
| Sea level pressure | 3.5770 *** | 1.3219 ** | 2.1149 * | 1.4178 ** |
| (0.9727) | (0.5838) | (1.2398) | (0.6250) | |
| Wind speed | 0.2875 | 3.2390 *** | 3.7215 *** | 2.7884 *** |
| (0.9436) | (0.7243) | (1.2767) | (0.7722) | |
| Major event | −25.5973 | −3.0546 | −12.4966 * | 3.7993 |
| (15.7975) | (5.7231) | (7.1763) | (6.9226) | |
| PM2.5 | 0.5046 *** | 1.0451 *** | 0.9978 *** | 0.9540 *** |
| (0.0983) | (0.0978) | (0.1704) | (0.1075) | |
| R2 | 0.7013 | 0.5668 | 0.5537 | 0.5628 |
Note: The blank is the corresponding variable that is rejected because of the differences in the samples. The values in the table represent the correlation coefficients of the variables of the regres-sion model, and the standard error of each coefficient is in parentheses. ***, **, * indicate signifi-cance at the levels of 0.01, 0.05 and 0.1, respectively. Winter is the default variable.