Literature DB >> 34281023

Design of a Spark Big Data Framework for PM2.5 Air Pollution Forecasting.

Dong-Her Shih1, Thi Hien To2,3, Ly Sy Phu Nguyen2,3, Ting-Wei Wu1, Wen-Ting You1.   

Abstract

In recent years, with rapid economic development, air pollution has become extremely serious, causing many negative effects on health, environment and medical costs. PM2.5 is one of the main components of air pollution. Therefore, it is necessary to know the PM2.5 air quality in advance for health. Many studies on air quality are based on the government's official air quality monitoring stations, which cannot be widely deployed due to high cost constraints. Furthermore, the update frequency of government monitoring stations is once an hour, and it is hard to capture short-term PM2.5 concentration peaks with little warning. Nevertheless, dealing with short-term data with many stations, the volume of data is huge and is calculated, analyzed and predicted in a complex way. This alleviates the high computational requirements of the original predictor, thus making Spark suitable for the considered problem. This study proposes a PM2.5 instant prediction architecture based on the Spark big data framework to handle the huge data from the LASS community. The Spark big data framework proposed in this study is divided into three modules. It collects real time PM2.5 data and performs ensemble learning through three machine learning algorithms (Linear Regression, Random Forest, Gradient Boosting Decision Tree) to predict the PM2.5 concentration value in the next 30 to 180 min with accompanying visualization graph. The experimental results show that our proposed Spark big data ensemble prediction model in next 30-min prediction has the best performance (R2 up to 0.96), and the ensemble model has better performance than any single machine learning model. Taiwan has been suffering from a situation of relatively poor air pollution quality for a long time. Air pollutant monitoring data from LASS community can provide a wide broader monitoring, however the data is large and difficult to integrate or analyze. The proposed Spark big data framework system can provide short-term PM2.5 forecasts and help the decision-maker to take proper action immediately.

Entities:  

Keywords:  PM2.5 predictions; Spark; air pollution; big data; ensemble model; machine learning

Year:  2021        PMID: 34281023      PMCID: PMC8296958          DOI: 10.3390/ijerph18137087

Source DB:  PubMed          Journal:  Int J Environ Res Public Health        ISSN: 1660-4601            Impact factor:   3.390


  25 in total

1.  Prediction of PM2.5 along urban highway corridor under mixed traffic conditions using CALINE4 model.

Authors:  Rajni Dhyani; Niraj Sharma; Animesh Kumar Maity
Journal:  J Environ Manage       Date:  2017-04-24       Impact factor: 6.789

2.  Prediction of 24-hour-average PM(2.5) concentrations using a hidden Markov model with different emission distributions in Northern California.

Authors:  Wei Sun; Hao Zhang; Ahmet Palazoglu; Angadh Singh; Weidong Zhang; Shiwei Liu
Journal:  Sci Total Environ       Date:  2012-11-23       Impact factor: 7.963

3.  Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment.

Authors:  Yu Zhan; Yuzhou Luo; Xunfei Deng; Michael L Grieneisen; Minghua Zhang; Baofeng Di
Journal:  Environ Pollut       Date:  2017-11-05       Impact factor: 8.071

4.  A novel hybrid forecasting model for PM₁₀ and SO₂ daily concentrations.

Authors:  Ping Wang; Yong Liu; Zuodong Qin; Guisheng Zhang
Journal:  Sci Total Environ       Date:  2014-11-14       Impact factor: 7.963

Review 5.  Effects of particulate matter (PM(10), PM(2.5) and PM(1)) on the cardiovascular system.

Authors:  Giuliano Polichetti; Stefania Cocco; Alessandra Spinali; Valentina Trimarco; Alfredo Nunziata
Journal:  Toxicology       Date:  2009-04-18       Impact factor: 4.221

6.  Ambient PM2.5 air pollution exposure and hepatocellular carcinoma incidence in the United States.

Authors:  Trang VoPham; Kimberly A Bertrand; Rulla M Tamimi; Francine Laden; Jaime E Hart
Journal:  Cancer Causes Control       Date:  2018-04-25       Impact factor: 2.506

Review 7.  Air particulate matter and cardiovascular disease: a narrative review.

Authors:  Nicola Martinelli; Oliviero Olivieri; Domenico Girelli
Journal:  Eur J Intern Med       Date:  2013-05-04       Impact factor: 4.487

8.  Traffic-related air pollution and incident type 2 diabetes: results from the SALIA cohort study.

Authors:  Ursula Krämer; Christian Herder; Dorothea Sugiri; Klaus Strassburger; Tamara Schikowski; Ulrich Ranft; Wolfgang Rathmann
Journal:  Environ Health Perspect       Date:  2010-05-11       Impact factor: 9.031

9.  Forecasting Air Quality in Taiwan by Using Machine Learning.

Authors:  Mike Lee; Larry Lin; Chih-Yuan Chen; Yu Tsao; Ting-Hsuan Yao; Min-Han Fei; Shih-Hau Fang
Journal:  Sci Rep       Date:  2020-03-05       Impact factor: 4.379

10.  RAQ-A Random Forest Approach for Predicting Air Quality in Urban Sensing Systems.

Authors:  Ruiyun Yu; Yu Yang; Leyou Yang; Guangjie Han; Oguti Ann Move
Journal:  Sensors (Basel)       Date:  2016-01-09       Impact factor: 3.576

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.