Literature DB >> 18329697

Ground-level ozone prediction by support vector machine approach with a cost-sensitive classification scheme.

Wei-Zhen Lu1, Dong Wang.   

Abstract

For ground-level ozone (O(3)) prediction, a predictive model, with reliable performance not only on non-polluted days but, more importantly, on polluted days, is favored by public authorities to issue alerts, so that concerned citizens and industrial organizations could take precautions to avoid exposure and reduce harmful emissions. However, the class imbalance problem, i.e., in some collected field data, number of O(3) polluted days are much smaller than that of non-polluted days, will deteriorate the model performance on minority class-O(3) polluted days. Despite support vector machine (SVM) obtaining promising results in air quality prediction, in this study, a cost-sensitive classification scheme is proposed for the standard support vector classification model (S-SVC) in order to investigate whether the class imbalance plagues S-SVC. The S-SVC with such scheme is named as CS-SVC. Experiments on imbalanced data sets collected from two air quality monitoring sites in Hong Kong show that 1) S-SVC is still sensitive to class imbalance problem; 2) compared with S-SVC, CS-SVC effectively avoids class imbalance problem with lower percentage of false negative on O(3) polluted days but with higher percentage of false positive on non-polluted days; 3) compared with both S-SVC and CS-SVC, support vector regression model (SVR), after converting its output to binary one, only has similar performance with S-SVC, which indicates class imbalance problem also impairs the regressor model. From point of protecting public health, CS-SVC, which less likely misses to forecast O(3) polluted days, is recommended here.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18329697     DOI: 10.1016/j.scitotenv.2008.01.035

Source DB:  PubMed          Journal:  Sci Total Environ        ISSN: 0048-9697            Impact factor:   7.963


  4 in total

1.  Improving predictions in imbalanced data using Pairwise Expanded Logistic Regression.

Authors:  Xiaoqian Jiang; Robert El-Kareh; Lucila Ohno-Machado
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Observational studies and a statistical early warning of surface ozone pollution in Tangshan, the largest heavy industry city of North China.

Authors:  Pei Li; Jinyuan Xin; Xiaoping Bai; Yuesi Wang; Shigong Wang; Shixi Liu; Xiaoxin Feng
Journal:  Int J Environ Res Public Health       Date:  2013-03-13       Impact factor: 3.390

3.  Prediction of Indoor Air Exposure from Outdoor Air Quality Using an Artificial Neural Network Model for Inner City Commercial Buildings.

Authors:  Avril Challoner; Francesco Pilla; Laurence Gill
Journal:  Int J Environ Res Public Health       Date:  2015-12-01       Impact factor: 3.390

4.  Optimization of Skewed Data Using Sampling-Based Preprocessing Approach.

Authors:  Sushruta Mishra; Pradeep Kumar Mallick; Lambodar Jena; Gyoo-Soo Chae
Journal:  Front Public Health       Date:  2020-07-16
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.