Literature DB >> 25481321

A framework for exploration and cleaning of environmental data--Tehran air quality data experience.

Mansour Shamsipour1, Farshad Farzadfar2, Kimiya Gohari3, Mahboubeh Parsaeian4, Hassan Amini5, Katayoun Rabiei6, Mohammad Sadegh Hassanvand7, Iman Navidi4, Akbar Fotouhi8, Kazem Naddafi7, Nizal Sarrafzadegan6, Anita Mansouri9, Alireza Mesdaghinia10, Bagher Larijani11, Masud Yunesian12.   

Abstract

BACKGROUND: Management and cleaning of large environmental monitored data sets is a specific challenge. In this article, the authors present a novel framework for exploring and cleaning large datasets. As a case study, we applied the method on air quality data of Tehran, Iran from 1996 to 2013.
METHODS: The framework consists of data acquisition [here, data of particulate matter with aerodynamic diameter ≤10 µm (PM10)], development of databases, initial descriptive analyses, removing inconsistent data with plausibility range, and detection of missing pattern. Additionally, we developed a novel tool entitled spatiotemporal screening tool (SST), which considers both spatial and temporal nature of data in process of outlier detection. We also evaluated the effect of dust storm in outlier detection phase.
RESULTS: The raw mean concentration of PM10 before implementation of algorithms was 88.96 µg/m3 for 1996-2013 in Tehran. After implementing the algorithms, in total, 5.7% of data points were recognized as unacceptable outliers, from which 69% data points were detected by SST and 1% data points were detected via dust storm algorithm. In addition, 29% of unacceptable outlier values were not in the PR.  The mean concentration of PM10 after implementation of algorithms was 88.41 µg/m3. However, the standard deviation was significantly decreased from 90.86 µg/m3 to 61.64 µg/m3 after implementation of the algorithms. There was no distinguishable significant pattern according to hour, day, month, and year in missing data.
CONCLUSION: We developed a novel framework for cleaning of large environmental monitored data, which can identify hidden patterns. We also presented a complete picture of PM10 from 1996 to 2013 in Tehran. Finally, we propose implementation of our framework on large spatiotemporal databases, especially in developing countries.

Mesh:

Substances:

Year:  2014        PMID: 25481321     DOI: 0141712/AIM.008

Source DB:  PubMed          Journal:  Arch Iran Med        ISSN: 1029-2977            Impact factor:   1.354


  6 in total

1.  Outlier Detection in Urban Air Quality Sensor Networks.

Authors:  V M van Zoest; A Stein; G Hoek
Journal:  Water Air Soil Pollut       Date:  2018-03-08       Impact factor: 2.520

2.  Methods for the Identification of Outliers and Their Influence on Exposure Assessment in Agricultural Pesticide Applicators: A Proposed Approach and Validation Using Biological Monitoring.

Authors:  Stefan Mandić-Rajčević; Claudio Colosio
Journal:  Toxics       Date:  2019-07-12

3.  Common and Unique Barriers to the Exchange of Administrative Healthcare Data in Environmental Public Health Tracking Program.

Authors:  Mikyong Shin; Charles Hawley; Heather Strosnider
Journal:  Int J Environ Res Public Health       Date:  2021-04-20       Impact factor: 3.390

4.  Health system performance in Iran: a systematic analysis for the Global Burden of Disease Study 2019.

Authors: 
Journal:  Lancet       Date:  2022-04-06       Impact factor: 202.731

5.  The burden of cardiovascular and respiratory diseases attributed to ambient sulfur dioxide over 26 years.

Authors:  Katayoun Rabiei; Nizal Sarrafzadegan; Ali Ghanbari; Mansour Shamsipour; Mohammad Sadegh Hassanvand; Heresh Amini; Masud Yunesian; Farshad Farzadfar
Journal:  J Environ Health Sci Eng       Date:  2020-04-21

6.  Temporal variations of ambient air pollutants and meteorological influences on their concentrations in Tehran during 2012-2017.

Authors:  Fatemeh Yousefian; Sasan Faridi; Faramarz Azimi; Mina Aghaei; Mansour Shamsipour; Kamyar Yaghmaeian; Mohammad Sadegh Hassanvand
Journal:  Sci Rep       Date:  2020-01-15       Impact factor: 4.379

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.