Literature DB >> 34857891

A new method for evaluating air quality using an ideal grey close function cluster correlation analysis method.

Xiaoling Ren1, Zhenfu Luo2, Shuyu Qin3, Xinqian Shu4, Yuanyuan Zhang3.   

Abstract

To scientifically and reasonably evaluate air quality with a large amount of monitored data, this paper proposes a new evaluation method called ideal grey close function cluster correlation analysis (IGCFCCA). Taking the air quality in Ningxia Province, China, as an example, according to China's air quality standard, SO2, NO2, PM10, PM2.5 and O3 are selected as evaluation indexes to perform the evaluation. The results show that the air quality in this region in 2018 can be divided into three classifications, among which the relatively poor air quality in March, April and May is the first classification, the better air quality in August and September is the third classification, and the air quality in other months falls under the second classification. Correlation analysis is used to qualitatively determine that these three classifications correspond to first-level air quality in China's air quality standard, and the correlation degree, which is the distance between the three classifications and the first-level air quality, is quantitatively determined. Specifically, the correlation degrees of the first-classification, second-classification and third-classification of air quality are 0.674, 0.697 and 0.71, respectively. The research results indicate potential directions and objectives for air quality management to achieve scientific management.
© 2021. The Author(s).

Entities:  

Year:  2021        PMID: 34857891      PMCID: PMC8639721          DOI: 10.1038/s41598-021-02880-1

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

The air environment is a dynamic and complex system. The air quality is influenced by some pollutants, such as SO2, NO2, PM10, and O3. The concentrations of these pollutants are changing constantly. However, the monitored data used in analyses are usually collected in a certain period, and examples include one-hour average, few-hour average, one-month average and one-year average data. Instantaneous data collected every minute or second are difficult to collect and analyse. Therefore, this collection approach is considered a grey system. In a grey system, some information is known, and some information is unknown[1-7]. At present, China’s air quality standard (GB3095-2012) divides air quality into two levels and stipulates the concentrations of pollutants in first-level and second-level air[8,9]. The concentrations of pollutants are comparatively lower in first-level air, and they are higher in second-level air. The major pollutants include SO2, NO2, PM10, O3, and others. However, when people evaluate air quality according to GB3095-2012, there may be some problems. First, according to the national standard, the common evaluation methods can only determine which level the current air is associated with. However, there is no analysis of how much the current air belongs to the level, and it is not clear how far the current air is from the standard level. The space for improving the current air quality is also very vague. It is necessary to develop a method to quantitatively calculate the correlation degree, which is the distance between the current air and the two levels of air standards. Second, to determine the air quality in a certain area in a period of time, the concentrations of pollutants are usually monitored every day. However, the amount of monitored data is very large. Obviously, if people compare and analyse each recorded value, the workload will be very large, and tasks will be almost impossible to complete. Therefore, people usually calculate the average value of the data first and then analyse the average. However, among so many monitored data, which data should be taken as a group for average calculation is a problem. In other words, determining how to scientifically classify data is the key. Data with similar characteristics can be classified into one group. These different classifications can be analysed and evaluated. Therefore, the results of the analysis can be scientific. At present, there are many methods for comprehensively evaluating atmospheric environmental quality, including the air pollution index (API) method, ambient air quality index (AQI) method, single factor index method, green air pollution comprehensive index method, analytic hierarchy process, artificial neural network models, and fuzzy comprehensive evaluation method[8]. Due to the different evaluation principles of various evaluation methods, each method has unique advantages and disadvantages. Among them, the API and AQI methods are simple, intuitive and convenient to use but only applicable for evaluating the short-term air quality in cities[9]. The single factor index method is clear and easy to implement, but it cannot consider the air quality status as a whole, and the evaluation results are one dimensional[9]. Green's comprehensive air pollution index method is easy to understand and implement, but it is only applicable to areas where coal pollution is the main pollution type[9]. The analytical hierarchy process (AHP) is simple, practical and systematic, but quantitative results are limited; additionally, when there are many indicators, the statistics will be complex, and weights will be difficult to determine[9]. The artificial neural network evaluation method has the advantages of a fast operation speed, self-adaptation and strong fault tolerance, but the disadvantage is that when the data are poorly correlated, the evaluation results will exhibit homogenization phenomena[10-13]. IGCFCCA is a kind of fuzzy comprehensive evaluation method based on fuzzy mathematics, the fuzzy principle and the grey close function. The method can solve the common incomplete data problem and mainly assesses the relationships between uncertainty and incomplete information analysis, model building and forecasting. The method only needs a small amount of data and can achieve good prediction results. In this paper, the IGCFCCA method is used to evaluate the air quality in Ningxia Province. The method can not only scientifically classify a large amount of data but also calculate the correlation degree between each classification and the relevant standard. This approach can provide an important basis for comprehensive environmental management. Moreover, this new method provides a scientific reference and an important basis for the establishment and optimization of other industry standards in the future.

Basic principle and methods

A sample, which comes from the monitored data reports of some environmental management departments, is first classified by ideal grey close function cluster analysis. Then, the level of the sample is determined by grey correlation analysis, and comprehensive evaluation conclusions are established according to the correlation degree between the classification of the sample and the levels specified in GB3095-2012.

The classification of the sample to be evaluated

Establishing the evaluation index sequence matrix for the selected sample

Let S be a sequence of clustering objects, i.e., S = {s1, s2…, s}; X is a sequence of air-influencing variables, i.e., X = {x1, x2…, x}; x is the original monitoring data for s (i = 1, 2…, m) and x (k = 1, 2…, n); i and m represent the number of objects considered in clustering; k and n are the number of the influencing indexes which are the pollutants mentioned above. Accordingly, the following matrix can be established (Eq. 1).

Establishing the matrix of ideal-value grey close function clusters

Let X0 = {x01, x02…, x0} be the ideal-value sequence corresponding to each influential index. The principle for determining the ideal value is as follows (Eqs. 2, 3, 4). The first situation: The larger the influencing index (x) is, the better the air quality is; in this case, the ideal value The second situation: The smaller the influencing index (x) is, the better the air quality is; in this case, the ideal-value Third, the air quality is best when the influencing index (x) displays a moderate value, and the ideal value is According to the ideal value x0 (Eqs. 2, 3 or Eq. 4) and the original monitored data (x), the grey close function value y is calculated by using (Eq. 5).where x is the original monitored data and x0 is the ideal value corresponding to the k-th influential index. Moreover, the function value y is dimensionless, and y ∈ [0,1]. y denotes the correlation degree of s and s0 for the k-th index. Specifically, the larger y is, the closer s is to the ideal value s0, and the smaller y is, the farther s is from s0. Thus, the following grey close matrix Y can be established (Eq. 6). In this case, Y is the grey close function value. Moreover, (y01, y02…, y0) = (1,1…,1)1× is the ideal sequence, and the bigger y is, the better s is; the biggest y is equal to 1.

The classification of the sample to be evaluated

Because the influence of each influencing index is different, the weight of each influencing index needs to be considered. Let P be the comprehensive analysis value of s. P can be expressed as follows (Eq. 7)where W is the weight of each influencing index, and since the number of indexes is k, the number of W values is also k (W1, W2…, W). Corresponding, the following equation can be established (Eq. 8). Based on the actual comprehensive analysis value P, P = (P1, P2…, P)T. The following equation (Eq. 9) can be used to calculate the grey close value P of P in relation to P. Then, If P (Eq. 10) satisfies the following three conditions: (1) reflexivity, where P = 1 (i = j); (2) symmetry, where P = P; and (3) normativity, where P ∈ [0,1], we can select the appropriate threshold value from the P matrix, intercept the branches with weight values less than λ, which is the similarity coefficient[4,5], and establish the classification (t = 1, 2…, c) when λ level meets the relevant requirement. represents each classification of the air in a given region. The following equations (Eqs. 11, 12) can be established.where is the t-th classification, is the kth index of the t-th classification, t is the number of classifications, and k is the number of influencing indexes. can be expressed in the following matrix form (Eq. 13).

Correlation degree analysis of the sample to be evaluated

Let be the sample to be evaluated, and let X = (x1, x2…, x), which is the influencing index set mentioned above and is the evaluation index used for . Let be the stated air quality classification in the GB3095-2012. Then, the equation for the correlation coefficient is as follows (Eq. 14)[14].where ζ (k) is the correlation coefficient and ε is the resolution coefficient, with a general value of 0.5[4,5]. Moreover, the correlation degree (R) equation is as follows (Eq. 15). The value of R is calculated by using (Eq. 15). The maximum value of R indicates that the sample to be evaluated has the highest correlation degree with the considered air quality level. Therefore, the sample is classified correspondingly.

Air quality assessment—taking Ningxia Province in China as an example

The classification of the samples to be evaluated

Monthly reports of the air quality in Ningxia Province in 2018 were provided by the Department of Ecology and Environment of Ningxia Province. The monthly report data were used to establish the cluster of samples S (Table 1) (Eq. 1). Each sample included five kinds of pollutants. Moreover, the concentrations of SO2, NO2, PM10 and PM2.5 were based on monthly averages calculated from 24-h averages, and the concentration of O3 was the monthly average calculated from the 8-h average values.
Table 1

Air quality in Ningxia Province in 2018.

IndexMonthly average concentrations of major monitored pollutants (μg/m3)
SO2 (x1)NO2 (x2)PM10 (x3)PM2.5 (x4)O3 (x5)
January9346
February40278640104
March303216755129
April182715947141
May142315045162
June14247426178
July9178129160
August10205625150
September13276526129
October19378739112
November27431555783
December32371415076
Air quality in Ningxia Province in 2018. x1 is the SO2 concentration; x2 is the NO2 concentration; x3 is the PM10 concentration; x4 is the PM2.5 concentration; and x5 is the O3 concentration. For these pollutants, the lower the concentration is, the better the air quality is. As shown in Table 1, because the management department only provided some monitored data and the data in January are incomplete, only the data that are listed in the table from February to December can be effectively analysed. However, the focus of this study is on the new analysis and evaluation method (IGCFCCA), and almost all of the data can be analysed by this method. According to (Eq. 3), the five ideal values are as follows: x01 is 9, x02 is 17, x03 is 56, x04 is 25, and x05 is 76. Based on the sample data in Table 1, the ideal-value grey close matrix (Eq. 6) can be obtained from (Eq. 5); according to (Eq. 8), the weights of x1, x2, x3, x4 and x5 are w1 = 0.06, w2 = 0.09, w3 = 0.34, w4 = 0.12, and w5 = 0.39, respectively. Consequently, the comprehensive analysis value P (i = 1, 2…, 11) (Table 2) of S is calculated with (Eq. 7). The grey close function value y (Eq. 5) and the comprehensive analysis value P are shown in Table 2.
Table 2

Grey close function value and the comprehensive analysis value.

IndexX1X2X3X4X5Comprehensive analysis value (Pi)
S10.2250.6300.6510.6250.7310.651
S20.3000.5310.3350.4550.5890.464
S30.5000.6300.3520.5320.5390.481
S40.6430.7390.3730.5560.4690.482
S50.6430.7080.7570.9620.4270.641
S61.0001.0000.6910.8620.4750.673
S70.9000.8501.0001.0000.5070.787
S80.6920.6300.8620.9620.5890.736
S90.4740.4590.6440.6410.6790.631
S100.3330.3950.3610.4390.9160.590
S110.2810.4590.3970.5001.0000.645
Grey close function value and the comprehensive analysis value. With P (P1, P2… and P11) as known numbers, P (j = 1, 2…, 11) can be calculated from (Eq. 9). The corresponding elements of the grey similar matrix (Eq. 10) are shown in Table 3.
Table 3

Grey close values P.

SS1S2S3S4S5S6S7S8S9S10S11
S11.0000
S20.71271.0000
S30.73890.96471.0000
S40.74040.96270.99791.0000
S50.98460.72390.75040.75201.0000
S60.96730.68950.71470.71620.95251.0000
S70.82720.58960.61120.61250.81450.85511.0000
S80.88450.63040.65350.65490.87090.91440.93521.0000
S90.96930.73530.76230.76390.98440.93760.80180.85731.0000
S100.90630.78640.81530.81690.92040.87670.74970.80160.93501.0000
S110.99080.71940.74570.74730.99380.95840.81960.87640.97830.91471.0000
Grey close values P. The following information can be obtained from Table 3. If λ = 0.9[4,5], S2, S3 and S4 correspond to the first classification ; S7 and S8 correspond to the third classification ; and the other S values correspond to the second classification . S2, S3 and S4 are the samples for March, April and May, respectively, and S7 and S8 are the samples for August and September, respectively. Cluster (Eq. 13) (Table 4) includes , and .
Table 4

The classifications of air.

Indexx1x2x3x4x5
First classification20.6727.33158.6749.00144.00
Second classification23.5030.83104.0040.17118.83
Third classification11.5023.5060.5025.50139.50
The classifications of air. The samples (Table 1) can be divided into three classifications, and the class-based approach provides two main advantages. First, if the data in each month are compared and analysed with the air standards, the workload will be large, and errors will easily accumulate. In contrast, only analysing the three classifications can greatly improve the work efficiency. Second, this classification method can be used to establish national or local standards. For example, actual statistical data over many years can be classified by this method, and the classification results can be used as new comparison standards, which would be beneficial to the analysis and evaluation of statistical data in the future.

Sample evaluation and correlation degree analysis

In the former parts of the paper, the samples from each month in 2018 are divided into three classifications (, and ). The concentrations of these pollutants in the air quality standard (GB3095-2012) are used for comparison, and the comparison of the data is shown in Fig. 1.
Figure 1

Comparison of the samples to be evaluated with the two levels of air standards.

Comparison of the samples to be evaluated with the two levels of air standards. As shown in Fig. 1, compared with that in the first-level air standard, the SO2 concentration in the third-classification air standard is lower, and the NO2 concentrations in the three air classes are all lower than the concentration in the first-level air standard. In other words, the concentration of NO2 in the region meets the first-level air standard throughout the year, and the concentration of SO2 in August and September meets the first-level air standard. Therefore, according to the first-level air standard, the region should strengthen the management of PM10, PM2.5 and O3 emissions throughout the year, and the management of SO2 emissions in months other than August and September should be strengthened. Compared with the second-level air, the concentrations of SO2, NO2, O3 in the three air classes are all lower than that in the second-level standard, the concentrations of PM10 and PM2.5 in the third classification of air are lower those in the second-level standard. In other words, the concentrations of NO2, SO2 and O3 in the region meet the second-level air standard throughout the year. Moreover, the concentrations of PM10 and PM2.5 in August and September meet the second-level air standard. Therefore, according to the second-level air standard, the region should strengthen the management of PM10 and PM2.5 emissions. According to grey theory, the cluster data and the data (and from air quality standard) used for comparison must be initialized[4,5], and the initial values are shown in Table 5.
Table 5

Data initialization results.

Indexx1x2x3x4x5
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{1}^{\prime }$$\end{document}S11.0001.3227.6762.3716.967
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{2}^{\prime }$$\end{document}S21.0001.3124.4261.7095.057
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{3}^{\prime }$$\end{document}S31.0002.0435.2612.21712.130
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{S}}_{01}^{\prime }$$\end{document}S011.0002.0002.0000.7505.000
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text{S}}_{02}^{\prime }$$\end{document}S021.0000.6671.1670.5832.667
Data initialization results. According to Eqs. 14 and 15, the correlation degree R and the correlation coefficient ζ of the first-level standard are shown in Table 6, and the correlation degree and correlation coefficient of the second-level standard are shown in Table 7.
Table 6

Correlation with the first-level air standard.

Correlation coefficient and correlation degreeζ1ζ2ζ3ζ4ζ5R1
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{1}^{\prime }$$\end{document}S11.0000.8070.3330.6370.5910.674
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{2}^{\prime }$$\end{document}S21.0000.6380.3330.5580.9550.697
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{3}^{\prime }$$\end{document}S31.0000.9880.5220.7080.3330.710
Table 7

Correlation with the second-level air standard.

Correlation coefficient and correlation degreeζ1ζ2ζ3ζ4ζ5R2
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{1}^{\prime }$$\end{document}S11.0000.8320.3330.6460.4310.648
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{2}^{\prime }$$\end{document}S21.0000.7160.3330.5910.4050.609
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S_{3}^{\prime }$$\end{document}S31.0000.7750.5360.7430.3330.677
Correlation with the first-level air standard. Correlation with the second-level air standard. According to Tables 6 and 7, all three classifications have the highest correlation with the first-level air standard. Therefore, the air quality in Ningxia Province in 2018 was associated with the first-level standard. More importantly, this result quantitatively indicates a correlation between the three classifications and the first-level air standard. The correlation degrees of the first classification, second classification and third classification with the first-level air standard are 0.674, 0.697 and 0.71, respectively. Therefore, it is clear that the gaps between the three classifications and the compared air standard are 0.326, 0.303 and 0.29. Moreover, the reason why the correlation degree cannot reach 1 is that some pollutant concentrations in the monitored data for these classifications are lower than the first-level air standard, and the remaining pollutant values are higher. Therefore, there is still room to continue to improve the air quality in the region. The region should continue to reduce the concentrations of pollutants and further improve the correlation degrees of all classifications of air with the first-level air standards.

Conclusions

A new method of air quality assessment, IGCFCCA, is proposed. The advantage of the method is that it can quantitatively characterize the correlation degree between the current air quality and the corresponding standard level. Specifically, the results of this method indicated that the air quality in Ningxia Province in 2018 was correlated with first-level air in China’s air quality standard. The correlation degrees of the first classification, second classification and third classification of air quality with the first-level air standard are 0.674, 0.697 and 0.71, respectively. Therefore, the region should continue to reduce the concentrations of pollutants, especially PM10, PM2.5 and O3, and further improve the correlation degrees of all classifications with the first-level air standards. Notably, this method can be used in other industries. The air quality in Ningxia Province in 2018 was classified into three classifications by ideal grey close function cluster analysis. Specifically, the relatively poor air quality in March, April and May and the comparatively better air quality in August and September correspond to the third classification, and the air quality in the remaining months corresponds to the second classification. In addition, the classification method can be used as a reference when establishing other classification standards, such as national standards, regional standards, and industry standards. Supplementary Information.
  1 in total

1.  Improving performance evaluation based on balanced scorecard with grey relational analysis and data envelopment analysis approaches: Case study in water and wastewater companies.

Authors:  Fatemeh Sarraf; Shabnam Hashemi Nejad
Journal:  Eval Program Plann       Date:  2019-11-24
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.