Dikshit-Ratnaparkhi A1, Bormane D2, Ghongade R3. 1. All India Shri Shivaji Memorial Society's Institute of Information Technology (AISSMS IOIT), Savitribai Phule Pune University, Pune, Maharashtra, India. 2. All India Shri Shivaji Memorial Society's College of Engineering (AISSMSCOE), Savitribai Phule Pune University, Pune, Maharashtra, India. 3. Bharati Vidyapeeth College of Engineering (BVCOE), Pune.
Abstract
BACKGROUND: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Extensive research that focuses on incorporating vagueness in the form of fuzzy sets, fuzzy rough sets and hesitant fuzzy sets (HFS) has been in past decades. OBJECTIVE: The paper aims to develop an enhanced entropy based on the clustering technique for calculating the weights of the attributes to finally generate appropriately clustered attributes. MATERIAL AND METHODS: Finding optimal attributes to make a decision has always been a matter of concern for the researchers. Metrics used for optimal attribute generation can be broadly classified into mutual dependency, similarity, correlation and entropy based metrics in fuzzy domain .The experimentation has been carried out on ECG dataset in a hesitant fuzzy framework with four attributes. RESULTS: We propose a novel correlation based on an algorithm that takes entropy based weighted attributes as input which effectively generates a relevant and non-redundant set of attributes. We have also derived correlation coefficient formulas for HFSs and applied them to clustering analysis under framework of hesitant fuzzy sets. The results show the comparison of the proposed mathematical model with the existing similarity based on algorithms. CONCLUSION: The selection of optimal relevant attributes certainly highlights the robustness and efficacy of the proposed approach. The entire experimentation and comparative results help us conclude that selection of optimal attributes in hesitant fuzzy domain certainly prove to be a powerful tool in order to express uncertainty in the process of data acquisition and classification.
BACKGROUND: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Extensive research that focuses on incorporating vagueness in the form of fuzzy sets, fuzzy rough sets and hesitant fuzzy sets (HFS) has been in past decades. OBJECTIVE: The paper aims to develop an enhanced entropy based on the clustering technique for calculating the weights of the attributes to finally generate appropriately clustered attributes. MATERIAL AND METHODS: Finding optimal attributes to make a decision has always been a matter of concern for the researchers. Metrics used for optimal attribute generation can be broadly classified into mutual dependency, similarity, correlation and entropy based metrics in fuzzy domain .The experimentation has been carried out on ECG dataset in a hesitant fuzzy framework with four attributes. RESULTS: We propose a novel correlation based on an algorithm that takes entropy based weighted attributes as input which effectively generates a relevant and non-redundant set of attributes. We have also derived correlation coefficient formulas for HFSs and applied them to clustering analysis under framework of hesitant fuzzy sets. The results show the comparison of the proposed mathematical model with the existing similarity based on algorithms. CONCLUSION: The selection of optimal relevant attributes certainly highlights the robustness and efficacy of the proposed approach. The entire experimentation and comparative results help us conclude that selection of optimal attributes in hesitant fuzzy domain certainly prove to be a powerful tool in order to express uncertainty in the process of data acquisition and classification.
Electrocardiogram classification and segmentation have been subjects of research for a prolonged period since the enormous deaths occur worldwide due to uncertainty and variability in the diagnosis of heart diseases. Health care systems have undergone revolutionary changes to provide timely and relevant information about the patients’ health periodically, thereby providing a cost-effective way for prediction of heart condition. The signal acquired from 12 lead data in real time ECG acquisition systems is contaminated by artifacts due to power line noise, respiration, muscular movements and electrode impedance. This leads to changing in the electrical characteristics of ECG which are seen as baseline drifts and motion artifacts. Generally, amplitude and duration of major features in the ECG segmentation process which have undergone considerable changes. Hence fuzzy domain and framework of hesitant fuzzy sets are the solutions which are able to effectively deal with such variations and uncertainty.Introduction and comprehensive mathematical modelling for hesitant fuzzy sets and intuitionistic fuzzy sets have been carried out as an extension to Zadeh’s fuzzy set theory involving the creation of hesitant sets allowing the membership degree of an element of a set to have several possible values and be able to explain the hesitant information more comprehensively than any other fuzzy sets [1]. Multicriteria decision in accordance with intuitionistic fuzzy sets using optimal weights for the attributes is made [2-3]. Clustering methods for optimal criteria selection have been discussed at length in [4-6]. Significance of hesitant fuzzy sets (HFS) on the basis of the similarities between the fuzzy sets and HFS has been assessed [1,5]. However, it presents limitations when various sources of vagueness appear simultaneously. Comparison of type 1 and type 2 of fuzzy sets with hesitant fuzzy sets so as to provide a clear perspective on the different concepts, tools and trends related to the extension of fuzzy sets is provided [7]. ECG segmentation and k nearest neighbour based on classification approaches for dataset reduction have been discussed [8-10]. Type 1 of fuzzy sets used widely and applied successfully in different data to handle uncertainties is discussed [11]. Rough sets theory developed by Pawlak deals with crisp upper and lower sets to handle imprecision, vagueness and uncertainty in the data analysis [12]. This theory was further developed in fuzzy framework to incorporate the imprecise information to improve the accuracy in classification problems [13-15]. The problems associated with the formation of indiscernibility equations or partitions as far as rough sets concerned are addressed. Fusion of rough and fuzzy sets are explored through axiomatic and constructive approach [16]. With reference to the practical problems associated with ECG signal acquisition, hesitant fuzzy sets theory has been developed while taking into account the fact that patients might exhibit multiple symptoms at the same time. With reference to ECG analysis, this could include variation in ST segment, QRS duration or any other attributes which could create a complex decision system. Such uncertainty can be resolved with hesitant fuzzy sets. More recently, HFS can be effectively if it is applied to medical data diagnosis [17,18]. Thus the aforementioned literature applying the concepts of hesitant fuzzy sets certainly prove ECG analysis. In the next section, preliminaries of hesitant fuzzy sets have been discussed.
Introduction
PRELIMINARIES
The concept of fuzziness is further extended in the hesitant fuzzy sets allowing the membership degree of an element to a set to have several possible values [19,20]. We propose a hesitant fuzzy set model to attribute reduction of ECG dataset. In real world scenario, it is possible that a patient’s medical report shows a variation in number of parameters causing complications in decision made for a physician. In such cases, a physician might assign variable probable membership values to the attributes or the classes.
a. Properties of Hesitant Fuzzy sets (HFS)
Definition 1. Let X be a reference set, a hesitant fuzzy set (HFS) A on X is defined in terms of function h
that after applying to X returns a subset A which is given by A={⟨x,h Where h is
a set of possible membership degrees
of the element x ∈ X to A. Let h be called the hesitant fuzzy element (HFS)
[1-2,21].Definition 2. Given a HFS (h), its lower and upper bounds are defined respectively as belowhhDefinition 3: For a HFS, the score function s(h is defined as follows:Where S(hDefinition 4: Correlation coefficients of HFS Let x={x be universe
of discourse in discrete form, and A and B be two HFS on X denoted by A={⟨x,h Where h
and B={⟨x,h Where h ; the informational energy of the set A is defined asFor two HFS A and B, their correlation is defined byThe above equation satisfies the following properties1) C2) CThen the correlation coefficient between two HFS A and B given by [21], and it can be followed as:If the length of h is less than h, then h should
be extended by adding the minimum value in it until having the length same as h.
This concept has also been applied to hesitant fuzzy sets [1-2].
The Correlation coefficient has been further modified by [21] as followsWhere w is the weight vector.Definition 5: The novel correlation measure is an enhanced version of the
correlation measure put forth [17, 21].
A systematic mathematical approach for calculation of attribute weights is provided in the following section.
The attribute weights are calculated on the basis of the entropy of the attributes rather than assuming random values [21].
Two entropy based ordered weighted attribute selection (EBOW-AS) approaches for hesitant fuzzy sets have been provided to calculate correlation measure.Method 1: For two hesitant fuzzy sets, A and B on X={x have weights associated with them and given by weight
vector W={w with w≥1 and
then the proposed correlation based EBOW-AS for HFS can be given asWhere δ satisfies the properties(1) δ(2) 0≤δ ;(3) δand attribute weight W can be defined asE is the Shannon’s entropy that indicates the importance of attributes Hence E can be given asWhere S normalized score matrix as describe [17].Method 2: The correlation measure considering the attribute weights can also be calculated using the formula as given below which is motivated by Zhang et. al.WhereWhereWhereFor Aj (j=1,2……m) and hesitant fuzzy set then C=(ρ will be correlation matrix as ststed in definition 4 given by
Material and Methods
In this section, we present the enhanced clustering algorithm pertaining to Entropy Based Ordered Weighted Attribute Selection (EBOW-AS) for hesitant fuzzy sets.
A detailed description is also given below.Step 1: Let {A
be the set of attributes of a Hesitant fuzzy set (HFS) in X={x .The hesitant fuzzy decision matrix can be given as followsStep 2: Next step is to determine the normalized score matrix S where S is the score matrix according to Definition 3.Step 3: Entropy which is based on the determination of attributes weights as given in Definition 4.where j=1.......nwhere j=1.......nStep 4: calculate the correlation coefficients and correlation matrix as discussed in definition 5.Step 5: Calculate the resultant equivalent correlation matrix Cλ for a deviation threshold of λ. Elements in Correlation matrix which have
value greater than or equal to threshold are replaced by 1 and elements having value less than threshold are replaced by 0 i.eThe resultant thresholded matrix shall indicate the required informative attributes and also show the redundant attributes.
Results and Discussion
An illustration of proposed Entropy Based Ordered Weighted Attribute Selection approach which is an algorithm inspired by clustering
algorithm given by [1-2,21].
has been discussed below. HFS clustering algorithm for clustering different objects or instances is performed [21].
The idea here is to apply novel entropy based on the attribute weighing algorithm for attribute reduction of hesitant fuzzy sets. A detailed explanation
of the algorithm providing an example for attribute reduction of ECG dataset is considered below.Let x be 4 examples for ECG classification
and A1, A2, A3, A4 be the attributes for feature selection namelyA1 is QRS_widthA2 is RpeakA3 is ST_segment durationA4 is Pwave onsetInitially, the weights of the attributes are not known, then the hesitant fuzzy decision matrix can be given as shown in the Table 1.
Table 1
Hesitant Fuzzy Decision Matrix
D
x1
x2
x3
x4
A1
{0.8,0.82}
{0.4,0.43}
{0.81,0.83}
{0.43,0.44}
A2
{0.81,0.84}
{0.85,0.86}
{0.66,0.67}
{0.61,0.64}
A3
{0.51,.54}
{0.45,0.46}
{0.56,0.57}
{0.41,0.44}
A4
{0.71,.74}
{0.35,0.36}
{0.76,0.77}
{0.31,0.34}
Hesitant Fuzzy Decision MatrixFor attribute reduction, correlation coefficients for the attributes need to be calculated. Considering the weight vector given by w = [0.12 0.56 0.08 0.24],
the calculations for correlation matrix are shown according to algorithm proposed [18]Similarly δ
For the example considered above, correlation matrix can be then written asCalculation of correlation matrix is followed by calculating equivalent correlation matrix which further helps in determination of optimal
attributes according to value of threshold λ .The above algorithm as given by Chen has been enhanced to provide a mathematical approach
for calculation of weight vector which has not been given by [21]. Accordingly the steps to be followed
in the proposed Entropy Based Ordered Weighted Attribute Selection approach(EBOW-AS) for hesitant fuzzy sets are as follows.Step 1: Determination of hesitant fuzzy decision matrix D from given data is shown in Table 1.Step 2: Next step is to determine the normalized score matrix S where S is the score matrix according to DefinitionStep 3: Entropy based determination of attributes weights is a significant step that finally determines the significance of attributes [17,19].Where j=1…….nE=[0.99 0.95 0.99 0.98]W= [0.12 0.56 0.08 0.24]Using the above weight vector, the calculation of correlation matrix has already been shown in the above section.Step 4: Calculate correlation coefficients to generate the correlation matrix using method 2 which is a modified approach [17].
The modified approach involves the clustering of the attributes by calculating the resultant equivalent matrix to generate the non-redundant informative attributes.
The correlation coefficients as given by [17] are as followsLet A (j= (1,2…..m) be m HFS and C=(ε) be correlation matrix.Step5: Calculate the resultant equivalent correlation matrix for a deviation threshold of λ, and find the C matrix. If the elements of i -th column
in C are same as the elements of the j-th line in C matrix, the corresponding HFS will be the same type. Elements in correlation matrix which have values greater
than or equal to threshold and elements which have value less than threshold are replaced by 1, 0, respectively as discussed in Step 5.Correlation matrixes generated for the weighted attribute reduction for the ECG example under discussion are nas followsCC is an equivalent matrix.Step 6: For a deviation threshold of λ=0.9, calculated C matrix would be as followsFor a deviation threshold of λ=0.8, calculate C matrix would be as followsHence attribute similarity cluster would all look alike i.e. redundant attributes would be large in number. Selection of proper deviation threshold would change
as per the application. Selecting the deviation threshold of 0.9 has shown that attributes A and A are redundant while
attribute A and A are non-redundant or essential
attributes. The above approach provides an appropriate model for attribute reduction and clustering on the basis of entropy. This approach has been compared with classical
approaches involving the distance based similarity measure to rank the significance of attributes. The results for the same are tabulated below in Table 2.
Relative closeness measure [22] based on ordered weighted square root distance measure
0.51>0.49>0.48>0.38
A4>A1>A3>A2
3.
Ordered weighted geometric method [3]
0.80>0.78>0.77>0.63
A4>A3>A1>A2
Comparative AnalysisThe tabulated results indicate the order of significance for the attributes; however, the redundant attributes cannot be found. The proposed entropy based on the model for hesitant fuzzy sets is able to effectively determine the optimal informative attributes. Moreover, the mathematical model proposed for calculation of weight vector for hesitant fuzzy sets is able to provide an exact evaluation of significant instances.
Conclusion
In this paper, we have proposed a novel Entropy Based Ordered Weighted Attribute Selection approach for hesitant fuzzy sets to select optimal attributes. Development of a comprehensive mathematical model to calculate the weight vector for the hesitant fuzzy sets is a novel feature of the proposed work. The comparison with the existing distance based on measures has been extensively carried out to prove the efficacy of the proposed model. The illustrations in the examples indicate the efficiency and simplicity of the developed HFS application. The future directions will include development of a generic mathematical model for weight calculation to have an appropriate selection of attributes to find the most informative ones. This work can also be extended for development of an appropriate classification model for hesitant fuzzy sets.