Literature DB >> 29261710

Vector similarity measures of hesitant fuzzy linguistic term sets and their applications.

Abstract

In decision making, similarity measure and distance between two objects are crucial to be able to determine the relationship between those objects. Many researchers have received much attention for their research on this subject. In this study, we propose two novel similarity measures between hesitant fuzzy linguistic term sets (HFLTSs). In addition, two extensions of Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) are proposed in the hesitant fuzzy linguistic environments. Furthermore, an example of an application concerning traditional Chinese medical diagnosis and an MCDM problem have been given to illustrate the applicability and validation of these similarity measures of HFLTSs. Furthermore, the results of examples demonstrate that the Dice and Jaccard similarity measures are more reasonable than the cosine similarity measure with respect to HFLTSs.

Entities: CellLine Chemical Disease Species

Mesh：

Year: 2017 PMID： 29261710 PMCID： PMC5738036 DOI： 10.1371/journal.pone.0189579

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Similarity measures and distances are widely used to determine the relationship between two individuals in many domains [1-8], including medical engineering, decision making, pattern recognition, and network comparison. The “classical” similarity measures comprise Dice’s measure, Jaccard’s measure, the cosine formula, the overlap measures, and the correlation coefficient of Pearson [9]. The Dice, Jaccard, and cosine measures are types of vector similarity measures. With the development of fuzzy sets, “classical” similarity measures have been extended to various fuzzy environments [10-24]. Xu et al. [19] introduced the cosine similarity measures for the hesitant fuzzy environment. Zhang et al. [24] defined an integrated similarity measure based on the Dice and cosine measures for intuitionistic fuzzy sets. Ye [22] defined the Dice similarity measure for intuitionistic fuzzy sets. Ye [23] extended the Dice, Jaccard and cosine similarity measures to hesitant fuzzy sets. Through an example, Ye [23] pointed out that the Dice and Jaccard measures are more reasonable than the cosine measure when applied to a hesitant fuzzy set. Chiclana et al. [1] put forward a statistical comparative study of the manner in which five distance functions (Manhattan, Euclidean, cosine, Dice, and Jaccard) affect the consensus process for group decision-making problems. In some practical problems, because decision makers may exist in a state of hesitation for several linguistic terms with comparison of two methods, such a linguistic term is often insufficient. To deal with this situation, Rodríguez et al. [25] introduced the hesitant fuzzy linguistic term set (HFLTS), which is a strong structure that reflects decision makers’ hesitant attitude [26]. Moreover, different similarity measures for HFLTS have been put forth [15-17], e.g., Liao et al. [16] introduced the cosine similarity measures for HFLTS. In this paper, based on vector similarity measures, we extend Dice and Jaccard measures to HFLTSs and denote them as and , respectively. Moreover, the - and -distance-based technique for order of preference by similarity to an ideal solution (HFL-TOPSIS) methods are further established. Through examples, it shows that the extended Dice and Jaccard measures with HFLTSs are more reasonable than the cosine measure.

Similarity measures with hesitant fuzzy linguistic term set

Vector similarity functions

In the following, we introduce two classical vector similarity measures: Dice similarity [27] and Jaccard similarity [28]. Assuming two vectors, X = (x1,x2,⋯,x) and Y = (y1,y2,⋯,y), we can obtain

Hesitant fuzzy linguistic term set

Definition 1. [16] Let x ∈ X, i = 1,2,…,N, and S = {s|t = −τ,…,−1,0,1,…,τ} be a linguistic term set. Then, a hesitant fuzzy linguistic term set (HFLTS), H in X, is denoted as follows: where h(x) can be indicated as where φ ∈ {−τ,…,−1,0,1,…,τ} is the subscript of a linguistic term and L(x) is the total number of linguistic terms in h(x). The linguistic terms of an HFLTS, , might be unordered. For simplicity, we arrange the linguistic terms, , in ascending or descending order. The ascending order rule is to arrange the linguistic term set from small to large subscripts, whereas the descending order is just the opposite. Different HFLTSs always possess different numbers of linguistic terms. Zhu and Xu [29] recommend a method for increasing the shorter HFLTs until it has the same length as the longer one. The adding regulation mainly relies upon the risk preferences of decision makers by adding the maximum value, minimum value, and mean value, which correspond to optimism, pessimism, and neutral rules, respectively. Without loss of generality, we add the shorter terms according to neutral rules in this paper.

Similarity measures for HFLTSs

Let S = {s|t = −τ,…,−1,0,1,…,τ} be a linguistic term set. For two HFLTSs, and , Liao et al. [16] defined the information energy of and the correlation between two and as , respectively. Here, L is the maximum number of linguistic terms in or (with the shorter of the two needing to be extended to same length), . The N is the cardinality of X. Based on the definitions of the information energy and correlation of the HFLTSs, two vector similarity measures for HFLTSs are proposed. Definition 2. The similarity measure between and is defined as Some theorems of similarity measure are proposed as follows: Theorem 1. The similarity measure, , between the HFLTSs, and , possesses the following properties: ; , if and only if ; . Proof: (1) and (2) are obvious. (3) It is obvious for . According to the inequality a2 + b2 ≥ 2ab, we have . Thus, we obtain , and the property (3) holds. □ Definition 3. Similarity measure between and is defined as Theorem 2. The similarity measure, , between the HFLTSs and possesses the following properties: ; , if and only if ; . Proof: (1) and (2) are obvious. (3) This is obvious for . According to the inequality a2 + b2 ≥ 2ab, we have . Thus, we obtain , and the property (3) holds. □ Example 1. Let S = {s|α = −3,…,−1,0,1,…,3} be a linguistic term set and and be two HFLTSs on S. We can extend to by adding the linguistic term, s1.5. Thus, the similarity measure, , between and is obtained as follows:

Weighted similarity measures for HFLTSs

Let w be the weights of elements x(i = 1,2,…,N) and . Then, the similarity measure formulas given in Eqs (6) and (7) can be extended as follows: It is obvious that, if , then Eqs (8) and (9) are simplified to Eqs (6) and (7), respectively. Likewise, the two weighted similarity measures also possess the following properties: ; ; , if and only if .

Ordered weighted similarity measure for HFLTSs

Inspired by the OWA operators proposed by Yager [30], Liao et al. [17] defined the ordered weighted correlation of any two HFLTSs, and , by Where ζ(1),ζ(2),…,ζ(N) satisfy Here, w are the weights of ordered positions for elements x(i = 1,2,…,N) with . Similarly, the ordered weighed information energy of the set, H, is defined as Afterwards, we extend Eqs (8) and (9) to Eqs (13) and (14), respectively: The two ordered, weighted similarity measures also have the following properties: , ; ; , if and only if .

The -distance-based HFL-TOPSIS method

In recent years, multi-criteria decision making (MCDM) methods [31-36] have been developed and widely applied to diverse scientific fields, such as water resource utilization, energy management, machine tool evaluation, and supplier selection. TOPSIS is a simple and widely used MCDM method [37-39] for order preference using a close-to-ideal solution [40-42]. With the development of fuzzy sets, TOPSIS has been extended to fuzzy environments [43-48]. Furthermore, distance and similarity measures have a mutual transformation relationship with each other. Liao et al. [15] defined this relationship for HFLTSs as follows: Then, the corresponding distance measures can be easily obtained using Eq (15). Inspired by the cosine-distance-based HFL-TOPSIS method [16], the -distance-based HFL-TOPSIS method can be defined as follows: Step 1. Let A = {A1,A2,⋯,A} and C = {C1,C2,⋯,C} be a set of alternatives and a set of criteria, respectively. Let w be the weights of criteria C, where . The characteristics of A in relation to criteria c are represented by an HFLE, , where S = {s|t = −τ,…,−1,0,1,…,τ} is a linguistic term set. Step 2. The positive ideal solution, , and negative ideal solution, , are developed as follows: Step 3. According to Eq (16), the construction of the positive ideal distance matrix, D+, and the negative ideal distance matrix, D−, are given as where the distance between the two HFLEs, and , can be given as follows: Step 4. Calculate the closeness coefficient Where and . Step 5. Rank the alternatives by decreasing order of R. In the same way, the distance with the -distance-based HFL-TOPSIS method can be denoted as follows:

Application of the similarity measures of HFLTS

Example 2 [17]. In traditional Chinese medical diagnosis, a doctor always gets some imprecise information about a patient’s symptoms, such as temperature, headache, cough, and stomach pain, through seeing, smelling, asking, and touching. Assuming that a doctor wants to make a proper diagnosis for a patient with four symptoms, V = {temperature, headache, cough, stomach pain} with four possible diseases, D = {Viral infection, Typhoid, Pneumonia, Stomach problem}. Each symptom can be seen as a linguistic variable, whose corresponding linguistic term set is shown as follows: We can generate the following knowledge-based data set in terms of HFLTSs (see Table 1) according to existing experience. Assume there are four patients, P = {Richard, Catherine, Nicole, Kevin}, whose symptoms, as linguistic expressions, can be transformed into HFLTSs (Table 2).

Table 1

Symptoms characteristic for the considered diagnosis in terms of HFLTSs.

	Temperature	Headache	Cough	Stomach pain
Viral infection	{s₁,s₂,s₃}	{s₀,s₁,s₂}	{s₁,s₂,s₃}	{s₋₃}
Typhoid	{s₂,s₃}	{s₁,s₂,s₃}	{s₁,s₂,s₃}	{s_-3,s_-2}
Pneumonia	{s₀,s₁}	{s₋₁,s₀}	{s₂,s₃}	{s₋₃}
Stomach problem	{s₀}	{s₋₃}	{s₋₃}	{s₁,s₂,s₃}

Table 2

Symptoms characteristic for the considered patients in terms of HFLTSs.

	Temperature	Headache	Cough	Stomach pain
Richard	{s₂}	{s₂}	{s₁,s₂}	{s₋₃}
Catherine	{s₀}	{s₋₃}	{s₋₃}	{s₁,s₂}
Nicole	{s₃}	{s₁}	{s₂}	{s₋₃}
Kevin	{s₁}	{s₋₁,s₀}	{s₂}	{s₋₃}

In order to diagnosis the diseases of these four patients, we can calculate the similarity measures between the data set of each patient’s symptoms and that of the diagnoses. We use two new similarity measures, and , to derive the relationship between each patient and disease, and the similarity values taken from the Dice and Jaccard measures are displayed in Tables 3 and 4, respectively.

Table 3

Similarity values of between each patient’s symptoms and possible diagnosis.

	Viral infection	Typhoid	Pneumonia	Stomach problem
Richard	0.9283	0.9482	0.8333	0.7826
Catherine	0.6667	0.7237	0.7297	0.9884
Nicole	0.9302	0.9265	0.8101	0.6569
Kevin	0.8792	0.7964	0.9677	0.7265

Table 4

Similarity values of between each patient’s symptoms and possible diagnosis.

	Viral infection	Typhoid	Pneumonia	Stomach problem
Richard	0.8661	0.9015	0.7143	0.6428
Catherine	0.5000	0.5671	0.5745	0.9771
Nicole	0.8696	0.8630	0.6809	0.4891
Kevin	0.7845	0.6617	0.9375	0.5704

The principle behind the diagnosis is the larger the value of the similarity measure, the higher possibility of the diagnosis for the patient. From Table 3 and Table 4, we can see that Richard, Catherine, Nicole, and Kevin are suffering from typhoid, stomach problem, viral fever, and pneumonia, respectively, which is in concordance with the correlation coefficient values of ρ1 calculated in [17], but not with those of ρ2 calculated in [17]. This is because the and similarity measures are obtained according to the normalized inner product within a vector space, while ρ2 [17] was defined using the classical overlap measure. These two different measures come from different points of view. Example 3. In the following, we discuss an MCDM problem [16] in terms of both the - and -distance-based HFL-TOPSIS methods, respectively. Assume a company intends to select an ERP system from three candidates, A = {A1,A2,A3}, with three criteria: C1 (potential cost), C2 (function), and C3 (operational complexity) of weights 0.3, 0.5, and 0.2, respectively. As ERP systems are very complicated, it is not easy to use just one linguistic term to express an opinion for the decision maker. Thus, the decision maker may be hesitant when determining the values of each ERP system over the criteria. We transform the linguistic expressions of CIO (Chief Information Officer) into a HFLTS judgment matrix H, using the transformation function [25]: Now, we try to use the - and -distance-based HFL-TOPSIS methods to solve this MCDM problem. Using the -distance-based HFL-TOPSIS method Step 1. It is given above, so we go to Step 2 directly; Step 2. Since all criteria are benefit criteria according to the score function and the variance function [49], we obtain , , , , , and . Thus, the positive ideal solution and the negative ideal solution for this problem are A+ = ({s2,s3},{s2,s3},{s3}) and A− = ({s1,s2,s3},{s1,s2,s3},{s−2,s−1,s0}), respectively. Step 3. According to Eq (16), we can construct D+ and D− as follows: Step 4. Using Eq (17), we can calculate the closeness coefficient. Since , , , , , , we obtain RC(A1) = 0.7904, RC(A2) = 0, RC(A3) = 0.8401. Step 5. By means of the closeness coefficient of each alternative, the ranking of these ERP systems is A3 ≻ A1 ≻ A2, which implies that the third ERP system, A3, is the best choice for the company. Using the -distance-based HFL-TOPSIS method Steps 1 and 2 are the same as those in the -distance-based HFL-TOPSIS method; Step 3. According to Eq (18), we can construct D+ and D− as follows: Step 4. Using Eq (17), calculate the closeness coefficient. Since , , , , , , we obtain RC(A1) = 0.73712, RC(A2) = 0, RC(A3) = 0.79832. Step 5. By means of the closeness coefficient of each alternative, the ranking of these ERP systems is A3 ≻ A1 ≻ A2, which implies that the third ERP system, A3, is the best choice for the company. From the above results, it can be concluded that the ranking of our two TOPSIS methods is the same, while that [16] of the three ERP systems is inconsistent as a result of the application of different distance measures to TOPSIS methods. However, the closeness coefficients of the third and first ERP systems are very similar to that of the ideal solution. In fact, the third ERP system, A3 = ({s2,s3},{s1,s2,s3},{s3}), is closer to the positive ideal solution, A+ = ({s2,s3},{s2,s3},{s3}), than the first ERP system A1 = ({s1,s2,s3},{s2,s3},{s1,s2,s3}). Therefore, the third ERP system is a better choice for the company than the first ERP system, which validates that our methods are effective. The ranking result should be regarded as a support to the decision-making process. Afterwards, decision makers can choose an ERP system according to their preferences based on the ranking results of the TOPSIS method. Example 4. Suppose that assessed values of two alternatives are A1 = ({s2,s1,s−1},{s2,s1},{s2,s1,s−1}), A2 = ({s3,s2,s1},{s2,s1},{s−1,s−2}) based on three criteria weight vector ω = (0.35,0.25,0.4), and ideal alternative is A* = ({s4,s4,s4},{s4,s4},{s4,s4}). In the following, we calculate the Dice, Jaccard, and cosine [16] similarity measures between A1 and A*, and Dice, Jaccard, and cosine similarity measures between A2 and A*, respectively. Although above three similarities obtain same conclusion that A2 is better than A1, C(A1,A*) is approximately equal C(A2,A*). Therefore, it means that Dice and Jaccard similarities have a stronger ability to discriminate between HFLTSs than Cosine similarity, which could further verify our conclusions, namely, the Dice and Jaccard similarity measures are more reasonable than the cosine similarity measure with respect to HFLTSs.

Conclusions

In this paper, we introduced two novel similarity measures for HFLTSs and enumerated some properties of these similarity measures. Furthermore, the two weighted similarity measures and the ordered weighted similarity measures for HFLTSs have been established and analyzed. Inspired by the cosine-distance-based HFL-TOPSIS method, the - and -distance-based HFL-TOPSIS methods can be introduced. An application example concerning the traditional Chinese medical diagnosis and a MCDM problem have been discussed to illustrate the applicability and validation of both our HFLTS similarity measures. Through examples, it has been shown that the Dice and Jaccard measures are more reasonable than the cosine measure for the HFLTS.

3 in total

1 in total

1. Deriving the priority weights from probabilistic linguistic preference relation with unknown probabilities.

Authors: Yongming Song
Journal: PLoS One Date: 2018-12-10 Impact factor: 3.240