Literature DB >> 26479495

Discrimination Power of Polynomial-Based Descriptors for Graphs by Using Functional Matrices.

Matthias Dehmer1, Frank Emmert-Streib2, Yongtang Shi3, Monica Stefu4, Shailesh Tripathi5.   

Abstract

In this paper, we study the discrimination power of graph measures that are based on graph-theoretical matrices. The paper generalizes the work of [M. Dehmer, M. Moosbrugger. Y. Shi, Encoding structural information uniquely with polynomial-based descriptors by employing the Randić matrix, Applied Mathematics and Computation, 268(2015), 164-168]. We demonstrate that by using the new functional matrix approach, exhaustively generated graphs can be discriminated more uniquely than shown in the mentioned previous work.

Entities:  

Mesh:

Year:  2015        PMID: 26479495      PMCID: PMC4610680          DOI: 10.1371/journal.pone.0139265

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Polynomial representations have been investigated extensively in several application areas such as mathematical chemistry, discrete mathematics etc., see [1-5]. Some of these polynomials have been used as counting polynomials [2]. Another idea has been to define structural network measures based on the eigenvalues of graph polynomials, see [4, 6, 7]. A well-known example thereof is the Hosoya index [8] that has been defined by the coefficient of the so-called matching polynomial [3]. Another example relates to define spectral-based graph measures [9, 10]. Dehmer et al. [11] also made a contribution in this area by studying graph measures based on the zeros of the so-called information polynomial. It has been also Dehmer et al. [11] who explored the discrimination power of these polynomial-based measures. This relates to study the ability of these descriptors to distinguish non-isomorphic graphs structurally. The paper is a successor of [6]. In [6], Dehmer et al. explored the discrimination of quantitative graph measures which are based on the eigenvalues of the Randić matrix. In this paper, we define a functional matrix that is more general than the Randić matrix. Therefore we expect an impact on the results when evaluating the discrimination power of the indices already used in [6] on exhaustively generated graphs. In fact, we find that by choosing a different structural setting than the one encoded by the Randić matrix, the proposed measures are even more unique than in [6].

Methods and Results

Novel Graph Measures Based on Complex Zeros

We use the idea of defining a matrix where its elements encode structural information as much as possible. For this, we use the information functional approach due to Dehmer [12]. An information functional is a function that maps the vertices to the reals; note that several information functionals have been already defined and they have been proven useful for discriminating graphs uniquely or classifying graphs efficiently, see [12, 13]. Let G = (V, E) a graph. As in [6, 14], we define the graph descriptors based on the zeros of where is the functional matrix defined by As we can see, is based on using an information functional f and a function g composing f(v ) and f(v ), v , v ∈ V. The composition function g should be symmetrical, so the eigenvalues of Eq 1 are then real-valued. It is clear that is given by We define the symmetric functions and the information functionals [13]. and σ(v ) is the eccentricity of a vertex v ∈ V [15], δ(v ) is the degree of v ∈ V [15], and f 3 is based in vertex spheres [12]; we have chosen the coefficients c linearly decreasing like Note that in case of using the matrix , we obtain the well-known Randić matrix [16] This matrix has already been used when evaluating the discrimination power of the graph measures representing Eqs 12–14, see [6]. Based on Eq 2, we see that g(f(v ), f(v )) is taken into account and, hence, Eq 2 generalizes the Randić matrix. Therefore the here defined functional matrix enables us searching for cases where the functional matrix can encode structural information of graphs more uniquely than in case using the Randić matrix only, see [6]. Solving Eq 3, i.e., computing , we determine the non-zero roots . In order to compare the new results with previous ones, we define as in [4] straightforwardly the following graph measures based on the zeros of : and

Numerical Results

To interpret the numerical results, we start with giving some technical preliminaries. Here we describe how we generate exhaustively generated trees and graphs. We use the tree classes T , 14 ≤ i ≤ 19 containing all non-isomorphic tress with i vertices. N , 5 ≤ i ≤ 9 are the set of all non-isomorphic graphs with i vertices. The sizes of these classes are depicted in corresponding tables. As in [6], we generated the graphs by employing the package Nauty by McKay [17] and implemented the graph measures M , 1 ≤ i ≤ 3 in R. Now we start interpreting and discussing concrete numerical results; to do so, we start with trees. Tables 1–6 show the results when using g = g 1 and f = f 1, f 2, f 3. We see that given by the sum of the square roots of the roots of is highly unique on exhaustively generated trees. In most of the cases, this measure is fully unique (ndv = 0 and S = 0). Also, it is evident that the case f = f 2, g = g 1 and corresponds to the Randić matrix (see Tables 2 and 5 and [6]). We see that the cases g = g 1 and f = f 1, f 3 give better results than in [6]. An explanation for this could be the better coverage of the local topological neighborhood of a vertex by the information functional f = f 1, f 3. In case of f 3, this seems very plausible as f 3 captures the full neighborhood of each vertex by using j-spheres [12]. In summary, all graph measures are highly unique on the chosen graph classes by using g = g 1 and f = f 1, f 2, f 3. However the results are even more striking when considering Tables 7–18. We see that the measures are more unique when using g = g 2 and f = f 1, f 2, f 3. It seems that g = g 2 encodes/spreads its values more efficiently. This corresponds to an earlier finding due to Balaban et al. [18] where the descriptor was found to be highly unique; I 1, …, I are topological indices. To study further details, see [18]. In this light, the zeros of the graph polynomial given by Eq 1 can be interpreted as structural graph descriptors (or topological indices). In particular, we observe the high uniqueness of when considering g = g 2, g = g 3 and the graph classes T 18 and T 19. Also note that T 19 contains more than 300.000 connected graphs with 19 vertices each. Clearly, these results outperform the ones obtained in [6].
Table 1

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 1, g = g 1 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 60.998160.9992120.9993220.9995
M2g,f 01,000001,000001,000001,0000
M3g,f 01,000040.999480.999580.9998
Table 6

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 3, g = g 1 and .

T 18 T 19
Measurendv S ndv S
M1g,f 01.000001.0000
M2g,f 01.000001.0000
M3g,f 01.000001.0000
Table 2

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 2, g = g 1 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 300,9905620,99191260,99342280,9953
M2g,f 01,000001,000020,999801,0000
M3g,f 20,999340,999480,999580,9998
Table 5

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 2, g = g 1 and .

T 18 T 19
Measurendv S ndv S
M1g,f 5280,99576930,9978
M2g,f 80,999901,0000
M3g,f 140,9998400,9998742
Table 7

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 1, g = g 2 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 01.000001.000001.000001.0000
M2g,f 01.000001.000001.000001.0000
M3g,f 01.000001.000001.000001.0000
Table 18

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 3, g = g 3 and .

T 18 T 19
Measurendv S ndv S
M1g,f 01.000001.0000
M2g,f 01.000001.0000
M3g,f 01.000001.0000

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 1, g = g 1 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 2, g = g 1 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 3, g = g 1 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 1, g = g 1 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 2, g = g 1 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 3, g = g 1 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 1, g = g 2 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 2, g = g 2 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 3, g = g 3 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 1, g = g 2 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 2, g = g 2 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 3, g = g 2 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 1, g = g 3 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 2, g = g 3 and . |T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 3, g = g 3 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 1, g = g 3 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 2, g = g 3 and . |T 18| = 123867, |T 19| = 317955. Here we used f = f 3, g = g 3 and . As to the graphs, we look at Tables 19–21. We see that the results are shown for g = g 1 and f = f 1, f 2, f 3. The results for g = g 2 and g = g 3 are very similar and therefore they are not shown. We emphasize that most of the obtained results are much better than the ones in [6]. In this paper, the lowest ndv-value for N 9 achieved by using equals 8 (see Table 20); that means 99.9996% out of 261080 graphs could be discriminated uniquely. This is a striking result and outperforms all earlier results done by Dehmer et al. and co-workers, see, [4, 6, 19–22].
Table 19

Exhaustively generated sets of non-isomorphic graphs.

∣N 6∣ = 112, ∣N 7∣ = 853, ∣N 8∣ = 11117, ∣N 9∣ = 261080. Here we used f = f 1, g = g 1 and .

N 5 N 6 N 7 N 8 N 9
Measurendv S ndv S ndv S ndv S ndv S
M1g,f 01.000050.9553180.97884080.9633133050.9490
M2g,f 01.000001.000020.99761810.983766680.9744
M3g,f 01.000001.000040.99531120.989963920.9755
Table 21

Exhaustively generated sets of non-isomorphic graphs.

∣N 6∣ = 112, ∣N 7∣ = 853, ∣N 8∣ = 11117, ∣N 9∣ = 261080. Here we used f = f 3, g = g 1 and .

N 5 N 6 N 7 N 8 N 9
Measurendv S ndv S ndv S ndv S ndv S
M1g,f 01.000020.982101.000040.9996200.9999
M2g,f 01.000001.000001.000020.999880.9999
M3g,f 01.000001.000001.000040.9996180.9999
Table 20

Exhaustively generated sets of non-isomorphic graphs.

∣N 6∣ = 112, ∣N 7∣ = 853, ∣N 8∣ = 11117, ∣N 9∣ = 261080. Here we used f = f 2, g = g 1 and .

N 5 N 6 N 7 N 8 N 9
Measurendv S ndv S ndv S ndv S ndv S
M1g,f 60.7142130.8839330.9613710.99362720.9989
M2g,f 01.000001.000040.9953130.99881030.9996
M3g,f 20.904701.000080.9906230.99791830.9992

Exhaustively generated sets of non-isomorphic graphs.

∣N 6∣ = 112, ∣N 7∣ = 853, ∣N 8∣ = 11117, ∣N 9∣ = 261080. Here we used f = f 1, g = g 1 and . ∣N 6∣ = 112, ∣N 7∣ = 853, ∣N 8∣ = 11117, ∣N 9∣ = 261080. Here we used f = f 2, g = g 1 and . ∣N 6∣ = 112, ∣N 7∣ = 853, ∣N 8∣ = 11117, ∣N 9∣ = 261080. Here we used f = f 3, g = g 1 and . In comparison, the lowest ndv-value for N 9 equals 126 which has been achieved in [6] by using the entropy-like measure . Finally, we see that the results we have obtained in this paper are in parts slightly better and outperform the previous approach by using the Randić matrix.

Conclusion

In this paper, we generalized the work done by Dehmer et al. [6]. In [6], the eigenvalues of the Randić matrix have been used to define new quantitative network measures. A result of this work [6] was the high discrimination power of the measures on exhaustively generated networks. Note that the definition of Randić matrix comes from the famous Randić index [23, 24], from which some new invariants have been introduced, such as the Randić spectral [25] and the Randić energy [26]. Based on the construction of the functional matrix (see Eq 2), we have expected that the here proposed approach may be useful to discriminate graphs more efficiently than by using earlier methods, see, e.g., [4, 6]. From a mathematical point of view, this seems plausible as the involved information functionals f 1, …, f 3 encode structural information differently. In particular, f 3 captures the full topological neighborhood of a vertex and, hence, it encodes structural information more efficiently than f 1. This effect can be seen by the Tables 1–21. On top of that, the composite functions g 1, …, g 3 may optimize the spread of values of the involved information functionals. We therefore conclude that the ability how the final graph measure can discriminate graphs structurally also depends on the composite function g. In this case, evidence of this statement follows as we defined settings different from the Randić matrix, i.e., f = f 2, g = g 1 and finally .
Table 3

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 3, g = g 1 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 01.000001.000001.000001.0000
M2g,f 01.000001.000001.000001.0000
M3g,f 01.000001.000001.000001.0000
Table 4

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 1, g = g 1 and .

T 18 T 19
Measurendv S ndv S
M1g,f 420.9996680.9997
M2g,f 01,000001,0000
M3g,f 100.9999240.9999
Table 8

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 2, g = g 2 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 01.000001.000001.000001.0000
M2g,f 01.000001.000001.000001.0000
M3g,f 01.000001.000001.000001.0000
Table 9

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 3, g = g 3 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 01.000001.000001.000001.0000
M2g,f 01.000001.000001.000001.0000
M3g,f 01.000001.000001.000001.0000
Table 10

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 1, g = g 2 and .

T 18 T 19
Measurendv S ndv S
M1g,f 40.998720.9993
M2g,f 01.000001.0000
M3g,f 01.000001.0000
Table 11

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 2, g = g 2 and .

T 18 T 19
Measurendv S ndv S
M1g,f 01.000001.0000
M2g,f 01.000001.0000
M3g,f 01.000000001.0000
Table 12

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 3, g = g 2 and .

T 18 T 19
Measurendv S ndv S
M1g,f 01.000001.0000
M2g,f 01.000001.0000
M3g,f 01.000001.0000
Table 13

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 1, g = g 3 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 01.000001.000001.000001.0000
M2g,f 01.000001.000001.000001.0000
M3g,f 01.000001.000001.000001.0000
Table 14

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 2, g = g 3 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 01.000001.000001.000020.9993
M2g,f 01.000001.000001.000001.0000
M3g,f 01.000001.000001.000001.0000
Table 15

Exhaustively generated sets of non-isomorphic trees.

|T 14| = 3159, |T 15| = 7741, |T 16| = 19320, |T 17| = 48629. Here we used f = f 3, g = g 3 and .

T 14 T 15 T 16 T 17
Measurendv S ndv S ndv S ndv S
M1g,f 01.00000001.00001.000001.0000
M2g,f 01.000001.000001.000001.0000
M3g,f 01.000001.000001.000001.0000
Table 16

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 1, g = g 3 and .

T 18 T 19
Measurendv S ndv S
M1g,f 01.000001.0000
M2g,f 01.000001.0000
M3g,f 01.000001.0000
Table 17

Exhaustively generated sets of non-isomorphic trees.

|T 18| = 123867, |T 19| = 317955. Here we used f = f 2, g = g 3 and .

T 18 T 19
Measurendv S ndv S
M1g,f 100.9968240.9924
M2g,f 01.000001.0000
M3g,f 20.999320.9993
  5 in total

1.  Complexity of chemical graphs in terms of size, branching, and cyclicity.

Authors:  A T Balaban; D Mills; V Kodali; S C Basak
Journal:  SAR QSAR Environ Res       Date:  2006-08       Impact factor: 3.000

2.  New polynomial-based molecular descriptors with low degeneracy.

Authors:  Matthias Dehmer; Laurin A J Mueller; Armin Graber
Journal:  PLoS One       Date:  2010-07-30       Impact factor: 3.240

3.  Information indices with high discriminative power for graphs.

Authors:  Matthias Dehmer; Martin Grabner; Kurt Varmuza
Journal:  PLoS One       Date:  2012-02-29       Impact factor: 3.240

4.  Structural differentiation of graphs using Hosoya-based indices.

Authors:  Matthias Dehmer; Abbe Mowshowitz; Yongtang Shi
Journal:  PLoS One       Date:  2014-07-14       Impact factor: 3.240

5.  Knowledge Discovery and interactive Data Mining in Bioinformatics--State-of-the-Art, future challenges and research directions.

Authors:  Andreas Holzinger; Matthias Dehmer; Igor Jurisica
Journal:  BMC Bioinformatics       Date:  2014-05-16       Impact factor: 3.169

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.