| Literature DB >> 33285827 |
Yichuan Li1,2, Weihong Cai1,2, Yao Li1,2, Xin Du1,2.
Abstract
Numerous problems in many fields can be solved effectively through the approach of modeling by complex network analysis. Finding key nodes is one of the most important and challenging problems in network analysis. In previous studies, methods have been proposed to identify key nodes. However, they rely mainly on a limited field of local information, lack large-scale access to global information, and are also usually NP-hard. In this paper, a novel entropy and mutual information-based centrality approach (EMI) is proposed, which attempts to capture a far wider range and a greater abundance of information for assessing how vital a node is. We have developed countermeasures to assess the influence of nodes: EMI is no longer confined to neighbor nodes, and both topological and digital network characteristics are taken into account. We employ mutual information to fix a flaw that exists in many methods. Experiments on real-world connected networks demonstrate the outstanding performance of the proposed approach in both correctness and efficiency as compared with previous approaches.Entities:
Keywords: complex network; entropy; key nodes; mutual information
Year: 2019 PMID: 33285827 PMCID: PMC7516483 DOI: 10.3390/e22010052
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Results of recognizing key nodes in an artificial network by (a) betweenness centrality; (b) closeness centrality; (c) eigenvector centrality; (d) degree centrality; (e) harmonic centrality; and (f) Katz centrality, respectively. These methods are applied in the same artificial network; however, the centralities are quite different from each other (see the red nodes, they are considered as the most important centralities of the network; and the deeper the blue, the less important the node.).
Figure 2This figure shows the additive and subtractive relationships of various information measures that are associated with correlated variables X and Y. The intermediate part (violet) is the mutual information.
Figure 3This is a simple directed and weighted network with a corresponding incidence matrix.
This table contains some necessary parameters that can be obtained by Formulas (12), (13), (17), and (18). For example, node a and c are neighbors of node b, and the outdegree of node a and c is 2 and 1, respectively. Therefore, . The sum of the total weights carried by these outdegrees is 8.
| Node |
|
|
|
|
|---|---|---|---|---|
|
| 3 | 2 | 7 | 5 |
|
| 2 | 3 | 5 | 8 |
|
| 2 | 3 | 4 | 7 |
|
| 1 | 2 | 3 | 5 |
The overall power of each node.
| Node |
|
|
|
|
|---|---|---|---|---|
|
| 0.7782 | 0.7073 | 1.44674 | |
|
| 0.5775 | 0.5796 | 1.10648 | |
|
| 0.5775 | 0.5040 | 1.03088 | |
|
| 0.3010 | 0.2966 | 0 | 0.59760 |
The ranking list.
| Node | Placing |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 4Layout of the datasets. (a) Dutch college; (b) US-Airports; (c) Air traffic control; (d) E-road; (e) Chicago; and (f) Dolphins.
The basic topological properties of six real-world networks.
| Network |
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| Dc | 32 | 3062 | 191.38 (overall) | 290 | 0.904 | −0.05898 | 0.981 |
| US-Air | 1574 | 28,236 | 35.878 (overall) | 596 | 0.384 | −0.11330 | 0.842 |
| A-tc | 1226 | 2615 | 4.2659 (overall) | 37 | 0.0639 | −0.01520 | 0.951 |
| E-road | 1174 | 1417 | 2.4140 | 10 | 0.0339 | 0.12668 | 0.985 |
| Chicago | 1467 | 1298 | 1.7696 | 12 | 0 | −0.50492 | 0.928 |
| Dolphins | 62 | 159 | 5.1290 | 12 | 0.309 | −0.04359 | 0.957 |
The M values that were obtained by different methods on the six real networks.
|
|
|
|
|
|
|
|
| Dc | 0.69037 | 0.00000 | 0.36785 | 0.99570 | 0.12366 |
|
| US-Air | 0.05925 | 0.06400 | 0.00442 | 0.33589 | 0.04230 | 0.85336 |
| A-tc | 0.11111 | 0.03290 | 0.14406 | 0.56714 | 0.86431 |
|
| E-road | 0.96893 | 0.40198 | 0.43244 | 0.00777 | 0.81910 |
|
| Chicago | 0.61174 | 0.38272 | 0.09452 | 0.00000 | 0.53828 | 0.83838 |
| Dolphins | 0.94540 | 0.74809 | 0.11111 | 0.17084 | 0.54066 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Dc | 0.97895 | 1.00000 | 0.69037 | 1.00000 | 1.00000 |
|
| US-Air | 0.26639 | 0.27627 |
| 0.36178 | 0.39860 | 0.85336 |
| A-tc | 0.75375 | 0.93115 | 0.90503 | 0.92784 | 0.93413 |
|
| E-road | 0.53086 | 0.90656 | 0.96976 | 0.86180 | 0.88559 |
|
| Chicago | 0.07802 | 0.06905 | 0.69762 | 0.69650 |
| 0.83838 |
| Dolphins | 0.34183 | 0.93976 | 0.94540 | 0.95030 | 0.95031 |
|
|
|
|
|
|
|
|
|
The maximum value and mean value are highlighted in bold.
Figure 5The node distribution as ranked by the top five measures in four datasets: (a) Dutch college; (b) US-Airport; (c) Air traffic control; and (d) Dolphins.
Figure 6The complementary cumulative distribution function (CCDF) plots for the ranking lists that were obtained by the top five measures on (a) Dutch college; (b) US-Airport; (c) E-road; and (d) Dolphins.
The top 10 key nodes as ranked by different methods.
| Rank | Deg | K-s | Clo | Bet | Str | Rad | Bri | Cen | Ecc | DMNC | EMI |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 22 | 19 | 22 | 7 | 22 | 26 | 7 | - | - |
| 4 |
| 2 | 17 | 7 | 3 | 3 | 7 | 22 | 3 | - | - |
| 5 |
| 3 | 3 | 3 | 26 | 22 | 3 | 3 |
| - | - | 18 | 1 |
| 4 | 7 | 26 | 31 | 6 | 31 | 31 |
| - | - | 9 | 32 |
| 5 | 13 | 6 | 20 | 26 | 26 | 20 | 26 | - | - |
| 24 |
| 6 | 6 | 31 | 17 | 31 | 21 | 29 | 6 | - | - | 2 | 30 |
| 7 | 31 | 22 | 6 | 21 | 13 | 21 | 31 | - | - | 8 | 11 |
| 8 | 21 | 21 | 21 | 13 | 29 | 17 | 22 | - | - | 16 | 25 |
| 9 | 29 | 29 | 29 | 29 | 6 | 15 | 21 | - | - | 19 | 10 |
| 10 | 28 | 13 | 15 | 27 | 17 | 6 | 27 | - | - | 12 | 14 |
|
| |||||||||||
| 1 | 46 | - | 68 | 32 | 22 | - | 424 | - | - | 168 | 99 |
| 2 | 88 | - | 52 | 22 | 10 | - | 308 | - | - |
| 620 |
| 3 | 69 | - | 174 | 165 | 195 | - | 498 | - | - | 169 | 374 |
| 4 | 74 | - | 147 | 74 | 165 | - | 1418 | - | - | 127 | 187 |
| 5 | 165 | - | 69 | 174 | 147 | - | 986 | - | - | 166 | 265 |
| 6 | 150 | - | 88 | 46 | 174 | - | 744 | - | - |
| 594 |
| 7 | 174 | - | 74 | 418 | 46 | - | 804 | - | - |
| 198 |
| 8 | 147 | - | 159 | 159 | 74 | - | 520 | - | - |
| 485 |
| 9 | 159 | - | 150 | 69 | 317 | - | 1186 | - | - | 650 | 170 |
| 10 | 57 | - | 60 | 136 | 205 | - | 781 | - | - |
| 665 |
|
| |||||||||||
| 1 |
| 116 |
|
|
|
| 1226 | - | 734 | 3 | 312 |
| 2 |
| 34 |
|
| 213 |
| 953 | - | 369 | 221 | 52 |
| 3 |
|
|
| 212 | 212 |
| 842 | - | 618 |
| 68 |
| 4 |
| 109 |
|
|
|
| 1196 | - | 678 | 139 | 187 |
| 5 |
| 102 | 116 | 135 | 220 | 116 | 1007 | - | 827 | 28 | 113 |
| 6 |
| 77 |
| 213 |
|
| 254 | - | 992 | 29 | 44 |
| 7 | 110 | 424 | 110 | 523 | 135 | 110 | 307 | - | 787 | 1051 | 89 |
| 8 |
| 81 |
| 220 |
|
| 300 | - | 162 | 1053 | 47 |
| 9 |
| 308 | 51 | 148 | 148 | 51 | 1095 | - | 464 | 994 | 604 |
| 10 | 135 | 470 |
| 629 | 119 |
| 396 | - | 829 | 841 | 46 |
|
| |||||||||||
| 1 |
| 453 | 1174 | 402 | 277 |
| 404 | 1174 | 647 | - | 284 |
| 2 |
| 433 | 1173 | 284 | 402 | 402 | 452 | 1173 | 59 | - | 236 |
| 3 |
| 542 | 1163 | 277 | 837 | 403 | 837 | 1163 | 62 | - | 137 |
| 4 |
| 8 | 1162 | 453 | 836 | 432 | 835 | 1163 | 1174 | - | 39 |
| 5 |
| 179 | 1152 | 452 | 228 | 1019 | 801 | 1151 | 1173 | - | 7 |
| 6 |
| 280 | 1151 | 403 | 225 | 253 | 546 | 1148 | 1163 | - | 107 |
| 7 |
| 57 | 1148 |
|
| 452 | 799 | 1147 | 1162 | - | 401 |
| 8 | 499 | 253 | 1147 | 404 | 453 | 404 | 866 | 1094 | 1152 | - | 43 |
| 9 |
| 543 | 1094 | 837 | 452 | 232 | 811 | 1093 | 1151 | - | 141 |
| 10 |
| 479 | 1093 | 836 | 224 | 284 | 889 | 1077 | 1148 | - | 181 |
|
| |||||||||||
| 1 |
| 1157 | 918 |
|
|
| 1157 | 918 |
| - | 552 |
| 2 |
|
| 798 | 1157 |
| 1157 |
| 798 | 918 | - | 922 |
| 3 |
|
| 900 |
|
|
|
| 900 | 798 | - | 1147 |
| 4 |
|
| 1204 |
|
|
|
| 1204 |
| - | 1150 |
| 5 |
|
| 1163 |
|
|
|
| 1163 | 900 | - | 1146 |
| 6 |
|
| 1085 |
|
|
|
| 917 | 1082 | - | 1153 |
| 7 |
|
| 917 |
|
|
|
| 1292 | 917 | - | 1154 |
| 8 |
| 1134 | 1292 |
|
|
|
| 924 | 559 | - | 1155 |
| 9 |
| 1148 | 924 | 1138 | 1138 | 1138 | 1158 | 797 | 1292 | - | 1156 |
| 10 | 1138 | 1138 | 797 | 499 | 499 | 1100 | 1159 | 1343 | 183 | - | 817 |
|
| |||||||||||
| 1 | 52 | 7 |
|
|
|
| 40 |
|
|
| 21 |
| 2 | 34 |
|
|
|
|
| 24 |
|
|
| 38 |
| 3 |
| 22 |
|
|
|
| 8 |
|
| 19 | 15 |
| 4 |
| 10 |
|
|
|
|
|
| 55 | 25 | 2 |
| 5 | 58 | 25 |
| 8 |
|
| 29 |
| 8 | 7 | 46 |
| 6 |
|
|
|
|
|
|
| 8 | 29 | 58 | 41 |
| 7 |
| 19 | 8 |
| 55 | 29 |
| 34 | 40 | 44 | 18 |
| 8 |
| 52 | 29 | 55 | 8 | 8 | 55 |
| 31 | 35 | 30 |
| 9 | 14 |
| 34 | 52 | 58 | 34 | 53 | 9 | 28 | 6 | 51 |
| 10 | 39 |
| 9 | 58 | 29 | 9 | 33 | 29 |
| 17 | 37 |
Dolphins; Red means that this node is in common with EMI; ‘-’ in the list means this method has neither reliability nor reference value as it has assigned a lot of nodes to one rank.
Figure 7The infection epidemic with different seed sets in the six datasets, including: (a) Dutch college; (b) US-Airport; (c) Air traffic control; (d) E-road; (e) Chicago; and (f) Dolphins.
The time consumption of different methods in the six datasets. For example, the time of obtaining key nodes of Dutch college using EMI is only 0.0015 s. For each dataset, the shortest uptime is highlighted in bold.
| Data | Deg | K-s | Clo | Bet | Str | Rad | Bri | Cen | Ecc | DMNC | EMI |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Dc | 0.043 | 0.078 | 0.055 | 0.040 | 0.041 | 0.042 | 0.032 | 0.023 | 0.025 | 0.059 |
|
| US-Air | 4.750 | 4.120 | 4.520 | 20.59 | 4.500 | 4.390 | 8.700 | 4.650 | 4.580 | 6.220 |
|
| A-tc | 0.400 | 0.377 | 0.413 | 0.393 | 0.402 | 0.412 | 0.580 | 0.362 | 0.400 | 0.418 |
|
| E-road | 0.495 | 0.399 | 0.402 | 0.413 | 0.412 | 0.419 | 0.501 | 0.471 | 0.333 | 0.413 |
|
| Chicago | 0.520 | 0.411 | 0.780 | 0.790 | 0.690 | 0.512 | 0.742 | 0.720 | 0.810 | 0.991 |
|
| Dolphins | 0.0033 | 0.0030 | 0.0043 | 0.0044 | 0.0041 | 0.0037 | 0.0048 | 0.0046 | 0.0050 | 0.0051 |
|