| Literature DB >> 32111953 |
Felipe Montes1, Ana María Jaramillo2, Jose D Meisel3, Albert Diaz-Guilera4, Juan A Valdivia5, Olga L Sarmiento6, Roberto Zarama2.
Abstract
The explosion of network science has permitted an understanding of how the structure of social networks affects the dynamics of social contagion. In community-based interventions with spill-over effects, identifying influential spreaders may be harnessed to increase the spreading efficiency of social contagion, in terms of time needed to spread all the largest connected component of the network. Several strategies have been proved to be efficient using only data and simulation-based models in specific network topologies without a consensus of an overall result. Hence, the purpose of this paper is to benchmark the spreading efficiency of seeding strategies related to network structural properties and sizes. We simulate spreading processes on empirical and simulated social networks within a wide range of densities, clustering coefficients, and sizes. We also propose three new decentralized seeding strategies that are structurally different from well-known strategies: community hubs, ambassadors, and random hubs. We observe that the efficiency ranking of strategies varies with the network structure. In general, for sparse networks with community structure, decentralized influencers are suitable for increasing the spreading efficiency. By contrast, when the networks are denser, centralized influencers outperform. These results provide a framework for selecting efficient strategies according to different contexts in which social networks emerge.Entities:
Year: 2020 PMID: 32111953 PMCID: PMC7048861 DOI: 10.1038/s41598-020-60239-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Centralized and decentralized seeding strategies in undirected networks. Nodes color represent communities detected using the Louvain method. The highlighted nodes and their corresponding edges represent the seednodes selected using each strategy, and node size represents the selection order within the seednodes set. Centralized seednodes were those with (a) highest degree centrality: Hubs, (b) highest betweenness centrality, (c) highest closeness centrality, (d) highest Page-Rank, and (e) nodes in the k-core. Decentralized seednodes are: (f) nodes with the highest voting score calculated as the sum of the voting ability of its neighbors: Vote-Rank, (g) nodes of a detected community with the highest external degree: Ambassadors, (h) nodes of a detected community with the highest internal degree: Community Hubs, and (i) the most connected neighbor of randomly chosen nodes: Random Hubs. The Random seeding strategy was not represented in the figure.
Acronyms of networks structures categorized according to density and clustering coefficient. For example, LD-LC describes a network with low density and low clustering coefficient. We classified the empirical networks in two of these categories, and we generated random networks for the six categories. We represent with * the categories where it is not possible to generate a connected network within the given ranges of density and clustering coefficient. For each category, we generated networks of three sizes: Small networks with 200 nodes, Medium networks with 1000 nodes, and Large networks with 2000 nodes.
| Clustering Coefficient | ||||
|---|---|---|---|---|
| Low [0-0.1] | Medium (0.1.-0.2] | High (0.2-1] | ||
| Density | Low [0.-0.1] | LD-LC | LD-MC | LD-HC |
| Medium (0.1-0.2] | * | MD-MC | MD-HC | |
| High (0.2-1] | * | * | HD-HC | |
Characteristics reported for the largest connected component (LCC) of the empirical networks. : Number of nodes, : Number of edges, : Density, : Mean Clustering coefficient, : Mean degree, : Number of communities, : modularity, : diameter of the network, : average shortest path length, : degree assortativity coefficient. We ordered networks from lowest to highest density.
| Network |
| Ne | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Spanish physicists co-authorships network | (Medium) 1162 | 3017 | 0.004 | 0.69 | 5.19 | 31 | 0.9 | 22 | 8.57 | 0.03 | |
| Karnataka network | (Medium) 1118 | 5185 | 0.01 | 0.68 | 9.28 | 25 | 0.75 | 9 | 4.18 | 0.28 | |
| Global supply chain project network | (Small) 211 | 1507 | 0.03 | 0.62 | 7.14 | 5 | 0.28 | 4 | 2.23 | ||
| Recreovía facebook friendship network | (Small) 231 | 2542 | 0.10 | 0.52 | 22 | 5 | 0.29 | 7 | 2.65 | 0.08 | |
| School children friendship network | (Small) 25 | 87 | 0.15 | 0.47 | 3.48 | 3 | 0.33 | 4 | 1.93 | 0.24 |
Figure 2Ranking of the seeding strategies according to their spreading efficiency in five empirical networks by varying the initial percentage of seednodes and the probability of contagion. We ordered the figure panels from the lowest to the highest density of each network: (a) Spanish physicists co-authorship network, (b) Karnataka network, (c) Global supply chain project Network, (d) Recreovía Facebook friendship network, and (e) School children friendship network. We ranked the 10 seeding strategies according to the number of outperformed strategies in terms of spreading efficiency. We colored as reddest the seeding strategy with less efficiency than the others and as greenest the strategy that outperforms more strategies. The axis shows the ranking according to the total number of strategies outperformed by each strategy by considering all the different simulation scenarios of the spreading processes.
Characteristics reported for the largest connected component (LCC) of the generated random networks. Each value is the average of that measure in the 30 generated networks: : Size of the networks and number of nodes. : explanation of network Type in the Table 1, : Average number of edges, : Density, : Density standard deviation, : Average Clustering coefficient, : Clustering standard deviation, : Average degree, : Number of communities and confidence interval of (Values with * don’t have standard deviation), : modularity, and : modularity standard deviation.
| Small (200) | LD-LC | 590.97 | 0.03 | 0.10 | 0.017 | 5.91 | 9.37 | 0.37 | 0.01 | |
| LD-MC | 1535.03 | 0.08 | 0.16 | 0.011 | 15.35 | 7.60 | 0.20 | 0.01 | ||
| LD-HC | 396.00 | 0.02 | 0.40 | 0.030 | 3.96 | 10.57 | 0.58 | 0.01 | ||
| MD-MC | 2253.60 | 0.11 | 0.20 | 0.007 | 22.54 | 7.37 | 0.16 | 0 | ||
| MD-HC | 3512.40 | 0.18 | 0.31 | 0.005 | 35.12 | 6.40 | 0.14 | 0 | ||
| HD-HC | 9119.60 | 0.46 | 0.55 | 0.004 | 91.20 | 3.60 | 0.06 | 0 | ||
| Medium (1000) | LD-LC | 19590.93 | 0.04 | 0.10 | 0.002 | 39.18 | 9.07 | 0.13 | 0 | |
| LD-MC | 19411.23 | 0.04 | 0.15 | 0.003 | 38.82 | 8.67 | 0.15 | 0 | ||
| LD-HC | 38398.63 | 0.08 | 0.43 | 0.025 | 76.80 | 4.07 | 0.44 | 0 | ||
| MD-MC | 56400.00 | 0.11 | 0.19 | 0.002 | 112.80 | 8.37 | 0.07 | 0 | ||
| MD-HC | 73585.63 | 0.15 | 0.36 | 0.029 | 147.17 | 4.10 | 0.23 | 0.03 | ||
| HD-HC | 181035.50 | 0.36 | 0.43 | 0.001 | 362.07 | 2.03 | 0.11 | 0.00 | ||
| Large(2000) | LD-LC | 59100.00 | 0.03 | 0.08 | 0.001 | 59.10 | 9.43 | 0.11 | 0 | |
| LD-MC | 96297.00 | 0.05 | 0.14 | 0.001 | 96.30 | 7.47 | 0.11 | 0 | ||
| LD-HC | 153588.50 | 0.08 | 0.29 | 0.025 | 153.59 | 4.30 | 0.33 | 0.04 | ||
| MD-MC | 225600.00 | 0.11 | 0.19 | 0.001 | 225.60 | 8.30 | 0.05 | 0 | ||
| MD-HC | 357568.07 | 0.18 | 0.34 | 0.001 | 357.57 | 3.00 * | 0.11 | 0 | ||
| HD-HC | 494638.50 | 0.25 | 0.35 | 0.001 | 494.64 | 3.00 * | 0.10 | 0 |
Figure 3Ranking of the seeding strategies according to their spreading efficiency in 540 random networks classified according to their size, density and clustering. Thirty networks were generated for each combination of size, density, and clustering ranges using an algorithm of growing networks with tunable clustering[39]. Each panel represents a network size: (a) Small networks (200 nodes), (b) Medium Networks (1000 nodes), and (c) Large Networks (2000 nodes). Within each panel, networks structures are shown with acronyms according to the ranges of density and clustering coefficient as explained in Table 1, and ordered from left to right according to their clustering coefficient range. For each network category, we ranked the 10 seeding strategies according to the number of outperformed strategies in terms of spreading efficiency. The ranking was obtained by adding the number of strategies outperformed by each strategy in 30 spreading processes simulation runs for each one of the 30 networks of that category. Then, the heatmap was obtained according to the ranking by coloring reddest the seeding strategy with zero outperforms, and bluest the strategy with nine outperforms. For each network category, strategies were ordered according to the ranking from top to bottom. For initializing the spreading process, we fixed the number of seednodes as the number of detected communities in each network using a the Louvain method for community detection[40].
Summary of the top three most efficient strategies ranked for each combination of density, clustering, and size of the networks. Network structures are shown with acronyms according to the ranges of density, clustering coefficient and size as explained in Table 1. Network size is represented by S: Small networks (200 nodes), M: Medium networks (1000 nodes), and L: Large networks (2000 nodes). Strategies are ranked from 1 (most efficient strategy) to 3.
| Type | Size | Centralized | Decentralized | Random | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Hubs | Betweenness | Closeness | Page-Rank | K-Core | Vote-Rank | Ambassaddors | Community-Hubs | Random-Hubs | |||
| LD-LC | S | 1 | 2 | 3 | |||||||
| M | 3 | 2 | 1 | ||||||||
| L | 1 | 3 | 2 | ||||||||
| LD-MC | S | 2 | 3 | 1 | |||||||
| M | 3 | 1 | 2 | ||||||||
| L | 3 | 2 | 1 | ||||||||
| LD-HC | S | 2 | 3 | 1 | |||||||
| M | 1 | 3 | 2 | ||||||||
| L | 2 | 1 | 3 | ||||||||
| MD-MC | S | 3 | 1 | 2 | |||||||
| M | 1 | 3 | 2 | ||||||||
| L | 1 | 2 | 3 | ||||||||
| MD-HC | S | 3 | 1 | 2 | |||||||
| M | 3 | 1 | 2 | ||||||||
| L | 1 | 2 | 3 | ||||||||
| HD-HC | S | 1 | 2 | 3 | |||||||
| M | 2 | 3 | 1 | ||||||||
| L | 1 | 3 | 2 | ||||||||
Figure 4Degeneracy coefficient of seeding strategies for 30 simulated networks in the LD-LC category. Each panel represents a network size: (a) Small networks (200 nodes), (b) Medium networks (1000 nodes), and (c) Large Networks (2000 nodes). We define the degeneracy coefficient of two sets of seednodes (not to confound with k-degeneracy used in graph theory) as the fraction of nodes that belong to a pair of sets. Let A and B two sets, /. The lighter blue indicates a 0 of degeneracy coefficient between a pair of strategies, meaning that the two strategies did not have common nodes. The darker blue, as in the diagonal, indicates a degeneracy coefficient of 1, meaning that both strategies contain the same nodes.