| Literature DB >> 30533515 |
Christian L Staudt1, Michael Hamann1, Alexander Gutfraind2, Ilya Safro3, Henning Meyerhenke1.
Abstract
Research on generative models plays a central role in the emerging field of network science, studying how statistical patterns found in real networks could be generated by formal rules. Output from these generative models is then the basis for designing and evaluating computational methods on networks including verification and simulation studies. During the last two decades, a variety of models has been proposed with an ultimate goal of achieving comprehensive realism for the generated networks. In this study, we (a) introduce a new generator, termed ReCoN; (b) explore how ReCoN and some existing models can be fitted to an original network to produce a structurally similar replica, (c) use ReCoN to produce networks much larger than the original exemplar, and finally (d) discuss open problems and promising research directions. In a comparative experimental study, we find that ReCoN is often superior to many other state-of-the-art network generation methods. We argue that ReCoN is a scalable and effective tool for modeling a given network while preserving important properties at both micro- and macroscopic scales, and for scaling the exemplar data by orders of magnitude in size.Entities:
Keywords: Communities; Multiscale modeling; Network generation; Network modeling
Year: 2017 PMID: 30533515 PMCID: PMC6225971 DOI: 10.1007/s41109-017-0054-z
Source DB: PubMed Journal: Appl Netw Sci ISSN: 2364-8228
Fig. 1Scaling behavior of 100 Facebook networks; from left to right and top to bottom: number of edges, maximum degree, Gini coefficient of degree distribution, average local clustering coefficient, diameter, number of components, number of communities found by Parallel Louvain Method
Parameters set to fit a model to a given graph, and to produce a scaled-up replica
| Model | Parameters | Fitting | Fitting scaling by |
|---|---|---|---|
| Erdős–Rényi |
|
|
|
| Barabasi-Albert |
|
|
|
| Chung-Lu |
|
|
|
| Edge-Switching Markov Chain |
|
|
|
| R-MAT |
|
|
|
| Hyperbolic Unit-Disk |
|
|
|
| BTER |
|
|
|
| LFR |
|
|
|
Fig. 2Running time of 50 iterations of the kronfit algorithm in relation to the number of edges m of the input network
Fig. 3Speedup of NetworKit implementation of LFR compared to the reference implementation (Fortunato 2017) when replicating the set of 100 Facebook networks with m edges. Each point represents one network. The curve represents a linear regression model fit with its confidence interval (shaded area)
Fig. 4Scaling behavior of the different generators on the fb-Caltech36 network. From left to right and top to bottom: number of edges, max. degree, Gini coefficient of the degree distribution, average local clustering coefficient, diameter, number of components, number of communities. Each data point is the average over ten runs, the error bars show the standard deviation
Fig. 5Running time replication of a set of network analysis algorithms. Running times are in edges per second, i.e., higher is faster
Fig. 6Fitting and generating: processing speed measured in edges per second (size of replica graph/total running time, measured on 100 Facebook graphs)
Fig. 7A small social network and its scale-2 replicas produced by different models. ReCoN is the model that best reproduces a set of essential properties, including degree distributions, clustering and community structure. a Original, b R-MAT with kronfit, c BTER, d ReCoN
Fig. 8Colorado Springs epidemiological contact network. a original network, b scale-2 replica and c sample from a scale-200000 replica
Fig. 9Structure replication of Facebook networks a Relative deviation of scalar network properties, b Distribution of centrality scores
Fig. 11Structure replication of Facebook networks with scaling factor 4. a Relative deviation of scalar network properties, b Distribution of centrality scores
Additional networks used
| Network | Type |
|
|
|---|---|---|---|
| Email-Enron | Email communication | 36,692 | 183,831 |
| PGPgiantcompo | PGP web of trust | 10,680 | 24,316 |
| As-22july06 | Internet topology | 22,963 | 48,436 |
| Hep-th | Scientific coauthorship | 8361 | 15,751 |
| CoAuthorsDBLP | Scientific coauthorship | 299,067 | 977,676 |
| Dolphins | Animal social network | 62 | 159 |
| Power | Power grid | 4941 | 6594 |
| Cnr-2000 | Web graph | 325,557 | 2,738,969 |
The network email-Enron has been taken from the Stanford Large Network Dataset Collection (Leskovec and Krevl 2014), all other networks are from the clustering instances of the 10th DIMACS implementation challenge (Bader et al. 2014)