| Literature DB >> 26378457 |
Thong Pham1, Paul Sheridan2, Hidetoshi Shimodaira1.
Abstract
Preferential attachment is a stochastic process that has been proposed to explain certain topological features characteristic of complex networks from diverse domains. The systematic investigation of preferential attachment is an important area of research in network science, not only for the theoretical matter of verifying whether this hypothesized process is operative in real-world networks, but also for the practical insights that follow from knowledge of its functional form. Here we describe a maximum likelihood based estimation method for the measurement of preferential attachment in temporal complex networks. We call the method PAFit, and implement it in an R package of the same name. PAFit constitutes an advance over previous methods primarily because we based it on a nonparametric statistical framework that enables attachment kernel estimation free of any assumptions about its functional form. We show this results in PAFit outperforming the popular methods of Jeong and Newman in Monte Carlo simulations. What is more, we found that the application of PAFit to a publically available Flickr social network dataset yielded clear evidence for a deviation of the attachment kernel from the popularly assumed log-linear form. Independent of our main work, we provide a correction to a consequential error in Newman's original method which had evidently gone unnoticed since its publication over a decade ago.Entities:
Mesh:
Year: 2015 PMID: 26378457 PMCID: PMC4574777 DOI: 10.1371/journal.pone.0137796
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of the existing attachment kernel estimation methods.
| Method | Form of | Estimation method |
|---|---|---|
| Newman [ | Nonparametric | Weighted sum of multiple histograms |
| Jeong et al. [ | Nonparametric | Single histogram |
| Massen et al. [ |
| Iterative fixed-point algorithm |
| Sheridan et al. [ |
| Markov chain Monte Carlo |
| Gomez et al. [ |
| ML by grid search |
| Kunegis et al. [ |
| Regression |
Summary of the existing methods for estimating the attachment kernel A . Nonparametric methods are methods that do not assume a functional form for A .
Fig 1Estimation of the attachment kernel when the true model is A = 3(log(max(k, 1)))2 + 1.
A: Jeong’s method. B: Newman’s method. C: Corrected Newman’s method. D: PAFit. The solid line depicts the true model. The plots are on a log-log scale. The gray vertical lines are the estimated confidence intervals of the estimated values by PAFit. Confidence intervals are not available in the remaining methods.
Summary of true attachment kernels used in the Monte Carlo simulation.
| Attachment kernel | Parameter | Value |
|---|---|---|
|
|
| 0.5, 0.6, …, 1.5 (11 values) |
|
|
| 0.8, 1.0, 1.2 |
|
|
| 2, 3 |
Summary of true attachment kernels used in the Monte Carlo simulation. There is a total of 16 different kernels.
Fig 2Comparison between five methods in average relative error.
A: B = 100. B: B = 20. See Table 2 for the details of the true attachment kernels A used here.
Summary statistics for the Flickr social network dataset.
| ∣ | ∣ |
| Δ∣ | Δ∣ |
|
|---|---|---|---|---|---|
| 2302925 | 33140018 | 134 | 815867 | 16105211 | 2.15 |
Summary statistics for the Flickr social network dataset [35]. This is a directed simple network. The numbers ∣V∣ and ∣E∣ are the total number of nodes and edges in the final network, respectively. Meanwhile, T is the number of observed time-steps, while Δ∣V∣ and Δ∣E∣ are the increments of nodes and edges after time t = 0, respectively. The value is the power-law exponent of the degree distribution of the final network [6].
Fig 3Estimation of the attachment kernel in the Flickr social network dataset.
A: Jeong’s method. B: Newman’s method. C: Corrected Newman’s method. D: PAFit. The plots are on a log-log scale. The solid line corresponding to A = k is plotted as a visual guide.