| Literature DB >> 27748398 |
Mikołaj Morzy1,2, Tomasz Kajdanowicz2, Bolesław K Szymański2,3.
Abstract
Many collections of numbers do not have a uniform distribution of the leading digit, but conform to a very particular pattern known as Benford's distribution. This distribution has been found in numerous areas such as accounting data, voting registers, census data, and even in natural phenomena. Recently it has been reported that Benford's law applies to online social networks. Here we introduce a set of rigorous tests for adherence to Benford's law and apply it to verification of this claim, extending the scope of the experiment to various complex networks and to artificial networks created by several popular generative models. Our findings are that neither for real nor for artificial networks there is sufficient evidence for common conformity of network structural properties with Benford's distribution. We find very weak evidence suggesting that three measures, degree centrality, betweenness centrality and local clustering coefficient, could adhere to Benford's law for scalefree networks but only for very narrow range of their parameters.Entities:
Year: 2016 PMID: 27748398 PMCID: PMC5066226 DOI: 10.1038/srep34917
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Centrality measures (a) degree (b) betweenness (c) closeness (d) clustering coefficient.
Real world datasets (the sets used by Golbeck14 are marked with an † after their name).
| Name | Description | Vertices | Edges |
|---|---|---|---|
| product co-purchase network | 262 111 | 1 234 877 | |
| paper citation network | 27 770 | 352 807 | |
| scientific collaboration network | 317 080 | 1 049 866 | |
| email communication network | 36 692 | 367 662 | |
| friend counts | 18 298 | 88 234 | |
| social circles network | 107 614 | 30 494 866 | |
| peer-to-peer network | 36 682 | 88 323 | |
| friendship network | 2 793 657 | 6 898 682 | |
| followers counts | 67 648 287 | 67 648 287 | |
| scientific collaboration network | 12 008 | 237 010 | |
| friendship network | 82 168 | 948 464 | |
| website hyperlink network | 281 903 | 2 312 497 | |
| social circles network | 81 306 | 2 420 766 | |
| adminship voting network | 7115 | 103 689 | |
| friendship network | 1 134 890 | 2 987 624 |
Real world network properties which pass at least 2 goodness of fit tests.
| Dataset | Measure | No. of passed tests |
|---|---|---|
| degree | 2 | |
| betweenness | 3 | |
| betweenness | 2 | |
| betweenness | 7 | |
| betweenness | 2 | |
| betweenness | 11 |
Artificial network properties which pass at least 2 goodness of fit tests.
| Model | Parameter | Measure | No. of passed tests |
|---|---|---|---|
| preferential.attachment | 1.00 | clustering | 2 |
| preferential.attachment | 1.22 | clustering | 2 |
| preferential.attachment | 1.44 | clustering | 2 |
| preferential.attachment | 1.67 | clustering | 2 |
| preferential.attachment | 1.89 | clustering | 2 |
| preferential.attachment | 2.11 | clustering | 2 |
| preferential.attachment | 2.33 | betweenness | 2 |
| preferential.attachment | 2.33 | clustering | 2 |
| preferential.attachment | 2.56 | betweenness | 2 |
| preferential.attachment | 2.56 | clustering | 2 |
| preferential.attachment | 2.78 | betweenness | 2 |
| preferential.attachment | 2.78 | clustering | 2 |
| preferential.attachment | 3.00 | betweenness | 2 |
| preferential.attachment | 3.00 | clustering | 2 |
| random.graph | 0.001 | clustering | 2 |
| random.graph | 0.001 | clustering | 2 |
| random.graph | 0.001 | clustering | 2 |
| small.world | 0.001 | betweenness | 2 |
Number of accepted goodness of fit tests from 60 real-world and 320 artificial network centrality measures distributions.
| Mantissa Arc test | 21 |
| 17 | |
| Judge-Schechter Mean Deviation test | 11 |
| Joenssen’s | 8 |
| Distortion Factor | 1 |
Summary of distributions used in tests comparing goodness of fit.
| Distribution | Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. |
|---|---|---|---|---|---|---|
| Normal | 0.001028 | 2.676 | 3.981 | 4.031 | 5.335 | 11.69 |
| Benford | 1 | 1.804 | 3.202 | 3.927 | 5.645 | 9.996 |
Figure 2Average p-values of tests for different levels of Benford’s distribution purity.
Average purity of Benford’s distribution accepted by each test.
| Test | Test number | Average Purity |
|---|---|---|
| Chi-Square Test for Benford Distribution | t1 | 0.97 |
| Euclidean Distance Test for Benford Distribution | t2 | 0.97 |
| Joint Digits Test | t3 | 0.97 |
| JP-Square Correlation Statistic Test for Benford Distribution | t4 | 0.97 |
| K-S Test for Benford Distribution | t5 | 0.97 |
| Chebyshev Distance Test for Benford Distribution | t6 | 0.97 |
| Freedman-Watson U-squared Test for Benford Distribution | t7 | 0.98 |
| Judge-Schechter Mean Deviation Test for Benford Distribution | t8 | 0.91 |
| Mantissa Arc Test | t9 | 0.98 |
| Distortion Factor | t11 | 0.79 |