| Literature DB >> 34412110 |
Amr Elsisy1,2, Boleslaw K Szymanski1,2, Jasmine A Plum1,2, Miao Qi1,2, Alex Pentland3.
Abstract
Milgram empirically showed that people knowing only connections to their friends could locate any person in the U.S. in a few steps. Later research showed that social network topology enables a node aware of its full routing to find an arbitrary target in even fewer steps. Yet, the success of people in forwarding efficiently knowing only personal connections is still not fully explained. To study this problem, we emulate it on a real location-based social network, Gowalla. It provides explicit information about friends and temporal locations of each user useful for studies of human mobility. Here, we use it to conduct a massive computational experiment to establish new necessary and sufficient conditions for achieving social search efficiency. The results demonstrate that only the distribution of friendship edges and the partial knowledge of friends of friends are essential and sufficient for the efficiency of social search. Surprisingly, the efficiency of the search using the original distribution of friendship edges is not dependent on how the nodes are distributed into space. Moreover, the effect of using a limited knowledge that each node possesses about friends of its friends is strongly nonlinear. We show that gains of such use grow statistically significantly only when this knowledge is limited to a small fraction of friends of friends.Entities:
Mesh:
Year: 2021 PMID: 34412110 PMCID: PMC8376299 DOI: 10.1371/journal.pone.0255982
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Success rates with error bars collected from 500 runs for each of the five selections of search weights defined in Eq 1 with different seeds for the random number generator for each of 500 distinct pairs of starting and target nodes selected to be at least 1,609 km apart.
The plots include the baseline random search with all metric weights set to 0, three searches using a single metric with weight 1, and the search with the metric weights yielding the highest performance. We plot each rate as a function of the maximum number of i-friends of which a node is aware for each of its friends. The error bars show the standard error.
Fig 2(a-b) show plots with error bars of success rates (a) and stretches (b) achieved with partial knowledge of i-friends under the different distributions of friendship edges as a function of various distributions of nodes into space. Plots represent the results of running 10 samples of each distribution resulting in 100 samples for each case of the considered friendship edge distributions. Each of these samples was executed 10 times and averaged results plotted. Each friendship edge distribution has a unique color assigned to its plots and two best performing distributions have also plots of their stretches achieved without knowledge of i-friends marked with the dashed line. We describe all distributions of friendship to edges and nodes into space in the text. Plots for runs with awareness of i-friends were computed using kappa limit of the number of i-friends set to 15, which, if needed, are uniformly randomly chosen from i-friends for each friend of the sender. The error bars were in the range of [0.002, 0.039] for success rates (a) and in the range of [0.07, 1.61] for stretches (b).
Fig 3The degree distribution for the giant component of a network with 75,803 Gowalla users located in the U.S., 454,350 friendship edges, and γ ≈ 1.49.
Percentages of Gowalla users and the populations of metropolitan areas in the United States.
| No. | Name | Percentage of U.S. population | Percentage of Gowalla Users | Difference |
|---|---|---|---|---|
| 1 | Baltimore-Washington DC | 2.46 | 2.93 | -0.47 |
| 2 | Los Angeles | 7.42 | 1.22 | 6.21 |
| 3 | Dallas-Fort Worth | 6.97 | 2.18 | 4.79 |
| 4 | Austin and San Antonio | 12.71 | 0.97 | 11.74 |
| 5 | Seattle-Tacoma-Belly | 2.06 | 1.17 | 0.89 |
| 6 | New York City | 4.25 | 6.30 | -2.05 |
| 7 | Boston | 1.55 | 1.48 | 0.06 |
| 8 | Houston | 2.04 | 2.04 | 0.01 |
| 9 | San Francisco and San Jose | 7.75 | 1.44 | 6.31 |
| 10 | Chicago | 1.89 | 3.00 | -1.12 |
| 11 | Philadelphia | 1.01 | 1.90 | -0.89 |
| 12 | Salt Lake City | 1.14 | 0.36 | 0.78 |
| 13 | Portland | 1.19 | 0.74 | 0.46 |
| 14 | Denver | 1.35 | 0.88 | 0.47 |
| 15 | Atlanta | 1.60 | 1.98 | -0.37 |
| 16 | Oklahoma City | 1.82 | 0.41 | 1.40 |
| 17 | Orlando | 2.77 | 0.73 | 2.04 |
Distributions of fractions of friends, i-friends and communities over the ranges of distances from nodes.
The second column shows the fractions of friends at each distance range, computed by summing the numbers of friends in each range for each individual user and then dividing the result by the total number of friends of all users. The third column shows fractions computed as ratios of sums of the numbers of i-friends of each user at each distance range to the total number of i-friends for all users. The fourth column shows fractions of members of communities of each individual user at each distance range listed, computed as the sum of these numbers divided by the total number of members of all relevant communities. Fifth, sixth and seventh columns list cumulative values from the second, third and fourth column, respectively.
| Range | Percentage of | Cumulative Percentage of | ||||
|---|---|---|---|---|---|---|
| (km) | Friends | i-friends | Communities | Friends | i-friends | Communities |
| ≤ 6.25 | 18.6 | 2.6 | 14.0 |
| 2.6 | 14.0 |
| 6.25–12.5 | 8.6 | 1.3 | 4.3 |
| 3.9 | 18.3 |
| 12.5–25 | 10.3 | 1.7 | 5.5 |
| 5.6 | 23.9 |
| 25–50 | 7.6 | 1.5 | 4.6 |
| 7.1 | 28.5 |
| 50–100 | 3.9 | 1.0 | 2.6 |
| 8.1 | 31.1 |
| 100–200 | 3.8 | 1.6 | 3.1 |
| 9.7 | 34.1 |
| 200–400 | 6.4 | 6.0 | 6.8 | 59.2 | 15.7 |
|
| 400–800 | 6.4 | 8.8 | 8.2 | 65.6 | 24.5 |
|
| 800–1600 | 11.8 | 23.8 | 17.2 | 77.4 | 48.3 |
|
| 1600–3200 | 14.8 | 36.4 | 23.3 | 92.2 |
|
|
| 3200–6400 | 7.5 | 14.2 | 10.0 | 99.8 |
| 99.7 |