| Literature DB >> 28542409 |
Dan Braha1,2, Marcus A M de Aguiar1,3.
Abstract
Social influence plays an important role in human behavior and decisions. Sources of influence can be divided as external, which are independent of social context, or as originating from peers, such as family and friends. An important question is how to disentangle the social contagion by peers from external influences. While a variety of experimental and observational studies provided insight into this problem, identifying the extent of contagion based on large-scale observational data with an unknown network structure remains largely unexplored. By bridging the gap between the large-scale complex systems perspective of collective human dynamics and the detailed approach of social sciences, we present a parsimonious model of social influence, and apply it to a central topic in political science-elections and voting behavior. We provide an analytical expression of the county vote-share distribution, which is in excellent agreement with almost a century of observed U.S. presidential election data. Analyzing the social influence topography over this period reveals an abrupt phase transition from low to high levels of social contagion, and robust differences among regions. These results suggest that social contagion effects are becoming more instrumental in shaping large-scale collective political behavior, with implications on democratic electoral processes and policies.Entities:
Mesh:
Year: 2017 PMID: 28542409 PMCID: PMC5436881 DOI: 10.1371/journal.pone.0177970
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Observed county vote-share distributions, 1920–2012.
The observed distributions of Democratic vote-share per county are presented for states with the greatest number of counties, for each of the nine census divisions established by the U.S. Census Bureau. Here plots are shown for various presidential elections from 1920 to 2012. The vote-share per county is measured as the percentage of the vote in the county received by the Democratic candidate. The figure shows the plot of the cumulative distributions. The insets show the corresponding probability density functions (for clarity the x-scale has been shifted to right and contracted). Curves are based on kernel estimation with Gaussian kernels.
Fig 2Scaled vote-share distributions and predicted curves, 1920–2012.
Without loss of generality, curves are presented for states with the greatest number of counties in each of the four census regions—Northeast, Midwest, South, and West—of the U.S. The figure shows the plot of the cumulative distributions. Observed values (dashed lines) are based on kernel estimation with Gaussian kernels. Solid lines are Gaussian distributions with mean 0 and variance 1. The scaled vote-shares are calculated from the estimated external influence parameters N0 and N1. The goodness of fit of the Gaussian relative to the empirically observed county vote-share distributions was determined by using a Kolmogorov-Smirnov test. The fit of the model is excellent—the test fails to reject the normality null hypothesis at the 5% significance level, for 95% of all states in every election year; and at the 1% significance level, for 98% of all states in every election year.
Fig 3Distributions of social influence and best-fit curves, 1920–2012.
The upper and middle panels of Fig 3 show the histogram and cumulative distribution of the Social Influence index (using Eq 7), aggregated over all states and election years. The lower panel shows the spread of the Social Influence index. The broad distributions are best fitted by the log-logistic distribution, often used for analyzing skewed data. The goodness of fit of the log-logistic distribution was determined by using a Kolmogorov-Smirnov test. The fit of the model is very good (p = 0.16).
Fig 4Evolution of social influence, 1920–2012.
Time series of social influence averaged over states, and their normalized versions (see text), are shown for each of the nine U.S. census divisions: a-b) Division 1, New England—Maine, Massachusetts, New Hampshire, and Vermont; c-d) Division 2, Middle Atlantic (Solid line): New Jersey, New York and Pennsylvania. Division 4, West North Central (Dashed line): Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota and South Dakota. e-f) Division 3, East North Central (Solid line): Illinois, Indiana, Michigan, Ohio and Wisconsin. Division 5, South Atlantic (Dashed line): Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia and West Virginia. Division 6, East South Central (dotted line): Alabama, Kentucky, Mississippi and Tennessee. Division 7, West South Central (Dash-dot line): Arkansas, Louisiana, Oklahoma and Texas. Division 8, Mountain (Gray solid line): Arizona, Colorado, Idaho, Montana, Nevada, New Mexico, Utah and Wyoming. Division 9, Pacific: Alaska, California, Oregon and Washington (Gray dashed line). The nine U.S. census divisions are clustered according to the correlation between their respective normalized social influence profiles. The corresponding average pairwise correlations are 0.824 and 0.909 for the clusters in Fig 4c-d and Fig 4e-f, respectively. Panel 4g shows the time series of normalized social influence, averaged over all U.S. states. The Z–scores are mapped to colors from white (z = −1.1, below the mean) to black (z = 2.9, above the mean). A clear pattern of high social influence (positive or near-zero Z–scores) follows a period (1920–1980) of low social influence (negative Z–scores). The 1984 break date—separating low from high levels of social contagion—is identified by the Mann-Whitney U-test, which is applied for different potential breaks within the range 1920–2012 (see S1 Fig).
Fig 5Spatial variation of social influence and its change over time, 1920–2012.
a) Hot spot analysis of social influence for sample maps of US presidential elections (see S2–S25 Figs and S1 Movie for a complete analysis). The colored areas reflect the significance (p-value) of local concentration of social influence for each state. The p-values for each state are derived from a random permutation test of local clustering using the Getis-Ord Local statistic. This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. Low p-values (p-value≤0.1) indicate statistically significant high levels of social influence at a state and its surrounding neighbors (hot spots). High p-values (p-value≥0.9) indicate statistically significant low levels of social influence at a state and its surrounding neighbors (cold spots). b) Heatmap of the correlation coefficients between pairs of election years characterized by their state-level social influence profiles.
Fig 6Stationary distributions for different values of N0 and N1.
Probability distributions of finding the network with k nodes in state 1 at equilibrium for different values of N0 and N1. The number of variable nodes is N = 500.