Literature DB >> 28542409

Voting contagion: Modeling and analysis of a century of U.S. presidential elections.

Dan Braha^1,2, Marcus A M de Aguiar^1,3.

Abstract

Social influence plays an important role in human behavior and decisions. Sources of influence can be divided as external, which are independent of social context, or as originating from peers, such as family and friends. An important question is how to disentangle the social contagion by peers from external influences. While a variety of experimental and observational studies provided insight into this problem, identifying the extent of contagion based on large-scale observational data with an unknown network structure remains largely unexplored. By bridging the gap between the large-scale complex systems perspective of collective human dynamics and the detailed approach of social sciences, we present a parsimonious model of social influence, and apply it to a central topic in political science-elections and voting behavior. We provide an analytical expression of the county vote-share distribution, which is in excellent agreement with almost a century of observed U.S. presidential election data. Analyzing the social influence topography over this period reveals an abrupt phase transition from low to high levels of social contagion, and robust differences among regions. These results suggest that social contagion effects are becoming more instrumental in shaping large-scale collective political behavior, with implications on democratic electoral processes and policies.

Entities: Disease Gene Species

Mesh：

Year: 2017 PMID： 28542409 PMCID： PMC5436881 DOI： 10.1371/journal.pone.0177970

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

The understanding of collective human dynamics in theoretical and real-life social systems gained increasing attention in recent decades [1-7]. At the core of these efforts are models that incorporate a collection of interconnected individuals that change their behavior based on micro-level processes of social influence exerted by their neighbors, but also based on individuals’ personal influences independent of social context. The macro-level characteristics of the system emerge as a product of the collective dynamics of these personal influences and micro-level social influence processes. The question of how to separate and measure the effect of social influence is therefore a major challenge for understanding collective human behavior. Although a variety of experimental [8-12] and observational [13-16] studies attempted to address this challenge, identifying the extent of social influence based on large-scale, macro-level observational data in the presence of unknown network structure remains largely unexplored. To close this gap, we present a simple and universal method for measuring social influence, taking the voter model of statistical physics as our basic dynamical system [2–4, 17–19]. We apply our model to understanding the collective dynamics of voting in US presidential elections—a topic at the core of collective political behavior. The study of electoral behavior has attracted considerable attention by political scientists. Most studies of voting behavior in the United States and other democracies view vote choices as the result of several interrelated attitudinal and social factors [20]. Attitudinal factors that reflect short‐term fluctuations in partisan division of the vote include evaluations of the candidates' personal qualities and government performance, and orientations toward issues of public policy. Long-term factors, which persist beyond a particular election, include partisan loyalties [21-22], ideological orientations [23], and social characteristics such as race, religion, social class, and region [20]. Recent studies have also elucidated the role of social networks in spreading voting behavior [24]. Voters embedded in social networks of friends, family members, neighbors, and co-workers [12] influence each other in terms of voter turnout [12, 25–27] or support particular candidates [24, 28]. Social networks enable bounded rational voters to limit the cost of searching for political information [23] by relying on readily available information of their peers. These peer groups can also include “opinion leaders” who can considerably influence the behavior of voters in their network of contacts by being perceived as trustworthy and highly informed on political issues [21]. The opinion leaders (also known as “zealots,” “inflexible,” “stubborn” or “frozen” voters in the sociophysics literature, see [18–19, 29–38]) are individuals who hardly change their political preferences, and influence the voting behavior of uncommitted individuals. Opinion leaders often interpret media messages and pass them on to "opinion followers" [21]. Other sources of political information that were shown to influence citizen attitudes and voting behavior are the mass media [39-43] and a variety of organized efforts at political persuasion such as campaign persuasion [44]. Thus the picture that emerges from the modern history of social science academic voting research suggests that voters are embedded in interpersonal social networks that can increase the likelihood of voting contagion and behavior change via social imitation; but are also exposed to what we might call “external influences”—social forces, which are often consistently skewed in favor of one candidate over another [24], that affect voters. As mentioned above, these external influences include various individual prejudiced attitudes and orientations, party identification, individual’s upbringing, religion, ideology, campaign persuasion, and exposure to the mass media, such as television and newspapers. Since in this paper our focus is on understanding the dynamics of flexible voters who are free to change their voting behavior, without loss of generality, one can consider exposure to opinion leaders (including peers, journalists, or politicians), to be an external influence, despite sometimes being a peer influence effect. The reasoning for this is that opinion leaders are ideologically inflexible and unwavering candidate supporters, and thus convey a consistent partisan bias in favor of one candidate over another [24]. Collectively, the voter’s electoral decision can be explained in terms of peer effects (via social imitation) and partisan biases conveyed by competing external influences. A pertinent question here is how to disentangle the effect of social contagion from that of exposure to external influences. This identification problem goes beyond voting. People hold opinions on a multitude of topics that inform alternative courses of action, from crime participation [45] and smoking [46] to riots and protests [47] and financial markets [31]. These opinions can be either the result of individual considerations or, when confronted with information that is difficult to acquire or process, influenced by the views of others through social interactions. In this paper we describe a general methodology for detecting behavioral contagion from large-scale observational data. We extend the basic voter model [2–4, 17–19] by taking into account the dynamic response of social networks to external influences. Our model focuses on two characterizations of voting behavior. The first is that of most studies of voting behavior, which consider vote choices to be driven, as outlined above, by various individual’s biases and other external pressures. The second—from complex systems science and recent observational and face-to-face studies—is that of internal self-reinforcing dynamics where voters’ opinions are changed under the influence of their peers. Incorporating both, we construct a universal representation of the largest scale system behavior when there is both external and interpersonal influence. The extended voter model is able to reproduce remarkably well statistical features and patterns of almost a century of county-level U.S. presidential election data. More importantly, our model presents a general framework for detecting social contagion from large-scale election return data, and can be applied more generally to many different systems that involve social contagion. Here, electoral votes cast for U.S. presidential candidates at the county level are analyzed, covering the period of 1920 through 2012. Counties are grouped by state, and the corresponding distributions of the fraction of votes (vote-share) in a county for the Democratic candidate in an election year are studied. Fig 1 shows the county vote-share distributions for various states and election years. The data indicates that there is a wide variation in the characteristics of voting behavior with no apparent pattern of voting dynamics in time or geographical space. Here we show that much of this observed variability of county vote-shares may be best explained by fluctuating peer influences across time and space. Although the study of collective voting behavior has recently been the focus of discussion in the context of identification and modelling of universal patterns of observed voting behavior [48-59], the mechanisms leading to such diverse spatiotemporal variation in voting patterns as shown in Fig 1 as well as other spatiotemporal patterns discussed in our paper (see section Empirical Results of US Presidential Elections: 1920 to 2012) are poorly understood. The model presented below provides a parsimonious quantitative framework that is capable to explain and reproduce, among others, the full range of empirical county vote-share distributions for all states and election years. Using the model, we develop an index of social influence that enables us to examine and reveal remarkably robust patterns of spatial and temporal variation in social influence over 92 years of US presidential elections. The statistical physics model presented in this paper is obviously limited in ignoring a lot of psychological and social factors influencing the decisions of individual voters. However, this limitation should not be perceived as an overly simplified assumption that overlooks human complexity. Indeed, as demonstrated by other models of social complexity [1-7], at times the details of a complex system do not matter if one wants to understand the large-scale behavior of the system. In this case, only the broad-brush features of the system are necessary to understand the complexity of human choices; in our case, the relative strength of external to peer influences are shown as a plausible explanation for observed voting behavior.

Fig 1

Observed county vote-share distributions, 1920–2012.

The observed distributions of Democratic vote-share per county are presented for states with the greatest number of counties, for each of the nine census divisions established by the U.S. Census Bureau. Here plots are shown for various presidential elections from 1920 to 2012. The vote-share per county is measured as the percentage of the vote in the county received by the Democratic candidate. The figure shows the plot of the cumulative distributions. The insets show the corresponding probability density functions (for clarity the x-scale has been shifted to right and contracted). Curves are based on kernel estimation with Gaussian kernels.

Observed county vote-share distributions, 1920–2012.

Models of opinion dynamics

Models of opinion formation, which explore the dynamics of competing opinions taking into account the interactions among agents, have been extensively studied [1–7, 60–61]. In their most basic form, these models consist of voters, represented by nodes on a social network, having only two possible opinions, 0 or 1. Each voter may change her mind by using various interaction mechanisms, for example, randomly adopting the opinion of a connected neighbor (essentially a noisy majority-vote rule, see [3, 17, 58]), or by applying local majority rules [1, 3, 5]. The stochastic dynamics of these simple interaction models ultimately leads to a uniform state corresponding to the all-nodes-0 or all-nodes-1 states where all voters share the same political choices. Obviously, consensus states are not observed in real-world political elections, and thus the basic models cannot be plausibly considered as realistic models that are able to describe empirical voting data. Accordingly, more realistic models of opinion dynamics have been proposed that incorporate, among other features, social impact theory [60-62], opinion leaders and zealots [18–19, 29–38, 62–63], external influences and fields [2, 18–19, 64–70], individual’s biases [71-72], contrarians [73], individual’s own current opinion [74-75], word-of-mouth spreading [52], non-overlapping cliques [59], or noisy diffusive process [58]. Below we further elaborate on the themes of opinion leaders and zealots, external influences, and individual’s biases—themes that play an important role in our model, and that have been seen empirically by studies of electoral behavior (see Introduction). Opinion leaders have often been modeled by considering the presence of biased voters who favor one opinion over the other, and that will not change their opinion (also known as zealots [29], inflexible [30], frozen [18–19, 31], stubborn [32-35], or committed voters [36-38]. The problem has been originally introduced and studied with a single zealot in regular lattices [29], and has been subsequently incorporated in models that use repeated local updates of random grouping of agents in the limit where the number of voters and zealots goes to infinity [30], and with arbitrary numbers of voters and zealots in fully connected networks where complete and exact results of the stochastic dynamics have been obtained [18-19]. Further studies that explore the use of zealots in the context of the voter model include [31–34, 36–38, 76]. The role of external influences (distinct from social imitation) in opinion formation has often been modeled as an external perturbation or modulation acting on all agents in the system, or by some external field or global coupling [3]. These perturbations could account for the effects of propaganda [65], fashion waves [66-67], or the mass media [68-70]; but are also driven by individual biases and prejudices [71-72], or level of political awareness [64]. More generally, these perturbations represent the dynamic response of a complex system to an external environment [18–19, 63]. As mentioned in the Introduction, this view is deeply aligned with empirically and theoretically grounded research by political scientists, which have uncovered external forces in the form of prejudiced attitudes and orientations of individuals, party identification, individual’s upbringing, religion, ideology, exposure to campaign persuasion and the mass media (such as television and newspapers), or partisan bias conveyed by opinion leaders. While the above models show the important role of contagion spreading via social interactions in collective opinion dynamics, any further progress in understanding real-world voting phenomena needs to supplement and contrast these theoretical efforts with large sets of empirical voting data. Some effort has been done in this regards [52, 56, 58], particularly with the aim of explaining and reproducing the distribution of votes in bipartisan and proportional elections. Our paper is a contribution in this direction. Below we present an exactly solvable model of stochastic voter dynamics obtaining, among others, the stationary vote-share fluctuations across counties, which is in excellent agreement with almost a century of observed U.S. presidential election data at the state level, for every election year. The model is further validated by reproducing empirical temporal and spatial election patterns, as identified by social science academic voting research.

The model

To model the dynamics of elections, we take the prototypical voter model [2–3, 17–19] as our basic dynamical system, and modify it to more closely reflect features of real-world political elections (see Introduction). We consider a network with N “free” nodes representing uncommitted voters [18–19, 29–38], and links between pairs of free nodes representing peer influences. We add to this network of uncommitted voters “fixed” nodes representing unwavering candidate supporters and opinion leaders that influence all other uncommitted voters, but are not themselves influenced by other nodes [18–19, 29–38]. The assumption that there exists a directed link from each uncommitted voter to each fixed node is motivated by the empirical fact that uncommitted voters are consistently exposed to partisan bias in favor of one of the candidates over another [20–22, 24, 39–44], conveyed by opinion leaders and other external sources (see detailed discussion below). The number of fixed nodes that are biased in favor of the first candidate (named ‘0’) is N0 and the number biased in favor of the second candidate (named ‘1’) is N1. Thus, we consider a network with N + N0 + N1 nodes. Each node has an internal state which can take only the values 0 and 1, representing whether the voter chooses the first or second candidate; or, for fixed nodes, whether the node is biased in favor of the first or second candidate. We assume a mean-field interaction model in which each uncommitted voter is equally likely to interact with another uncommitted voter [3, 17, 58, 77]. Accordingly, individuals update their contacts in a fully mixed fashion within the population, which implies a homogeneous random network for the uncommitted voters’ social ties. We assume that the N free voters change their internal state following the noisy majority-vote model [3, 17, 58]: At each time step a random free voter is selected and its state is updated with probability 1 − p by copying the state of one of its connected neighbors, chosen at random from all nodes; and with probability p the state remains the same. The N0 and N1 voters that are biased towards the first and second candidate, respectively, remain fixed in state 0 and state 1, respectively. Our model’s assumptions and the noisy majority-vote update rule that we use [3, 17, 58] share important features with other variants of the majority rule principle. For example, the elegant majority rule proposed in [77]—see also the excellent review in [3]—assumes that all agents in the population can communicate with each other; forming, at each iteration, a random group of agents who take the majority opinion inside the group [3]. In this model, therefore, multiple individuals’ vote choices are updated simultaneously at each time step, at variance with our noisy majority-vote update rule where a single individual’s vote choice is updated at each time step [3]. The presence of inflexible agents with opposing views in this model [30, 35] leads to a solution, in the mean field limit [3, 77], that eventually settles into a fixed value of vote-share for one candidate, depending on the initial conditions. Our generalized voter model, on the other hand, does not necessarily settle into a fixed value. Instead, our main result shows that despite fluctuations of the voting dynamics, voter choices converge in distribution. Moreover, the long-run stationary distribution of vote-shares does not depend on the initial vote choices of uncommitted voters. This result, first reported in [18] within a fully solvable model, accounts for both the finite number of voters in a population and the numerous sources that convey consistent partisan biases to uncommitted voters. These properties and the foregoing model’s assumptions essentially create the kind of characteristic vote-share fluctuations across counties as recently observed in the sociophysics literature (e.g., [52, 56, 58, 78]) for various countries; and thus support the plausibility of the model and its capacity to describe real world voting phenomena. Of course, successful matching to a variety of spatiotemporal real-world election data is the ultimate test of any theory. The parameters N0 and N1 of the fixed voters can be interpreted according to two viewpoints. We emphasize that both viewpoints are valid and useful: (1) Zealots and opinion leaders: As originally stated above, fixed voters can be viewed as unwavering candidate supporters and opinion leaders (peers, journalists, or politicians) that influence uncommitted voters, but are not themselves influenced by their neighbors' vote choices; (2) External factors: Alternatively, following our assumption in which each uncommitted voter is equally likely to interact with the fixed voters, the parameters N0 and N1 give the “effective strength” of the consistent partisan bias conveyed by the fixed voters in favor of one of the candidates (with effective strength N0) over another (with effective strength N1). As stressed in the Introduction, these consistent partisan biases by opinion leaders is merely one instance of a broad class of consistent external factors that influence the choices of uncommitted voters. These external factors include exposure to television, newspapers, or campaign persuasion. Recognizing that no voter is a “blank slate,” these external factors also include any prejudiced beliefs, party identification, individual’s upbringing, religion, or political ideology of uncommitted voters [20-22]. Mathematically, this broad interpretation is achieved (see Materials and methods) by analytically extending the parameters N0 and N1 to non-integer values; thus enabling modeling arbitrary strength of these external influences in favor of one of the candidates over another. According to this viewpoint, copying the state of a connected voter represents mutual influence among friends, neighbors, and family members via social imitation or via a consistent partisan bias acting on uncommitted voters (by opinion leaders or other external sources). External influences of opposite partisan biases do not cancel; instead larger N0 and N1 reflect increasing probability that consistent partisan biases determine the choices of uncommitted voters, independent of the voting choices of other uncommitted voters. Here we assume that there are many external sources of competing political information, and that over the election period in question the sources are persistent in their proportion of partisan biases regarding the two-major party candidates, though vary in the way they influence individual voters’ choices. Election years that are consistently biased towards the first (second) party’s candidate would be represented by N0 greater (smaller) than N1.

The limiting stationary distribution of votes

We have previously proposed the above model as a widely applicable theory of collective behavior of complex systems [18–19, 31, 79–80], where the generalized voter model was solved exactly for a fully connected network. The fully connected network case was also shown to be equivalent (up to simple scaling) to a homogeneous random network (see Materials and methods). More specifically, at equilibrium, the probability of finding the network in the global state of k free voters in state 1 (i.e. voting for candidate 1) is given, independently of the initial state, as follows (see derivation in Materials and methods): where N is the number of free voters, k is the number of free voters is state 1 and are binomial coefficients. As mentioned above, analytically extending the parameters N0 and N1 to non-integer values enables to capture not only the case of zealots and opinion leaders but also the generalized effects of external factors (see Materials and methods). In this case, the solution in Eq 1 remains the same, with the difference that factorials must be replaced by gamma functions. Indeed, as we move around in the (N0, N1)-parameter space, the stationary distribution in Eq 1 exhibits strikingly different shapes. The different shapes of the stationary distributions depend on the magnitude of the external parameters, N0 and N1, compared to the extent of social imitation within the network of uncommitted voters, and the relative partisan bias of opinion leaders or other external influences (e.g., television and newspapers) toward the first or second candidate (i.e., N0 > N1 or vice versa). As shown in Materials and Methods, these distributions vary from skewed unimodal distributions with intermediate peaks or peaks at all nodes 1 or all nodes 0, to bimodal and uniform distributions. Interestingly, Eq 1 remains valid for other network topologies (including random, regular lattice, scale-free and small world networks) if N0 and N1 are re-scaled according to the degree distribution (see Materials and methods). In this paper we are mostly interested in the fraction of voters (vote-share) that voted for a candidate rather than the actual number of voters. Thus, we define the vote-share for candidate 1 as the scaled variable v = k/N. The mean and variance of v can be computed from Eq 1 as follows The variance of vote-shares in Eq 3 has an appealing interpretation. When peer influences (via social imitation) are very weak compared to external forces (N0, N1 → ∞), the variance of vote-shares becomes . This is the variance of vote-shares that one would expect if all uncommitted voters are solely influenced, each with probability μ, by the consistent partisan biases exerted by either opinion leaders with opposing views or other external forces (e.g., mass media), independent of the voting choices of other uncommitted voters. The second term on the right side of Eq 3, which is a decreasing nonlinear function of the external influence parameters, represents the effect of social imitation and peer influence within the network of uncommitted voters. This second term, which we call the “social influence index,” provides us with a method of detecting and isolating the effect of social imitation and social contagion. We use this index extensively in this paper to explore and understand how social influence changes across states and over almost a century of county-level U.S. presidential election years.

Estimation of external influence from large scale voting data

The U.S. presidential election data are often collected at the level of counties. This data provides, among others, information on the vote-share in each county i (a single realization from an unobserved stationary vote-share distribution). Thus, in order to divulge the phenomenology of voting contagion in electoral voting behavior, we need to show how to estimate the external parameters of the generalized voter model from real data. The unknown external influence parameters N0 and N1 for any state in any election year can be estimated from a sample of observed vote-shares across counties as follows. Suppose a particular state has n counties, and let v be the fraction of voters in the ith county that voted for candidate 1, and N be the total number of votes cast for all candidates in the county. We assume that all counties of a state are influenced by the same external parameters N0 and N1. Accordingly, the voting dynamics in the ith county is governed by the generalized voter model, which applies to a subnetwork of N free nodes and N0 and N1 fixed nodes (note that each county has a different number of free nodes). We assume that the vote-share distribution (Eq 1) in each county is in equilibrium, and that the corresponding mean and variance are given by Eqs 2 and 3. Using Eq 2, the expected value of the vote-share in county i does not depend on i, and is equal to μ = μ = N1/(N0 + N1). We thus estimate μ by simply taking the sample average of vote-shares across all n counties. For the variance of the vote-share in county i, a crude estimate based on the single observed vote-share data point v is provided by . Obviously, this estimate is imperfect and we define the residual between and the estimate of Using Eqs 3 and 4, we define a system of nonlinear estimation equations (one equation for each county) that relate , the estimate of , to the external parameters N0 and N1: The estimation procedure first estimates μ on the right hand side of Eq 5 by , and then select the sum of parameters N0 + N1 that minimizes the squared errors in Eq 5. The least squares estimate is given by Eq 6 and the condition fully determine the estimated external parameters. We can then use the estimate in Eq 6 to obtain the “social influence index" of the state as defined in Eq 3: where is the average number of voters per county. Eq 7 forms the basis for the statistical analysis of social imitation; for all states across U.S. presidential elections (see section Empirical Results of US Presidential Elections: 1920 to 2012).

Derivation of the stationary vote-share distribution at the county level

For the U.S. presidential elections from 1920 to 2012, we empirically find that the external parameters for all states and across election years. Moreover, we notice that the total number of voters N, in any county i for any given election year, is large. Thus, the voting dynamics in any county is applied to a network of voters with a very large number of free and fixed nodes. Driven by these facts, we find that in the limit N → ∞ the stationary distribution in Eq 1—characterizing the long run distribution of votes in the ith county—is approximately a Gaussian distribution (see Materials and methods). More specifically, the asymptotic vote-share distribution in county i is given by a Gaussian with mean μ = μ = N1⁄(N0+N1) and variance . We stress that this predicted Gaussian vote-share distribution (and its characteristic mean and variance) at the county level is not assumed from the outset but turns out to be the consequence of basic principles of voting behavior and the generalized voter model. We next derived the stationary vote share fluctuations across counties.

Derivation of the stationary vote-share distribution across counties

While the stationary vote share distribution at the county level is not observed (but predicted to be Gaussian), the availability of large sets of empirical voting data enables us to obtain, for each state in every election year, the probability distribution of observed vote-shares across all n counties in the state (see Fig 1). As mentioned above, this candidate vote share distribution has been the focus of recent attention. A plausible model for the stationary vote-share distribution across counties is to describe it as a Gaussian scale mixture [81] with n different components (representing the n counties in the state), each distributed as a normal distribution with the same mean μ and different variances , as specified above. Let v denote the random variable corresponding to this Gaussian mixture (this is called the “vote-share per county” in Fig 1). This Gaussian mixture is a unimodal distribution with mode at μ, skewness value β1 = 0, kutosis value , and variance Using the Pearson system, the Gaussian scale mixture can be shown to be approximately a t-distribution [82]. More specifically, Let and c2 = (β2 − 3)/(5β2 − 9) be the Pearson coefficients corresponding to the Gaussian mixture, and let , and m = (1 − c2)/c2. Then, the scaled and shifted random variable α(v − μ) is approximately distributed as a Student's t-distribution with m degrees of freedom [82]. Notice that the parameters μ, α, and m of this Student's t-distribution can be completely specified once the external parameters N0, N0 are estimated (as was shown above), and the number of counties in the state n, and total number of votes N in each county are given (these data are publicly available in many countries). The above key result implies that the scaled and shifted vote-shares across counties can be described by a Student's t-distribution with m degrees of freedom. Finally, we empirically find for our comprehensive U.S. presidential election data that the number of degrees of freedom m ≫ 100 for all states in every election year. In this case, the Student's t-distribution with m degrees of freedom approaches the normal distribution; and thus the distribution of the scaled and shifted vote-shares across counties is predicted to match nicely the standard normal distribution. We emphasize that this predicted Gaussian vote-share distribution across counties is derived from first principles and does not involve any a priori assumption about the vote-share distribution. Successful matching to election data will be a corroboration of this theory.

Empirical results of US presidential elections: 1920 to 2012

Analyzing the county vote-share probability distributions

Our analysis is based on US presidential election data from 1920 to 2012 [83]. States with less than 10 counties (i.e., Connecticut, Delaware, Hawaii, and Rhode Island) and Washington D.C. were excluded from analysis. For each state, in every election year, the data includes information on the number of counties n for which vote-share data was available, the vote-share v in county i, and the total number of votes cast for all candidates in county i, N. The external influence parameters N0 and N1, and the distribution parameters α and m were estimated for all states and every election year. Using these parameters, we constructed the probability distribution of the scaled and shifted vote-share quantity α(v − μ), and compared it with the predicted normal distribution. Fig 2 shows that this theoretical prediction fits remarkably well for most states and election years, representing almost a century of county-level U.S. presidential election data, and is consistent with observations in other countries [78].

Fig 2

Scaled vote-share distributions and predicted curves, 1920–2012.

Without loss of generality, curves are presented for states with the greatest number of counties in each of the four census regions—Northeast, Midwest, South, and West—of the U.S. The figure shows the plot of the cumulative distributions. Observed values (dashed lines) are based on kernel estimation with Gaussian kernels. Solid lines are Gaussian distributions with mean 0 and variance 1. The scaled vote-shares are calculated from the estimated external influence parameters N0 and N1. The goodness of fit of the Gaussian relative to the empirically observed county vote-share distributions was determined by using a Kolmogorov-Smirnov test. The fit of the model is excellent—the test fails to reject the normality null hypothesis at the 5% significance level, for 95% of all states in every election year; and at the 1% significance level, for 98% of all states in every election year.

Scaled vote-share distributions and predicted curves, 1920–2012.

Analyzing the evolution of social influence

We can use the index of social influence defined in Eq 7 to explore the level of social interactions across states and election years. We first examine the distribution of the social influence index, aggregated over all states and election years. The histogram in the upper panel of Fig 3 shows a right-skewed distribution. This means that while the bulk of the distribution occurs for small values of social contagion, the electorate in US presidential elections is at times highly volatile and subject to wide swings of social contagion effects higher than the typical value. This is reflected by the highly right-skewed tail of the histogram. Here we find that the log-logistic provides a slightly better fit relative to the log-normal distribution. The log-logistic is a heavy-tailed distribution similar in shape to the log-normal distribution, but with heavier tails [84].

Fig 3

Distributions of social influence and best-fit curves, 1920–2012.

Distributions of social influence and best-fit curves, 1920–2012.

The upper and middle panels of Fig 3 show the histogram and cumulative distribution of the Social Influence index (using Eq 7), aggregated over all states and election years. The lower panel shows the spread of the Social Influence index. The broad distributions are best fitted by the log-logistic distribution, often used for analyzing skewed data. The goodness of fit of the log-logistic distribution was determined by using a Kolmogorov-Smirnov test. The fit of the model is very good (p = 0.16). To analyze the electoral dynamics, we examine the spatial and temporal variation in social influence from 1920 to 2012. First, we examine the evolution of social influence over time. Fig 4 shows the time series of the average social influence for each of the nine U.S. census divisions (panels a-f) along with the time series of (normalized) social influence averaged over all U.S. states (panel g). To enable the comparison of the various time series, all data are normalized Z–scores. Specifically, for each individual time series we express the social influence in terms of standard deviation from their mean, calculated from 1920 to 2012. We use hierarchical clustering to identify clusters of U.S. census divisions with the highest within-cluster time-series correlation and the greatest between-cluster time-series variability. The result from the hierarchical clustering suggests three clusters: two main clusters (arranged in panels d and f) with within-cluster average correlations of 0.824 and 0.909, and a relatively high between-cluster average correlation of 0.775; and a singleton cluster (New England) with a relatively low between-cluster average correlation of 0.1071.

Fig 4

Evolution of social influence, 1920–2012.

Evolution of social influence, 1920–2012.

Time series of social influence averaged over states, and their normalized versions (see text), are shown for each of the nine U.S. census divisions: a-b) Division 1, New England—Maine, Massachusetts, New Hampshire, and Vermont; c-d) Division 2, Middle Atlantic (Solid line): New Jersey, New York and Pennsylvania. Division 4, West North Central (Dashed line): Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota and South Dakota. e-f) Division 3, East North Central (Solid line): Illinois, Indiana, Michigan, Ohio and Wisconsin. Division 5, South Atlantic (Dashed line): Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia and West Virginia. Division 6, East South Central (dotted line): Alabama, Kentucky, Mississippi and Tennessee. Division 7, West South Central (Dash-dot line): Arkansas, Louisiana, Oklahoma and Texas. Division 8, Mountain (Gray solid line): Arizona, Colorado, Idaho, Montana, Nevada, New Mexico, Utah and Wyoming. Division 9, Pacific: Alaska, California, Oregon and Washington (Gray dashed line). The nine U.S. census divisions are clustered according to the correlation between their respective normalized social influence profiles. The corresponding average pairwise correlations are 0.824 and 0.909 for the clusters in Fig 4c-d and Fig 4e-f, respectively. Panel 4g shows the time series of normalized social influence, averaged over all U.S. states. The Z–scores are mapped to colors from white (z = −1.1, below the mean) to black (z = 2.9, above the mean). A clear pattern of high social influence (positive or near-zero Z–scores) follows a period (1920–1980) of low social influence (negative Z–scores). The 1984 break date—separating low from high levels of social contagion—is identified by the Mann-Whitney U-test, which is applied for different potential breaks within the range 1920–2012 (see S1 Fig). Remarkably, we find that despite variations in social influence across states and divisions, the normalized time series pertaining to the overwhelming number of states (with the exception of the three-state region of New England analyzed here) collapse on a very similar curve. Indeed, as can be seen, the normalized curves in panels d and f show a very similar pattern, which is also similar to the observed temporal pattern of social influence when averaged over all states (Fig 4g). That is, the pattern in Fig 4g shows a monotonic upward trend, which means that social influence increases through time (Mann-Kendall test, p < 0.001). Moreover, the period of 1984–2012 displays much higher levels of social influence when compared with the period of 1920–1980, which displays lower levels of social influence (Mann-Whitney U-test, p < 0.001, see S1 Fig). New England is an apparent exception to this pattern (Fig 4a and 4b). However, this exception may be explained by the historical events and our model. One of the most unique characteristics that makes New England, as a political region in America, different from other regions is its town meeting form of government—a local institution that did not spread to other states [85]. The town meeting is the legislative assembly of a town in which qualified voters make laws in face-to-face communal decision making [85]. Town meetings defined New England’s politics until the middle decades of the 20th century. This was changed in 1962 with the Supreme Court’s “one person, one vote” decisions, which resulted in shifting power dynamic away from most small towns that practiced town meetings, face-to-face interactions, to cities that adopted representative politics [85]. Thus the relative high levels of social interaction observed in New England prior to the 1960 election (see Fig 4b)—contrary to the patterns observed in other regions—correspond to the period in which town meetings—a powerful platform of social influence via face-to-face, communal, decision making—had wide legislative powers. This was followed by a sharp decline in relative social influence (see Fig 4b) after the Supreme Court’s “one person, one vote” decision, which had the effect of shifting the power from face-to-face communication and social interaction to representative politics. This political transition changed not only the relative level of social interactions—and thus the variability of the vote-share distributions—but also impacted the partisan bias—hence the mean of the vote-share distributions—towards the Democrats [85].

Analyzing the correlation trends of social influence

As a further support for the usefulness and consistency of our model, we examine how the spatial variation of social influence across states changes over time. We can characterize each election as a vector of state-level indices of social influence (using Eq 7), and measure the similarity between each pair of elections by the corresponding correlation coefficient (Fig 5b). This type of analysis, combined with the findings in Fig 4, reveals intriguing patterns that go beyond short‐term fluctuations in partisan division of the vote. Hierarchical clustering of the elections by the social influence correlation distance shows several marked clusters of highly similar election years (see Fig 4b): 1932–1972, 1984–2012, and three smaller clusters 1920–1924, 1928, and 1976–1980. Remarkably, these clusters of social interactions correspond nicely with the partitioning of American history into distinct party systems [86].

Fig 5

Spatial variation of social influence and its change over time, 1920–2012.

Spatial variation of social influence and its change over time, 1920–2012.

a) Hot spot analysis of social influence for sample maps of US presidential elections (see S2–S25 Figs and S1 Movie for a complete analysis). The colored areas reflect the significance (p-value) of local concentration of social influence for each state. The p-values for each state are derived from a random permutation test of local clustering using the Getis-Ord Local statistic. This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. Low p-values (p-value≤0.1) indicate statistically significant high levels of social influence at a state and its surrounding neighbors (hot spots). High p-values (p-value≥0.9) indicate statistically significant low levels of social influence at a state and its surrounding neighbors (cold spots). b) Heatmap of the correlation coefficients between pairs of election years characterized by their state-level social influence profiles. There have been six party system periods in American history, separated by relatively significant change in party loyalties [86-90]. Clustering analysis reveals that during 1932–1972, external forces (in the form of attitudes, orientations, party identification, individual’s upbringing, religion, or ideology) are strong compared to social/peer influences, and indicate a stable long-term electorate phase. This result is plausibly supported by the historical account. The stock market crash of 1929 and the ensuing depression signaled the realignment of the fifth party system from a Republican to Democratic majority with the election of 1932 and the New Deal coalition [86, 88]. The change was also influenced by demographic changes of rising American electorate of African Americans, blue collar workers, Catholics and urban ethnics, and a shrinking Republican base of white Protestants, small town residents, farmers, and middle class businessmen [88]. The distinction between external and social influence stands despite some fluctuations in Republican vs. Democratic selections. The cluster 1976–1980 identified by our clustering analysis (see also Fig 5b) suggests that the elections of 1976 and 1980 formed a transition period to the post-New Deal era of weakened partisanship among the voters [88]. This transition period corresponds to the Watergate scandal and Nixon's resignation in 1974, and Democrat Jimmy Carter’s victory in the 1976 presidential election. Whereas the previous fifth party system was characterized by strong party loyalties and partisan attachments, the sixth party system (overlapping with the 1984–2012 cluster in Fig 5b) is characterized by electoral dealignment—the weakening of party loyalties among voters [88, 90–91], reduced political involvement [92], and the critical role of voters’ personal social interaction networks in determining vote choices [24]. As partisanship declined and more voters became independents [86], inter-election vote swings increased [88]. Moreover, the external influence of television and newspaper declined as the media were considerably less likely to be sources of partisan-biased information [24, 88, 93]. This led to a period of strong competition where neither Democrats nor Republicans created a true majority party, resulting in alternating control of the presidency, split-ticket voting, and divided government. These trends seem to be consistent with our model, which shows higher levels of social contagion for the 1984–2012 period (Fig 4g), relative to the 1920–1980 period, combined with the long-term stability of social influence patterns indicated by the high levels of association between the 1984–2012 elections (Fig 5b). The high levels of social interactions observed in the 1984–2012 period (Fig 4g) account for an increasing volatility and variability of the vote-share distributions (via Eq 3). This is seen in the historical account: The Republicans won the presidency with the victories of Ronald Reagan and George H. W. Bush in 1980, 1984, and 1988, and regained control of the Senate from 1981 to 1987 for the first time in almost 30 years. The Democrats regained control of the presidency with Bill Clinton in 1992 and 1996 whereas the Republicans won control of the Congress from 1994 to 2006 for the first time in 40 years. In 2000, Republican George W. Bush defeated Democratic Al Gore in the closest election in modern U.S. history. Although Bush won reelection in 2004, Democrats won control of Congress in 2006, and Democrat Barack Obama was elected in 2008. Although it would seem that Obama's victories in the 2008 and 2012 suggest a critical realignment of the party system, the Republicans regained control of the House in 2010 by their biggest landslide since 1946, and control of Congress in 2014, with the largest Republican majority in the House since 1928.

Mapping the geography of social influence

S2 Movie in Supplementary Material shows maps (excluding Alaska and Hawaii), color-coded by levels of social influence, for all election years. In order to better characterize the spatial patterns of social influence observed in S2 Movie, we apply a variety of spatial statistical data analysis methods. First, we utilized a random permutation test of spatial autocorrelation using the Moran’s I statistic [94, 95]. The random permutation tests suggest (see S1 Table) the presence of significant positive spatial correlation, for all election years, between states’ own levels of social influence and the levels of their neighbors as indicated by the level of significance (p-value) shown in the third column of S1 Table. This analysis was performed with a contiguity spatial weight matrix (row normalized) that indicates whether states share a boundary or not. While the Moran’s I statistic indicates that the spatial distribution of high and/or low values is more spatially clustered than would be expected if underlying processes were random, it does not identify unexpected spatial spikes of high or low social influence values. We thus applied random permutation tests of spatial clustering using the Getis-Ord General G* statistic [96-97]. The tests indicate (see S2 Table) that social influence is significantly concentrated in space as shown by the significance levels (p-value) in the third column of S2 Table. That is, for all election years, the observed Getis-Ord General G* is larger than the expected General G*, indicating that the spatial distribution of high social influence values is more spatially clustered than would be expected if underlying spatial processes were truly random. In order to identify where high or low values of social influence cluster spatially, we further applied a random permutation test of local clustering using the Getis-Ord Local statistic [96-97]. Low p-values of the random permutation test indicate statistically significant high levels of social influence at a state and its surrounding neighbors (hot spots). High p-values indicate statistically significant low levels of social influence at a state and its surrounding neighbors (cold spots). This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The corresponding maps of hot spot analysis, for all election years, are presented in S2–S25 Figs and S1 Movie. A sample of these maps of social influence clusters is presented in Fig 5a. The colored areas in Fig 5a reflect the significance (p-value) of local concentration of social influence for each state, derived from the random permutation test of local clustering using the Getis-Ord Local statistic. The maps shown in Fig 5a (see S2–S25 Figs and S1 Movie for a complete analysis) enable to identify unusual geographical concentrations of high or low values (i.e., hot or cold spots) of social influence across the United States, for each election year. More specifically, the hotspot analysis of US presidential elections from 1920 to 2012 reveals a distinctive geographical cluster of states with statistically significant low levels of social influence (cold spots). This cluster is comprised of states mainly in the Great Plains and West North Central regions (including, for example, Montana, Wyoming, North Dakota, South Dakota, Nebraska, Kansas and Oklahoma). Contrastingly, states predominantly in the Middle Atlantic region (New Jersey, Pennsylvania, and New York)—for all election years—and states in the Pacific region (California, and Oregon) and the Southwest (Arizona and Nevada)—from 1988 to 2012—display high values of social influence (hot spots). It would be interesting to speculate on the political, economic, social, and psychological factors that drive geographic variation in voting contagion. Research in the geographical and psychological sciences, which examines the geographical distribution of political, economic, social, and personality traits within the United States [98-103], suggests that the Great Plains and West North Central region is characterized by individuals that are typified by conservative social values, low openness and resistance to change, and preference of familiarity over novelty. This region comprises states with comparatively small minority populations [101], is less affluent, has fewer highly educated residents, is less innovative compared with other regions, and tends to be politically conservative and religious [100]. Individual in this region choose to settle near family and friends and maintain intimate social relationships with them, but also tend to display low levels of social tolerance and acceptance for people who are from different cultures, unconventional, or live alternative lifestyles [100]. Altogether, the above characteristics indicate a region where voters' choices are plausibly based upon strong ideology, party identification, orientations and attitudes rooted in religion and traditional social values, and reinforced by face‐to‐face interactions with like‐minded family members and friends. We therefore expect our model to generate a social influence index (see S2 Movie for maps of raw social influence values instead of the Getis-Ord Local statistic) that reflects external forces (e.g., in the form of party identification or ideology), which are strong compared to peer influences. Unlike the very low openness and conservative social values typical for the Great Plains and West North Central region, states along the Middle Atlantic and Southwest region are marked by moderately to very high openness, is wealthy, educated, culturally and ethnically diverse, and economically innovative [100, 102]. This region appears to be politically liberal, and has fewer mainline Protestants [100]. Residents of this region also appear to be tolerant and accepting of social and cultural differences [100]. Considering the social diversity, tolerance, openness, and open-mindedness in this region, it is plausible that people’s orientations and attitudes are influenced by the attitudes of others [100]. This is consistent with our model, which shows high levels of social influence index (see maps of raw social influence values in S2 Movie) that indicate peer influences that are strong compared to external forces in the form of attitudes or ideology. Although further research is needed to uncover the factors affecting social influence, it is plausible that economic, social, and psychological factors, as discussed above, can explain the geographical variability of social influence.

Discussion

Many complex systems can be viewed as comprising of numerous interconnected units each of which independently responds to external forces, but is also affected by internal forces exerted by the states of its connected units. In such systems, the stationary distribution of the states of the units may change in characteristic ways depending on the strength of external influences relative to internal influences [18–19, 31, 79–80]. Therefore, a key question is how to disentangle the effect of internal influences from that of exposure to external influences, given observational data about the phenomena we are trying to explain. This identification problem is important not only to the biological and physical sciences (e.g., ecosystems, see [104]), but also in the social sciences where the importance of social interactions in forming opinions and decisions has been emphasized [12, 45–46, 105]. The U.S. presidential elections are a case in point. In such situations, voters’ candidate choices are affected by many sources that convey consistent partisan biases skewed in favor of one candidate over another. These sources are numerous and include exposure to television, newspapers, campaign persuasion, or opinion leaders (including peers, journalists, or politicians); but also include various individual prejudiced attitudes and orientations, party identification, individual’s upbringing, religion, or ideology (no voter is a ‘blank slate’). Uncommitted voters are also affected by the choices of other uncommitted voters in their own personal networks, via social imitation mechanisms. All of these empirical facts are deeply rooted in the extensive study of electoral behavior by social and political scientists (see Introduction) as well as studies of opinion dynamics in the sociophysics literature (see Models of Opinion Dynamics). The vote-share fluctuations across counties, and other spatiotemporal voting patterns, thus depend on the relative magnitude of the persistent partisan biases for one candidate over another. Individual voters are influenced by a variety of psychological and social factors, but taking them all into account would be not only impossible but also unnecessary for understanding the large-scale behavior of the system. This large-scale behavior can still be captured by introducing a few key parameters, as we have demonstrated in this paper. We presented a general methodology for quantifying the degree of social imitation and peer influence on the basis of given observational data. The methodology is based on an extended version of the voter model [18-19] that takes into account the effect of external forces, and is applied to a comprehensive data of US presidential elections from 1920 to 2012. An essential element in the model is social interaction between individual voters. The model includes two parameters that reflect the bias in favor of one of two candidates. These tunable parameters represent unwavering candidate supporters (zealots or opinion leaders) that convey a consistent partisan bias in favor of one candidate over another; or, as discussed above, alternatively can be interpreted as external factors that influence uncommitted voters’ choices. In addition to these external factors, voters are also influenced by the behavior of others via social imitation. Our model is validated in several ways. First, we derive the theoretical probability distribution of the vote-share per county, and find a remarkable fit between the theoretical result and the empirically observed county vote-share distributions. Our theoretical result is also consistent with observations in other countries [78]. To our knowledge this is the first study that provides an analytical expression of the stationary vote-share distribution across counties. Second, we examined the temporal dynamics of social influence by calculating the social influence index for each state and each election year. Our analysis reveals a distinct pattern of increasing social influence over 92 years (1920–2012) of US presidential elections. The 1984 election year represents the phase transition point from low (1920–1984) to high (1984–2012) levels of social contagion. The increasing levels of social influence at presidential elections suggest, in turn, the decline of bias induced by external forces (e.g., partisanship among voters), and an increasing of independence in voting behavior. Third, we examined how the geographic variation across states in social influence changes over time. This spatiotemporal analysis enables our model to reproduce two stable long-term periods of election years corresponding to two successive long-term periods of low and high levels of social contagion, in alignment with the 1984 phase transition finding. This suggests a new data-driven, large-scale systems approach of characterizing abrupt transitions of political events, which is based on critical realignment in the patterns of social contagion. Finally, we use the model to map the social contagion geography of the United States. Results from spatial analysis reveal robust differences among regions of the United States in terms of their social influence index. In particular, we identify two regions of ‘hot’ and ‘cold’ spots of social influence, each comprising states that are geographically close. We provided some evidence that statewide variation in social contagion may be linked to psychological, social, and economic factors. More broadly the results suggest the growing role of social influence, contagion, and ‘herd-following’ in shaping peoples’ behaviors, tastes, and actions in a variety of real-life situations. Social influence and contagion will likely become increasingly evident as our society becomes more interconnected through the information superhighway and transport infrastructure networks. If we want to truly understand macro-level collective behavior in human systems—and perhaps devise ways by which human society can increase its collective wisdom—it will be important to develop practical and effective methods for measuring and monitoring the extent of social influence.

Materials and methods

Dynamic network model of voting

Consider a network representing a county with N nonpartisan voters (variable nodes) taking only the values of 0 or 1, representing support for candidate 0 or 1, respectively (e.g., Republican or Democrat). In addition, there are N0 and N1 partisan voters (frozen nodes) in state 0 and 1, respectively. At each time step, a variable node is selected at random; with probability 1 − p the node copies the state of one of its connected neighbors, and with probability p the state remains unchanged. The partisan nodes can also be interpreted as external perturbations, representing a variety of factors that influence voters’ attitudes towards one of the two candidates (e.g., mass media, party identification, individual’s upbringing, religion, or ideology). Analytically extending N0 and N1 to be real numbers enables modeling arbitrary strengths of external perturbations. For a fully connected network the behavior of the system can be solved exactly as follows. The nodes are indistinguishable and the state of the network is fully specified by the number of nodes with internal state 1. Therefore, there are only N + 1 distinguishable global states, which we denote S, k = 0,1,⋯,N. The state S has k variable nodes in state 1 and N − k variable nodes in state 0. If P (k) is the probability of finding the network in state S at time t, then P(k) can depend only on P(k), P(k + 1) and P(k − 1). The probabilities P(k) define a vector of N + 1 components P. The dynamics is described by the equation The term inside the first brackets gives the probability that the state S does not change in that time step and is divided into two contributions: the probability p that the node does not change plus the probability 1 − p that the node does change but copies another node in the same state. In the latter case, the state of the node is 1 with probability k / N, and it may copy a different node in the same state with probability (k − 1 + N1)/(N + N0 + N1 − 1). Also, if the state of the selected node is 0, which has probability (N − k)/N, it may copy another node in state 0 with probability (N − k − 1 + N0)/(N + N0 + N1 − 1). The other terms are obtained similarly. In terms of P, the dynamics is described by the equation where the time evolution matrix T, and also the auxiliary matrix A, is tri-diagonal. The non-zero elements of A are independent of p and are given by The transition probability from state S to S after a time t can be written as where a and b are the components of the right and left r-th eigenvectors of the evolution matrix, a and b. Thus, the dynamical problem has been reduced to finding the right and left eigenvectors and eigenvalues of the time evolution matrix T. The eigenvalues λr of T are given by and satisfy 0 ≤ p ≤ λ ≤ 1. The equation for P(L,t; M, 0) shows that the asymptotic behavior of the network is determined only by the right and left eigenvectors with unit eigenvalue, i.e., by the eigenvector corresponding to λ0 = 1. The coefficients of the corresponding (unnormalized) left eigenvector are simply b0 = 1. The coefficients a0 of the right eigenvector are obtained using a generating function technique and an associated nonlinear second order differential equation [18-19]. The coefficients are then given by the Taylor expansion of the hypergeometric function F(−N, N1, 1 − N − N0, x) ≡ ∑ a0x. After normalization, these coefficients give the stationary distribution This is the probability of finding the network with k nodes in state 1 at equilibrium, and it is independent of the initial state. The other eigenvectors, corresponding to λ ≠ 1, can also be calculated, and are also related to hypergeometric functions [18-19]. Although these eigenvectors provide a complete description of the dynamics of the network (see Eq 10), they are not particularly illuminating as we are interested in understanding the asymptotic behavior of the system (λ0 = 1). In the thermodynamic limit N → ∞, we can define continuous variables v = k/N, n0 = N0/N and n1 = N1/N and approximate the asymptotic distribution presented in Eq 12 by a Gaussian with mean and variance In the limit where n0, n1 ≫ 1, the width depends only on the ratio α = n0/n1 and is given by . In particular, for n0, n1 ≫ 1, the width tends to . While the model solved above was stated in terms of non-negative integer influence parameters N0, N1, it can be generalized to a model where the external influence parameters N0, N1 are real numbers. In this case, the solution in Eq 12 remains the same, with the difference that factorials must be replaced by gamma functions. Since the numbers N0/(N + N0 + N1 − 1) and N1/(N + N0 + N1 − 1) represent the probabilities that a free node (nonpartisan voter) copies one of the frozen nodes (partisan voters), small (large) values of N0 and N1 can be interpreted as representing a weak (strong) connection between the free nodes and the external system containing the frozen nodes. The external system can be thought of as a reservoir that affects the network but is not affected by it.

Model behavior

Fig 6 shows examples of the distribution ρ(k) for a network with N = 500 and various values of N0 and N1. As we move around in the (N0, N1)-parameter space, we observe different types of behavior, which is characteristic of a first-order phase transition.

Fig 6

Stationary distributions for different values of N0 and N1.

Probability distributions of finding the network with k nodes in state 1 at equilibrium for different values of N0 and N1. The number of variable nodes is N = 500.

Stationary distributions for different values of N0 and N1.

Probability distributions of finding the network with k nodes in state 1 at equilibrium for different values of N0 and N1. The number of variable nodes is N = 500. For N0 = N1 = 1, we obtain ρ(k) = 1/(N + 1) for all values of N, i.e. N0 = N1 = 1 is the critical value of this model. In this case, all states S are equally likely and the system executes a random walk through the state space. In the limit N → ∞, N0 = N1 = 1 marks the transition between disordered and ordered states. For N0, N1 > 1, we obtain skewed unimodal distributions with peak at N1/(N0 + N1) corresponding to the fraction of voters in the network that voted for candidate 1. If N1 > N0, the majority of votes go to candidate 1, and if N0 > N1 the majority of votes go to candidate 0. We note that the estimation of the influence parameters N0, N1, based on almost a century of US presidential election data, predominantly falls within this regime. For N0, N1 ≫ 1, ρ(k) resembles a Gaussian distribution, and if N0 = N1 about half the voters vote for candidate 0 and half the voters vote for candidate 1, similarly to a magnetic material at high temperatures. For N0, N1 < 1—the bistable (hysteresis) region—we obtain bimodal distributions in which either of the two network phases can exist, similar to the magnetization state in the Ising model below the critical temperature. For N0, N1 ≪ 1, the distribution peaks at all nodes 0 or all nodes 1, similar to a magnetized state at low temperatures. Finally, for N1 > 1, N0 < 1 or N1 < 1, N0 > 1, we obtain unimodal distributions with peaks at all nodes 1 or all nodes 0, respectively.

Other network topologies

Although the stationary vote-share distribution given by Eq 12 is obtained assuming fully connected networks, it was shown in [18-19] that our exact results are excellent approximations for other networks, including random, regular lattice, scale-free, and small world networks. These approximations can be useful, for example, if our model is applied to a network constructed based on online social networks or commuting networks. For these networks, which are not fully connected, the effect of the frozen nodes is amplified and can be quantified as follows: the probability that a free node copies a frozen node is P = (N0 + N1)/(N0 + N1 + k) where k is the degree of the node. We can then define effective numbers of frozen nodes in the corresponding fully connected network, N0 and N1, as being the values for which where the term on the right-hand side in Eq 14 is the expectation with respect to the degree distribution f(k), and the term on the left-hand side is the probability that a free node copies a frozen node in the corresponding fully connected network. Eq 15 is the mean field boundary condition. For well-behaved distributions, N0 and N1 can be obtained in terms of central moments of the degree distribution by expanding the right-hand side in Eq 14 around the average degree 〈k〉 of the real network, as follows: where μ = ∑(k − 〈k〉) f(k) are the central moments of the distribution f(k). For example, using only the first term in the Taylor expansion gives (N0 + N1)/(N0 + N1+N − 1) = (N0 + N1)/(N0 + N1 + 〈k〉). This leads to where f = (N − 1)/〈k〉. Therefore, as the network acquires more internal connections and 〈k〉 increases, the effective values N0 and N1 decrease.

Hotspots of social contagion: 92 years of presidential elections.

S1 Movie shows colored maps that reflect the significance (p-value) of local concentration of social influence for each state. The p-values for each state are derived from a random permutation test of local clustering using the Getis-Ord Local statistic. This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. Low p-values (p-value≤0.1) indicate statistically significant high levels of social influence at a state and its surrounding neighbors (hot spots). High p-values (p-value≥0.9) indicate statistically significant low levels of social influence at a state and its surrounding neighbors (cold spots). (MOV) Click here for additional data file.

Social influence topography of the United States: 1920–2012.

S2 Movie shows maps of social influence for all election years. The colored areas are derived from the social influence index calculated using Eq 7 in the main text. (MOV) Click here for additional data file.

Testing for a break in the level of social influence using the Mann-Whiney U-test.

The Mann—Whitney U-test is a nonparametric test that assesses whether one of two random variables is stochastically larger than the other. Given a time-series of social influence from 1920 to 2012, we define for each election year, y, two samples of social influence: from 1920 to y−4, and from y to 2012. We apply the Mann—Whitney U-test for these two samples, and calculate the corresponding p-value. The optimal break date is the date that achieves the minimum p-value over all potential breaks within the range 1920–2012 (marked by a red circle in the above curve, plotted in a linear-log scale). (TIF) Click here for additional data file.

Hot spot analysis of social influence: 1920 US presidential election.

The colored areas reflect the significance (p-value) of local concentration of social influence for each state. The p-values for each state are derived from a random permutation test of local clustering using the Getis-Ord Local statistic (see Fig 5 in main text for details). (TIF) Click here for additional data file.

Hot spot analysis of social influence: 1924 US presidential election.

Hot spot analysis of social influence: 1928 US presidential election.

Hot spot analysis of social influence: 1932 US presidential election.

Hot spot analysis of social influence: 1936 US presidential election.

Hot spot analysis of social influence: 1940 US presidential election.

Hot spot analysis of social influence: 1944 US presidential election.

Hot spot analysis of social influence: 1948 US presidential election.

Hot spot analysis of social influence: 1952 US presidential election.

Hot spot analysis of social influence: 1956 US presidential election.

Hot spot analysis of social influence: 1960 US presidential election.

Hot spot analysis of social influence: 1964 US presidential election.

Hot spot analysis of social influence: 1968 US presidential election.

Hot spot analysis of social influence: 1972 US presidential election.

Hot spot analysis of social influence: 1976 US presidential election.

Hot spot analysis of social influence: 1980 US presidential election.

Hot spot analysis of social influence: 1984 US presidential election.

Hot spot analysis of social influence: 1988 US presidential election.

Hot spot analysis of social influence: 1992 US presidential election.

Hot spot analysis of social influence: 1996 US presidential election.

Hot spot analysis of social influence: 2000 US presidential election.

Hot spot analysis of social influence: 2004 US presidential election.

Hot spot analysis of social influence: 2008 US presidential election.

Hot spot analysis of social influence: 2012 US presidential election.

Results of random permutation tests of spatial autocorrelation using Moran’s I statistic.

This analysis was performed with a contiguity spatial weight matrix (row normalized) that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. The observed Moran’s I statistics are shown in the second column and the corresponding significance levels (p-values) of the tests are shown in the third column. The random permutation tests suggest the presence of significant positive spatial autocorrelation as indicated by the level of significance (p-value) shown in the third column. (TIF) Click here for additional data file.

Results of random permutation tests of spatial clustering using Getis-Ord General G* statistic.

This analysis was performed with a contiguity spatial weight matrix that indicates whether states share a boundary or not. The variable of concern is the social influence index calculated using Eq 7 in the main text. The observed Getis-Ord General G* statistics and significance levels (p-values) of the tests are shown in the second and third columns, respectively. The tests indicate that social influence is significantly concentrated in space as shown by the significance levels (p-value) in the third column. For all election years, the observed Getis-Ord General G* is larger than the expected General G*, indicating that the spatial distribution of high social influence values is more spatially clustered than would be expected if underlying spatial processes were truly random. (TIF) Click here for additional data file.

33 in total

1. Scaling behavior in a proportional voting process.

Authors: R N Costa Filho; M P Almeida; J S Andrade; J E Moreira
Journal: Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics Date: 1999-07

2. Does a single zealot affect an infinite group of voters?

Authors: Mauro Mobilia
Journal: Phys Rev Lett Date: 2003-07-11 Impact factor: 9.161

3. Moran model as a dynamical process on networks and its implications for neutral speciation.

Authors: Marcus A M de Aguiar; Yaneer Bar-Yam
Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2011-09-01

4. The spread of behavior in an online social network experiment.

Authors: Damon Centola
Journal: Science Date: 2010-09-03 Impact factor: 47.728

5. Scaling and universality in proportional elections.

Authors: Santo Fortunato; Claudio Castellano
Journal: Phys Rev Lett Date: 2007-09-25 Impact factor: 9.161

6. Nonequilibrium transition induced by mass media in a model for social influence.

Authors: J C González-Avella; M G Cosenza; K Tucci
Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2005-12-01

7. Divided we stand: three psychological regions of the United States and their political, economic, social, and health correlates.

Authors: Peter J Rentfrow; Samuel D Gosling; Markus Jokela; David J Stillwell; Michal Kosinski; Jeff Potter
Journal: J Pers Soc Psychol Date: 2013-10-14