Literature DB >> 26064661

Simple graph models of information spread in finite populations.

Abstract

We consider several classes of simple graphs as potential models for information diffusion in a structured population. These include biases cycles, dual circular flows, partial bipartite graphs and what we call 'single-link' graphs. In addition to fixation probabilities, we study structure parameters for these graphs, including eigenvalues of the Laplacian, conductances, communicability and expected hitting times. In several cases, values of these parameters are related, most strongly so for partial bipartite graphs. A measure of directional bias in cycles and circular flows arises from the non-zero eigenvalues of the antisymmetric part of the Laplacian and another measure is found for cycles as the value of the transition probability for which hitting times going in either direction of the cycle are equal. A generalization of circular flow graphs is used to illustrate the possibility of tuning edge weights to match pre-specified values for graph parameters; in particular, we show that generalizations of circular flows can be tuned to have fixation probabilities equal to the Moran probability for a complete graph by tuning vertex temperature profiles. Finally, single-link graphs are introduced as an example of a graph involving a bottleneck in the connection between two components and these are compared to the partial bipartite graphs.

Entities: Chemical Disease Species

Keywords: fixation probabilities; graph models of populations; graph parameters

Year: 2015 PMID： 26064661 PMCID： PMC4453248 DOI： 10.1098/rsos.150028

Source DB: PubMed Journal: R Soc Open Sci ISSN： 2054-5703 Impact factor: 2.963

Introduction

Using graphs to represent structured populations has become an important tool in approaching a variety of questions relating to the way that information spreads in non-homogeneous populations. Questions that have been addressed include the extinction or fixation of mutant genes [1-4]; epidemics [5-7]; spread of computer viruses, rumours and gossip [8-11]; development of rumour spreading algorithms for synchronization processes in parallel computation [12-14]; uptake of innovations and new ideas [15]; locating the source of rumours or viruses (natural and computer) [16,17]; and tracking terrorists [18,19]. A population is represented as a directed graph, each vertex corresponding to an individual population member, while edges are labelled with information describing the interaction between population members. In many cases, the edge labels are probabilities so that the weight w of the edge that connects vertex i to vertex j describes the probability of an effect being propagated from vertex i to vertex j. Following this, an update procedure is defined describing temporal dynamics. Here attention is restricted to discrete processes but continuous models have also been explored [20]. Some of the useful update procedures are birth–death, death–birth, voter models and probabilistic voter models [21]. In this paper, birth–death updating is used. In cases of rumour spread, there are competing narratives and birth–death processes provide a good model. The state space approach developed in Voorhees [22] and Voorhees & Murray [23] allows determination of all fixation probabilities for arbitrary initial conditions. The population is partitioned into two disjoint classes, depending on the presence or absence of a defining characteristic (e.g. a mutant gene, believing a rumour, being infected with a virus, etc.). Vertices are labelled 0 if this characteristic is not present and 1 if it is present. Thus, for a population of fixed size N the state space is the set of 2 binary vectors of length N with the all zero vector corresponding to extinction and the all one vector to fixation. A state vector =(v1,…v) is also a binary number. Taking v as the denary form of this number, , and denoting the probability that the state denoted v will go to fixation by x, a set of master equations were derived [22], yielding fixation probabilities for all initial states. where W is the edge weight matrix of the population graph, v′=1−v and ()=⋅W,()=′⋅W. This yields interesting results [22,23] but is limited in that equations (2.1) are a set of 2−2 linear equations in an equal number of unknowns so that exact solution for populations having more than a few members is not possible unless strong symmetry constraints are imposed. As far as general solutions for fixation probabilities are concerned, Moran [24] derived the single vertex fixation probability for a complete graph on N vertices as and the k-vertex fixation probability for the complete graph is A graph is fixation equivalent to a complete graph if its single vertex fixation probability is given by the Moran expression of equation (2.2). Lieberman et al. [4] proved that a graph with a probabilistic edge weight matrix is fixation equivalent to a complete graph if and only if the edge weight matrix is doubly stochastic. Broom & Rychtàr [25] derived the single vertex fixation probability for the n-star graph and also provided a means to compute this probability for a line graph. Zhang et al. [26] derived the k-vertex fixation probability for star graphs, and Voorhees & Murray [23] give the single vertex fixation probability for the complete bipartite graph K together with summation formulae allowing computation of k-vertex fixation probabilities for these graphs. The star and bipartite results are reproduced in simple and compact form in Monk et al. [27] using a Martingale formalism, while results reported in Adlam & Nowak [28] indicate that the Moran formula of equation (2.1) may be, at least asymptotically, universal in the family of random graphs. In this paper, several classes of simple graphs are considered with respect to fixation probability and characteristic measures of graph structure. Because applied interest is in large populations, a good deal of research is focused on random graphs, complete graphs and small world graphs with vertex degrees specified in terms of global averages and treated statistically [29-32]. Our goal is more modest, to provide results in some very simple cases that may point to more general applications, and to examine the extent to which structural parameters of graphs may provide information related to fixation probabilities and other questions of interest. Graph parameters considered are eigenvalues of the graph Laplacian, subset conductances, graph communicability (the Estrada parameter) and measures of vertex centralities, and random walk hitting times. Choice of graphs for study has been both pragmatic and strategic: we have chosen graphs that are simple to analyse, yet exhibit behaviours suggestive for more general cases. The graphs considered are biased cycles, dual circular flows, partial bipartite graphs and single-link coupled graphs. While cycles are perhaps the simplest graphs to consider, biased cycles display interesting properties, e.g. providing a clear case of asymmetry in the graph Laplacian that shows an analogy to rotation/angular momentum. In addition, results obtained for cycles may be useful for analysis of more complex graphs in which interaction cycles can be identified, the simplest case being dual circular flows, which can be thought of as composed of an ensemble of cycles. Dual circular flows, in turn, constitute a large family of graphs that include star graphs [4,22,25], funnel graphs [4,22,23], cascades [23] and layered networks [33]. In addition, we use them to provide an example of how the edge weights of a graph can be tuned to match specified global properties such as temperature profile. Finally, the partial bipartite and single-link graphs form the two extremes of a class of graphs with vertex set the union of two components V and V with linkages between these components through subsets S⊆V and S⊆V having, respectively, k and m elements. In terms of application, this family of graphs can provide models for polarized populations in which communication between different subpopulations takes place through subsets of representatives. While illustrated here for only a small numbers of vertices, many of our results hold for graphs of arbitrary size. In particular, theorems (4.1), (4.3), (5.1) and (6.1) giving Laplacian eigenvalues for cycles, dual circular flows and partial bipartite graphs; equation 7.1, giving the characteristic polynomial for the Laplacian of a single-link graph; conjectures (6.2) and (6.3) on hitting times and communicability for partial bipartite graphs; and formulae derived for conductances are valid for any number of vertices. Partial bipartite graphs are of particular interest. These graphs have only three non-zero Laplacian eigenvalues and which of these is smallest or largest depends on the values of the connection parameters 1−p and 1−q between the two subsets of vertices. Equality of all three eigenvalues occurs at the value of p and q for which the conductances of each graph component are equal and, although no proof is available, this appears to correspond to the condition for equality of single vertex fixation probabilities. Further, given conjecture (6.3), these parameter values give the minimum value of the communicability and are the condition for equal probability measures on paths going in either direction between graph components. The notation used is that vertices are labelled with lower case indices ranging over the number of vertices. Thus, v labels the kth vertex of a graph. The single vertex fixation probability for the kth vertex is labelled by x2 with 2 the base 10 form of the binary number associated with a single one at the kth vertex.

Measures of graph characteristics

This section briefly describes the structural measures for graphs that are considered in this paper.

Eigenvalues of the Laplacian

The graph Laplacian of a graph G with edge weight matrix W is defined in Barbosa et al. [34] as Δ=Φ1/2(I−W)Φ−1/2, where Φ is a diagonal matrix with entries given by the solution of ϕ⋅W=ϕ. The smallest eigenvalue of Δ is always 0 and the eigen-spectrum satisfies λ0=0≤λ1≤⋯≤λ≤2. The number of zero eigenvalues gives the number of disconnected components of the graph. Thus, if λ1 is non-zero the graph is connected and the value of λ1 gives a measure of the difficulty involved in cutting the graph into disconnected parts.

Subset conductances

For a graph (G, W) with vertex set V the conductance, also known as the Cheeger constant [35], is defined as For a subset of vertices S, the conductance of S is The conductance of S is inversely related to the mixing time for a Markov process starting at a vertex in S.

Communicability

This concept was introduced in 2008 by Estrada & Hatano [36] in order to describe situations ‘in which a perturbation on one node of [a] network is ‘felt’ by the rest of the nodes with different intensities.’ [37], p. 92. The communicability between vertices i and j of a graph is defined as a weighted sum of all paths starting at vertex i and ending at vertex j, with weighting that favours shorter paths. This allows definition of a variety of communicability functions with the general form , where A is the graph adjacency matrix and the coefficients c are chosen such that the series converges, gives greater weight to shorter paths (smaller k values), and yields real, non-negative values for the matrix elements of P. As sums over loops of all lengths starting and ending at a vertex, the diagonal elements of P represent a measure of vertex centrality [37]. One choice for coefficients is c=1/k!, yielding P=e, and the communicability is defined as Tr(e/N), i.e. the average vertex centrality. Since cases considered in this paper involve probabilistically weighted edges, the matrix A is replaced by the edge weight matrix W. Hence the matrix function used is V =e and the i,j element of this matrix is the weighted sum of probabilities for paths of length k starting at i and ending at j, for . We still refer to Tr(e/N) as communicability, however. In most cases, e cannot be computed directly and the sixth-order approximation e∼I+W+1/2W2+1/6W3+1/24W4+1/120W5+1/720W6 is used.

Expected hitting times

Following Kemeny & Snell [38], the matrix M of expected hitting times for a graph (G,W) is computed using the fundamental matrix of the graph, defined by Z=[I−(W−A)]−1 where W is the edge weight matrix and A is the matrix with columns equal to the stationary probability of a random walk on G. If J is the matrix of all ones and Z* is the diagonal matrix with , then [38] The h element of M is the expected value of the random variable describing the number of iterations for a random walk that starts at vertex i to first reach vertex j.

Biased cycles

This section begins with consideration of biased cycles, that is, cycles in which there is a probabilistically preferred direction. Figure 1 shows the general form of a biased cycle. If the parameter p equals one-half it becomes a balanced cycle, while p=1 yields a clockwise cycle and p=0 yields a counter-clockwise cycle. The weight matrix W is doubly stochastic; hence, by the Isothermal theorem, the single vertex fixation probability equals the Moran probability, although time to fixation will differ [39].

Figure 1.

Counter-clockwise probabilities are p and clockwise probabilities are 1−p.

Counter-clockwise probabilities are p and clockwise probabilities are 1−p. From equation (3.2), it is easy to see that the conductance of any single connected block of k vertices in a length N biased cycle equals N/k(N−k). Also, the matrix e is circulant: e=circ(V11,V12,V13,V14,V15). Since the diagonal elements of W are zero if k is odd and symmetric in p and 1−p if k is even, Tr(e) will be symmetric in p and 1−p with maximum value at p=1/2. Figure 2 shows entries of e/5 for N=5.

Figure 2.

Entries in e=Circ(V11,V12,V13,V14,V15).

Entries in e=Circ(V11,V12,V13,V14,V15). This figure is symmetric about p=1/2. V=V for all i,j, hence Tr(e/5)=V11 and V12=V15,V13=V14 at 1/2. The diagonal elements V11 are substantially greater than the off diagonal elements with the exceptions V15=V11 at p=0 and V12=V11 at p=1. Since the steady state for the matrix W is homogeneous, the Laplacian matrix for a biased cycle is Δ=I−W. Surprisingly, the characteristic polynomial for Δ is rather complicated. For the biased cycle on N vertices, Proof of theorems (4.1) and (4.3) is tedious but straight forward.

Theorem 4.1

Let C(N, p) be a biased cycle on N vertices. The characteristic polynomial for Δ(C(N, p)) is where the coefficients satisfy and

Corollary 4.2

If p=0 or p=1, equations (4.2) reduce to (λ−1)=1(N even),(λ−1)=−1(N odd) and the eigenvalues of Δ are given in terms of the Nth roots of unity: λ=1+e2 or λ=1−e2 as N is even or odd, respectively.

Theorem 4.3

Let Δ be the antisymmetric part of Δ. For a biased cycle, all eigenvalues of Δ are either zero or are imaginary and contain a factor of 1−2p. If 1 – 2p is negative then there is a clockwise bias in the cycle, whereas if it is positive there is a counter-clockwise bias. This bias shows up clearly in the hitting times. Figure 3a−c shows hitting times for the biased cycle as a function of p for one-, two- and three-step transitions in the counter-clockwise direction. The shape of these curves is related to the possibility of a k-step transition occurring either as a counter-clockwise transition of k steps or a clockwise transition of N−k steps. Figure 3d shows the one-, two-, three- and four-step hitting times in the counter-clockwise direction for the N=9 biased cycle.

Figure 3.

Hitting times for single vertex transition in counter-clockwise direction as function of p for N=5–9 (from bottom). (a) single-step, (b) double-step, (c) three-step, (d) N=9 case for one through four-step transitions (bottom to top). The value of p at which the peak hitting time occurs gives a measure of the bias of the cycle. For example, for N=5 the single-step transition in the counter-clockwise direction is favoured for , whereas for p>0.68806 the four-step transition in the clockwise direction is favoured. Because of the symmetry between p and 1−p, the peak value of p starting from the symmetrically placed vertex will be one minus the peak value starting from a given vertex. Referring to figure 1, if vertex v1 is located at the top of a five cycle then the p value of 0.68806 characterizes vertex v2 while vertex v5 will be characterized by a peak p-value of 0.31194. Likewise, the two- and three-step graphs for N=5 are mirror images reflected about p=1/2.

Dual circular flows

In Voorhees [22] and Voorhees & Murray [23], a class of graphs called circular flows is defined and in Voorhees & Murray [23] a number of results concerning circular flows are given, including entropy computations and demonstration that this family of graphs contains many cases in which the single vertex fixation probability is enhanced with respect to the Moran probability only for limited values of the fitness parameter r≥1. In these cases, graph edges were directed in a single direction and the graphs were viewed in terms of probability flow through a series of layers. Here this is generalized to dual circular flows, that is, circular flows in which directed edges exist in both directions. For simplicity, only the case of uniform right- and left-directed edge weights is considered although this condition will be dropped when we consider tuning a graph. A length k+1 dual circular flow is illustrated schematically in figure 4.

Figure 4.

Schematic of a k+1 level dual circular flow. If p=0 or 1 this reduces to the simple homogeneous circular flow defined in Voorhees [22] and Voorhees & Murray [23].

Schematic of a k+1 level dual circular flow. If p=0 or 1 this reduces to the simple homogeneous circular flow defined in Voorhees [22] and Voorhees & Murray [23]. The state space for this system is the set of vectors {(m0,m1,…,m)|0≤m≤n}, where m indicates the number of mutants in the ith subset of the graph, which contains n vertices. Using the notation of Voorhees [22], x indicates the fixation probability of the state , whereas x( represents the fixation probability for the state arising from in which the ith population of mutants has increased (+) or decreased (−) by 1. For a dual circular flow, equation (1.1) becomes (All sums taken mod(k+1).) Figure 5 shows single vertex fixation probabilities for the (n0,n1,n2)=(1,2,3) dual circular flow. The p=0 case corresponds to the (1, 2, 3) cascade graph and the p=1 case corresponds to the (1, 2, 3) funnel graph. The fixation probability for the single vertex at level 0 is x1, for each of the two vertices at level 1 it is x2, and for each of the three vertices at level 2 it is x8. The overall single vertex fixation probability is ρ=(x1+2x2+3x3)/6 and the Moran probability is denoted ρm.

Figure 5.

Fixation probabilities for (1,2,3) dual circular flow. (a) Single vertex fixation probability, (b) fixation probabilities x1 minus x2, (c) fixation probabilities x1 minus x8, (d) fixation probabilities x2 minus x8, (e) single vertex fixation probability as a function of p for specified r values and (f–h) fixation probabilities x1, x2 and x8 for specified p-values. Figure 5a shows the transition between fixation probabilities for cascade and funnel flows as p varies from 0 to 1. Figure 5b–d shows the differences between single vertex fixation probabilities from different levels (0, 1, 2). Figure 5e shows cross sections of figure 5a for specified values of r, whereas figure 5f–h shows plots of x1−ρm, x2−ρm and x8−ρm, respectively, as functions of r for specified p-values. Examination of these figures indicates that starting at the single, level zero vertex always suppresses selection, starting at one of the three level-two vertices always enhances selection, relative to the Moran value, while starting at one of the two level-one vertices will enhance or suppress selection depending on the values of p and r (for example, for p=0.4 selection is enhanced for r>1). At p=0.25, x2>x8 for r<4.0471 while for r values greater than this x8>x2. Likewise, at p=0.9, x2 This example illustrates the complex behaviour that appears even in simple dual circular flow graphs. In Voorhees & Murray [23], a number of cases are considered showing that the single vertex fixation probability for such graphs can display both enhancement and suppression of selection, depending on the particular value of the fitness r. For example, in the (1, 2, 3)-funnel graph (p=1), selection is only enhanced for 1 The following theorem relates the characteristic polynomials for the Laplacian of all dual circular flow graphs to that of a biased cycle of the same length.

Theorem 5.1

Let (n0, n1,…,n) represent a dual circular flow with . Then the characteristic polynomials for Δ and Δ are just (λ−1) multiplying the corresponding characteristic polynomials of a length k biased cycle with bias p. Outline of proof Beginning with the flow (1, 1, 2), the claim is demonstrated for this case. This is extended by induction to the flows (1, 1, n) and (1, …,1, n). With this established a further induction demonstrates the result for flows (1,…,1,n,n). Proceeding in this way, an eventual proof is obtained. One point to note is that all eigenvalues of Δ again involve a factor of 1−2p, providing a measure of directional bias in the circular flow. By equation (3.2) and inspection of figure 4, it is obvious that the conductance of level s in a dual circular flow equals N/(N−n). Computation of the communicability and the associated vertex centralities yield results like those indicated in figure 6 which shows these plotted as functions of p for the dual flows 123, 235 and 1234, with V indicating the centrality of a vertex at level i.

Figure 6.

Communicability (V , in red) and associated vertex centralities (black) for dual circular flows: (a) (1,2,3) flow (note the communicability equals the v1 vertex centrality), (b) (2,3,5) flow and (c) (1,2,3,4) flow. Inspection of this figure indicates that the centrality of a vertex in a given layer is inversely related to the number of vertices in that layer—multiplying vertex centrality by the number of vertices in a layer shows that the centrality of a layer is proportional to the number of vertices—layers with more vertices have greater centrality. This can be understood in terms of the topological structure of these graphs—there are no edges connecting vertices within a layer. This also implies that the hitting times between layers will be the same as for a biased cycle of the same length, although hitting times between specific vertices will be substantially longer. Dual circular flows also provide an example of the way that edge coefficients in a graph can be tuned to match desired global features. To address this, the graph of figure 4 is generalized, allowing different weight coefficients between levels. This is indicated in figure 7. The target parameter chosen for tuning is vertex temperature [4] at each level of the graph and the inter-level edge weights p will be selected to produce specified temperature profiles. In particular, conditions will be determined for which the single vertex fixation probability matches that of a complete graph.

Figure 7.

Schematic of a general k+1 level dual circular flow.

Schematic of a general k+1 level dual circular flow. Referring to figure 7, the temperature of a vertex at level i is Hence, if a temperature profile =(t0,…,t) is to be matched the following set of equations must be satisfied: Two examples will illustrate the factors involved in doing this.

Example 5.2

A three-level flow (n0,n1,n2) with t=1 for all i. In this case, the goal is to adjust the connection coefficients p so as to produce a graph with single vertex fixation probability equal to the Moran probability. Equation (5.3) becomes Setting p2=p and solving for p0 and p1 yields Since p0 and p1 are probabilities they must lie in the range [0, 1], and this requires For example, if (n0,n1,n2)=(1,2,3), equation (5.6) requires 1/3≤p≤2 and 2/3≤p≤2/3. From the second of these, the only value allowed for p=p2 is 2/3 and this yields p0=1, p1=0. Thus, the (1,2,3) flow with edge weights (p0,p1,p2)=(1,0,2/3) has the Moran single vertex fixation probability. If (n0,n1,n2)=(2,3,4), on the other hand, there is a larger range of possible edge weights: the conditions that must be satisfied are 1/2≤p2≤3/4 with p0=2p2−1/2 and p1=2(2p2−1). If the number of levels in the graph is even (i.e. k is odd), an additional constraint appears. Equation (5.3) separate into two sets of equations with the left sides of each set summing separately to zero. This yields the conditions For example, if k=3 and t=1 for all i, this requires n0+n2=n1+n3. In addition, each set of equations is solved in terms of a single parameter so there are two parameters in the solution. For example, in the general k=5 case, the equations are and with constraints Equations (5.8a,b) have solutions in terms of p4 and p5, respectively: For a solution to be valid, all transition probabilities must be in [0,1] imposing the additional constraints while the values of p4, p5 are also constrained to lie in [0, 1]. If edge weights exist satisfying the temperature profile t=1, 0≤i≤k, then the graph with these edge weights will have the Moran fixation probability. The next example, however, demonstrates that matching a general temperature profile does not always lead to matching fixation probabilities.

Example 5.3

Consider the circular flow (n0,n1,n2) with temperature profile (2, 3/2, 1/3). Any edge weights satisfying p0=3p2−2, p1=(3p2− 1)/2 will match this profile. If p2=1 then p0=p1=1 as well—this case is considered in Voorhees & Murray [23]. But all values of p2 in the interval [2/3, 1] yield valid solutions matching the given profile. Figure 8 shows a plot of the single vertex fixation probability minus the Moran probability for this range of p, illustrating that very different fixation probabilities result.

Figure 8.

Fixation probabilities for (1,2,3) flow matching temperature profile (2,3/2,1/3).

Partial bipartite graphs

The vertex set for a complete bipartite graph K consists of two subsets V and V containing s and n vertices, respectively, with edge set . That is, every vertex in V has a link to every vertex in V and vice versa but no pairs of edges in V are linked nor are any pairs of edges in V. Furthermore, if edges are directed and weighted, then all weights for edges from V to V are 1/n and all weights for edges from V to V are 1/s. This family of graphs includes the star graphs (s=1). In Voorhees & Murray [23], the birth–death single vertex fixation probability for the complete bipartite graph K is given as Here we consider the more general case of partial bipartite graphs, as illustrated in figure 9. This figure shows two subsets V and V of vertices with |V|=s and |V|=n. Within each subset, every vertex connects equally to every other vertex with probabilities p/(s−1) and q/(n−1), respectively. In addition, every vertex in V connects to every vertex of V with the probability (1−p)/n and every vertex of V connects to every vertex of V with probability (1−q)/s.

Figure 9.

Schematic of a partial bipartite graph.

Schematic of a partial bipartite graph. The edge weight matrix for this graph has block form: where J is the i by j matrix of all ones and I is the j by j identity matrix. The normalized steady-state solution for ⋅W= is with s entries of n(1−q) and n entries of s(1−p). The Laplacian matrix and its antisymmetric part have block form Proof of the next theorem follows directly from the form of these matrices.

Theorem 6.1

The characteristic polynomials for Δ and Δ of the N vertex partial bipartite graph with edge weight matrix W given in equation (6.2) are Thus, Δ has eigenvalues of 0, (s−1+p)/(s−1) with multiplicity s−1, (n−1+q)/(n−1) with multiplicity n−1, and 2−p−q, whereas Δ has N−2 zero eigenvalues with the remaining two eigenvalues imaginary and equal to . The total probability flow from V to V equals s(1−p) while the total flow from V to V is n(1−q). In the bipartite case, the subsets of interest are V and V. The right and left conductance of the graph are thus defined as The conductance of a set of vertices S is inversely related to the mixing time of a Markov process starting from a vertex in S. Thus, C(V)≥C(V) implies that the mixing time starting from a vertex in V will be less than that starting from a vertex in V. The condition for C(V)≥C(V) is Figure 10 shows the single vertex probability for the (2, 5) partial bipartite graph for fixed values of q with p and r as variables and for r=2 with p and q as variables.

Figure 10.

Fixation probabilities minus Moran probabilities for (2, 5) partial bipartite graph for varying q values (a–d), (e) detail of (c) (0≤p≤1/3) and (f) the x1 (vertex in V2 component) and x4 (vertex in V5 component) probabilities for r=2. Although figure 10c and 10d appear quite similar there is a significant difference. In figure 10d the p=0 line is at 0, whereas in figure 10c this is not quite the case, as indicated in the detail of figure 10e. Figure 11 shows cross sections of figure 10e. While the difference from the Moran probability is extremely small, these figures show an unusual effect in which selection is enhanced for fitness values less than one, suppressed for a limited range of values greater than one and then enhanced again. The graphs of figure 11 illustrate a transition between three distinct regimes of behaviour. In the first, selection is suppressed for 01 (pattern A, table 1). In the second, selection is enhanced for 0fitness in the range [1, ] and enhanced for fitness greater than (pattern C, table 1). In the third, selection is enhanced for 01 (pattern D, table 1). When other values of q are considered, similar sorts of transitions appear, with a fourth region in which selection is suppressed for 0fitness in the range [1, ] and suppressed for (pattern B, table 1). Table 1 shows comparative q- and p-values illustrating this effect, which will be discussed further in §7.

Figure 11.

Cross sections of figure 10e.

Table 1.

Patterns of single vertex fixation probability in (2, 5) partial bipartite graph.

q	pattern A	pattern B (p,r_max)	pattern C (p,r_max)	pattern D
0	p<0.5	(0.5,257.0019), (0.55,23.5695), (0.6,12.5060), (0.75,5.3575), (0.875,3.7144), (1,2.8845)
0.125	p<0.445	(0.445,261.1673), (0.45,91.6745), (0.55,6.2814), (0.6,4.1189), (0.75,1.3160), (0.85,1.0000)		p>0.85
0.3333	p<0.3375	(0.3375,109.1445), (0.3402,28.3141), (0.35,11.2907), (0.3667,4.9994), (0.4,2.1249), (0.5,1.0000)		p>0.5
0.5	p<0.2		(0.2075,1.0260), (0.208,1.1176), (0.2085,1.2023), (0.21,1.4381), (0.215,2.2926), (0.228,404.7369)	p>0.228
0.625	p<1/16			p>1/16
0.6667	p<0.0833		(0.0833,1.0000), (0.0999,1.7239), (0.1,2.7687), (0.1111,5.9870), (0.117,247.4142) (0.118,42.2788,175.4007)	p>0.125
0.875	p>0.585		(0.48,821.1031), (0.49,13.0399), (0.5,7.0984), (0.51,5.0663), (0.55,2.3417), (0.585,1.2061)	p<0.48

Cross sections of figure 10e. Patterns of single vertex fixation probability in (2, 5) partial bipartite graph. The q=0.6667, p=0.118 case is an anomaly in that selection is enhanced for the two regions 0 Figure 12 shows probabilities for the (3,4) partial bipartite graph as functions of p and q for r taking values 5/4, 3/2 and 2. In all cases, the ranges of p and q are [0, 7/8] in order to avoid a singular that arises if p or q is 1.

Figure 12.

Single vertex fixation probabilities for (3,4) partial bipartite graph with r=5/4, 3/2 and 2. (a) Fixation probability minus Moran probability with r=2, (b–d) fixation probabilities starting on three sides (sheet that is lower at q=0) compared to that starting on fourth side for r=5/4, 3/2, 2. Figure 12b–d plots x1 and x8, the single vertex probabilities for a vertex starting in the three vertex subset and the four vertex subset, respectively. The equation for x1−x8=0 contains over 350 terms in products of p and q with coefficients of up to 20 digits. Nevertheless, a simple solution exists and is the same for all three values of r considered: q=(1+3p)/4. This compares exactly with the condition based on conductances given in equation (6.7). Likewise, for the (2, 5) case the x1−x4=0 line is q=(3+2p)/5, which again is the same condition as given by equation (6.7). This suggests that equation (6.7) provides a sharp differentiation between fixation probabilities for vertices starting in one or the other of the two graph components V and V. If the condition of equation (6.7) is satisfied the subset V will be said to be evangelical relative to V while if this condition is not satisfied V is isolated relative to V. In the extreme case p=1, we will say that V is quarantined. Estimates of hitting times were computed for the (s,n) partial bipartite graph for a number of examples and the form of these cases suggests the following conjecture.

Conjecture 6.2

Let G be an (s, n) partial bipartite graph with h(s, s) and h(n, n) the respective hitting times between pairs of vertices in V and V, and h(s, n) the hitting time starting from a vertex in V and ending at one in V while h(n, s) is the hitting time starting from a vertex in V and ending at one in V. Then Given this conjecture, the condition for h(s,s)≥h(n,n) is If (s,n)=(3,4) then equation (6.9) becomes q≥(1+p)/(5+p). Note that the hitting times h(s,s) and h(n,n) both involve a factor of 2−p−q which is an eigenvalue of the Laplacian. If p and q are sufficiently small, 2−p−q is the largest eigenvalue. From equation (6.5), ‘sufficiently small’ requires with equality yielding p=(s−1)/(N−1), q=(n−1)/(N−1). More generally, eigenvalues of the Laplacian can be plotted as functions of p and q, as illustrated in figure 13a for (s,n)=(3,4), and the curves for equality of pairs of the non-zero eigenvalues can be computed, showing q as a function of p as in figure 13b.

Figure 13.

(a) Eigenvalues of Δ for (3,4) partial bipartite graph as functions of (p,q). λ1=(3+q)/3,λ2=(2+p)/2,λ3=2−p−q. (b) Equal value lines for eigenvalues of Δ for (3,4) partial bipartite graph.

(a) Eigenvalues of Δ for (3,4) partial bipartite graph as functions of (p,q). λ1=(3+q)/3,λ2=(2+p)/2,λ3=2−p−q. (b) Equal value lines for eigenvalues of Δ for (3,4) partial bipartite graph. The point of equality for all three eigenvalues is (p,q)=(1/3,1/2). Ranking eigenvalues according to size for the six different regions in this graph is shown in table 2.

Table 2.

Size order of eigenvalues of Δ for regions defined in figure 13b.

region in figure 13	non-zero eigenvalue order
I	λ₃, λ₁, λ₂
II	λ₃, λ₂, λ₁
III	λ₁, λ₃, λ₂
IV	λ₂, λ₃, λ₁
V	λ₁, λ₂, λ₃
VI	λ₂, λ₁, λ₃

Size order of eigenvalues of Δ for regions defined in figure 13b. In the limiting case of p=q=1, the eigenvalue 2−p−q becomes zero and the graph separates into two disconnected components. Care is required, however, to determine how h(s,s) and h(n,n) behave in this limit since the factors (2−p−q)/(1−p) and (2−p−q)/(1−q) give results that depend on how the limit (p,q)→(1,1) is approached. For example, suppose that p=q. Then 2−p−q=2−q−q=(1−q)(2+q+q2+⋯+q) and at q=1h(s,s)=(1+k)(s−1) and likewise h(n,n)=(1+k)(n−1). These differ from the hitting times for size s and size n complete graphs, which are s−1 and n−1, respectively. The way to deal with this in computing h(s,s) is to set p=1 before going to the limit (or, in computing h(n,n) setting q=1). Otherwise there is always the possibility of transitions between V and V giving additional contributions to hitting times. Exact computation of e is possible for partial bipartite graphs and examination of the cases (2, 3), (3, 4), (2, 5), (3, 5) and (4, 5) suggests the following.

Conjecture 6.3

Let G be an (s, n) partial bipartite graph, V and V the diagonal elements of e corresponding to vertices in V and V, with V and V refer to off block diagonal elements. Then In addition, Setting partial derivatives of Tr(e) with respect to p and q to zero and solving for q yields Setting these equal indicates that the minimum value of Tr(e) occurs at which is the condition for eigenvalue equality in equation (6.10). Figure 14 shows plots of V and V, V and V, and Tr(e) for (s,n)=(3,4).

Figure 14.

Plots of (a) V33 and V44, (b) V34 and V43, and (c) Tr(e) for (3,4) partial bipartite graph.

Plots of (a) V33 and V44, (b) V34 and V43, and (c) Tr(e) for (3,4) partial bipartite graph. In figure 14a, V33 is lowest at p=0, q=1 and in figure 14b, V is zero at p=1, whereas V is zero at q=1. The minimum value in figure 14c occurs at (p,q)=(1/3,1/2), which coincides with the point at which all non-zero eigenvalues of Δ are equal. The line of equality in figure 14a is difficult to express but can be approximated as the straight line q=1.0923827p−0.3800648, thus for p>(q+0.3800648)/1.0923827, V will be larger than V. In figure 14b, the line of equality is given by q=(1+3p)/4. Likewise, if q>(1+3p)/4 then V

Single-link graphs

In this section, we consider graphs consisting of two components with linkage between components occurring only through a single vertex in each component, as illustrated in figure 15.

Figure 15.

Single line graph.

Single line graph. These graphs and partial bipartite graphs are the extreme cases for graphs having two components with linkages occurring only between subsets of each component. Here the component V contains s vertices, one of which is linked to one of the n vertices in V. Each non-linked vertex in V connect to every other vertex in this component with probability 1/(s−1) while the linking vertex connects to all other vertices in V with probability p/(s−1) and connects to the linking vertex in V with probability 1−p. Likewise, all non-linked vertices in V connect to every other vertex in this component with probability 1/(n−1) while the linking vertex connects to all other vertices in V with probability q/(n−1) and connects to the linking vertex in V with probability 1−q. Single vertex fixation probabilities are suppressed relative to the Moran probability for cases considered, as shown in figure 16 for (s,n)=(3,4) for r=5/4 and r=2. This figure also shows comparisons of fixation probabilities for possible initial mutant vertices for r=5/4 and 2. Here x1 is the fixation probability for a non-linked vertex in V3, x4 is the fixation probability for the linking vertex in V3, x8 is the fixation probability for the linking vertex in V4, and x16 is the fixation probability for a non-linked vertex in V4.

Figure 16.

Fixation probabilities for (3,4) single-link graph for r=5/4 (a,c,e) and r=2 (b,d,f): (a,b) ρ−ρm, (c,d) x1 and x16, (e,f) x4 and x8.

Fixation probabilities for (3,4) single-link graph for r=5/4 (a,c,e) and r=2 (b,d,f): (a,b) ρ−ρm, (c,d) x1 and x16, (e,f) x4 and x8. In figure 16c,d, if p=0 the x16 probability is zero and the x1 probability is the Moran probability for s−1 vertices. Likewise, if q=0 the x1 probability is zero and the x16 probability is the Moran probability for n−1 vertices because fixation can only occur if the initial mutant is introduced in the {v1,v2,v3} or the {v4,v5,v6,v7} vertex sets, respectively. If p=1 (q≠1), on the other hand, the x1 probability is zero and if q=1 (p≠1), the x16 probability is zero (these are not shown in figure 16 which only shows p and q between the values of 0 and 7/8). If p=q=1 the graph has split into disconnected components, which shows up in the appearance of a second zero eigenvalue for the Laplacian as shows up in figure 17e below.

Figure 17.

Roots of cubic factor of the (3,4) single-link characteristic polynomial (a–e) correspond to p=0, 1/4, 1/2, 3/4, 1, respectively. In each figure, q varies from bottom to top with values 0, 1/4, 1/2, 3/4, 1. Taking x=λ−1, the characteristic polynomial for the Laplacian of this graph is Thus, there will be s−2 roots λ=s/(s−1), n−2 roots λ=n/(n−1). The last term in brackets factors into x+1 (yielding the 0 eigenvalue) and a cubic. Figure 17 shows plots of the cubic factor for chosen values of p and q. In the lowest curve of figure 17a, p=q=0, corresponding to an eigenvalue of 2 (since x=λ−1), whereas in the upper curve of figure 17e, p=q=1, giving a second eigenvalue of 0 (x=−1) as the graph is split into two disconnected components. With the linked vertices denoted v and v, the conductances for the sets V∖v, V∖v, v, v, V and V are given by All but the two of these are constant and the condition C(V)=C(V) is that q=p. Approximation of e and Tr(e) for the (3, 4) case yields figure 18.

Figure 18.

(a) Tr(e/7), (b) centrality of v1 and v5 vertices, (c) centrality of v3 and v4 vertices, (d) average centrality of V3 and V4 components, (e) communicability of {v1,v2} and {v5,v6,v7} vertex sets, (f) communicability of v3 and v4 vertices. The minimax values Tr(e/7) are at (p,q)∼(0.793893,0.394234). The line of equality in figure 17b is approximately given by q∼2.021513p+0.0074732—if q is greater than the above value, vertex v1 has the greater centrality. In figure 17e, the line of equality is q=p which, as in the partial bipartite case, corresponds to the condition for C(V3)=C(V4), and we conjecture that this holds for a general single-link graph. Table 3 shows the hitting times for the (3, 4) single-link graph.

Table 3.

Hitting times for (3, 4) single-linked graph.

h11=2+p+2q−5pqp(1−q),h33=2+p+2q−5pq(1−q),	h55=2+p+2q−5pqq(1−p)h44=2+p+2q−5pq(1−p)
h12=2(2+p+2q−5pq)3p(1−q)	h56=3(2+p+2q−5pq)4q(1−p)
h₁₃=2	h31=2(4−p+4q−7pq)3p(1−q)
h14=31−p	h41=(8+p+8q−5pq)p(1−q)
h15=3(2+p+2q−3pq)2q(1−p)	h51=2(4+5p+4q−7pq)3p(1−q)
h34=1+2p1−p	h43=1+3q1−q
h35=(6+3p+2q−5pq)2q(1−p)	h53=41−q
h45=3(2+p−3pq)2q(1−p)	h₅₄=4

Hitting times for (3, 4) single-linked graph. In contrast to hitting times for the partial bipartite graphs, most of the hitting times in table 5 show divergences for cases where p or q are 0 or 1. Consideration of the graph topology makes it apparent that the hitting time h will diverge whenever there is either zero probability of a transition from vertex u to vertex v, or there is a finite probability of a transition from vertex u to a vertex u* for which the probability of transition to vertex v is zero.

Table 5.

Summary of results (roman indicates general results, italic indicates limited exemplary cases).

graphs	eigenvalues	conductance	communicability, centrality	expected hitting times	fixation probabilities
biased cycles	theorems (4.1) and (4.2), corollary (4.2), equation (4.2)	N/k(N−k) for a length k block	figure 2	figure 3a–d	same as Moran
dual circular flows	theorem (5.1)	N/(N−n_s) for level s	figure 6	same as biased cycle between levels	figure 5
partial bipartite	theorem (6.1), equation (6.5)	equation (6.6)	conjecture (6.2); figure 14; equations (6.11) and (6.12);	conjecture (6.2), equations (6.8) and (6.9)	figures 10–12, tables 1 and 2
single link	figure 17, equation (6.15)	equation (6.16)	figure 18	tables 3 and 4	figure 16

In terms of symmetries there are natural comparisons between the following pairs of hitting times: (h11, h55), (h33, h44), (h13, h54), (h31, h45), (h14, h53), (h41, h35), (h12, h56), (h15, h51), (h34, h43). Table 4 shows conditions for equality between members of each pair, noting that h13 can never equal h54.

Table 4.

Conditions for equality of pairs of hitting times from table 3.

h₁₁=h₅₅, h₃₃=h₄₄	q=p
h₃₁=h₄₅, h₅₃=h₁₄	q=3p4−p
h₁₂=h₅₆	q=9p8+p
h₃₄=h₄₃	q=3p/(4−p)
h₄₁=h₃₅	q=11p2−p+8−76p4−292p3−39p2+272p+645p2+20p−16
h₁₅=h₅₁	q=−2(4p2+p+4)+73p4−184p3−192p2+320p+64p2−26p+16

Conditions for equality of pairs of hitting times from table 3.

Discussion

Construction of graphic population models involves few conceptual problems, the procedure is well established: locate each population member, characterized by one or more variable conditions, at a vertex of a graph and label edges between vertices with some measure of interaction between linked individuals. Then define an updating process relating individual interactions to the dynamics of characteristic variables. While this sort of model can become mathematically complex, the conceptual framework is simple. Equation (2.1) treats the easiest case, in which each population member exhibits a binary-valued characteristic and the edge weights are just probabilities of interaction, coupled with birth–death updating. Solutions to this equation provide birth–death fixation probabilities for any given initial state. The problem is the combinatorial explosion that arises. If the population has size N, each population member is characterized by a vector of characteristics =(c1,…,c), and each of these characteristics c can exhibit one of m possible values, then the state space contains almost members. One way of dealing with this is through methods from statistical mechanics in which graphs are constructed according to algorithms for developing random, small world or other connections between vertices. In this paper, a different direction is taken—we study properties of some very simple graphs in the hope of eventually discovering connections to the statistics of larger graphs. In analogy, a molecular gas can be studied in terms of statistical mechanics by assuming that all molecules are point particles and using concepts such as temperature, entropy, pressure, free energy and so on; but it is also possible to study the structure of the individual molecules with an eye towards eventual discovery of connections between molecular structural properties and large-scale gas behaviour. Four classes of graphs were considered: biased cycles, dual circular flows, partial bipartite graphs and single-link graphs. For biased cycles, theorem (4.1) gives the characteristic polynomial of the Laplacian matrix and the corollary to this theorem expresses eigenvalues in terms of roots of unity for the special cases p=0 and p=1. Theorem (4.3) shows the value of the term 1−2p as a measure of cycle bias. Computation of hitting times for simple cases shows the characteristic form illustrated in figure 3, with maximum hitting times coming at p-values characterizing a ‘tipping’ point between clockwise and counter-clockwise motion for a random walk. Biased cycles provide the simplest example of the dual circular flow graphs and theorem (5.1) links the characteristic polynomial of the Laplacian of a circular flow to that of the biased cycles, with the term 1−2p again playing the role of measuring directional bias. Further generalization allowed the possibility of ‘tuning’ a flow to match specified parameters, exemplified by derivation of the conditions required to match any specified vertex temperature profile. Bipartite graphs are two-level dual circular flows and, as indicated in the Introduction section, the generalization to partial bipartite graphs leads to results of particular interest, showing a tight connection between conductances, Laplacian eigenvalues, vertex centralities, expected hitting times and single vertex fixation probabilities. These graphs also showed the possibility of counterintuitive behaviour in which fitness values less than 1 can result in enhanced selection, illustrated in figure 11 and table 1. While these tables show only the (2, 5) partial bipartite case, similar behaviour has been found for the (3, 4) case, although the ranges of p for which the behaviour appears, at least in preliminary investigation, are narrower. From table 1, for fixed q there appears to be two critical p-values, p− and p+, depending on q, such that in most cases with q<1/2 pattern A appears for pp+ and pattern B for p−fitness value that distinguishes between patterns A and B or between patterns C and D depends on both the q- and p-values. In table 1, pattern C shows up for cases in which q is relatively large while p is relatively small. This corresponds to a situation in which the five vertex is strongly connected internally, with weak linkages to the two vertex subset, while this latter subset is weakly connected internally with strong links to the five vertex subset. Under these conditions, a mutant with r<1 appearing in V2 will have a low probability of extinction and, if chosen for reproduction, a high probability of reproducing in V5 so that in a sense V2 may act as a root. Much further work is required, however, to develop any detailed understanding of this behaviour, as well as extending analysis to dual circular flows in which vertices in each level can influence each other. The final case consists of a two-component graph in which the linkage between components occurs only through a single vertex in each component. Laplacian eigenvalues, hitting times and communicability were determined, and fixation probabilities were found for the (3, 4) example, allowing comparison to the corresponding partial bipartite case. As indicated, partial bipartite and single-link graphs are the two extremes of a family of graphs that provide models of two populations communicating across smaller sets of representatives. Adjustment of connection parameters provides models for different levels of polarization in the two populations. Further work is required to analyse both the partial bipartite graphs and single-link graphs, as well as to consider more general cases in which linkages between two populations are through subsets S⊆V and S⊆V. Being able to characterize polarization within a population in this way could provide an important tool for studies of social interactions. Table 5 shows a general summary of results reported in the present paper. Summary of results (roman indicates general results, italic indicates limited exemplary cases).

11 in total

Simple graph models of information spread in finite populations.

Introduction

Measures of graph characteristics

Eigenvalues of the Laplacian

Subset conductances

Communicability

Expected hitting times

Biased cycles

Theorem 4.1

Corollary 4.2

Theorem 4.3

Dual circular flows

Theorem 5.1

Example 5.2

Example 5.3

Partial bipartite graphs

Theorem 6.1

Conjecture 6.2

Conjecture 6.3

Single-link graphs

Discussion

1. The spread of innovations in social networks.

2. Evolutionary dynamics on graphs.

3. Some features of the spread of epidemics and information on a random graph.

4. Communicability in complex networks.

5. Early appraisal of the fixation probability in directed networks.

6. Locating the source of diffusion in large-scale networks.

7. The effect of population structure on the rate of evolution.

8. Universality in network dynamics.

9. Universality of fixation probabilities in randomly structured populations.

10. Exact epidemic models on graphs using graph-automorphism driven lumping.

1. Martingales and the fixation time of evolutionary graphs with arbitrary dimensionality.

2. A process of rumour scotching on finite populations.