Literature DB >> 35217599

Indirect influence in social networks as an induced percolation phenomenon.

Jiarong Xie¹, Xiangrong Wang^2,3, Ling Feng^4,5, Jin-Hua Zhao^6,7, Wenyuan Liu⁸, Yamir Moreno^9,10,11, Yanqing Hu¹².

Abstract

Percolation theory has been widely used to study phase transitions in network systems. It has also successfully explained various macroscopic spreading phenomena across different fields. Yet, the theoretical frameworks have been focusing on direct interactions among nodes, while recent empirical observations have shown that indirect interactions are common in many network systems like social and ecological networks, among others. By investigating the detailed mechanism of both direct and indirect influence on scientific collaboration networks, here we show that indirect influence can play the dominant role in behavioral influence. To address the lack of theoretical understanding of such indirect influence on the macroscopic behavior of the system, we propose a percolation mechanism of indirect interactions called induced percolation. Surprisingly, our model exhibits a unique anisotropy property. Specifically, directed networks show first-order abrupt transitions as opposed to the second-order continuous transition in the same network structure but with undirected links. A mix of directed and undirected links leads to rich hybrid phase transitions. Furthermore, a unique feature of the nonmonotonic pattern is observed in network connectivities near the critical point. We also present an analytical framework to characterize the proposed induced percolation, paving the way to further understanding network dynamics with indirect interactions.

Entities: Chemical

Keywords: behavioral contagion; indirect interactions; percolation; phase transition; social network

Year: 2022 PMID： 35217599 PMCID： PMC8892329 DOI： 10.1073/pnas.2100151119

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 12.779

Percolation theory (1) is one of the most prominent frameworks within statistical physics. Initially developed (2, 3) to explain the chemical formation of large macromolecules, it has been recently used to study various dynamical processes in complex networks (4–9). Examples include the use of bond percolation (9, 10) to study the wide spread of rumors over online social media and outbreaks of infectious diseases on structured populations. Site percolation (4, 5, 11) has been employed to study the cascading failures of infrastructure networks (6, 12–16) and the resilience of protein–protein interaction networks (17). Likewise, bootstrap percolation (18), k-core (19–21), and linear threshold percolation (7, 22–24) have enabled the study of the spreading of behaviors over social networks. Finally, the so-called explosive percolation (25) has allowed a better characterization of systems’ structural transitions when they are growing or can adapt, whereas core percolation (26, 27) has contributed significantly to insights into nondeterministic polynomial problems. Common to all these percolation models is that they have successfully described various important dynamical phenomena by considering different direct interactions (8, 9, 28) among network nodes; in particular, they have captured the behavior of network systems as given by phase transitions (4, 8, 9, 28, 29). Our study is motivated by recent evidence that there are many systems in which indirect interactions play a major role in their spreading dynamics (30–35). Such underlying indirect interactions have important implications not only on the dynamics of the system but also on the evolution and the emergence of network structures. For example, Christakis and Fowler (30, 31) found that for the spreading of many social behaviors, such as drug (36) and alcohol addictions (37) and obesity (30), an individual can span their influence to their friends around three degrees of separation (friend of a friend’s friend). This phenomenon is also widely known as “three degrees of influence” in social science. In ecological networks, Guimarães et al. (32, 33) discovered in 2017 that indirect effects contribute strongly to the trait coevolution among reciprocal species, which can alter environmental selection and promote the evolution of species. Despite the ubiquity of indirect influence in various real-world systems, few studies have examined the exact mechanisms by which the indirect influences occur, or the relative strengths between direct and indirect influences. Here, based on empirical analyses of scientific collaboration networks, we reveal that indirect influence occurs through next-nearest neighbors and can be the dominant mechanism through which research interests change; on the contrary, evidence of direct (nearest) influence is relatively weak. However, on the theoretical front, up to now there has been no percolation-based theoretical model to describe the underlying mechanism of indirect influence or its distinctions with existing percolation models in terms of the macroscopic behaviors. For either regular networks or complex networks, various percolation models like bond, site, bootstrap, k-core, linear threshold and core, etc., are always based on direct interactions (8, 9, 28) among nodes. In essence, all of these models only take into account the existence and the strength of directly connected nodes, regardless of any indirect influences of other nodes. Hence, they are not suitable for describing the indirect mechanism. Here, we propose a percolation framework called induced percolation to theoretically study the impact of such an indirect mechanism on the whole system. Our results show that indirect interactions lead to a unique macroscopic behavior characterized by anisotropy and phase transitions and different spreading outcomes compared to the direct influence mechanisms. Specifically, we study the most general scenario in which links can have directions and report that varying the links’ directionality could change the order of the phase transition. This is in sharp contrast to previous percolation models, for which the nature of the phase transitions is not affected by the directionality of links. Such rich phase transition behavior is further illustrated in our simulations on empirical networks. To the best of our knowledge, the phenomenon of directionality-related order of the phase transitions only exists in some special cases of core percolation (27), whereas it is shown to be a generic feature in our indirect interaction model.

Results

Empirical Indirect Influence Mechanisms.

To investigate the exact mechanism of neighboring influence and its direct/indirect nature in empirical networks, we study collaboration networks of scientists. Here the “behavior” is meant as the research field(s) of a scientist, and the “spreading of behavior” is defined as the propensity of the scientist to stay in his/her established field or shift to an emerging field. We then study how scientists’ research fields are influenced by their direct (nearest) neighbors and indirect (next-nearest) neighbors. We choose four pairs of fields in physics that have large numbers of scientists involved: chaos vs. complex networks, phase transitions vs. complex networks, electrical properties of low-dimensional structures vs. optical properties of low-dimensional structures (hereinafter referred to as EPLDS vs. OPLDS), and carbon nanotubes vs. graphene. For each pair of fields, the latter field is the emerging field (new field) that attracts scientists from the former (old, already established) field. Specifically, we analyze the datasets of articles published by the American Physical Society (APS) (38) and Web of Science (39), considered as representative data sources for the studied fields (we share all the data of this study at https://github.com/Jia-Rong-Xie). Based on articles in each pair of fields, covering in total 5 y around the emergence of a new field (see Table 1 and ), we construct a collaboration network. The nodes are the scientists, and a link is established between any pair of scientists who have at least one joint publication. Scientists who have published multiple articles (at least two in APS and five in Web of Science dataset; refer to extended discussion for parameter robustness in ) in the old field yet have not published any articles in the new field are defined as focused scientists in the old field. They are assumed to be the influencers in the networks and labeled as state 1. For any other nodes (influenced) in the networks, we calculate the number of direct and indirect “influencers” for each node. The number of direct influencers of node i is simply the number of its nearest neighbors with state 1, and we denote it as . For the number of indirect influencers of node i, we first identify its state 1 neighbors. For each of the direct influencers (direct state 1 neighbors), we then count the number of their own state 1 neighbors, and the maximum count is defined as the number of indirect influencers m (also called induced index). On each of the direct influencers of node i, we further count its degree and define the maximum degree of them as degree index . A visual illustration of the definitions is shown in Fig. 1.

Table 1.

Description of bibliographic datasets used in the empirical studies

Established field (PACS)	Emerging field (PACS)	No. of. authors	No. of edges	Constructing network period	Field emerging period	Observation period
Chaos (05.45)	Complex networks (89.75)	1,833	3,128	1999–2003	2001–2003	2004–2006
Phase transitions (64.60)	Complex networks (89.75)	1,265	2,864
EPLDS (73.20)	OPLDS (78.67)	2,069	5,900
Carbon nanotubes (39).	Graphene (39).	20,011	110,041	2009–2013	2011–2013	2014–2016

The first and second columns are names and Physics and Astronomy Classification Scheme (PACS) numbers (except carbon nanotubes and graphene) for the four pairs of research fields. The third to fifth columns are the number of authors and edges and the period used to construct collaboration networks, respectively. The sixth column is the period during which a new field emerges and “focused” scientists are specified. The seventh column is the period to observe scientists’ behavioral change and to calculate the indicator .

Fig. 1.

Indirect influence mechanism in empirical collaboration networks. (A) Schematic representation for the induced index m, the k-core index , and the degree index d. In this example, “focused” scientists in the established field are denoted as state 1. Node i has an induced index because among all direct neighbors in state 1, node j has the maximum neighbors in state 1 (i.e., three excluding node i). The degree index of node i is , which is the degree of node j. The k-core index of node i is , which is the number of direct neighbors in state 1 (nodes l and j). (B) Empirical evidence of indirect influence. It shows a clear indirect influence mechanism in four pairs of established and emerging fields in physics that have large numbers of scientists involved. The proportion Q of publications in the established fields significantly increases with the scientists’ induced index m in all the datasets. To compare with direct influence, the orange lines in C show that the value of Q is hardly affected by the direct influence measured through the k-core index, while the scientists with higher indirect influence index (top 50% of m values) clearly have a higher Q value than that of the lower indirect influence (bottom 50% of m values), indicating a strong indirect influence. D highlights four sample scientists (nodes) labeled as h,i,j,l. Each orange node is a node of interest, its connected green nodes are the neighbors of state 1, pink nodes are green nodes’ state-1 neighbors used to calculated induced index m. Higher induced index nodes h and j () publish a proportion and of old field articles, much higher than that of node i with a lower induced index (), although i’s k-core index is higher () than h and the same as j. Comparing node l and i, j again indicates that the influence is stronger through induced index m than that of . A similar comparison in E and F shows that the proportion Q is hardly affected by the degree index but clearly affected by the induced index. C–F show results performed on the collaboration network of carbon nanotubes vs. graphene. Note that in F we also show the state 0 nodes labeled in blue, since the calculation of degree index considers them.

Description of bibliographic datasets used in the empirical studies The first and second columns are names and Physics and Astronomy Classification Scheme (PACS) numbers (except carbon nanotubes and graphene) for the four pairs of research fields. The third to fifth columns are the number of authors and edges and the period used to construct collaboration networks, respectively. The sixth column is the period during which a new field emerges and “focused” scientists are specified. The seventh column is the period to observe scientists’ behavioral change and to calculate the indicator . Indirect influence mechanism in empirical collaboration networks. (A) Schematic representation for the induced index m, the k-core index , and the degree index d. In this example, “focused” scientists in the established field are denoted as state 1. Node i has an induced index because among all direct neighbors in state 1, node j has the maximum neighbors in state 1 (i.e., three excluding node i). The degree index of node i is , which is the degree of node j. The k-core index of node i is , which is the number of direct neighbors in state 1 (nodes l and j). (B) Empirical evidence of indirect influence. It shows a clear indirect influence mechanism in four pairs of established and emerging fields in physics that have large numbers of scientists involved. The proportion Q of publications in the established fields significantly increases with the scientists’ induced index m in all the datasets. To compare with direct influence, the orange lines in C show that the value of Q is hardly affected by the direct influence measured through the k-core index, while the scientists with higher indirect influence index (top 50% of m values) clearly have a higher Q value than that of the lower indirect influence (bottom 50% of m values), indicating a strong indirect influence. D highlights four sample scientists (nodes) labeled as h,i,j,l. Each orange node is a node of interest, its connected green nodes are the neighbors of state 1, pink nodes are green nodes’ state-1 neighbors used to calculated induced index m. Higher induced index nodes h and j () publish a proportion and of old field articles, much higher than that of node i with a lower induced index (), although i’s k-core index is higher () than h and the same as j. Comparing node l and i, j again indicates that the influence is stronger through induced index m than that of . A similar comparison in E and F shows that the proportion Q is hardly affected by the degree index but clearly affected by the induced index. C–F show results performed on the collaboration network of carbon nanotubes vs. graphene. Note that in F we also show the state 0 nodes labeled in blue, since the calculation of degree index considers them. Within the next 3 y (see Table 1 and ), we count each influenced i’s publications and calculate the proportion Q of articles in the old field by the following expression:where and represent the number of papers in the old field and the total number of papers published by scientist i during the observation period, respectively. A higher Q value of the influenced i then indicates that it receives more influence by the “influencers” with state 1, either directly or indirectly. Our results in Fig. 1 (and ) clearly show that Q increases with the indirect influence index m, yet not so much with direct influence index (also called k-core index; see Fig. 1 ) or the degree index d (see Fig. 1 ) and the second-nearest degree index (see ). This indicates that rather than direct influence, indirect influence plays a dominant role in the choice of research focus among scientists. Indeed, we find that nodes i and h, via node j (see Fig. 1), are more likely to coauthor publications in the old field, which means that the quantitative correlation between Q and m does mediate the collaboration relationship (see ). The observed indirect influence mechanism in empirical collaboration networks is possibly due to the following two factors. First, the fact that a scientist has a high value of induced index means he/she collaborates with a highly active scientist (a state 1 neighbor on its own connecting to a large amount of state 1 neighbors). This active scientist could strongly influence collaborators. Second, researchers who collaborate with highly active scientists have better chances to find new potential collaborators through their connections with respect to researchers who have no highly active neighbors. In other words, scientists with high induced index can interact with researchers of the same field indirectly, through their highly active neighbors.

Induced Percolation Model.

We now define a percolation model based only on this indirect influence mechanism characterized by the indirect index m. As empirically shown before, the indirect influence increases with m. Here we present the most simplified version of this influence mechanism that assumes a deterministic influence outcome, i.e., a node i is influenced to state 1 with probability if its indirect induced index m is not smaller than a threshold m (see for a slightly more complicated case): More formally, induced percolation can be defined on directed networks as follows. Let us assume that the state of the nodes is characterized by an integer value, 0 or 1. Initially, we set the state of all nodes in the network to 1. A node i remains in state 1 if at least one of its incoming links comes from a node, say j, with state 1, and in turn the node j has at least m other incoming links from nodes that are in state 1; see Fig. 2 for an illustration of the case m = 2. Otherwise, node i changes to state 0 at the next time step. The influence of the m nodes on the node i defines the indirect interactions among them. Under this mechanism, certain nodes will change their states from 1 to 0 at each time step until no more changes are possible; see Fig. 2 for an example. Compared with bond, bootstrap, or k-core percolation, the fundamental difference of induced percolation is that the current state of a node is affected not only by its nearest neighbors but also by a number of its next-nearest neighbors. The mechanism for induced percolation through a network captures the observation that there are behaviors whose influence reaches nodes beyond the first shell.

Fig. 2.

Induced percolation on directed networks. A illustrates the proposed mechanism of induced percolation for the case m = 2. In order for a node i to remain in state 1, at least one node (j) at the other end of an incoming link should be in state 1. In its turn, j should also have at least m ( in the example) incoming links from neighbors that are in state 1. B shows a directed graph of eight nodes all in state 1. C shows the GOUT at equilibrium state when the graph on panel B is pruned according to the induced percolation rules. D and E illustrate the variables x and y defined in the main text by Eqs. and . F and G show the relationship between the order parameters GSCC, GIN, and GOUT for induced percolation and typical bond percolation processes, respectively. H schematically represents the multilayer representation employed to derive the order parameter when there are directed and undirected links in the substrate network. In network percolation theory, the giant strongly connected component (GSCC), giant in-component (GIN), and giant out-component (GOUT) are three main order parameters. In particular, GSCC refers to the largest strongly connected component whose size is comparable to the entire network. GIN is the group of nodes from which any node in GSCC can be reached, while GOUT is the group of nodes that can be reached from any node in GSCC. For various types of propagation dynamics on networks, the GOUT corresponds to the largest spreading coverage, and it serves as an indicator of network connectivity under a given propagation mechanism. The size of the GOUT in the empirical studies corresponds to the number of scientists who stay in the old field. Therefore, in induced percolation, the main quantity of interest is GOUT (8, 28, 29) and the size of GOUT is the order parameter, i.e., the macroscopic quantity that characterizes phase transitions. In addition, we also examine the size distribution of small outgoing components. In undirected networks, each link can be viewed as two directed links with opposite directions. Therefore, induced percolation can be studied on fully directed networks, and then the methodology can be extended to either undirected (i.e., fully bidirectional) networks or to networks in which there are both bidirectional and unidirectional links. Note that the GOUT of undirected networks is the same as the GIN and the GSCC. We schematically illustrate the proposed induced percolation mechanism on directed networks in Fig. 2, where we also show the order parameter as compared with the one typically used in bond percolation. Similar diagrams for undirected and mixed networks can be found in . The phase transition that characterizes the induced percolation process can be analytically studied on random networks. The class of random directed networks is constructed by independently connecting two arbitrary nodes with a directed link with a fixed probability. The network can be described by the joint degree distribution , which is the probability that a randomly selected node has in-degree and out-degree . For random directed networks, the size of GOUT is derived through the following recursive equations. We first define two recursive variables x and y (see Fig. 2 ): x represents the probability that when selecting at random a directed link, the node at the origin of the link is active (in state 1), whereas y represents the probability that a random link enables its end node to be in an active state. According to the definitions of x and y, we havewhere is the probability that none of the incoming k links can keep node j in state 1 (see Fig. 2 ), while represents the probability that at least one of the incoming links can keep node j in state 1. The term is the excess incoming degree distribution (28, 29) for the node at the origin of an arbitrary directed link. This is because the likelihood of a node’s being the origin of a randomly chosen directed link is proportional to the node’s out-degree. Calculating the probability y is a little more involved. The definition of the induced percolation process implies that even if the starting node of a directed link is active (which happens with probability x), it is not guaranteed that the end node of this directed link remains active (which happens with probability y). However, if the starting node of this directed link is itself active, and at the same time at least m neighbors pointing to the starting node are active, then this directed link can keep its end node active. Conversely, if a directed link can keep the node it points to active (corresponding to y), then the starting node of this directed link must be active (corresponding to x). Therefore, it must hold when m > 1 ( when m = 1 which corresponds to bond percolation). The above analysis yields the expression of y aswhere gives the probability that for a node of incoming degree , s out of neighbors are active. y / x represents the conditional probability that a directed link keeps its end node active, given the starting node is active. Therefore, is the probability that at least 1 out of the s active incoming neighbors keeps this node active. Finally, the order parameter for the size of GOUT can be calculated based on Eqs. and as follows: Here is equivalent to the probability that a randomly chosen node has at least one incoming node to keep it active. One interesting finding worth highlighting is that the GSCC coincides with the GIN for the induced percolation process on directed networks, which is not the case for classical percolation models (see Fig. 2 ). The theoretical analysis of the order parameter on undirected networks is illustrated in Methods. We also note that the analysis of on mixed networks can be done by mapping the structure to a multilayer network; see Fig. 2 and more details in .

Phase Transitions of Induced Percolation.

Theoretical analyses allow us to show that the type or order of the phase transition depends on the directionality of the links for the same network connectivity pattern, i.e., the phase transition is anisotropic in nature. On directed networks, when m > 1 (m = 1 is the case of typical bond percolation), induced percolation shows discontinuous (first-order) phase transitions (see Figs. 3 and 4 for real-world networks). Yet, on undirected networks, the same percolation process always leads to continuous (second-order) phase transitions (see Figs. 3 and 4 for real-world networks). These results are in sharp contrast with previous percolation models on networks (see Table 2), for which it has never been found that the directionality of network links fundamentally alters the type of phase transitions. This means that previously studied types of percolation models might have significantly underestimated the effects of asymmetry in link directions on the system’s macroscopic behavior. An important implication of this observation is that abrupt transitions in complex systems like ecological and social networks might be way more likely to occur than previously anticipated by existing percolation models.

Fig. 3.

Fig. 4.

Order parameter GOUT for induced percolation on empirical networks based on datasets in Table 1. A–D show GOUT as a function of the proportion λ of remaining links. Each point of GOUT is computed as the steady state of induced percolation (m = 3) on real-world networks after randomly removing a fraction of links. The collaboration network is constructed based on published articles within the first 5 y in the four pairs of fields described in Table 1. The directed part of a citation network is obtained by removing all the bidirectional links from the citation network (see ). For the studied four pairs of fields, GOUT in general well agrees with main findings: a continuous, discontinuous, and hybrid phase transition for undirected collaboration networks (blue diamonds), directed part of citation networks (purple squares), and mixed citation networks (orange circles).

Table 2.

Comparison of percolation models

Percolation model	Type of phase transition		Clusters distribution near critical point	β	Hybrid phase transition at critical point
Percolation model	Undirected	Directed	Clusters distribution near critical point	β	Hybrid phase transition at critical point
Induced percolation	Second	First	Nonmonotonic	1 (second)1/2 (first)	θ=1/3 η=1/2
Bond percolation (5, 8, 9, 28)	Second	Second	Monotonic	1	—
Site percolation (5, 8, 9, 17, 28)	Second	Second	Monotonic	1	—
Bootstrap percolation (18)	Second/first	—	Monotonic	1 (second)1/2 (first)	θ=1/3 η=1/2
k-core percolation (19)	Second/first	Second/first	—	1 (second)1/2 (first)	—
Core percolation (26, 27)	Second	Second/first	—	1 (second)1/2 (first)	—
Explosive percolation (40, 43, 44)	Second	—	—	0.0555	—
Articulation percolation (45)	Second/first	—	—	1 (second)1/2 (first)	—

Dashes indicate that no related research has been found.

Order parameter GOUT for induced percolation on directed and undirected random networks. The symbols represent simulation results and the curves are corresponding theoretical results. A and B show GOUT for induced percolation () on directed scale-free (SF) and Erdős–Rényi (ER) networks as a function of the average degree . Results are compared with the behavior of the same order parameter for bond percolation (equivalent to setting m = 1). C shows the graphical solution of Eq. for induced percolation (m = 2) on directed ER graphs, where k is the critical average degree at which a first-order phase transition takes place. D and E show results for undirected networks, whereas the graphical solution shown in F is derived from Eq. (see Methods) for induced percolation (m = 2) on undirected ER graphs. Directed SF networks are generated by the static model (41, 42) with exponents and for the incoming and outgoing degree distributions, respectively. Undirected SF networks are generated with the exponent . Order parameter GOUT for induced percolation on empirical networks based on datasets in Table 1. A–D show GOUT as a function of the proportion λ of remaining links. Each point of GOUT is computed as the steady state of induced percolation (m = 3) on real-world networks after randomly removing a fraction of links. The collaboration network is constructed based on published articles within the first 5 y in the four pairs of fields described in Table 1. The directed part of a citation network is obtained by removing all the bidirectional links from the citation network (see ). For the studied four pairs of fields, GOUT in general well agrees with main findings: a continuous, discontinuous, and hybrid phase transition for undirected collaboration networks (blue diamonds), directed part of citation networks (purple squares), and mixed citation networks (orange circles). Comparison of percolation models Dashes indicate that no related research has been found. The anisotropy induced by the directionality of the links leads to a rich and complex behavior when the network is composed of a mixture of directed and undirected links. Specifically, a hybrid phase transition emerges with the presence of a certain amount of directed links. Fig. 5 show that by increasing the fraction p of directed links in the network, the order parameter GOUT evolves, as the average degree increases, from a continuous transition to a hybrid phase transition where both continuous and discontinuous transition exist, to a first-order transition for larger values of . In addition, in the region where the hybrid phase transition is observed, several quantities follow a set of scaling relations with critical exponents that are in line with Landau’s mean-field theory.

Fig. 5.

Phase transitions and critical behaviors of induced percolation on mixed networks. In A, we show theoretical and numerical results for GOUT as a function of the average degree when the fraction of directed links is varied. The point C denotes the point at which coexistence of second- and first-order phase transitions occurs for the first time. The curved dotted line represents the value of GOUT before and after the first-order phase transition. The symbols represent simulation results and the curves are corresponding theoretical results. B shows the values of the critical points in the parameter space made up by the average degree and the percentage of directed links; the dotted line describes the critical value at which a second-order phase transition occurs, while the solid line corresponds to the first-order phase transition. Dots correspond to critical points, C. C represents the types of phase transitions that can be observed in the m – p plane. Blue, purple, and green colors bound the area in which second-order, hybrid, and first-order phase transitions exist, respectively. The red boundary lines between the blue and the purple areas correspond to the critical points C. When the parameters are such that they lay on the red line, the behavior of GOUT corresponds to the green line marked with point C in A. D shows the types of phase transitions shown in C but in the plane. E presents results of the jump size, , as a function of when the critical point C is approached either from below or from above. F depicts the change of near the critical point as a function of , when fixing . The mixed network is generated by assigning a percentage p of directed links to an undirected ER network with an average degree and consists of 106 nodes. We label the critical hybrid point where the hybrid transition first appears as point in Fig. 5. We find a set of scaling relations connecting GOUT to other quantities near C that are predicted by Landau’s mean-field theory: Within the hybrid transition, the jump height of GOUT, , where k is the critical point at which the first-order transition occurs, follows a scaling function of with the critical exponent (Fig. 5): The same critical exponent holds for the jump height as a scaling function of as shown in the . When fixing p at and varying in the vicinity of , the size deviation of GOUT is quantified by the following scaling function of with critical exponent (Fig. 5), reached from both below and above: We note that Baxter et al. also find these two critical exponents in k-core percolation (20). Another unexpected feature that distinguishes the percolation process formulated here from other percolation is the cluster size distribution near criticality. Typically, for the second-order phase transitions, in the vicinity of the phase transition point, the size distribution of small connected clusters is in general governed by the monotonous function of , where provides a characteristic size of the finite components (4, 10). The closer to the critical point, the larger will be. At the exact phase transition point, approaches infinity and P(s) exhibits a monotonic power-law distribution of , signifying a loss of characteristic scale in the distribution. However, for induced percolation on undirected networks, we find that near the critical point P(s) exhibits a novel oscillatory-like behavior, i.e., it is no longer monotonically decreasing with s (see Fig. 6 ).

Fig. 6.

Size distribution, , of small clusters at the critical point of induced percolation () on undirected networks. In A we show the size distribution, P(s), which exhibits a fluctuating behavior especially for small sizes. B plots the same distribution P(s) but as a function of the average degree , showing an unambiguous nonmonotonic decrease of the size distribution. C and D depict the monotonous power-law decay of the cluster size distribution in the limit of classical bond percolation. Finally, E displays the cluster size distribution at the critical point , also showing the structure of each cluster. Results are averaged over 103 independent realizations of undirected Erdős–Rényi networks (of size 106 nodes). As it can be clearly seen, the critical behavior of induced percolation is different from that of classical one. As it can be seen in the Fig. 6 , the observed oscillatory-like behavior of P(s) is more pronounced for small values of s and does not change the asymptotic power law distribution for large s nor the critical exponent of the phase transition, which is the same as in bond percolation, β = 1, (40). This behavior of P(s) is, however, clearly distinct from the classical monotonic distribution (see Fig. 6 ). We note that we do not have a clear notion of what the exact impact of this pattern on the macroscopic behavior of the system is, which is a question to be further examined in future work.

Conclusion and Discussion

Let us first mention that in addition to our empirical results on collaboration networks we believe that the induced percolation mechanism could play a relevant role in other examples of behavioral influence or contagion, such as in the behavioral spreading of drug abuse, alcoholism, obesity, divorce, happiness, and loneliness, among others. These examples are usually listed to show the “three degrees of influence” mechanism. That is, one individual’s influence can significantly spread out to their friends’ friends’ friends. However, the specific spreading mechanisms behind this phenomenon remain unknown and with no theoretical, first-principled grounds. Although our empirical work reveals only one mechanism of influence within two degrees, we believe that it can be regarded as the first step to provide a specific spreading mechanism for the “three degrees of influence” and potentially opens new paths in the field of percolation on networked systems. Based on our empirical discovery that indirect influence can dominate over direct influence we have proposed an induced percolation model to characterize the dynamics and outcomes of this indirect spreading mechanism. We found that such indirect interactions lead to a plethora of percolation transitions in complex networks that are rooted in the degree of anisotropy of the connectivity pattern. Specifically, we have shown that the amount of directed links in a network determines the order of the phase transition, which spans from a second order in networks without directed links to a first order when all links are directed. In between, a rich behavior associated with hybrid phase transitions emerges with the coexistence of second- and first-order phase transitions. In addition, the indirect effect makes the size distribution of small clusters near the phase transition point exhibit a nonmonotonic pattern, which has not been previously seen in other percolation models. Our theoretical framework provides the tools to investigate the implications of having different indirect influence mechanisms in a spreading phenomenon and understand their associated dynamical process and macroscopic spreading outcomes. For instance, we have found that indirect influence can dominate over direct influence in social systems like what we found in scientific collaboration networks—if similar mechanisms in other social behaviors like drug abuse, alcoholism, etc. also hold, this implies very different mitigation policies from that based on direct influence mechanisms.

Methods

Induced Percolation on Undirected Networks.

We elaborate on the definition and the theoretical derivation of induced percolation on undirected networks. All nodes in an undirected network are initially set to state 1. A node l remains in state 1 if at least one of its undirected links has a node j in state 1, and this node j has at least m neighbors (excluding the node l) with state 1 (as illustrated in for the case of m = 2); otherwise, node l changes to state 0 at the next time step. To theoretically analyze the percolating probability that any node belongs to a GCC (giant connected component, equivalent to GOUT), , we start by defining six conditional probabilities as intermediate variables, whose notations are shown collectively in . Without loss of generality, we denote a randomly chosen undirected link as and deduce the probability that node l belongs to a GCC. According to the definition of induced percolation for undirected networks, the condition for node l to remain active is that there is at least one active neighbor j, and the number of active neighbors (except node l) of node j satisfies . We refer to a node in state 1 as an active node and in state 0 as an inactive node. Unlike active neighbors in directed networks, the number of active neighbors in undirected networks is closely related to the degree k of node j. Specifically, if k > m and node j is active, then node j can keep all its neighbors active. Conversely, if , then node j cannot keep any of its neighbors active. Hereafter, we employ the degree k instead of the number of active neighbors to derive percolation probability. The conditional probability is the probability that node j can keep node l active, given node l can keep j active. As per the definition of induced percolation, the event of j keeping l active implies that the degree of node j satisfies . Node j simultaneously keeps all of its neighbors active. The above analysis yields the following recursive equation:where represents the excess degree distribution of the end node of a randomly chosen link. On the other hand, the conditional probability is defined as the probability that node j can keep node l active, and node l is connected to the GCC via node j, given that node l can keep j active. Again, as per the definition of induced percolation, the degree of node j satisfies . Analogously, node j can keep all its neighbors active. In addition, the event that node l connects to GCC through node j is equivalent to the event that node j connects to GCC through at least one of the neighbors other than l. The corresponding probability is (as shown in ), where the probability accounts for the likelihood that one of the neighbors does not belong to the GCC given that node j can keep it active. Therefore, the self-consistent equation for the conditional probability can be written as In the previous definition, we made use of the conditional probability , which is the probability that node j cannot keep node l active while node l connects to GCC through node j, under the condition that node l maintains node j in state 1. Thus, it follows that the degree of node j satisfies and that node j cannot keep any of its neighbors active. Moreover, the event in which node l connects to the GCC through node j is equivalent to the event in which node j reaches the GCC through one of the neighbors other than l. The corresponding probability reads (as shown in ), where the probabilities stand for cases in which node j cannot keep any neighbors in state 1 (see below). Therefore, the conditional probability can be calculated using Once the above probabilities have been defined, we can proceed with the derivation of the remaining three conditional probabilities, namely, , and , which are analogous to v, , and , but under the condition that node l cannot keep node j active. The derivation of the probability is similar to , except that node j relies on at least one of the neighbors (except l) to remain active. This probability can be expressed as The derivation of the conditional probability is similar to that of , except that one additional condition is required: of the neighbors different from l, at least one can keep j active and connected to the GCC. Assuming that there are exactly s () neighbors that can keep node j active, the probability is . The probability that node j is connected to the GCC through one of the s neighbors is . For the remaining neighbors that cannot keep node j active, the probability of j connecting to the GCC through one of them is . Therefore, the probability that node j is not connected to the GCC through any neighbor is , as shown in . Therefore, the self-consistent equation to derive the conditional probability is Finally, the conditional probability can be obtained similarly to , with the additional consideration that for neighbors except l at least one can keep j active and that node j connects to the GCC via at least one of the neighbors. Thus, the degree of node j satisfies , which also implies that j keeps all its neighbors active. Therefore, the conditional probabilities , and in Eq. are replaced by probabilities , and . This leads to the following expression for the conditional probability :where s represents the number of neighbors that can keep j active. The graphical solution of the self-consistent equation is shown in the main text, where and represents the expression on the right-hand side of Eq. . The value of is obtained by solving the self-consistent Eqs. –. The previously defined conditional probabilities allow us to derive the order parameter, , for induced percolation on undirected networks. For an arbitrarily chosen node l to belong to the GCC, we have that 1) at least one of its neighbors should keep it active and 2) node l is attached to the GCC through at least one of its neighbors. If the degree of node l satisfies , then the probability that node l belongs to the GCC is , whose derivation is similar to Eq. in . If the degree of node l satisfies , then the probability that node l belongs to the GCC is and the derivation is similar to in Eq. . Therefore, the order parameter can be computed, for undirected networks, as

29 in total

Indirect influence in social networks as an induced percolation phenomenon.

Results

Empirical Indirect Influence Mechanisms.

Induced Percolation Model.

Phase Transitions of Induced Percolation.

Conclusion and Discussion

Methods

Induced Percolation on Undirected Networks.

1. Network robustness and fragility: percolation on random graphs.

2. Error and attack tolerance of complex networks

3. Suppressing cascades of load in interdependent networks.

4. Catastrophic cascade of failures in interdependent networks.

5. k-Core organization of complex networks.

6. Inducing effect on the percolation transition in complex networks.

7. Influence maximization in complex networks through optimal percolation.

8. Detecting and modelling real percolation and phase transitions of information on social media.

9. The spread of alcohol consumption behavior in a large social network.

10. The spread of obesity in a large social network over 32 years.