Literature DB >> 27535087

Increasing returns to scale: The solution to the second-order social dilemma.

Hang Ye1,2, Shu Chen1, Jun Luo2, Fei Tan3, Yongmin Jia1, Yefeng Chen1.   

Abstract

Humans benefit from extensive cooperation; however, the existence of free-riders may cause cooperation to collapse. This is called the social dilemma. It has been shown that punishing free-riders is an effective way of resolving this problem. Because punishment is costly, this gives rise to the second-order social dilemma. Without exception, existing solutions rely on some stringent assumptions. This paper proposes, under very mild conditions, a simple model of a public goods game featuring increasing returns to scale. We find that punishers stand out and even dominate the population provided that the degree of increasing returns to scale is large enough; consequently, the second-order social dilemma dissipates. Historical evidence shows that people are more willing to cooperate with others and punish defectors when they suffer from either internal or external menaces. During the prehistoric age, the abundance of contributors was decisive in joint endeavours such as fighting floods, defending territory, and hunting. These situations serve as favourable examples of public goods games in which the degrees of increasing returns to scale are undoubtedly very large. Our findings show that natural selection has endowed human kind with a tendency to pursue justice and punish defection that deviates from social norms.

Entities:  

Mesh:

Year:  2016        PMID: 27535087      PMCID: PMC4989174          DOI: 10.1038/srep31927

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


It was only approximately 12,000 years ago, with the emergence of agricultural civilization in Mesopotamia, that individuals and families alike were able to play primary roles in economic activities1. In other words, in more than 99% of the 7 million years of human evolutionary history2, our ancestors had to rely on joint endeavours rather than individuals or families to maintain species survival. This life pattern includes joint projects such as hunting large game, fighting natural disasters, defending territory against invaders, and, most importantly, sharing the fruits of these cooperative activities345. This life pattern can be viewed as a particular mechanism for producing and distributing public goods. Public goods games (PGGs) have been widely employed to investigate the evolution of human cooperation within the framework of evolutionary game theory. In a standard PGG, the total return of a joint project is divided equally among all participants, which means that a defector can free-ride other participants and hence can fare better than others. This situation raises an evolutionary puzzle: how do cooperators rise and evolve? Because the payoff of a cooperator is strictly lower than that of a defector, each participant, in light of the hypothesis of self-interest, chooses to contribute nothing to the PGG in Nash equilibrium. As a result, any individual’s payoff will be zero. This is called the social dilemma67. It is widely believed that punishing defection serves as an effective way to resolve this problem. Assume that some players, called punishers or altruistic punishers, have a strong sense of justice and not only contribute but also punish those who deviate from social norms. If the strength of punishment exceeds the contribution cost, free-riding becomes unprofitable, and the PGG is immune from the social dilemma8910111213. Related anthropological and zoological studies show that punishing defection was not only an important way for human beings to maintain cooperation in early societies1415 but is also a vital means for many types of animals to maintain species survival1617. However, punishment is costly because it takes time and energy and invites retaliation. As a result, the payoff of a punisher is lower than that of a cooperator. In light of the hypothesis of self-interest, each participant will choose to co-operate but not punish in Nash equilibrium. These cooperators are called second-order free-riders to distinguish them from defectors. This situation presents a second evolutionary puzzle for humankind: how do punishers rise and evolve? Once punishers are invaded by cooperators, defectors can easily invade and dominate the population; thus, defection eventually prevails in the evolutionary equilibrium. This is called the second-order social dilemma8181920. From the perspective of the evolution of simulation, there are two main types of relevant studies to solve the problem of the second-order social dilemma. One type is based on the viewpoint of the social network21222324, including different mechanisms of punishment, such as adaptive punishment25, probabilistic punishment2627, and conditional punishment28. Reward is another solution to the second-order social dilemma29. The use of a combination of rewards and punishment is also an important method30313233. The evolutionary games of spatial networks not only provide an important perspective for solving the problem of the first-order social dilemma3435 but also shed light on the solution for the second-order social dilemma. The second categorization relates to the agent-based model, but none of these models is immune from some strong conditions: (i) external conditions, including the group selection effect resulting from immigration3637, indirect reciprocity that is dependent on the reputation mechanism3839, and the effect of cultural selection4041 or religious indoctrination4243; and (ii) internal conditions, including the addition of new strategies or new types of behaviour to alter the payoff matrix of the game, such as voluntary participation4445, rewards46, sympathy47, and pool punishment48 or the modification of game rules that may change the nature of the game, such as communication45, coordination49, and cooperation50 among punishers. These strong conditions inevitably narrow our scope of interpretation51. In accordance with the intrinsic properties of human co-operation, we propose a model of PGG without imposing any further assumptions (including new strategies, behaviour types, or game rules) to resolve the second-order social dilemma. It is well known that one of the most important properties of human cooperation is economies of scale5152. As an example, Fig. 1 shows collective hunting in a primitive society. Assume that both the number of the prey per unit area and the length a hunter is able to siege are given. In this case, the hunting returns will depend on the area controlled by the hunters and, ultimately, on the number of hunters. Let l denote the length a hunter is able to siege. Then, n hunters can siege an area with a circumference of nl, and they can control at best an area of (l2/4π)n2. Therefore, the return of adding one more hunter to this hunting activity grows exponentially. Such a scenario is termed “economies of scale” or “increasing returns to scale”.
Figure 1

Collective hunting in a primitive society.

According to the aforementioned discussion, we construct a general model of PGG featuring increasing returns to scale. Our analysis of the stochastic evolutionary dynamics and the corresponding computer simulations show that punishers will achieve dominating evolutionary advantages and are able to resist any invasion of second-order free-riders provided that the degree of increasing returns to scale is sufficiently large. Thus, the second-order social dilemma can be effectively resolved.

Results

Analytical results

We establish a stochastic evolutionary model of PGG in a finite population. To see the effect of an economy of scale, we calculate the relative time of cooperators (X), defectors (Y) and punishers (Z) in homogeneous states as a function of the coefficient of increasing returns to scale (denoted as α). The relative time in homogeneous states means the probability of the population being occupied entirely by one of the three strategies. The parameters in the model are population size (M), sample size (N), contribution cost(c), multiplier of return (r), strength of punishment (δ), cost of punishment (γ), selection strength (ω) and mutation rate (μ). The results are plotted in Fig. 2 below.
Figure 2

The relative time of strategies in homogenous states.

In the calculations, the parameter values are M = 100, N = 5, c = 1, r = 3, δ = 1, γ = 0.3, ω = 0.1, μ → 0.

As seen in the figure above, in the stochastic evolutionary model, both punishers and cooperators become increasingly dominant relative to defectors if the coefficient of increasing returns to scale α ≥ 1.3. Punishers and cooperators even jointly dominate the population and are unlikely to be invaded by defectors when α ≥ 1.6. In this case, defectors almost certainly become extinct. Because punishment is no longer necessary, cooperators and punishers fare the same. Consequently, the second-order social dilemma dissipates.

Simulation results

We also adopt a frequency-dependent Moran process to specify the stochastic dynamics of a PGG in a finite population and run a series of multi-agent computer simulations. Our computer simulations show that the punishing strategy cannot gain a foothold in the population if the PGG is of constant returns to scale (α = 1). Typically, a rock-paper-scissors-like evolution path governed by the alternating temporary domination of each type will emerge, so the system is unable to form stable cooperation (Fig. 3a). However, punishment becomes the only evolutionarily dominant strategy of the three types in the population provided that the degree of increasing returns to scale of the PGG is sufficiently large (for example, α = 1.8). The resulting evolution path shows that after a number of transient oscillations, punishers immediately come to dominate the population and can resist the invasion of any other strategy (Fig. 3b). This implies that the second-order social dilemma dissipates in the PGG featuring increasing returns to scale.
Figure 3

Evolution of cooperation in PGG.

In the simulations, the parameter values are M = 100, N = 5, X = 30, Y = 40, Z = 30, c = 1, r = 3, α = 1.0 or α = 1.8, δ = 1, γ = 0.3, ω = 0.5, μ = 0.001. (a)With constant returns to scale (α = 1). (b) With increasing returns to scale (α = 1.8).

Robust tests

The simulation results in Fig. 3 are robust when we extend the periods to more than 1 million or change the initial composition of the population. The simulation was repeated 20 times, and all displayed a similar montage. As a result, we randomly chose one as the representative montage of the simulation result. To further test the robustness of the results obtained, we studied how different parameter values can affect the evolution of cooperation in a PGG (Fig. 4). The strategy frequency here is the averaged proportion of different strategies in the population in 100,000 periods. All of the results are averaged over 20 times, and they are not affected by the initial composition of the population.
Figure 4

The effect of different parameter values on the evolution of cooperation in a PGG.

(a ~ c) coefficient of increasing returns to scale α in different initial composition: (a) 100% cooperators, (b) 100% defectors, (c) 100% punishers; (d) multiplier of return r; (e) contribution cost c; (f) strength of punishment δ; (g) cost of punishment γ; (h) selection strength ω; (i) mutation rate μ. The parameter values are X = 30, Y = 40, Z = 30 (except for figure a ~ c), r = 3 (except for figure d), c = 1 (except for figure e), α = 1.0 (except for figure d), α = 1.8 (except for figure a ~ c), δ = 1 (except for figure f), γ = 0.3 (except for figure g), ω = 0.5 (except for figure h), μ = 0.001 (except for figure i).

We can see that there is a threshold value of α of approximately 1.2 in which the punishers and the defectors are very similar. When α is larger than 1.2, the punishers can gradually gain an advantage against both the defectors and the cooperators. The multiplier of return r has a similar effect as α. Moreover, increasing the cost of contribution and punishment has a negative effect on the maintenance of cooperation, whereas increasing the strength of punishment, the selection strength and the mutation rate helps the punishers in defeating the defectors, which in turn has corresponding effects on the threshold value of α. Due to space limitations, other robust tests on various parameters can be found in the Supplementary Materials (Fig. S2–S12).

Discussion

The underlying mechanism of increasing returns to scale to solve the second-order social dilemma

Why do punishers fare best in the PGG with increasing returns to scale, hence resolving the second-order social dilemma? It is well known that punishers are primarily threatened by cooperators’ second-order free-riding. However, by analysing the simulation data, we find that the payoff advantage of cooperators over punishers diminishes as the degree of increasing returns to scale becomes larger. This is because the larger the degree of increasing returns to scale is, the higher the payoff each individual receives from the game. Given that the punishment cost is held fixed, the payoff difference between punishers and cooperators sharply decreases as the average payoff of all individuals increases, indicating that the evolutionary advantage of cooperators over punishers becomes very small (see Supplementary Materials, Fig. S13). When such an advantage becomes sufficiently weak, it is very likely to be offset by the randomness in evolutionary dynamics. This randomness in the biological evolutionary process is primarily due to genetic variation inside the organism and genetic drift induced by environmental factors. This suggests that biological character is not determined entirely by fitness. Rather, with small probability, it is affected by random disturbances from inside or outside the organisms. If the evolutionary advantage of a particular biological character is large enough, it can successfully resist this random disturbance. Otherwise, its evolutionary advantage will eventually be offset by the random disturbance. Of course, this randomness affects each type of player. However, our computer simulations show that only punishers, rather than cooperators, can dominate the population. This finding seems to suggest that randomness only weakens the advantage of cooperators. Further analysis of the simulation data shows that cooperators may dominate the population only for transient periods, but eventually they cannot defend their regime because defectors can easily invade and dominate the population.On the contrary, the evolutionary advantage of punishers is reinforced once they become dominant in the population. This is because the abundance of punishers effectively restrains the spread of defectors, reducing the punishment cost. As a result, punishers’ evolutionary advantage becomes even more dominant (see Supplementary Materials, Fig. S14). Therefore, as shown in our computer simulation, once the punishers establish their regime in the population, it becomes extremely difficult for other types to invade.

The historical evidence of increasing returns to scale

Modern production activities depend not only on labour but also on many other production factors, including capital and technology. However, many large-scale production activities in modern society still depend heavily on the number of cooperators. The number of participants has a great impact on the results of activities. In fact, revenue grows exponentially with the number of participants. Examples of such activities include conventional warfare, geological exploration, and rescue activities during natural calamities such as earthquakes, tsunamis, and floods. As rarely as these events may be considered in modern society, the degrees of increasing returns to scale are usually extremely large in these activities. Our model provides a reasonable interpretation of why individuals are more willing to reach a consensus to cooperate and punish defectors under these circumstances. In a primitive society in which the level of productivity is extremely low, labour becomes the most important or even the only production factor1. The number of contributors thus plays a decisive role in many joint endeavours in a primitive society, such as fighting floods, defending territory, and hunting large game. The degrees of increasing returns to scale are all very large in these activities52. Therefore, increasing returns to scale may have been a common feature in most social activities over the prehistorical age, lasting for millions of years. Ample evolutionary psychology studies have suggested that the mind, and thus the behaviour, of modern man has long been formed by the ancestral environment53 because agricultural civilization has a history of only a little more than 10,000 years and industrial civilization is less than 300 years old, whereas human society has millions of years of history. Neuroanatomy evidence also shows that the interconnections among neurons in the human brain have changed very little since the Industrial Revolution. Thus, it may not be surprising that some evolutionary psychologists claim that “our modern skulls house a stone age mind”54. These facts help us better understand why the pursuit of fairness and justice has become a common psychological state and the behavioural propensity of human beings. Our results show that natural selection has endowed humankind with a tendency to pursue justice and to punish defectors who deviate from social norms. In other words, the sense of justice is a product of long-lasting human evolution.

Methods

A model of a PGG featuring increasing returns to scale

We can apply a Cobb-Douglas production function P = cr(X + Z)Y to characterize the PGG with increasing returns to scale5556, where P denotes total payoff from the PGG, c is the contribution cost to the joint project from each contributor (including cooperators and punishers, c > 0), r the multiplier of return (r > 1), X the number of cooperators, Z is the number of punishers, Y is the number of defectors, and α and β are the contribution rate of contributors and defectors, respectively. Because cr > 0, the PGG is featured with increasing returns to scale if α + β > 1. Moreover, because defectors make no contribution, namely, β = 0, the PGG features increasing returns to scale provided that α > 1. In this case, α is called the coefficient of increasing returns to scale. Let δ and γ denote the strength and the cost of punishment, respectively. Then, the payoffs of cooperators, defectors, and punishers from each period, namely, P, P and Pfrom a PGG with increasing returns to scale are given below:

A stochastic evolutionary model of a PGG with finite population

As a useful method to analyse the stochastic evolutionary process of a finite population, the Moran process is widely applied in studies of biological evolution and evolutionary game theory, e.g., genetic replication, genetic mutation, genetic drift, or strategy learning and updating57585960. The analysis of the stochastic evolution of a finite population will be greatly simplified in the limiting case, the mutation rate μ → 0, where the population consists of two types at most. For μ = 0, any monomorphic state becomes absorbing. If the mutation rate μ is sufficiently small, a mutant either becomes extinct or spreads into fixation before the next mutant appears. Therefore, the transition between any two monomorphic states occurs only when a mutant appears and spreads into fixation. In this case, the multivariate hypergeometric sampling reduces to a hypergeometric distribution44. Next, consider a sampling process of randomly choosing N individuals from a well-mixed finite population of constant size M to participate in a PGG with increasing returns to scale. For μ → 0, this process is equivalent to an N-trial sampling without replacement from a population with m individuals of type i and m = M-m individuals of type j. The probability of selecting k individuals of type i and N-k individuals of type j is Thus, according to equation (4) and the model of the PGG (1)-(3), in any period of the game, for the X cooperators and Y = M-X defectors, the expected payoff P of cooperators competing against defectors and the expected payoff P of defectors competing against cooperators are Similarly, the expected payoffs to punishers and defectors are Finally, the expected payoffs to cooperators and punishers are The evolutionary fitness f of an individual of type i in a well-mixed population of types i and j can be calculated from its payoff P and the fitness function F = exp(ωP), which is Thus, the probability of changing the number of individuals of type i by ±1, T±, can be calculated from these transition probabilities, the fixation probability ρ of a single mutant strategy of type i in a resident population of type j can be derived (see Appendix A): Consequently, the fixation probabilities ρ define the transition probabilities between the three different homogeneous states of the population. The corresponding Markov transition matrix A is given by The normalized right eigenvector to the largest eigenvalue (which is 1) of the transposed matrix of A determines the stationary distribution; that is, it indicates the probability of finding the system in one of the three homogeneous states. It is given by (see Appendix B) The normalization factor N must be chosen such that the elements of ø sum up to one.

Simulation

When μ > 0 and is not negligible, the fixation probabilities generally continue to fluctuate because of the existence of random disturbance. However, as long as some behaviour or character is evolutionarily stable, its regime in the population will eventually withstand any random disturbance. Here, we adopt a frequency-dependent Moran process to specify the stochastic dynamics of PGG in a finite population444547 and run a series of multi-agent computer simulations with μ > 0. The computer simulation procedures are specified as below. Setting a random sample to participate in the game. Applying the Monte Carlo method606162, N individuals are randomly chosen from a well-mixed finite population of constant size M to participate in a PGG with increasing returns to scale. Calculating the payoffs from the game. Let the individuals from the sample play the game, and then use computer simulation software to calculate the payoff of each type of player at the end of each period of the game according to equations (1–3). Calculating evolutionary fitness. A basic assumption of evolutionary game theory is that individuals are more prone to imitate those with higher payoffs. This assumption implies that individuals with higher payoffs generally have higher fitness and are thus more evolutionarily advantageous. In evolutionary dynamics, a commonly used algorithm to calculate fitness is F = 1 − ω + ωP, where F denotes fitness, P is payoff, and ω is selection strength (0 < ω ≤ 1). This algorithm treats fitness as a convex combination of the “baseline fitness”, which is normalized to 1 for all players, and the payoff from the game44. A drawback of this algorithm is that it is only applicable for analysing stochastic evolutionary dynamics under weak selection because fitness may be negative for strong selection. To avoid this limitation, we adopt an exponential function F = exp(ωP) by Thaulsen et al.63 in our computer simulation, which allows us to accommodate any value of ω in its domain. Genetic replication or strategy updating. The Moran process assumes that one member of the population M is chosen to die and is replaced by a newly born individual in each generation of the evolutionary process. The type of the newly born individual is jointly determined by both the fitness and frequency of each type of individuals in the population. Usually, two algorithms are commonly used to implement the above process4459. The first is the “birth-death” process in which an individual is first chosen for reproduction with a probability proportional to its fitness, and then its clonal offspring replaces a randomly selected individual from the population M. The second is the “death-birth” process, in which a randomly selected individual is first removed from the population M and another individual is subsequently selected for reproduction with a probability proportional to its fitness and produces a clonal offspring. In addition to these algorithms, we apply a third approach called “genetic pool” in our computer simulations, in which each individual in the population M reproduces an offspring with a probability proportional to its fitness, and these newly born individuals form a “genetic pool” from which one offspring is chosen randomly to replace an individual in the population4759. Genetic variation or mutation. Genetic variation is an important factor that affects the evolutionary process. A common assumption of evolutionary dynamics is that any individual of a specific type can switch to another type with a small probability μ irrespective of its payoff. The parameter μ is called the mutation rate. This assumption implies that players will change their strategies with a very small probability without taking the potential payoffs of alternative strategies into account, which can simply be viewed as players’ tentative exploration of alternative strategies44. The aforementioned steps are executed successively and compose our multi-agent computer simulations based on the frequency-dependent Moran process (see Supplementary Materials, Fig. S1).

Appendix A: Proof of equation (13)

Let n denote the state that has n individuals of type i. From n = 0 to n = M, there are M + 1 kinds of states in total, the transitions between which compose the Markov process. The states of n = 0 and n = M are absorbing states. The corresponding Markov transition matrix is Suppose x is the probability of the state k transiting to the absorbing state M; thus, we can obtain So, if we let y = x − x, θ = t−/t+, we can obtain This means that y1 = x1, y2 = θ1 × 1, y3 = θ1θ2 × 1, and so on. We can also obtain the following identity: Substituting equation (A.2) into the identity, we can obtain Here, x1 donates the probability of the state that has a mutant of type i transiting to the state completely dominated by this type, which is exactly the fixation probability ρ we want. Substitute θ and equation (10) into equation (A.3), and we can obtain equation (13).

Appendix B: Proof of equation (15)

For a Markov transition matrix P = (P), a probability distribution {π, i ≥ 0} is called the stationary distribution of the Markov chain if it satisfies It can be rewritten as Taking the transpose of both sides, we can obtain The transposed matrix of the Markov transition matrix A is Substituting equation (B.2) into equation (B.1) and with some calculations, we can obtain the following homogeneous linear equations: Remember that {π, i ≥ 0} is a probability distribution, which means the identity Substituting the identity into equation (B.3), we can obtain equation (15).

Additional Information

How to cite this article: Ye, H. et al. Increasing returns to scale: The solution to the second-order social dilemma. Sci. Rep. 6, 31927; doi: 10.1038/srep31927 (2016).
  40 in total

1.  A new hominid from the Upper Miocene of Chad, Central Africa.

Authors:  Michel Brunet; Franck Guy; David Pilbeam; Hassane Taisso Mackaye; Andossa Likius; Djimdoumalbaye Ahounta; Alain Beauvilain; Cécile Blondel; Hervé Bocherens; Jean-Renaud Boisserie; Louis De Bonis; Yves Coppens; Jean Dejax; Christiane Denys; Philippe Duringer; Véra Eisenmann; Gongdibé Fanone; Pierre Fronty; Denis Geraads; Thomas Lehmann; Fabrice Lihoreau; Antoine Louchart; Adoum Mahamat; Gildas Merceron; Guy Mouchelin; Olga Otero; Pablo Pelaez Campomanes; Marcia Ponce De Leon; Jean-Claude Rage; Michel Sapanet; Mathieu Schuster; Jean Sudre; Pascal Tassy; Xavier Valentin; Patrick Vignaud; Laurent Viriot; Antoine Zazzo; Christoph Zollikofer
Journal:  Nature       Date:  2002-07-11       Impact factor: 49.962

2.  Social learning promotes institutions for governing the commons.

Authors:  Karl Sigmund; Hannelore De Silva; Arne Traulsen; Christoph Hauert
Journal:  Nature       Date:  2010-07-14       Impact factor: 49.962

3.  Costly punishment across human societies.

Authors:  Joseph Henrich; Richard McElreath; Abigail Barr; Jean Ensminger; Clark Barrett; Alexander Bolyanatz; Juan Camilo Cardenas; Michael Gurven; Edwins Gwako; Natalie Henrich; Carolyn Lesorogol; Frank Marlowe; David Tracer; John Ziker
Journal:  Science       Date:  2006-06-23       Impact factor: 47.728

4.  Behavior. Foresight and evolution of the human mind.

Authors:  Thomas Suddendorf
Journal:  Science       Date:  2006-05-19       Impact factor: 47.728

5.  The Monte Carlo method.

Authors:  N METROPOLIS; S ULAM
Journal:  J Am Stat Assoc       Date:  1949-09       Impact factor: 5.033

6.  Cooperation in social dilemmas: free riding may be thwarted by second-order reward rather than by punishment.

Authors:  Toko Kiyonari; Pat Barclay
Journal:  J Pers Soc Psychol       Date:  2008-10

Review 7.  Universal scaling for the dilemma strength in evolutionary games.

Authors:  Zhen Wang; Satoshi Kokubo; Marko Jusup; Jun Tanimoto
Journal:  Phys Life Rev       Date:  2015-05-05       Impact factor: 11.025

8.  Outsourcing punishment to God: beliefs in divine control reduce earthly punishment.

Authors:  Kristin Laurin; Azim F Shariff; Joseph Henrich; Aaron C Kay
Journal:  Proc Biol Sci       Date:  2012-05-23       Impact factor: 5.349

9.  Human cooperation: second-order free-riding problem solved?

Authors:  James H Fowler
Journal:  Nature       Date:  2005-09-22       Impact factor: 49.962

10.  Analytical results for individual and group selection of any intensity.

Authors:  Arne Traulsen; Noam Shoresh; Martin A Nowak
Journal:  Bull Math Biol       Date:  2008-04-02       Impact factor: 1.758

View more
  1 in total

1.  Benefits of asynchronous exclusion for the evolution of cooperation in stochastic evolutionary optional public goods games.

Authors:  Ji Quan; Junjun Zheng; Xianjia Wang; Xiukang Yang
Journal:  Sci Rep       Date:  2019-06-03       Impact factor: 4.379

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.