Literature DB >> 30224497

Endogenous rewards promote cooperation.

Chun-Lei Yang¹, Boyu Zhang², Gary Charness³, Cong Li⁴, Jaimie W Lien⁵.

Abstract

Sustaining cooperation in social dilemmas is a fundamental objective in the social and biological sciences. Although providing a punishment option to community members in the public goods game (PGG) has been shown to effectively promote cooperation, this has some serious disadvantages; these include destruction of a society's physical resources as well as its overall social capital. A more efficient approach may be to instead employ a reward mechanism. We propose an endogenous reward mechanism that taxes the gross income of each round's PGG play and assigns the amount to a fund; each player then decides how to distribute his or her share of the fund as rewards to other members of the community. Our mechanism successfully reverses the decay trend and achieves a high level of contribution with budget-balanced rewards that require no external funding, an important condition for practical implementation. Simulations based on type-specific estimations indicate that the payoff-based conditional cooperation model explains the observed treatment effects well.

Entities: Chemical Disease Gene Species

Keywords: cooperation; mechanism; public goods; reward

Mesh：

Year: 2018 PMID： 30224497 PMCID： PMC6176598 DOI： 10.1073/pnas.1808241115

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 11.205

Sustaining cooperation in social dilemmas has been a longstanding and fundamental topic in both the social and biological sciences. Several solutions have been proposed and found to be effective to varying degrees at solving this conundrum, including repeated interactions (1, 2), reputation (3, 4), and assortative matching (5). The folk theorem literature in economics (6, 7) has focused on the set of payoffs that can be sustained in equilibrium under finite and infinite horizons. In the public goods game (PGG), providing a punishment option to community members has been shown to effectively promote cooperation (8–11) and to reverse the decay in contribution that typically prevails in the standard PGG (12). It has often been suggested that society should not overly rely on punishment, particularly when viable alternatives are available. It has been shown that the range of parameter values for punishment to achieve efficiency can be quite narrow (13), and use of costly punishment can be individually disadvantageous (14). Furthermore, while punishment addresses “incorrect” behavior, it does not necessarily indicate the desired behavior; rewards can provide such targeted positive reinforcement. It may thus be more natural and effective to provide positive feedback (15). Punishment mechanisms may also enact serious social disadvantages, such as the destruction of a society’s physical resources as well as its overall social capital. Punishment carries negative psychological effects that may be deleterious to the social fabric; for example, it may lead to retaliation that potentially counteracts the well-meaning effort of those individuals who have chosen to punish free riders (16–18). Employing a reward mechanism might offer a more efficient and socially desirable approach. Recent studies find promise for the role of rewards in social dilemmas (19, 20); however, the exact conditions, particularly with regards to budget balance and procedures for rewarding behavior, are not yet fully understood. We propose an anonymous endogenous reward mechanism that taxes the gross income of each round’s PGG play and assigns this amount to a fund; each player then decides on how to distribute his or her share of the fund as rewards to other members of the community. The PGG with endogenous reward is a two-stage game, where the first stage is a standard PGG and the second is the reward stage. In the latter, a 20% tax is levied on each player’s income in the first stage, and the reward assignment results then determine the tax revenue redistribution. Our mechanism successfully achieves a high level of contribution (about 70%) with budget-balanced (1:1) rewards that require no external funding, which is an important condition for practical implementation of such a reward system. Endogenous rewards lead to a significant increase in cooperation compared with the control treatment, and, perhaps more importantly, the trend of cooperation increases with rewards. We find that players are much more likely to reward higher contributors than lower ones, regardless of their own contributions, and that most of the subjects in the experiments are conditional cooperators, who use the average contribution level in the group as a reference. Standard game-theoretical analysis shows that full contribution is never a Nash equilibrium (NE) under the 20% tax rate in our experiment, regardless of the reward assignment outcome in the second stage. In fact, we show, by constructing a plausibly socially responsible form of reward behavior, that sustainable contribution levels in a Nash equilibrium range from perfect free riding to some given upper bound. It turns out that the equilibrium upper bound is substantially less than the level achieved in our experiments. Moreover, due to the large range of sustainable equilibrium outcomes, the standard game-theoretic prediction on behavior is unclear. Accordingly, we demonstrate with simulations that, under the assumption of conditional cooperation, improvement in cooperation as well as the reversal of the stylized decay trend is more likely to emerge with an endogenous rewards mechanism than without it. Our data are consistent with the higher contribution levels under endogenous rewards being driven by two key effects: First, a strategic response to reward formation helps to avoid the no-contribution outcome. Second, our analysis shows that conditional cooperation may be a salient force in maintaining high levels of public goods contributions. The endogenous reward mechanism in the PGG yields a large and sustained improvement on contributions. Specifically, a certain level of tax is levied on each player, who then decides how to distribute the tax among the other group members, i.e., rewards are equivalent to unsubsidized 1:1 transfers. An intuitive interpretation is in the context of a primitive tribe society with an inherited tradition (perhaps the result of a common agreement at some time in tribal history) that every family should give a certain share of its daily or monthly physical yields to the public fund, and then expresses an opinion regarding how to redistribute this levied share among the other tribe members. In such contexts, a sophisticated central planner is not needed to promote cooperation. A unique feature is that our mechanism is both endogenous and budget balanced. While the magnitude of the increase in cooperation is comparable to that often found when punishment is feasible, the net social benefit is considerably higher under our mechanism, since there is no costly destruction of resources. By comparison, studies using punishment rarely find significant net payoff increases, and previous work showing effectiveness for punishment (or reward) mechanisms typically involves one spent unit delivering 3 times the effect on the receiving party, which may be an unrealistic feature. Peer reward systems can lead to second-order free riding, where players cooperate but do not volunteer to reward others, because rewarding others is costly. Theory (e.g., refs. 21 and 22) shows that failure is inevitable for peer rewards, unless reputation can be induced to facilitate it (19, 23). Previous experiments have consistently found that budget-balanced peer reward fails to prevent the decline of cooperation (24–26). Often, some players distribute bonuses to cooperators under this setup, and people behave more cooperatively after being rewarded. However, the number of people who reward typically decreases with the number of cooperators, so that the reward level (and contribution) drops. Note that centralized rewards with a simple preestablished rule typically fail to halt the cooperation decay both in theory and experiment (27–29) if the tax rate is not too high, such as in our setup. If the central planner has unlimited taxation power and has perfect information on the behavior of all members, then a fine-tuned Pigouvian transfer rule can implement the social optimum as the unique Nash equilibrium with perfectly rational players (30, 31). However, this type of solution may have only limited applicability in practice, since a central planner rarely has complete information and real tax rates are rarely as high as the solution requires; in other words, there is a large gap in applicability between the Falkinger−Pigou unlimited transfer setup and our design with a moderate tax rate and autonomous reward assignment decisions. Endogenous reward from a common fund bypasses these challenges without the need for external funding or detailed knowledge of individual contributions. Furthermore, budget-balanced reward has the distinction of always being feasible as an augmentation to the standard PGG, whereas punishment may be infeasible in practice (or even prohibited by law), and rewarding with a 3:1 benefit ratio may also be infeasible. Methodologically, we modify the set of features in previous reward mechanisms via one key dimension: providing internal funding of the mechanism. The results from the experiment are striking: Contributions increase to high levels. People reward high contributors and receive rewards, in turn, themselves when they make large contributions, cultivating a “virtuous cycle.” In addition, we find no evidence of a decrease in contributions over time; in fact, the contribution trend of our mechanism is significantly positive. This peer-directed balanced budget mechanism is thus successful in promoting public goods contributions. Empirical analysis indicates that the success of our mechanism stems from most people in the experiment behaving as conditional cooperators who respond to the observed average contributions. Note that, in our setting, players have the option to simply make equal distribution of the reward points, which would lead to convergence to the free-riding NE. It is reassuring to confirm that the “collective will” turns out to make good use of the provided reward mechanism.

Results

In the experiments, each subject interacted anonymously with the same three other players throughout. In “CR” experiments, subjects first play 10 rounds of the standard four-player PGG (CR1), followed by 10 rounds of the endogenous reward PGG (CR2); in “RC” experiments (RC1 and RC2), the order is reversed. In , we show evidence that this rate also emerges endogenously (from a menu of three possible rates) in another experimental treatment. The main result is that the mechanism promotes cooperation and reverses the downward trend of the standard PGG (Fig. 1). Overall, average contributions are considerably higher in the reward protocols (CR2 and RC1) than in the respective control protocols (CR1 and RC2) (Table 1), with P = 0.008 for each comparison (two-tailed signed rank test, one independent observation for each group). As has been typically found in the literature, average contributions decline in the control protocols (a two-tailed signed rank test comparing the first five and last five periods gives P = 0.070 in CR1 and P = 0.008 in RC2).

Fig. 1.

Time evolution of the average contribution levels. Average contributions decline in the control protocols CR1 and RC2; the downward trend is reversed in the reward protocols CR2 and RC1.

Table 1.

Average contributions and time trends in the four experimental protocols

Protocol	Periods 1 to 10	Periods 1 to 5	Periods 6 to 10	Slope	Constant	R²	F test, P value
CR1	9.25	10.91	7.59	−0.541	12.52	0.904	<0.001
CR2	14.39	14.68	14.11	0.179	14.06	0.411	0.063
RC1	10.71	9.97	11.46	0.241	9.38	0.656	0.008
RC2	6.58	9.14	4.03	−1.034	12.23	0.995	<0.001

Analysis is based on observations at the group level. The last round was excluded in these regressions due to end-game effects. The F test is provided for the slope coefficient. Statistical results for average contributions across protocols are shown in SI Appendix, Table S1.

Time evolution of the average contribution levels. Average contributions decline in the control protocols CR1 and RC2; the downward trend is reversed in the reward protocols CR2 and RC1. Average contributions and time trends in the four experimental protocols Analysis is based on observations at the group level. The last round was excluded in these regressions due to end-game effects. The F test is provided for the slope coefficient. Statistical results for average contributions across protocols are shown in SI Appendix, Table S1. The current mechanism is the only budget-balanced one among comparable settings in the literature that provides a persistently high rate of contributions along with efficiency improvement. In , we provide detailed discussion of how the increase in net profits in our experiments compares to findings in previous work on peer punishment and reward in the PGG. Regarding the delegation of rewards, there is a strong positive correlation between points received and relative contribution (Fig. 2); a regression shows that one will receive, on average, nearly one more point if one contributes an additional point to the public good. Had the subjects been aware of this empirical reward distribution, they would have realized the strong mitigating effect on the ex ante incentive to free ride, making it easier for leader-type conditional cooperators to maintain their contributions (Fig. 2). Detailed calculations are provided in .

Fig. 2.

Reward assignment behavior. (A) N1, N2, and N3 denote the other group members who contribute the most, second most, and least, respectively. Rounds 1 to 10 (11 to 20) refer to the RC1 (CR2) protocol. On average, N1 receives 15 points, N2 receives 10 points, and N3 receives 5 points ( has statistical results). (B) One more point above the others’ mean in the contribution stage yields, on average, 0.7 more points of return in the reward stage. (C) In the controls CR1 and RC2, high contributors (i.e., averaging between 15 and 20 points) earn significantly less than the group average (two-sided sign test, P value < 0.001), and free riders earn most. In CR2 and RC1 with rewards, however, high contributors earn only slightly (and insignificantly) less than the group average (two-tailed sign test, P value = 0.3449). While the standard PGG with standard self-interested preferences has a unique NE with zero contributions, the equilibria under the mechanism depend crucially on the rewards formation. We prove that any contribution level between zero and a certain upper bound can be sustained in a symmetric NE with standard self-interested preferences, assuming suitable choices of reward functions. However, given the large range of equilibrium outcomes, neither NE nor a subgame perfect equilibrium that extends the NE concept for repeated settings can be viewed as possessing satisfactory predictive powers, without further equilibrium refinement. (Detailed proofs and discussions are in and .) Thus, we find it useful to look for alternative behavioral motivations to explain the contribution behavior observed in the experiment.

Model of Conditional Cooperation.

The conditional cooperation model, in which a conditional cooperator (also known as a conformist) adjusts the next round’s contribution based on the previous round’s experience, has been widely used to explain observed outcomes in PGG experiments and other social dilemmas (29, 32–37). Consider a scenario in which players update their strategies according to the conditional cooperation rule, where they change their contribution in the next round in the direction of the average current-round contribution of the group. Thus, the contribution of a player in round has the form , where and denote the contribution of the player and the average contribution of his/her three group members, respectively, in round (in this, we follow ref. 35). In this context, an individual’s behavioral pattern in a repeated PGG can be described by a three-dimensional vector , where is the contribution in the first round and measures the effect of the behaviors of others on one’s decision-making. Let denote the average payoff of the other three players in round , so that, then, , implying that . Therefore, an alternative to the above recursion equation is . We call this formulation the “payoff-based” conditional cooperation. In the PGG with endogenous reward, this payoff-based conditional cooperation can capture the aggregate postredistribution effect that motivates the conditional cooperator’s adaptive changes. We show in that, for monotone-increasing reward-assigning functions, group compositions of -type conditional cooperators exist with the group average contribution increasing (decreasing) over rounds in the reward (standard) PGG. So, while the presence of conditional cooperators is in itself insufficient to prevent the downward trend to the free-riding equilibrium, endogenous rewards greatly facilitate the escape from this decay trap, which otherwise eventually results in very low contributions. We now demonstrate through simulations that the reward institution can be an effective means to promote cooperation in a large set of plausible profiles of estimated conditional cooperator types.

Simulations.

Note that, due to the limited number of rounds, we cannot reliably and confidently estimate the CC-type for each subject individually. Therefore, as a first step, following refs. 29 and 32, we classify subjects’ actions into three categories: conforming, cooperating, or defecting (see for details). Based on these action characterizations, we further classify subjects based on their observed cooperative tendencies: There are people who are strongly inclined to cooperate, people who rarely cooperate, and those who cooperate primarily depending on whether others are cooperating. This pattern forms the basis for our three types. While there could also be other types (e.g., people who mix randomly), these types are less compelling and so are subsumed into the “Unclassified” category. In our baseline classification, we designate a player as a conformist (cooperator, defector) if conforming (cooperating, defecting) behavior is displayed more than 50% of the time altogether in the rounds other than the first and last in a segment (conforming behavior cannot be identified in the first period, and last-period unraveling is frequently encountered in experiments), in CR or RC. In , we show that our subsequent regression results are robust to tighter classification rules requiring consistent behavior by an individual for up to 75% of the rounds. Fig. 3 illustrates how the behavior and associated types are distributed across treatments.

Fig. 3.

Prevalence of types. (A) The average proportions of conforming, defecting, and cooperating behaviors in CR and RC excluding the first and last rounds of each segment. (B) The proportions of conforming, cooperating, defecting, and unclassified players in CR and RC. A player is categorized as unclassified if s/he adheres to none of the three behaviors more than 50% of the time. Classifications of higher percentage thresholds are considered in . Table 2 shows that regressions based on this classification indeed yield consistent type-specific results, which can be interpreted as a justification for our classification criterion in hindsight. To make the best use of our limited sample size, we assume that conditionally cooperative players behave consistently across conditions in terms of underlying parameters in the conditional cooperation model, and thus we combine data from their actions across the control and treatment cases. The numbers of individuals for thus-identified types in CR and RC (i.e., Fig. 3) are shown in Table 2. In both CR and RC, the proportion of conforming individuals is about 50%, and the proportions of cooperating and defecting individuals are about 12%, while the others are unclassified. In the second step, we estimate for each type of individual by linear regression (Table 2). In the third step, we conduct simulations with parameters motivated by Table 2, which demonstrates concretely that rewards can be effective via the channel of conditional cooperation models.

Table 2.

Classification and characterization of CC types

Type	CR	RC	x₀	a	B	Adjusted R²	F test, P value
Cooperator	4	4	18	1.03 (±0.07)	0.41 (±0.18)	0.9175	<0.001
Conformist	15	12	10.3	0.97 (±0.03)	0.68 (±0.09)	0.9021	<0.001
Defector	3	4	6	0.84 (±0.09)	0.23 (±0.13)	0.7860	<0.001
Unclassified	10	12	12.6	0.93 (±0.05)	0.50 (±0.12)	0.8388	<0.001

We define a player as a conformist (cooperator, defector) if the frequency that he or she displays conforming (cooperating, defecting) behavior from rounds (other than the first and last rounds of a segment) is greater than 50%. A player is unclassified if s/he exhibits none of the three behaviors more than 50% of the time. Columns 2 and 3 show the numbers of different types of individuals in CR and RC, respectively. Columns 3 to 5 are (x0, a, b) for different type of individuals, where (±) is the 95% confidence interval. Columns 6 and 7 are coefficients of determination and P values of the linear regression, respectively. Sample range includes rounds 2 to 9 and 12 to 19. Robustness checks incorporating player fixed effects and clustering SEs at the group level give results of similar relative magnitude and significance as the baseline specification. Details are provided in SI Appendix, section 3 and Tables S5 and S6. The (x0, a, b) for types of individuals under other classification rules are shown in SI Appendix, Table S8. Overall, these parameters are not sensitive to the exact percentage threshold for classification rules.

Classification and characterization of CC types We define a player as a conformist (cooperator, defector) if the frequency that he or she displays conforming (cooperating, defecting) behavior from rounds (other than the first and last rounds of a segment) is greater than 50%. A player is unclassified if s/he exhibits none of the three behaviors more than 50% of the time. Columns 2 and 3 show the numbers of different types of individuals in CR and RC, respectively. Columns 3 to 5 are (x0, a, b) for different type of individuals, where (±) is the 95% confidence interval. Columns 6 and 7 are coefficients of determination and P values of the linear regression, respectively. Sample range includes rounds 2 to 9 and 12 to 19. Robustness checks incorporating player fixed effects and clustering SEs at the group level give results of similar relative magnitude and significance as the baseline specification. Details are provided in SI Appendix, section 3 and Tables S5 and S6. The (x0, a, b) for types of individuals under other classification rules are shown in SI Appendix, Table S8. Overall, these parameters are not sensitive to the exact percentage threshold for classification rules. From Table 2, we observe that cooperators typically have high a and low b, defectors have low a and low b, and conformists have an a slightly lower than for cooperators, but with a high b. With a > 1, cooperators could be seen as leaders whose behaviors are tracked by conformists with approximately a = 1 (see for additional robustness checks). An additional question of interest is for which typical group compositions of the three types we expect an increase or decline of the cooperation level, and whether the reward and control conditions induce different trends. We postulate that there are large sets of type profiles where (i) the reward condition dominates the control regarding the cooperative trend, i.e., there is a bifurcation in dynamic convergence, and (ii) the reward condition leads to an increase in cooperation while the control leads to decay, i.e., a strict bifurcation. Due to the complexity of the task, a general algebraic solution is not available. However, we conducted a series of numerical simulations that provides support for these postulates, as shown in Fig. 4. For the simulation samples presented in Fig. 4, we start with a set of stylized parameter specifications for cooperators, conformists, and defectors close to the estimations in Table 2. We use the (a, b) parameters that are within the 95% confidence interval of the estimated parameters in Table 2.

Fig. 4.

Agent-based simulations. Columns differ in group type composition: CCFD refers to group with two cooperators, one conformist, and one defector; CFFD and CFDD have (1, 2, 1) and (1, 1, 2) of (cooperators, conformists, defectors), respectively. (A–C) Time evolution of the group average contribution. Parameters for cooperators, conformists, and defectors are fixed at (15, 1.05, 0.4), (10, 1, 0.7), and (5, 0.9, 0.2), respectively, and their r in reward assignment functions are taken as 2.5, 2, and 1.5, respectively. (D–I) Robustness tests for the effectiveness of the reward mechanism, where and refer to cooperator’s and defector’s parameters, respectively. In the green regions, a trend bifurcation similar to A–C occurs between the reward and control mechanisms. In the yellow regions, the decay trend in contribution is reversed, displaying strict bifurcation. In the blue regions, either both reward and control are trending up or both converge to zero. Robustness tests with different parameter values and group compositions are provided in . The composition of types in a simulation group consists of at least one each of the three types, with the fourth player varying between these types in different groups. For a variety of group compositions and parameters a and b, the group average contribution increases (decreases) over rounds in the reward (standard) PGG (see the yellow regions in Fig. 4 ). In the green regions, a bifurcation occurs between the reward and control mechanisms, where the contribution level in the reward condition is higher than the control condition. Finally, the blue regions indicate situations in which either both reward and control conditions are trending up or both converge to zero. This occurs, for example, with close to 1, as seen in Fig. 4 . When even the lowest contributors in the group are unwilling to unconditionally decrease their contributions, it is no surprise that this group’s average contribution easily goes up even in the control. In Fig. 4 , on the other hand, we see that bifurcations are more likely to occur when the ratio between and is within some range.

Discussion

The conditional cooperation model is relatively simple but nevertheless captures a salient behavioral feature. One important question about the model’s fit as an explanation of behavior is whether subjects respond directly to histories beyond the previous period, as proposed by the conditional cooperation model. We also consider the two-lag extension to the standard conditional cooperation model, . Regressions show that the second lag of payoff gap is generally insignificant [but is positive and marginally significant (P = 0.08) for conformists] and thus not the predominant influence (). However, regressions show that the second lag of the inertial component of behavior is significant for conformists and unclassified (P < 0.001 and 0.01 respectively), to the extent of approximately . This may have the effect of mitigating high swings in a player’s action dynamics, which may be caused by high . The simulation results are similar with or without the second lag (). In addition, actual conditionally cooperative behavior is more complicated than the formulae capture. For example, the willingness to be a leading cooperator with a > 1 may diminish with the expected remaining number of periods in the game, since it may be a strategic investment for the collective with a sufficiently long joint future ahead. Individuals may be asymmetric in their adaptive change toward the mean; “nobler” conditional cooperators might raise their contribution level faster than lowering it, as if having an aversion to taking advantage of the group; less forgiving conditional cooperators might display the opposite trend. Since our main concern is to demonstrate the effectiveness of our endogenous reward mechanism, we will leave the extensions of more-complete models of conditional cooperation to future studies. Our study has demonstrated that endogenous reward in a budget-balanced setup can be highly effective in promoting public goods contributions. This positive result would be further strengthened if it is shown that real players voluntarily choose such a reward mechanism against other budget-balanced alternatives. The pilot experiments of our follow-up research suggest that the reward mechanism proposed here is indeed sustainable in a larger sense by allowing for a vote for the tax rate regime among group members (a preview of the basic result is provided in ), although it is not a forgone conclusion that society as a whole benefits from having members self-sorting into different tax regimes at an early stage. This suggests a potential future research program in understanding the willingness of the members of society to endogenously set up institutions that effectively promote the public good (see refs. 11 and 38–40 for work in this direction). In addition, future research can consider the sustainability of such endogenous mechanisms under asymmetric players (41, 42), which is a key condition for the real-world implementation of such endogenous policies. Finally, an open question remains, what happens if both reward and punishment options are simultaneously available each time after a round of the PGG? Would we observe both incentives being selected in the same group dynamics, and, if so, which incentive system would dominate? Through the feature of an institutional tax, our tested mechanism avoids the funding concern of reward (and punishment) and therefore offers a level playing field for this methodological contest.

Methods

Experimental Design.

The Institutional Review Board at School of Mathematical Sciences, Beijing Normal University, reviewed and approved this research, and informed consent was obtained from subjects before participation. A total of 64 students from Beijing Normal University participated voluntarily in our PGG experiments at the School of Mathematical Sciences Computer Lab, Beijing Normal University. We developed a computer program that is similar in function to the software z-Tree. Subjects interacted anonymously via computer screens for 20 rounds of the repeated game among the same four players. In CR experiments (32 subjects, eight groups, one session), subjects first play 10 rounds of the standard four-player PGG (CR1), and then play 10 rounds of the PGG with endogenous reward (CR2); in RC experiments (32 subjects, eight groups, one session), the order is reversed. Subjects first play 10 rounds of the PGG with endogenous reward (RC1), and then 10 rounds of the standard PGG (RC2). Before starting each experimental protocol, we explained (on the computer) the game to all participants, including the rules of the game and the feedback that they would receive about the history of play (10 min). Subjects were told that they would interact with the same three people for the whole experiment, and they knew that the number of rounds per protocol was fixed beforehand at 10. All players in a protocol were given the same instructions (in Chinese). The translation of the instructions can be found in . To attempt to ensure that all participants fully understood the game, we implemented comprehension exercises (10 min) before starting each experimental protocol. There was no time limit for decision-making in the formal experiment, and, in practice, the whole experiment lasted about 60 min. After the experiment, the total monetary units of each subject obtained in the experiment were converted to Chinese yuan at a ratio of 100:7. This pay plus the 20-yuan show-up payment was the subject’s final income. The average income was 56.6 renminbi (minimum 43.7, maximum 65.2). In the standard PGG, each player has 20 monetary units, and an individual’s single-round expected payoff is when he/she contributes monetary units to the common pool and the average contribution of the group is . Assume a tax rate is levied on first-stage gross income. Then, is the total tax revenue to be redistributed in the second stage of endogenous reward. For our study, we set = 20%. In the reward stage, each player has 30 points, and decides how to distribute these points among the other three group members. The value of each point is . Thus, in the PGG with endogenous reward, player i’s expected payoff is if he/she received total points from other players.

Equilibrium Analysis.

It is well known that the standard PGG has a unique NE in which . In the PGG with endogenous reward, the NE depends crucially on reward formation. Suppose that player j allocates points to player i. Thus, with and . We can show that, in any symmetric NE, the individual contribution level is maximally , i.e., 11.1 for . In , we prove the following folk-theorem-type result.

Proposition S1.

In the PGG with endogenous reward, there exists an anonymous and monotone reward function , so that it constitutes a Nash equilibrium if every person contributes in the first stage and assigns reward points according to in the second stage. Note that any N-period sequence of NE of the base game constitutes a subgame perfect equilibrium (SPE) in the N-period finitely repeated game. The multiplicity of base-game NE can generate additional SPE, where the range of sustainable contribution sequences depends on . In fact, except for the last period where equilibrium action is bound by , everything can be consistent with SPE in our setup. has further discussion on SPE. Now, for the purpose of data fitting, we restrict our further attention to the redistribution rule with . For illustration, we consider the following three special cases: (i) , which corresponds to the extreme case of equal shares for all other players, independent of their contribution; (ii) , which implies rewarding the highest contributor only; and (iii) , which means rewarding proportionally to the contributions made. We calculate, in , that the pure strategy symmetric Nash equilibrium here is under rule i; under rule iii; and nonexistent under rule ii. We also estimate r in the reward assignment function for each player based on the least squares rule. The average overall is r = 2.33 with R2 = 0.695. Note that the existence of a symmetric pure strategy NE is impossible for any . Furthermore, for the three types of player classification, cooperators are the most prosocial, with r = 2.50, followed by r = 2.07 and r = 1.36 for conformists and defectors, respectively. (Unclassified players have r = 2.90.)

24 in total

1. Retaliation and antisocial punishment are overlooked in many theoretical models as well as behavioral experiments.

Authors: Anna Dreber; David G Rand
Journal: Behav Brain Sci Date: 2012-02 Impact factor: 12.579

2. Incentives and opportunism: from the carrot to the stick.

Authors: Christian Hilbe; Karl Sigmund
Journal: Proc Biol Sci Date: 2010-04-07 Impact factor: 5.349

3. Conditional cooperation and costly monitoring explain success in forest commons management.

Authors: Devesh Rustagi; Stefanie Engel; Michael Kosfeld
Journal: Science Date: 2010-11-12 Impact factor: 47.728

Review 4. Evolution of indirect reciprocity.

Authors: Martin A Nowak; Karl Sigmund
Journal: Nature Date: 2005-10-27 Impact factor: 49.962

Review 5. On the interaction of the stick and the carrot in social dilemmas.

Authors: Manfred Milinski; Bettina Rockenbach
Journal: J Theor Biol Date: 2011-03-31 Impact factor: 2.691

6. Cooperating with the future.

Authors: Oliver P Hauser; David G Rand; Alexander Peysakhovich; Martin A Nowak
Journal: Nature Date: 2014-06-25 Impact factor: 49.962

7. Third-party punishment increases cooperation in children through (misaligned) expectations and conditional cooperation.

Authors: Philipp Lergetporer; Silvia Angerer; Daniela Glätzle-Rützler; Matthias Sutter
Journal: Proc Natl Acad Sci U S A Date: 2014-04-28 Impact factor: 11.205

Endogenous rewards promote cooperation.

Results

Model of Conditional Cooperation.

Simulations.

Discussion

Methods

Experimental Design.

Equilibrium Analysis.

Proposition S1.

1. Retaliation and antisocial punishment are overlooked in many theoretical models as well as behavioral experiments.

2. Incentives and opportunism: from the carrot to the stick.

3. Conditional cooperation and costly monitoring explain success in forest commons management.

Review 4. Evolution of indirect reciprocity.

Review 5. On the interaction of the stick and the carrot in social dilemmas.

6. Cooperating with the future.

7. Third-party punishment increases cooperation in children through (misaligned) expectations and conditional cooperation.

8. Winners don't punish.

9. Indirect reciprocity provides only a narrow margin of efficiency for costly punishment.

10. The role of institutional incentives and the exemplar in promoting cooperation.

1. The competitive advantage of institutional reward.

2. Protocol for quantitative assessment of social cooperation in mice.

3. Reputation effects drive the joint evolution of cooperation and social rewarding.

4. Cooperation in an Assortative Matching Prisoners Dilemma Experiment with Pro-Social Dummies.