Literature DB >> 35259010

Existence of equilibria in repeated games with long-run payoffs.

Galit Ashkenazi-Golan¹, János Flesch², Arkadi Predtetchinski³, Eilon Solan⁴.

Abstract

SignificanceNash equilibrium, of central importance in strategic game theory, exists in all finite games. Here we prove that it exists also in all infinitely repeated games, with a finite or countably infinite set of players, in which the payoff function is bounded and measurable and the payoff depends only on what is played in the long run, i.e., not on what is played in any fixed finite number of stages. To this end we combine techniques from stochastic games with techniques from alternating-move games with Borel-measurable payoffs.

Entities: Chemical

Keywords: Nash equilibrium; countably many players; repeated games; tail-measurable payoffs

Year: 2022 PMID： 35259010 PMCID： PMC8931206 DOI： 10.1073/pnas.2105867119

Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN： 0027-8424 Impact factor: 12.779

Introduction

Nash equilibrium, of central importance in strategic game theory and its applications [Holt and Roth (1)], exists in all finite games [Nash (2)]. It exists also in some infinitely repeated (IR) games [Shapley (3) and Sorin (4)]; in others, only ε-equilibria (see end of section 2.1) exist [Vieille (5, 6)]; and in others still, even they do not [Gale and Stewart (7)]. Here we prove that ε-equilibria exist in all IR games in which the payoff depends only on what happens in the long run, more precisely, whenever action streams that differ in only finitely many stages have the same payoff (technically, we require Borel measurability and tail measurability [section 2.2]). Moreover, we allow countably infinitely many players [Peleg (8) and Voorneveld (9)]. Thus, interestingly, even though one-shot games with countably infinitely many players and finite action sets may fail to have an ε-equilibrium, IR games with countably infinitely many players, finite action sets, and tail-measurable payoffs do have an ε-equilibrium. We remark that the existence of ε-equilibrium is not known even when the number of players in the game is finite. The world is finite; infinite models capture those aspects of the finite world that are difficult to capture with finite models. For example, IR games capture the idea of a long repetition when it is not clear or too costly to compute how long the repetition is. Models with countably infinitely many players are used to capture interactions with many players, for example, models in which long-run players coexist with a sequence of short-run ones [Mailath and Samuelson (10)], overlapping generations models [Kandori and Obayashi (11)], large networks [Jackson (12)], and models with time-inconsistent decision makers [O’Donoghue and Rabin (13)]. IR games with countably infinitely many players model situations in which these two phenomena (long duration and many players) interact, for example, models in which a certain phenomenon occurring in a local area of a network propagates over time to the distant reaches of that network. To prove the result we introduce a technique to the study of IR games with general payoff functions, which combines Martin’s method (14) for Borel games and Blackwell games, with a method developed by Solan and Vieille (15) for stochastic games. The structure of the paper is as follows. In section 2, we introduce the model, state the existence result, and discuss an illustrative example. In section 3, we prove the main result. In section 4 we make some concluding remarks. An IR game with countably many players is a triplet , where is a countable set of players; A is a nonempty and finite action set for player i, for each (let denote the set of action profiles, and denote the set of plays, i.e., infinite streams of action profiles); and is player i’s payoff function, for each .

Model and Main Result

The Model.

The game is played in discrete time as follows. In each stage , each player selects an action , simultaneously with the other players. This induces an action profile in stage t, which is observed by all players. Given the realized play , the payoff to each player is .

Histories.

A history is a finite sequence of elements of A, including the empty sequence . The set of histories is denoted by , where A is the t-fold product of A. The current stage (or length) of a history is denoted by . For a play , write to denote the prefix of p of length t + 1. For two histories and a play , we denote the concatenation of h and by and that of h and p by hp. Each history induces a subgame of Γ. Given , the subgame that starts at h is the game , where and is given by .

Measure theoretic setup.

We endow the set A of action profiles, the sets A for , and the set of plays with the product topology, where the action set A of each player has its natural discrete topology. We denote the corresponding Borel sigma-algebras by , and . A function that is measurable with respect to one of these sigma-algebras is called product-measurable.

Strategies.

A mixed action for player i is a probability measure x on A. The set of mixed actions for player i is denoted by X. A mixed action profile is a vector of mixed actions , one for each player. The set of mixed action profiles is denoted by X. A (behavior) strategy of player i is a function that satisfies the following measurability condition: for each stage and each action , the mapping from A to given by is product-measurable. We denote by Σ the set of strategies of player i. A strategy profile is a vector of strategies , one for each player. We denote by Σ the set of strategy profiles.

Strategy profiles with finite support.

A mixed action profile x is said to have finite support if only finitely many players randomize in x, i.e., all players except finitely many play a pure action with probability 1. We denote by the set of mixed action profiles with finite support. A strategy profile σ is said to have finite support (at each stage) if for every history . The set of players who randomize may be history dependent, and its size may not be bounded. We denote by the set of strategy profiles with finite support.

Definitions for the opponents of a player.

For , let – i denote the set of its opponents, . When restricting attention to players in – i, we can similarly define mixed action profiles , strategy profiles , mixed action profiles with finite support, and strategy profiles with finite support. The corresponding sets are denoted by , and , respectively.

Expected payoffs.

For each history the mixed action profile induces a unique probability measure on by Kolmogorov’s extension theorem, and then σ induces a unique probability measure on by the Ionescu–Tulcea theorem. If player i’s payoff function f is bounded and product-measurable, then player i’s expected payoff under σ is In that case, player i’s expected payoff under σ in the subgame starting at a history h iswhere σ is the strategy profile defined as for each history .

Equilibrium.

Consider a game with countably many players and bounded and product-measurable payoffs. Let , where for each player . A strategy profile is called an -equilibrium, if for every player and every strategy , we have .

Tail Measurability.

A set of plays is called tail if whenever and satisfy for every t sufficiently large, we have . Intuitively, a set is tail if changing finitely many coordinates does not change the membership relation for B. The tail sets of form a sigma-algebra, denoted by . The sigma-algebras and are generally not related: neither of them includes the other [Rosenthal (16) and Blackwell and Diaconis (17)]. A function from to is called tail-measurable if it is measurable with respect to . Various important evaluation functions in the literature of IR games are tail-measurable, for example, the limsup and liminf of the average stage payoffs [e.g., Sorin (4), p. 73]. Various classical winning conditions in the computer science literature, such as the Büchi, co-Büchi, parity, Streett, and Müller [e.g., Horn and Gimbert (18) and Chatterjee and Henzinger (19)], are also tail-measurable. The discounted payoff [e.g., Shapley (3)] is not tail-measurable.

The Main Result.

We now present the main result of the paper. Consider an IR game with countably many players. If each player’s payoff function is bounded, product-measurable, and tail-measurable, then the game admits an -equilibrium for each , where for each . We show the existence of an -equilibrium with a simple structure: The players are supposed to follow a play; that is, each player is supposed to play a fixed sequence of actions. If a player deviates, her opponents switch to a strategy profile with finite support to punish the deviator. Thus, it never happens that, on or off the equilibrium play, infinitely many players randomize at the same stage. A 0-equilibrium does not always exist under the conditions of , not even in games with only one player (the other players can be treated as dummies). Indeed, suppose that this player has two actions, 0 and 1. For each play , let denote the limsup frequency of action 1: . Define the payoff to be if and 0 if . This payoff function is tail-measurable, the player can obtain a payoff arbitrarily close to 1, but the player cannot obtain 1. The example in section 2.4 illustrates the main ideas of the construction of -equilibria in the proof of .

An Illustrative Example.

Voorneveld (9) studied the following one-shot game with countably infinitely many players, which is a variation of the example in Peleg (8) and in which each player would like to choose the action that is chosen by the minority of the players. The set of players is . The action set of each player is . The payoff of each player is Voorneveld (9) showed that this one-shot game has no -equilibrium when the values are not too large, for example, when for every i. Indeed, suppose by way of contradiction that is an -equilibrium. The set of action profilesis tail. Hence, by Kolmogorov’s 0–1 law, either or . In the former case, choosing action 0 gives payoff 0, while choosing action 1 gives payoff 1, and therefore the strategy of each player i chooses action 1 with probability at least , which contradicts . The argument is analogous for the latter case. Hence, cannot be an -equilibrium. Consider now the following repeated version Γ of Voorneveld’s game: The set of players is . The action set of each player is . The payoff of each player is equal to 1 if for all sufficiently large stages t, and it is equal to 0 otherwise, where r is defined in Eq. . Thus, player i wins in Γ if she wins the one-shot Voorneveld game in all but finitely many stages. The payoff function of this game is product-measurable and tail-measurable, and hence, the game satisfies the conditions of . Even though the one-shot Voorneveld game has no -equilibrium when for every i, according to the IR game Γ does have an -equilibrium when for every i. In fact, the following strategy profile is a pure -equilibrium in the game Γ: fix an arbitrary default action for each player i; let each player i play her default action in all stages t < i, and in each stage , let her play action 0 if is an element of B and action 1 otherwise. Under this strategy profile, each player i wins the one-shot Voorneveld game in all stages , i.e., for each . Hence, in the game Γ each player’s payoff is 1. In the above -equilibrium, the players are supposed to follow a specific play of the game. Since each player’s payoff is maximal under this play, no player has an incentive to deviate from it. The general construction shares one crucial feature with the example above: the -equilibrium play, denoted p, is pure. The key property of the play p is that it yields each player i at least her minmax value, possibly up to . The minmax value of a player is the highest expected payoff that the player can defend against any strategy profile of her opponents; see section 3.1 for a formal definition. If a player deviates from the play p, the other players start punishing the deviator at her minmax value. Showing that a play p with the desired property exists is the main challenge of the construction. It is to tackle this problem that Martin’s (14) techniques and tools from stochastic games are called for. To apply the results of Martin (14), we cannot allow countably infinitely many players to randomize at the same stage. We therefore do not use the classical minmax value of player i, where all opponents may randomize their actions. Rather, we will define the concept of the finitistic minmax value of player i, restricting the opponents of player i to strategy profiles with finite support.

The Proof

In section 3.1 we introduce the notion of finitistic minmax value, designed specifically for games with countably infinitely many players. In section 3.2 we establish a relation between the finitistic minmax value of the IR game and the finitistic minmax values of a collection of one-shot games, where each one-shot game is associated with a history in the IR game. In section 3.3 we use these one-shot games to derive the existence of a play in which each player’s payoff is at least her finitistic minmax value, up to some error term, and complete the proof of .

The Finitistic Minmax Value.

In IR games it is customary to use threats of punishment at the minmax value to discipline players. In the presence of countably infinitely many players, when the payoff function is not product-measurable, the expected payoff and therefore the minmax value are not always well defined. In our proof we will use one-shot games where the payoff functions are not necessarily product-measurable; hence, we introduce a suitable related notion, the so-called finitistic minmax value. This notion imposes that all but finitely many of a player’s opponents play a pure strategy. As a consequence, it is well defined regardless of whether the payoff function is product-measurable or not.

The finitistic minmax value in one-shot games.

Consider a one-shot game , where I is a countable set of players, and for each , A is a nonempty and finite action set of player i and is her payoff function. Consider a player . If r is bounded and -measurable, then each mixed action profile induces an expected payoff for player i. In the game G, the (classical) minmax value of player i is defined as The minmax value can be interpreted as the highest payoff that player i can defend against any mixed action profile of her opponents. Recall that denotes the set of mixed action profiles such that x is pure for all but finitely many . Each mixed action profile induces a well-defined expectation for any function . The finitistic minmax value of player i in G is Since A is finite, we can replace by . The classical and the finitistic versions of the minmax value coincide for any function r that is bounded and -measurable. We note that in Eqs. and we cannot replace the infimum by minimum. Indeed, suppose that i = 0, and player 0 has a single action, while all other players have two actions: a and b. Define r0 to be if player n > 0 is the only opponent who plays action b and to be 1 otherwise. Then player 0’s minmax value and finitistic minmax value are equal to 0, but player 0’s payoff is never 0.

The finitistic minmax value in IR games.

Let Γ be an IR game with countably many players. Assume that player i’s payoff function f is bounded and product-measurable. Define the finitistic minmax value of player i in the IR game Γ as Note the distinction between and ; the former is used for the finitistic minmax value in one-shot games, and the latter is used for the finitistic minmax value in IR games. Player i’s finitistic minmax value in the subgame starting at a history h is When a player’s payoff function is tail-measurable, her payoff is independent of the initial segment of the play. As a result, her finitistic minmax value is independent of the history. Let Γ be an IR game with countably many players. Assume that some player i’s payoff function f, product-measurable, and tail-measurable. Then, for every history .

Defending the Finitistic Minmax Value.

We start with an auxiliary result. Let be an IR game with countably many players. Fix a player . For a function and a history , we write to denote the function . This way, given a function , we can associate the following one-shot game with each history : the action set of each player is A, and the payoff of player i under each action profile is . The payoff functions of the other players are irrelevant. Let Γ be an IR game with countably many players. Consider a player and assume that her payoff function f and is product-measurable. Suppose moreover that and let . There exists a function satisfying the following conditions: At the empty history: . for every . for every play . Intuitively, the function d has the following properties. is the initialization. requires that at any history h, if the opponents of player i use a mixed strategy profile with finite support, then player i has a response x such that the expectation of the function d under the mixed action profile is at least d(h). In other words, player i is able to defend the payoff of d(h) in the one-shot game against finitistic mixed action profiles of the other players. states that the actual payoff is at least the limsup of the sequence of values under d. Before proving , we provide two examples. Consider the IR game in section 2.4. Define d recursively: let and set . Let be such that d(h) has been defined. If , let for each . If , let for each . The function d satisfies of , and it satisfies since . To see that it satisfies of , suppose that for a play . Then for some stage , and therefore, for all stages . ◊ Let Γ be the following two-player zero-sum IR game: The action set is for player 1 and for player 2 (and all other players are dummies). The payoff function f1 assigns payoff 1 to each play such that (T, L) or (B, R) are played infinitely many times and assigns payoff 0 to all other plays. One can think of Γ as a repeated matching pennies gamewhere player 1 wins if she receives payoff 1 infinitely often. The function f1 is tail-measurable, and for each history h. Moreover, one strategy of player 1 that guarantees the payoff 1 is playing the mixed action at each stage. Let be the functions and . Define d recursively. Fix and let . If d(h) has been defined, let be given as follows: Note that the value of d is exactly d(h). Also, if p is a play where both (T, L) and (B, R) are played only finitely many times, then is monotonically decreasing after some stage and is approaching 0. Thus, the function d satisfies the conditions of for i = 1.◊ The proof of is largely based on the arguments in Martin (14) and Maitra and Sudderth (20). These arguments involve an auxiliary zero-sum perfect information game in which the winning strategy of the first player produces the desired function d. We provide a sketch of the argument, noting what modifications to Martin’s and Maitra and Sudderth’s proofs are required, but omit the details. Given a game Γ, a player i satisfying the conditions of , and a number , consider the following perfect information game, denoted : Player I chooses a one-shot payoff function such that . Player II chooses an action profile such that . Player I chooses a one-shot payoff function such that . Player II chooses an action profile such that . Player I chooses a one-shot payoff function such that , and so on. To distinguish between histories and plays in Γ and in , we call the latter positions and runs. The outcome of is a run . Player I wins if Note that player I has a legal move at any stage of the game, for instance, the function such that r(a) = 1 for all . By induction one can show that player II always has a legal move as well. Let T be the set of all positions in the game . Then T is a pruned tree on the set , where R denotes the set of functions . One can check that player I’s winning set in is a Borel subset of the set of runs, where the set of runs is endowed with the product topology and is endowed with the discrete topology. By Martin (21), is determined: either player I has a winning strategy in the game or player II does. We next establish the following fact. Player I has a winning strategy in whenever . Fix w such that and suppose that player I has no winning strategy in . Then player II does. Fix player II’s winning strategy, say , and . We spell out the key definitions and give a sketch of the steps of the proof, which follows closely the arguments in Maitra and Sudderth (ref. 20, lemmata 4.3 and 4.4). One defines a notion of consistency (with respect to and δ) of a history of the IR game Γ and simultaneously a function ψ mapping each consistent history of Γ into a player I’s position in of the same length. The definition is recursive in the length of the history. The empty history of Γ is consistent; we define to be the empty position of . Take a and a consistent history for which player I’s position has been defined; let and consider the history in Γ. Let be the set of player I’s legal moves r at position in to which responds with a: If is not empty, is defined to be consistent. In this case, define To define , choose so thatand let It holds thatwhenever is a consistent history and . The proof of is by contradiction. Assuming that for some , we define player I’s move at position by , argue that the move is legal, and consider player II’s reaction a as prescribed by . Then and we obtain a contradiction to Eq. . Obtain a contradiction by showing that . Define a process on . Define first for each as follows: let ; for each , let whenever is consistent and otherwise. Now define by for each . This defines a process on adapted to the filtration . From Eq. and the fact that is winning, one deduces that everywhere on . Define , a strategy profile with finite support for i’s opponents in Γ. For each history h of Γ, choose a mixed action profile with finite support so that for each , At this point, a delicate issue is the measurability property of strategies. Recall that for each stage t, player j, and action a, the mapping should be measurable. Since we have chosen to be finitely supported, the number of histories that occur with positive probability at each stage under is finite. Since the definition of the strategy in histories that occur with probability 0 is irrelevant, we can set the action chosen at these histories to be an arbitrary fixed action, thus ensuring that the measurability property is satisfied. The process W is a supermartingale with respect to the measure for each . This follows by combining Eqs. –. It holds that for each . This follows by the martingale convergence theorem and . This inequality contradicts the choice of δ. □ Now, using , we complete the proof of . Take and let be a winning strategy of player I in . The strategy induces the function d with the desired properties, as follows. Define . Let denote the set of positions in the game of even length (i.e., player I’s positions) that are consistent with . Let denote the set of histories of Γ consisting of the empty history , and nonempty histories for which there is a sequence of elements of R such that is an element of . If , define d(ha) = 0 for each . If , let be player I’s move at position , and let for each . Since r is determined by is uniquely determined by . The reader can verify that d satisfies Conditions a–c. □ By using , we derive the existence of a function with certain desirable properties. Let Γ be an IR game with countably many players. Consider a player and assume that her payoff function f and is product-measurable. Then, for every , there exists a function satisfying the following conditions: for every . for every . For every strategy profile with finite support such thatwe have is identical to of . requires that at each history h, the finitistic minmax value of the one-shot game is not too low: it should be at least . links local conditions on the one-shot games , with a global condition on the IR game: under any strategy profile with finite support, if in each one-shot game , player i’s expected payoff under the mixed actions played at h is at least , then in the IR game, for each history h, player i’s expected payoff in the subgame at h is also at least . The function d defined in violates since there are histories h such that the one-shot payoff function d is identically zero, whereas for each history h. Consider instead the function d defined by and for each and . Thus, , and so for each h. Conditions M.1, M.2, and M.3 are all satisfied. To verify , consider a strategy profile satisfying Eq. . Then for each the mixed action profile is supported by the set of action profiles such that . This implies that is supported by the set of plays with . The function d does not satisfy of . Consider, for instance, the play p where at each stage all the players in – i play 0, while player i plays 0 at even stages and 1 at odd stages. Then i’s reward is 0 at even stages and 1 at odd stages, implying that , while . In fact, one can show that in this example, there is no function d that simultaneously satisfies Conditions M.2 and M.3 of and of . ◊ The function d constructed in fails to satisfy : we have seen that can be arbitrarily close to zero. Fix , and consider instead the function d defined recursively as follows. Let . Take such that d(h) has been defined. If , define to be given by Eq. ; if , define d to be the function Note that if , then , and if , then . This shows that both Conditions M.1 and M.2 are satisfied. To verify , let σ satisfy Eq. . Then, for each , the measure places probability of at least on the set . This implies that for infinitely many almost surely with respect to the measure . Thus, . ◊ Fix .

The function .

Consider an arbitrary history . Let be the set of all histories that extend h (including h itself). We define a function as follows. Suppose first that . Apply to the subgame and . This yields a function with the following properties: At the history h, For every , where the symbol stands for the function d as defined in section 3.2, with . For every play p that extends h we have If , then we let be the constant zero function. This definition satisfies Eqs. and , whereas Eq. has to be replaced by

The function d.

We recursively define a function and, simultaneously, an auxiliary function , which assigns to each history h a prefix of h. At the empty history, set and . Consider a history , and suppose that and d(g) are already defined for each strict prefix g of h. Let h– denote the prefix of h satisfying . Set . If , set . Otherwise, set ; in this case, we say that d reinitiates at h. The intuition behind the definition of the function d is the following. We start at with the function and stick with it until we encounter a history h1 such that . From this point on, the function is no longer useful, and reinitiation occurs, which takes effect from the next stage, . As of that stage, d follows the function , until encountering a history h2 such that . As of stage , d follows , and so on. We show that d satisfies Conditions M.1–M.3.

Verifying Conditions M.1 and M.2.

For each history , as for all action profiles , we have For the empty history , by Eqs. , , , and ,which proves Conditions M.1 and M.2 for . Consider now a history at which d does not reinitiate. Since , by Eqs. and ,which proves Conditions M.1 and M.2 for h. Finally, consider a history at which d reinitiates. By Eqs. , , , and , and since ,which proves Conditions M.1 and M.2 for h.

Verifying Condition M.3.

Assume that a strategy profile with finite support satisfies Eq. . Notice that Eq. and imply that for each history , and thus, the process is a bounded submartingale under σ (with respect to the usual filtration on ). Hence, in particular, converges with probability 1 under σ. Notice also that if d reinitiates at a history , then by Eqs. and we have . This implies that under σ the expected number of reinitiations is bounded in any subgame. In particular,for every history g. Fix a history g and . We will show that . Let be a sufficiently large stage such that This implies Hence,where the last inequality follows from Eq. . Since ρ is arbitrary, holds as well. In the proof of the function d is updated whenever its value drops too much, that is, whenever its continuation cannot serve to show that the value of the current subgame is not far below . This idea is reminiscent of, e.g., Rosenberg and Vieille (22), where to construct an optimal strategy one follows a discounted optimal strategy until a subgame is reached where the discounted value is far below the undiscounted value and then switches to a discounted optimal strategy with a higher discount factor, for which the discounted value is closer to the undiscounted value.

Proof of Theorem 2.1.

In this section we complete the proof of . By applying a proper affine transformation to the payoff function of each player i and adjusting accordingly, we can assume without loss of generality that f takes values in . The gist of the argument is as follows. For each player , let be a function satisfying the conditions of with , and fix arbitrarily some default action . We define, for each history , a one-shot game G(h). This is a game with finitely many players: if , then in G(h), only the players are active; the other players are assumed to be playing their respective default actions. The payoff to an active player i is given by the function . Nash equilibria of this game satisfy Eq. for every player . By Conditions M.2 and M.3, the corresponding strategy profile in the IR game Γ gives each player i, almost surely, a payoff of at least . This in turn implies that there exists a play p with a payoff of at least for each player . Such a play yields an -equilibrium in Γ: if a player i deviates from this play, then she gets punished; i.e., her opponents switch to a strategy profile which ensures that player i’s payoff is at most .

The one-shot game G(h).

Fix a history h, and denote . Let G(h) be the one-shot game where The set of players is . The action set of each player is A. The payoff function of each player is given by By Nash (2), the one-shot game G(h) has an equilibrium . Since each player is active in G(h) and only finitely many of her opponents randomize, we have

The play p.

For , define the set as follows. Let consist of the empty history , and for , let be the set of histories such that for and . This is the set of histories that may result if the inactive players stick to their default actions. The set is finite for each . For each player , let be the following strategy (compare with equation 9 in ref. 15): for each history , The function thus defined is a strategy; i.e., the function is -measurable for every and every . Indeed, this function coincides with the function on the finite set of histories , and it is constant on . The resulting strategy profile is an element of . Consider any player . For every history with , we have, by Eq. , By and we deduce that for every history with , Since this holds for all histories beyond stage i, Lévy’s 0–1 law implies that Since the set I of players is countable, it follows that there exists a play such that

The -equilibrium in Γ.

Based on the play p, the following grim-trigger strategy profile is an -equilibrium: The players follow the play p. If some player i deviates from p at stage t, then from stage t + 1 and on, all other players punish player i at her finitistic minmax value. That is, they bring down player i’s payoff to .

Discussion

In our proof, the tail measurability of the payoff functions was used in two arguments: first, in to conclude that the finitistic minmax value of a player in a subgame is independent of the history, and second, in the proof of to ensure that each player i can play arbitrarily during the first i stages without affecting her overall payoff. When the payoffs are not tail-measurable, an -equilibrium need not exist; Voorneveld’s one-shot game, when viewed as an IR game in which only stage 0 action profile matters, provides an illustration.

Games with Finitely Many Players.

Our model encompasses games with finitely many players, as we can assume that only finitely many players have more than one action. Existence of ε-equilibrium was not known even in this setup. Note that even in this case, the tail sigma-algebra is unrelated to the Borel sigma-algebra .

Ongoing Interaction.

Conceptually, the current framework differs from the traditional framework of repeated games [Sorin (4)]. In the latter, one specific game—for example, the prisoner’s dilemma [see, e.g., Fudenberg and Tirole (23), p. 9]—is played repeatedly, and the players are paid off at each stage. The long-run payoff is then defined as some kind of average of the stage payoffs. Here, on the other hand, stage payoffs are not explicitly defined; instead, the payoffs depend on the entire sequence of the action profiles, as in Gale and Stewart (7). One could still think of stages as representing stage games. These stage games may differ from each other and may evolve over time in a complex, history-dependent fashion; perhaps a prisoner’s dilemma at one stage, a stag hunt (see, e.g., ref. 23, p. 20) at the next, a battle of the sexes (see, e.g., ref. 23, p. 18) next, and so on. The stage payoffs, which are not explicit, may be thought of as being implicit in the long-run payoffs. The long-run payoff may exhibit players’ goals that cannot be described by an average of stage payoffs; for example, a player may care about the long-run average payoff as well as the long-run average stage to stage fluctuation. The theory of repeated games yields insights into phenomena such as altruism, cooperation, trust, loyalty, threats, and revenge [Sorin (4) and Aumann (24)], showing how they arise from the ongoing nature of the interaction. It takes a leap of the imagination to extend these insights to the real world, where people interact in different ways over time and their goals are complex. The current work formalizes this leap.

5 in total