## Notes

1 This argument makes up Chapter 6 of Gauthier's, DavidMorals by Agreement (Oxford: Clarendon Press, 1986). Hereafter *MA*, page references are to this text.

2 For criticism on this point see Macintosh, Duncan, “Two Gauthiers?” Dialogue, 28, 1 (January 1989): 43–61.

3 See Howard Sobel, Jordan, “Straight versus Constrained Maximization,” Canadian Journal of Philosophy, 23, 1 (March 1993): 25–54.

4 See Smith, Holly, “Deriving Morality from Rationality,” and Danielson, Peter, “Closing the Compliance Dilemma: How It's Possible to Be Moral in a Lamarckian World,” both in Vallentyne, Peter, ed., Contractarianism and Rational Choice (Cambridge: Cambridge University Press, 1991).

5 This point was made by Joseph Mendola, “Gauthier's Morals by Agreement and Two Kinds of Rationality,” Ethics, 97,4 (July 1987): 765-74, and Binmore, Ken, “Bargaining and Morality,” in Gauthier, David and Sugden, Robert, eds., Rationality, Justice and the Social Contract (Hemel Hempstead: Harvester Wheatsheaf, 1993).

6 While these problems are not decisive, they should serve as a warning against the very widespread tendency to overestimate the flexibility of game theoretic models. It is precisely because of this tendency that I consider it necessary to insist upon fairly close adherence to standard game theoretic formalizations whenever possible.

7 In order to keep the various lines of argument straight, I will use the term “straightforward maximizer” to refer to Gauthier's characterization of the standard rational choice agent, and “standard maximizer” to refer to my own.

8 See Gauthier, David, “Bargaining and Justice,” in his Moral Dealing (Ithaca, NY: Cornell University Press, 1990), pp. 187–206, and also *MA*, chap. 5. It should be noted that my arguments in no way diminish Gauthier's claim that bargaining theory can provide an adequate account of justice reasoning.

9 Gauthier incorporates the translucency assumption into argument 2 as follows: “Since the constrained maximizer has in some circumstances some probability of being able to enter into, and carry out, an agreement, whereas the straightforward maximizer has no such probability, the expected utility of the constrained maximizer is greater” (Gauthier, David, “Reason and Maximization,” in his *Moral Dealing*, p. 229). This is incorrect as it stands, since the con strained maximizer must have not just some chance of successful co-operation, but a good enough chance that it outweighs the expected disutility of being periodically exploited. Thus if P_{d} is the probability of the constrained maximizer detecting the straightforward maximizer, and being exploited is worth 0, then constrained maximization will only be advantageous if P_{d}u_{c} > u_{d}.

10 Note that perfect knowledge is not an assumption of standard rational choice theory. This reflects the simple fact that it is possible to be ignorant without being irrational.

11 The condition is in fact far more demanding than Gauthier suggests. For the choice problem to have a solution, each player must not only have the ability to detect dispositions, but the effectiveness of each player's ability must be common knowledge among all players. The reason for this is clear once the game is translated into its Bayes-equivalent form, as in Figure 3 below.

12 Gauthier, David, “The Incompleat Egoist,” in his *Moral Dealing*, pp. 265–66.

13 It is noteworthy that many of Gauthier's interpreters explicitly read the multi stage assumption into the translucency argument (e.g., see SayreMcCord, Geoffrey, “Deception and Reasons to be Moral,” in Vallentyne, *Contractarianism and Rational Choice*, p. 192).

14 This style of graph is due to Schelling, Thomas, Micromotives and Macrobehavior (New York: W. W. Norton, 1978).

15 My presentation of repeated games follows Myerson, Roger, Game Theory: Analysis of Conflict (Cambridge, MA: Harvard University Press, 1991).

16 For simplicity we say that the player always receives a signal informing her of her own previous move, and thus we treat only games of “perfect recall.”

17 Note how different the game would be if the information structure were changed (e.g., if money were distributed in sealed envelopes, and players were told only the total sum paid out).

18 This “one-stage deviation principle” may not be obvious. For proof, see Fudenberg, Drew and Tirole, Jean, Game Theory (Cambridge, MA: MIT Press, 1991), p. 110.

19 It is well known that co-operation cannot be sustained in the like manner in finitely repeated PDs of *known duration*. But the significance of this result should not be overstated, particularly since it does not generalize to games that have more than one suboptimal equilibrium.

20 Friedman, James W., “A Noncooperative Equilibrium for Supergames,” Review of Economic Studies, 38, 1 (January 1971): 1–12. Although this construction is loosely analogous to Robert Axelrod's “tit-for-tat” strategy (The Evolution of Cooperation [New York: Basic Books, 1984]), it is important to maintain a clear distinction between evolutionary and rationality-based game theory, since the two models operate with very different action-theoretic assumptions.

21 Gauthier's claim that straightforward maximizers would take the probability of others co-operating with them to be fixed independently at some value P would make them more or less equivalent to players in this game who ignored the future impact of their actions (i.e., had very low discount rates). It is not unnatural to suppose that a group of such extremely short-sighted agents would not be able to sustain co-operation. But this does not limit the model presented here if we grant, as a realistic empirical assumption, that discount rates tend to be fairly high.

22 This is Schelling's, Thomas classic solution, The Strategy of Conflict (Cambridge, MA: Harvard University Press, 1960), pp. 89 ff.

23 Selten, Reinhard, “Reexamination of the Perfectness Concept for Equilibrium Points in Extensive Games,” International Journal of Game Theory, 4, 1 (1975): 25–55. His actual purpose here is to introduce the notion of “trembling-hand” perfect equilibrium, which restricts the equilibrium set to one in a game where every node is reached with positive probability. But this refinement is not needed for repeated games, since every stage begins a proper subgame.

24 See Fudenberg, and Tirole, , *Game Theory*, pp. 153–54.

25 Fudenberg, Drew and Maskin, Eric, “The Folk Theorem in Repeated Games with Discounting or with Incomplete Information,” Econometrica, 54, 3 (May 1986): 533–54. See also Fudenberg, and Tirole, , *Game Theory*, pp. 150–60.

26 For an overview of decentralized repeated games, see Myerson, , *Game Theory: Analysis of Conflict*, pp. 349–52.

27 Smith, “Deriving Morality from Rationality,” p. 239.

28 Danielson, “Closing the Compliance Dilemma: How It's Possible to Be Moral in a Lamarckian World,” p. 300.

29 Another way of putting this is to say that instead of having philosophers worrying about the population profile, the players can do it themselves.

30 Harsanyi, John, “Games with Incomplete Information Played by ‘Bayesian’ Players,” 3 parts, Management Science, 14 (1967–68): 159–82, 320-34,486-502. For a general introduction to Bayesian games, see Fudenberg, and Tirole, , *Game Theory*, pp. 209–42.

31 Following David M. Kreps and Robert Wilson (see their “Sequential Equilibria,” *Econometrica*, 50, 4 [July 1982]: 863-87), I represent an equilibrium as a profile of assessments, composed of a strategy *and a consistent system of beliefs*. The beliefs support the equilibrium in the following way: when β = 1/3 player 2 is indifferent between c1 and d1, expecting a pay-off of 2/3 for either, so when β > 1/3, she prefers c1. Similarly, when β = 1/2, and player 2 is playing cl on the left, d2 on the right, player 1 is indifferent between C and D, expecting a pay-off of 1 for either.

32 This criticism is developed by Sobel, “Straight versus Constrained Maximiza tion,” p. 41.

33 For a complete model of belief-formation in multi-stage games of incomplete information, see Fudenberg, and Tirole, , *Game Theory*, pp. 331–33.

34 Since all knowledge is being modelled through the signalling system, including players' “recall” of their own moves, the assumption is that even players who participated in the game cannot recall how anyone played.

35 Recall that argument 1 takes the probability of others acting co-operatively to be fixed at P. Argument 2, on the other hand, assumes that P is simply a function of the agent's disposition. But when we assign P a dynamic role in the model, we can use these terms to define our standard maximizer as a conditional co-operator who defects only when doing so will not lower the value of P.

36 Note that Gauthier argues incorrectly that any form of co-operation based on external constraint will be suboptimal because of “the costs needed to enforce adherence to agreements” (MA., p. 164). But both effective and empty threats are costless among rational agents, since in neither case will the punishment sequence ever be carried out. Gauthier probably has in mind feeding the standing armies of some Leviathan. But the threat at work in a Nash reversion strategy is more of the form: “as soon as anyone breaks the law, all of society reverts to the state of nature.” This requires no special enforcement mechanism.

37 I would like to thank Jordan Howard Sobel for helpful comments and criticisms, along with the Social Sciences and Humanities Research Council of Canada for financial support during the period in which this paper was written.