Rewarding in International Law

Abstract Why states comply with international law has long been at the forefront of international law and international relations scholarship. The compliance discussion has largely focused on negative incentives. We argue that there is another, undertheorized mechanism: rewarding. We provide a typology and illustrations of how rewards can be applied. Furthermore, we explore the rationale, potential, and limitations of rewarding, drawing on rationalist and psychological approaches. Both approaches provide ample justifications for making greater use of rewarding in international law.

the time of entering the commitment; compliance rewards benefit the country when it complies with the commitment.
The rationale for rewarding is that cooperation can be encouraged through rewards offered by one party to offset the benefits that the other country draws from noncompliance. 9 In other words, compliance can be achieved if a reward outweighs the benefits from breaching international law. This rationale is linked to the Coase theorem, which states that if transaction costs are sufficiently low, it does not matter to which party one initially assigns a right. 10 The other party may (and if rational should and would) pay the right holder to relinquish it, so long as both would be better off. Thus, anyone considering penalties should be interested in rewards, both as a practical and theoretical matter. From a rational choice perspective, with few exceptions, rewards are pareto efficient because they make one country better off and no country worse off, while penalties are not pareto efficient since they make the target country always worse off.
While the Coase theorem shows that we should pay attention to rewards, it incorrectly suggests a type of equivalence between rewards and penalties. However, psychological research, including in IR scholarship, shows distinct differences of individual and state perceptions and responses to rewards and penalties. 11 Perceived losses and gains provoke different behavior in many respects. The baseline matters because actors evaluate gains and losses from a reference point. 12 This Article thus goes beyond the Coasean insight by arguing that rewards can produce better results than penalties, on both efficiency grounds (pareto optimality) and psychological grounds (behavioral differences). It is commonly noted that states commit to and comply with treaties because of the benefits (rewards) they receive from doing so. The exchange of benefits in treaty negotiation is routine and widely documented. Indeed, a lot of compliance literature has focused on what we call internal rewards, namely the assumption that enjoying the benefits of cooperation is often a sufficient incentive to comply with a treaty. 13 The threat of withdrawal of benefits is conceptualized as outcasting. 14 Thus, some types of rewards are addressed in the compliance literature, but rewards have not been present. Yet, this distinction has never been watertight. 20 Compliance mechanisms are closely related to treaty design since, when designing treaties, states must already have some ideas about how compliance with the commitments can be enhanced (backward induction). Although those questions should be conceptually distinguished (law is either on the left or right side of the equation), they are not independent of each other if the whole life cycle of international law is to be analyzed. 21 This Article is a contribution to both the treaty design and compliance literatures, but with a focus on the latter (or on the subset of rational design literature dealing with compliance), which in turn feeds into treaty design and thus the possibilities of cooperation. 22 We proceed as follows. Part II surveys compliance mechanisms as commonly discussed in the literature. We then introduce the definition and typology of rewarding, connecting, and differentiating it from the literature discussed. Part III then illustrates our typology of rewarding with examples. In Part IV, we discuss the differences between rewarding and penalties on a rationalist basis. We then turn in Part V to a behavioral analysis of rewarding. Part VI discusses the limits of rewarding and the conditions under which it can be expected to work, including its combination with penalties. Part VII concludes.

II. COMPLIANCE MECHANISMS IN THE LITERATURE
Compliance theories in international law rely on a variety of mechanisms, including norm spirals, 23 focal points, expressive law theories, 24 and international courts. 25 They either assume a unitary state or break up the black box of the state to explain compliance-for example, through national political processes 26 or national courts. 27 We acknowledge the merits of these theories, but our focus is on rationalist theories that predominantly use a unitary actor model, simply to make our point clearer.
The advantage of rational choice theory is that it provides falsifiable explanations about when countries will choose to violate international law. 28 This theory reduces the complexity 20 BARBARA KOREMENOS, THE CONTINENT OF INTERNATIONAL LAW (2016) (although situating herself in the design literature (at 2), she includes a chapter on penalty provisions where she also briefly discusses rewards (chapter 8)). Chayes & Chayes, supra note 1, at 183 ("[I]f the agreement is well-designed . . . compliance problems and enforcement issues are likely to be manageable."). 21 2021 199 of the real world into clear parameters: states are assumed to be rational, self-interested, and utility-maximizing. 29 In principle, all rational choice approaches to international law are based on the comparison of (objectively viewed) benefits and costs. From that perspective, states enter a treaty when the benefits arising from the treaty are higher than the costs of entering. 30 Likewise, states later comply with the treaty when the benefits of compliance are higher than the costs of compliance (where the costs of compliance also include the forgone benefits from noncompliance). In the literature, several mechanisms are discussed.
Most importantly, reciprocity is viewed as a basic mechanism for compliance. It exists in two forms: reciprocity as a practice of exchanging things with others for mutual benefit and reciprocity as the benefit from the act of compliance. 31 Reciprocity is deemed to be a crucial building block of human societies 32 as well as of international law 33 and is also reflected in Article 60(1-3) of the Vienna Convention on the Law of Treaties (VCLT). 34 It can be positive or negative. Positive reciprocity defines the benefit from the practice of exchanging things (e.g., rights, gains, and privileges), while negative reciprocity is defined by the withdrawal of beneficial exchanges. Reciprocal benefits are usually understood to be benefits from the treaty obtained through the compliance of the other party (or parties).
Often, states comply with treaties when they fear that their own noncompliance can trigger reciprocal noncompliance by other states. This works best when mutual cooperation is preferred to mutual violation. 35 If mutual cooperation is preferred, reciprocity has the capacity to induce compliance (as defection might otherwise end future cooperation).
A violation by one side is likely to provoke reciprocal withdrawal by the other side, at least in bilateral arrangements. But negative reciprocity, and the threat thereof, may be inefficient in treaties with public good aspects, since it can provoke a complete breakdown of cooperation, e.g., in arms control treaties. 36 Negative reciprocity is also unlawful under VCLT Article 60(5) for provisions relating to the protection of the human person contained in treaties of a 29 We recognize that there are various forms of rationality, and that these have different implications for law. See, e.g., JON ELSTER, ULYSSES AND THE SIRENS: STUDIES IN RATIONALITY AND IRRATIONALITY (1979). If states were perfectly rational and treaties were well-designed, then maybe there would be less need for compliance-enhancing mechanisms like penalties, as states would voluntarily comply to advance their own long-term interests. But since they are often imperfectly rational, (or captured by special interests), we need compliance-inducing tools. . 36 In multilateral constellations, reciprocity may not work well or is undesirable, but outcasting may. Cf. GUZMAN, HOW INTERNATIONAL LAW WORKS, supra note 1, at 174-76; KOREMENOS, supra note 20, at 233. Those treaties may therefore contain other penalty provisions-or rewarding mechanisms. THE AMERICAN JOURNAL OF INTERNATIONAL LAW humanitarian character. Notwithstanding the legal norm, violating human rights obligations by one state will not induce another country to treat its citizens in the same way.
Scholars have also highlighted the importance of reputation. States comply because they want to be able to make credible commitments in the future. By complying with promises, each country enhances its reputation as a state that honors its commitments and, therefore, its ability to reap cooperative benefits in the future. This allows a state to find more partners and extract more generous concessions. 37 Reputational benefits can ensue when entering a treaty (reputation for normative commitments) as well as when a state decides to comply with its international obligations (reputation for being a reliable partner). Reputation, positive or negative, can play alongside the other compliance mechanisms or can stand alone if the others do not work or are undesirable (e.g., in human rights treaties). Reputation as a compliance mechanism has been extensively discussed and reviewed, but many open questions remain. 38 It clearly depends on perception and may differ between audiences 39 as well as on the availability of information about behavior and its salience. Is the reputation attributable to the state or to the government? 40 Can reputation be compartmentalized, that is, does it matter whether a state (or government) has, for example, a good human rights record but a bad one in regard to investment?
A further compliance mechanism is retaliation. Retaliation represents a mostly costly action by one or more states with the intention to punish another state for violation of a commitment. 41 Retaliatory actions include retorsions and counter-measures, such as economic, diplomatic, or military sanctions outside the base treaty. Retaliation is rational when it influences the future action of the violating state, i.e., when retaliation is used to persuade the violator to comply in order to avoid further sanctions. However, retaliation is costly for both the noncomplying state and the retaliating state. This costliness encourages sanctions free-riding, which creates a sanctioning dilemma (a second order prisoners' dilemma). 42  Yet another compliance mechanism discussed is nonviolent outcasting, defined as the use of techniques to deny noncompliant states the benefits of social cooperation and membership or use of markets. 43 Outcasting penalizes by shutting the violating state or its economic operators out of the "club" or suspending them temporarily, depriving them of the benefits of cooperation, which has damaging consequences. 44 The use of exclusion as penalty for noncooperation or violation of international law converts public goods to excludable nonrivalrous goods in terms of consumption-that is, club goods. 45 Even more important, enforcement can also be carried out by nonstate actors, such as private banks in the Financial Action Task Force (FATF) mechanism. Outcasting is not a particular form of retaliation since it occurs solely within the treaty framework and is usually rather cheap for the states who outcast-it is rather a form of collective negative reciprocity, that is, complying members collectively withdraw their promises to the violating member. 46 Outcasting as a penalty is pervasive (e.g., in Article 4 of the Montreal Protocol banning the import of the substances listed in the Annexes from nonparties, 47 the Convention on International Trade in Endangered Species of Wild Fauna and Flora, 48 or the Basel Convention 49 ) and soft law (e.g., FATF for money laundering and terrorism financing or the Kimberly Process of conflict diamonds). Many regional organizations like the African Union, the Organization of American States, the Council of Europe, and the EU have some sort of outcasting device for members breaking their rules or principles (e.g., by revoking voting rights).
The managerial approach to compliance by Chayes and Chayes, which may be closest to our approach, drew attention to how the provision of incentives, positive assistance, and constructive engagement work to bring about compliance with international law, arguing that this approach is often more effective than sanctions or costly coercive enforcement. 50 They also called attention to the problem of incapacity as one reason for noncompliance with a treaty-a point we take up below. 51 In this Article, we supplement valuable insights from the managerial school, most notably the problem of incapacity, with an overall analytical treatment and typology of positive inducements. It helps us to understand when compliance 43 51 Previous scholarship has expressed some concerns with the managerial school. See GUZMAN, HOW INTERNATIONAL LAW WORKS, supra note 1, at 16 (arguing that it "does not offer any underlying theory or explanation of why states prefer to comply with international law. Nor does it help us to understand when this preference for compliance will trump other concerns and when it will not prevail.").

THE AMERICAN JOURNAL OF INTERNATIONAL LAW
can be expected and how different forms of rewards can overcome incapacity issues. We also go beyond the managerial school by explaining the underlying mechanisms of rewarding through a behavioral approach.
The first three mechanisms (reciprocity, reputation, retaliation), in their different forms, are what have been historically utilized to "promote compliance with international legal rules." 52 We submit that even in their positive form (positive reciprocity and reputation), these mechanisms do not in fact capture the universe of compliance mechanisms, and the discussion fails to distinguish this universe analytically.

III. REWARDING: THE PHENOMENON
We argue that rewarding is indispensably linked to compliance theory but is undertheorized. Reward and penalty (or carrot and stick) are often seen as the opposite sides of the same coin and thus rewards have not received special attention.
The focus of our analysis is on treaty law, although it can be applied to all sources of international law. We do not exclude soft law agreements since they may contain compliance mechanisms similar to treaty law. 53 Bilateral constellations as well as multilateral constellations are considered and we cover all stages of the life cycle of international law. After developing a typology of rewarding, we turn to illustrations based on that typology, contributing to a better understanding of compliance mechanisms within those examples.

A. Typology of Rewarding
To distinguish rewards from penalties, one must establish the target's baseline of expectations at the time the sender's influence attempt begins. 54 The baseline depends on rational expectations about the future, including the likelihood of reward/penalty. Rewards are improvements in a target's value position relative to its baseline of expectations; penalties are deprivations relative to the same baseline. For instance, giving a hundred dollar bonus to a man who expected a bonus of two hundred dollars may take the form of a reward but it is not perceived as such; similarly, cutting the salary by a hundred dollars while the man expected a two hundred dollar fine may take the form of a penalty but it is not perceived as such. 55 A conditional commitment not to reward if the target fails to comply is not necessarily a threat if the target had no prior expectation of receiving the reward. 56  reward is only considered a punishment when the reward was expected. The baseline of expectations can shift over time. Once a penalty has become part of the baseline, then the removal of the penalty would have the effect of a reward. The converse situation also applies to rewards: the removal of a reward can be a penalty. To summarize, the reward/penalty distinction depends on the baseline, which in turn depends on expectations about the future, including the likelihood of reward/penalty. This means that defining the baseline and the framing of expectations is critically important for the effectiveness of either mechanism. 57 The most important distinction to be drawn is between the benefits of the agreed upon bargain of a treaty (e.g., the exchange of goods or the provision of public goods) and rewarding outside the (base) treaty to be complied with (e.g., the payment of money that was not part of the treaty). In other words, the distinction is made between rewards as cooperative benefits accruing to a party of a treaty (internal rewarding) and rewards external to or "on top" of the cooperative benefit (external rewarding). We acknowledge that this distinction is made for analytical purposes, and there are examples, e.g., the Montreal Protocol discussed below, which show that the distinction depends on how the treaty is drafted and how it has been handled in practice. 58 But the distinction is crucial from a legal perspective, even if less so from a social science perspective. Another important distinction between internal and external rewards is their predictability and flexibility. Given that internal rewards are within the treaty, they are commonly more predictable but less flexible, whereas with external rewards it is the opposite.
The second distinction concerns the point in time and is made between rewards-forentering and rewards-for-complying with a treaty. This distinction is necessary since a state's incentives at the treaty-negotiating stage may be different from those it faces when the time for compliance arrives. The entry reward is often the cooperative benefit of the respective treaty (internal reward, positive reciprocity) but can go further than that-there may be rewards given on top of the cooperative benefit, such as external rewards (e.g., promises of tariff concessions if International Labour Organization Conventions are ratified) and positive reputation. Any concessions given during treaty negotiations can be considered as internal rewards if they are included in the treaty text. If the bargain of the treaty is insufficient or inclusion of the reward in the treaty would be inappropriate, states may resort to external rewards at the negotiation stage. 59 We can thus define a typology of rewards and penalties, which can be tangible or intangible. First, external penalties, e.g. retaliation, are a form of sanctions outside the base treaty. Similarly, fines can be considered as external penalties, as they have to be paid by the violating state "on top" of the treaty (the state is not subject to outcasting or to withdrawals of benefits of the base treaty). 60 Terminating a linked treaty is a form of external penalty. Reputation is 57 Note that in a multilateral context, the baseline can also be influenced by how other states in similar situations were or will be treated. 58 See note 47 supra, et seq. and corresponding text. 59 Both types of rewards, internal and external, contribute to expanding the zone of potential agreement (ZOPA) in treaty negotiations and relate to the cost-benefit calculus in treaty making. 60  Vol. 115:2 widely considered relevant for the future and not the base treaty, and thus we qualify a bad reputation as an external penalty. Second, there are internal penalties, in the form of withholding cooperative or other benefits within the treaty limited to treaty parties nonperforming their obligations or to states not entering the treaty in the first place; this is negative reciprocity as well as outcasting. Internal penalty is the withdrawal and nonenjoyment of the benefit of cooperation of the base treaty by the violator. The formation of club goods and outcasting have a rationalist basis as explanation and experiments confirm their effectiveness. Experimentally, it has been shown that excluding defectors is a cheap and powerful sanctioning device. 61 Third, there are internal rewards, allowing states to gain (when acceding a treaty) or retain the cooperative benefits of the (base) treaty when complying. Reciprocity is the practice of exchanging things with others for mutual benefit. In other words, (positive) reciprocity is the practice of rewarding cooperation through the exchange of rights, gains, privileges, and assistance within a treaty. Here, we also situate readmission or redemption after outcasting. Where exclusion is reversible, i.e., inclusion after an individual has been excluded ("redemption" in experimental terms), it is possible to achieve even larger contributions to the public good, possibly due to the endowment effect. 62 Fourth, there are external rewards that are reputational as well as direct benefits (e.g., side payments, other advantages through linked treaties, or intangible rewards like state visits) that follow the entry or compliance with a treaty and are not captured by the cooperative benefit of the base treaty itself. A good reputation is an external reward in our framework, since it is a benefit outside the bargain of the base treaty. To be precise, while naming and praising is the rewarding act implemented by the sender, a good reputation is the result of this act for the receiver. We simplify reputation as rewarding here since the described differentiation does not matter for our argument. Reputation can accrue from joining a treaty as well as from complying with international legal obligations. The former is a reputation for normative commitment whereas the latter relates to being a trustworthy partner.
External rewards can become necessary when the benefit of the bargain or treaty (i.e., the internal reward) is insufficient to induce accession or compliance (for some states and/or at a certain point in time). 63  are concluded in the first place? The answer is yes, if the treaty provides a global public good, like environmental treaties or has third party beneficiaries, like human rights treaties. We illustrate these types of rewarding in the following sections, since we submit that all of those mechanisms are used in international law to change states' behavior. The classical compliance framework is too narrow to understand how states behave. Therefore, we try to assess what drives states to interact cooperatively, which is part of a broader governance discussion. Showing the array of possibilities also informs states about rational design and how to achieve regulatory goals at different stages (treaty making as well as compliance).

B. Illustrations of Rewarding
Although international law already uses rewarding, formal rewarding in the compliance stage is rather uncommon. 64 Although many compliance mechanisms are designed within a treaty, states can also add external rewards (which are mostly intangible and reputational, but need not be, as will be shown below). Often, treaties contain all four kinds of rewardsinternal and external rewards as well as entry rewards and compliance rewards. A combination of rewards is especially common in treaties providing global public goods, such as environmental or disarmament treaties. Understanding how rewards are used in different issue areas of international law helps to shed light on the different tools applied (or that could be applied) to effectuate international law. We provide examples to illustrate the different constellations of internal rewards, followed by examples for external rewards.

Illustrations of Internal Rewarding
Internal rewards are derived from membership. Benefits from participating in economic, political, and legal ties with one another can generate the necessary incentives to enter and comply with a commitment. Benefits may be conditioned upon compliance either with (monetary) contributions or other substantive norms of the treaty. For instance, member states of the World Health Organization (WHO) benefit from a vast array of international public health programs (internal reward at entry stage), but voting privileges and services to which a member is entitled may be lost if the member state does not comply with its mandatory budget contributions (internal penalty at the compliance stage). 65 Not fulfilling the mandatory contributions may also lead to a loss in reputation and exiting the treaty during a pandemic may spoil the reputation of a country even more. 66 This external penalty enhances the internal reward mechanism.
The prospect of redemption or readmission for previously excluded countries is the reverse of outcasting and another constellation of internal rewards at the compliance stage that is found in many treaties. 67 Thus, members can regain (voting) rights once they fulfill their obligations. Readmission has been experimentally shown to enhance cooperation even more. 68 Enjoying benefits creates the so-called endowment effect, a well-researched behavioral bias finding that people are more likely to retain an object they own than acquire that same object when they do not own it. 69 It is connected to Prospect Theory and loss aversion. 70 Thus, from a behavioral perspective, after entering a treaty and enjoying the benefits, treaty compliance might be improved if those benefits can be lost by outcasting.
In cases of incapacity, 71 a penalty (internal or external) is unable to deter a country from violating international law because of the violator's inability to comply. 72 Lack of the necessary financial, administrative, or technological resources is perceived as one important reason why states do not follow international law, especially multilateral environmental agreements (MEAs). 73  . 70 Prospect Theory is a psychology Theory that describes how people make decisions when presented with alternatives that involve risk, probability, and uncertainty. It holds that people make decisions based on perceived losses or gains and are thus reference point dependent. People are usually averse to the possibility of losing, such that they would rather avoid a loss rather than take a risk to make an equivalent gain. See note 12 supra. 71 Chayes & Chayes, supra note 1, at 188. 72 In cases of impracticability, there is a question as to who should bear the burden of the high costs of performance, given that the obligee contracted to carry out its obligations. 73 RONALD B. MITCHELL, INTERNATIONAL POLITICS AND THE ENVIRONMENT 162 (2010) ("When developing countries fail to meet their environmental commitments, it often reflects a lack of adequate or appropriate resources (and the presence of more pressing concerns) rather than a calculation that non-compliance better fits their interests."). Capacity building is needed: Lothar Gündling, Compliance Assistance in International behavior in the direction requested. 74 Negative reputation is not likely to play a part in the case of incapacity and negative reciprocity is undesirable in public good treaties. If an agreement has been acceded to but cannot be complied with because of incapacity, internal rewards help to overcome (material) constraints by providing the prospective violator with the necessary resources. 75 This is usually employed when the bargain of the treaty is a global public good and broad membership is desired even if noncompliance due to incapacity is likely when acceding to the treaty (e.g., the Paris Agreement on Climate Change 76 ).
Recognizing how much noncompliance arises from incapacity, MEAs shifted from penalizing violations to facilitating compliance. 77 Potential donors have incentives to contribute to internal rewards, because if not, the target actor will continue engaging in the environmentally harmful behavior. States that view themselves as sufficiently harmed by another state's activity should be more willing to offer internal rewards large enough to convince the targeted actor to discontinue that behavior.
A very prominent example of an environmental treaty providing global public goods (or preventing public bads) using a mix of internal and external rewards at the compliance stage is the Montreal Protocol on Substances that Deplete the Ozone Layer. 78 Applying our typology shows the different ways in which rewards are used for that treaty. The Protocol aims to ban the global production and use of ozone-damaging chemicals including Chlorofluorocarbons (CFCs), Hydrochlorofluorocarbons (HCFCs), and halon as used in air conditioning and refrigeration systems, placing limits on the amount of ozone-depleting substances each member state may produce and consume. From a rationalist perspective, public goods are expected to be underprovided. Yet, it has been dubbed "one of the most successful and effective environmental treaties ever negotiated and implemented." 79 What were the causes of its success? One element that encouraged countries to ratify the Montreal Protocol was the trade provisions. These limited member states to trade only with other member states, thus creating a club good. Once the main producing countries signed up, it was only a matter of time before all countries had to sign up or risk not having access to increasingly limited supplies of CFCs and other ozone-depleting substances (ODS). The treaty thus achieved universal ratification. In return for agreeing to observe these limits of production, states parties receive access to trading privileges denied to nonparties; they are thus rewarded. Because the reward is part of the bargain, this is an example of a classical entry internal reward, the baseline being nonmembership. 80 How was compliance achieved? The above-mentioned mutual gains from trade within the club make compliance with the Protocol attractive to target states. Given that negative reciprocity is inefficient since the treaty provides a global public good, the treaty creates the possibility of withholding certain rights and benefits that a party receives from the treaty. It thus uses (partial) outcasting as an internal penalty and redemption as an internal reward as compliance mechanisms. Yet, additional internal rewards are considered crucial for the success of this treaty: the Multilateral Fund (MF) of Article 10, which provides incremental funding for developing countries (so-called Article 5 countries) to help them meet their compliance targets, thus deals with incapacity as described above. 81 A crucial element for the success was the recognition that nonreporting countries 82 were to a great majority developing countries unable to comply without technical assistance. 83 By recognizing this, the "Montreal Protocol [became] the first treaty under which the parties undertake to provide significant financial assistance to defray the incremental costs of compliance for developing countries." 84 Internal rewards in the form of technical and financial assistance helped to overcome material constraints of compliance. Significantly, the treaty has also provided institutional support. This helped countries to build capacity within their governments to implement phase-out activities and establish regional networks, so they can share experiences and learn from each other. As of December 2019, the contributions received by the MF from developed countries, or non-Article 5 countries, totaled over US$ 4.07 billion. The MF has also received additional voluntary contributions amounting to US$ 25.5 million from a group of donor countries to finance fast-start activities for the implementation of the HCFC phase-down. To facilitate phase-out of Article 5 countries, the Executive Committee has approved 144 country programs, 144 HCFC phase-out management plans, and has funded the establishment and the operating costs of ozone offices in 145 Article 5 countries. 85 The drafting of Article 5 also provides an interesting example of the (analytical) distinction between internal and external rewards. The Soviet Union became a party to the Protocol in 1988. In the mid-1990s, some former Soviet republics faced potential noncompliance and asked for a grace period to meet the Protocol's provisions. Some states received funding from the MF as Article 5 states. But Russia was not an Article 5 state, and thus was not eligible for this funding. Eventually, Russia received funding from the Global Environment Facility (GEF; a fund helping states to meet the objectives of MEAs), the U.S. and Danish governments, and the World Bank. The GEF requested that continued funding was subject to the Protocol processes for noncompliance and payment depended on favorable reports by the Protocol's implementation committee. Thus, the funds for Russia came from "outside" the treaty and from bodies with no formal role in the treaty system, yet treaty bodies continued to play a major role in reviewing progress and addressing any compliance issues that arose. 86 Legally, the funds given to Russia were an external reward, but they would have 81 Article 5 of the Montreal Protocol entitles developing countries to assistance from developed countries under the Multilateral Fund for the Implementation of the Montreal Protocol. Assistance takes the form of grants or concessional loans. See Multilateral Fund Secretariat, at http://www.multilateralfund.org/aboutmlf/fundsecretariat/default.aspx. 82 One requirement of the treaty was that member states had to report annual CFC consumption. 83  been internal rewards if Russia had been included as an Article 5 state in the first place-a historical contingency. The noncompliance procedure 87 was designed as a nonpunitive and advisory procedure. 88 It prioritized helping countries back into compliance and is a "flexible means to ensure some degree of implementation without suggesting the automatic blameworthiness of all nonperformance." 89 To clarify the range of outcomes that parties may expect from the noncompliance procedure, the parties have adopted an "Indicative List of Measures that Might Be Taken by a Meeting of the Parties in Respect of Non-compliance with the Protocol." 90 These measures are a mix of, on the one hand, positive inducements such as appropriate assistance (including assistance for the collection and reporting of data), technical assistance, technology transfer, financial assistance, information transfer, and training. On the other hand, they include the suspension of benefits of the protocol, with or without time limits, including those concerned with industrial rationalization, production, consumption, trade, transfer of technology, financial mechanisms, and institutional arrangements. 91 Those can be suspended and reinstated (redemption). This mix of different rewards has been successful. It is telling that all 142 developing countries were able to meet the 100 percent phase-out mark for CFCs, halons, and other ODS in 2010. 92 The Montreal Protocol reflects a tone shift in international law by using an encouragement-based approach that uses rewards. It has served as a model for other systems, 93 like, e.g. the United Nations Framework Convention on Climate Change (UNFCCC) and its protocols. 94 The Marrakesh Accords, adopting the compliance mechanism for the Kyoto Protocol, e.g., established a Compliance Committee, consisting of a Facilitative and an Enforcement Branch, thus accounting for the reasons of noncompliance. 95 The Facilitative Branch included advice, financial and technical assistance, and capacity building to achieve the objective of the base treaty (internal rewards at the compliance stage). The '"positive and conciliatory aspects'" of the Non-Compliance Procedure were stressed by developing countries who were afraid that they would be the first to become objects of such measures."). 89  We now turn to an illustration of reciprocity. Commonly, positive reciprocity is discussed in the entry phase (as the bargain of the treaty). Negative reciprocity is used as a means for compliance. If the state does not comply with the treaty, it loses the benefits of the treaty. But positive reciprocity can also be used in the compliance stage and this framing matters. Changing the narrative to positive reciprocity can be found within international humanitarian law (IHL) in order to induce more compliance; behavior of IHL relevant actors is thus "reframed." Because IHL is often violated, its significance is contested. This "credibility gap" 97 caused by the sole reliance on reports of violations triggers "the perception that . . . IHL is always violated and therefore useless." 98 This "negative and dismissive discourse renders violations banal and risks creating an environment where they may become more acceptable" 99 and based on the principle of negative reciprocity, inducing a violating spiral ("The other side does not respect it, so why should we?" 100 ). Therefore, the International Committee of the Red Cross (ICRC) is currently undertaking the so-called "changing the narrative" project to reaffirm the relevance of IHL in contemporary armed conflicts by giving concrete examples of compliance with IHL.
Instead of only focusing on violations of IHL, the ICRC wants to change the narrative to good practices, and thus break the perceived violation dominance, turning this negative spiral of reciprocity into a positive one. Furthermore, by mentioning the compliance of the actors who adhere to IHL, they may enhance their reputation, thus producing additionally an external reward. One initiative is the recently launched "IHL in Action: Respect for the Law on the Battlefield" database. 101 It is a collection of case studies documenting compliance with IHL in modern warfare. Based on publicly available information, these cases have been assessed by academics as demonstrating positive application of IHL. They demonstrate the importance of complying with that body of law in order to minimize human suffering in armed conflicts. Recent psychological studies highlight that focusing on unfavorable outcomes is not an effective way to change attitudes while pointing out desired behaviors is more likely to generate change and set in motion a positive spiral of reciprocity. 102 The ICRC believes that a more positive focus on IHL reporting can engender further compliance with the law.

Illustrations of External Rewarding
External rewards are benefits outside the base treaty, i.e., "on top" of the treaty. They may be needed to induce entry/compliance if the cooperative gain of the treaty is insufficient or suffers from social dilemma problems, which may be especially the case when global public 97  If the treaty itself lacks incentives for state B to comply, that is, diverse performance reciprocity is insufficient, only an external reward makes compliance attractive for B. External rewards can be found in linkage constellations, enhancing reputation, side payments, and rewards given within the treaty but on top of the bargain of the base treaty itself.

i. Reward via Linkage
A classical example for external rewards is linkage of treaties, be it at the entry stage or the compliance stage. 107 EU accession was used as an entry reward for cooperation with the International Criminal Tribunal for the former Yugoslavia (ICTY). 108 The cooperation and extradition requests by the ICTY were the base norms to be complied with and EU (and NATO) membership was the reward. The Republic of Croatia is a classic example. Croatia formally applied for EU membership on February 21, 2003. The European Council decided that accession negotiations with Croatia would open on March 17, 2005, provided that Croatia cooperated fully with the ICTY, in addition to the classical criteria for membership. 109 This meant that it had to take all necessary steps to ensure that the last remaining indictee was located and transferred to The Hague as soon as possible. 110 Negotiations started only in October 2005, when the ICTY prosecution confirmed that Croatia was cooperating with the ICTY. 111 But even the arrest of General Ante Gotovina two months later did not fully improve relations with the ICTY, when ICTY Prosecutor Serge Brammertz stated that some requested documents were still missing. 112 It was thus only after the closing arguments of the Gotovina trial were held in September 2010 that tensions between the ICTY and Croatia ended and Croatia then became a member of the EU on July 1, 2013.
There are many more examples. Often treaties offer new commercial ties or the reduction of existing barriers to trade to create incentives for entering and complying with a (base) treaty. The Generalized System of Preferences (GSP) under the World Trade Organization (WTO) Agreements 113 serves as an example for external rewards to ensure entry and compliance with other international law, given that GSPs are not part of the reciprocity package of the WTO's bargain but remain unilateral and voluntary measures. 114 In the EU, the arrangement removes import duties from most products coming into the common market from designated GSPþ beneficiary countries. The GSPþ helps developing countries to alleviate poverty and create jobs, but at the same time requires the countries to respect core principles of an array of treaties including labor, human rights, and anti-corruption (the base treaties). The entry reward may be insufficient to induce compliance and thus the EU continuously monitors the beneficiary countries' effective implementation of their obligations and publishes a report on the implementation every two years. Both rewarding and rewarded countries can be better off compared to the status quo ante (seeing each country as a whole, notwithstanding winners and losers of trade within the country or third country effects). Here, cooperative benefits are the result of the respective linked treaty and accrue within that treaty, but they secure entry and compliance with a base treaty.
ii. Reputational Rewards Moreover, external rewards can ease entry and compliance through a reputation channel. In our definition, a good reputation is an external reward for entering (normative https://ustr.gov/issue-areas/trade-development/preference-programs/generalized-system-preference-gsp; for Switzerland: Federal Customs Administration, Developing Countries GSP (Generalized System of Preferences), at https://www.ezv.admin.ch/ezv/en/home/information-companies/exemptions-reliefs-preferential-tariffs-andexport-contributio/importation-into-switzerland/developing-countries-gsp-generalized-system-of-preferences-. html; Switzerland's GSP has no conditions and thus is not an external reward. 114 See Kevin C. Kennedy, The Generalized System of Preferences After Four Decades, 20 MICH. ST. INT'L L. REV. 521, 528 (2012) ("As beneficiary countries cannot count on availability of preferences, the consequent uncertainty of market access is a major concern to the countries affected."). As alluded to before, rewards outside of treaties have more flexibility but less predictability. REWARDING IN INTERNATIONAL LAW   2021  213 commitment) or complying (reliability). A good reputation will allow a state to find more partners and to extract more generous concessions in future transactions. It is especially important if internal rewards do not work. Rewarding can stimulate feelings of pride and positive self-image. 115 Social approval makes individuals happy and proud while disapproval causes embarrassment and shame and makes people unhappy. 116 When praise for compliance, be it informally or in annual reports by international organizations, is attached to the base treaty as an instrument, 117 reputation is the benefit conveyed by it. While much of the literature has concentrated on naming and shaming, 118 literature on naming and praising in international law is rather scarce, 119 although it has been discussed (as negative and positive reputation) in the realm of Global Performance Indicators (GPIs), such as the World Bank's Ease of Doing Business Index and the Transparency Perception Index. 120 Declining in the rank of a GPI may damage reputation with various audiences (citizens, business, NGOs, other states), and climbing in a GPI may improve reputation.
Since reciprocity, retaliation, and outcasting do not work in the human rights sphere, reputation (next to assistance) is left as an international compliance mechanism. Shaming and its effect on reputation can also set in motion national processes through the activation of civil society and thus be one means of fostering compliance. 121 But focusing on bad reputation is not the only way to foster compliance-focusing on positive reputation may do so as well.
UN human rights treaty bodies adopt nonlegally binding "concluding observations" after consideration of periodic reports of states parties to the respective convention. They may include acknowledgment of positive steps taken by the state to achieve its obligations, identification of problematic areas that require further action by the state, discussion of practical steps that the state can take in order to improve its implementation of human rights standards, and follow-up on implementation of the concluding observations. These concluding observations thus already include "good practices" or positive aspects before turning to areas of concern and recommendations. For example, the concluding observations on the initial report of Greece on the Convention on the Rights of Persons with Disabilities 122 states that it "values the State party's measures to render public transport in Athens and other major cities accessible and the preservation of the nominal level of disability allowances during the economic and financial crisis." 123 But UN treaty bodies could go further in using naming and praising as rewards. For instance, UN human rights treaty bodies could, under the current legal norms, reward countries for compliance with UN human rights treaties by flagging the best performers in its annual reports, in addition to naming the worst performers as they do now. Article 24 of the Convention against Torture (CAT), 124 for example, gives considerable leeway on how to write the annual report. 125 The Subcommittee on Prevention of Torture and Other Cruel, Inhuman or Degrading Treatment or Punishment decided to identify those states parties whose establishment of their national preventive mechanism was substantially overdue and to record them on a public list. 126 This amounts to naming and shaming. Additionally, it could also name the best performers in submitting timely state reports in the last five or ten years (in addition to those states who have not submitted their due reports) and report on their best practices. This could be backed up with support provided-for example, through the Special Fund established under Article 26(1) of the Optional Protocol to the Convention Against Torture (OPCAT), which is directed toward projects aimed at establishing or strengthening national preventive mechanisms. Similar mechanisms could be established for other UN human rights treaties or regional human rights treaties, adding internal rewards to the external one of reputation.
Another example of rewarding in the human rights sphere can be derived from the Global Alliance of National Human Rights Institutions (GANHRI). 127 Its Subcommittee on Accreditation (SCA) 128 gives letter grades to individual National Human Rights Institutions (NHRIs) indicating compliance with the Paris Principles. 129 The SCA is unique within UN structures, serving as the gatekeeper of these international standards, independent of UN member states. More specifically, NHRIs are given a score of "A" (full compliance), 122 Convention on the Rights of Persons with Disabilities, CRPD/C/GRC/CO/1 of Oct. 29, 2019 (Advance Unedited Version). 123  "B" (partial compliance), "C" (noncompliance), and zero if the NHRI was suspended or accreditation was revoked. This puts countries in the spotlight. Next to the reputational mechanism, NHRIs with an "A" are rewarded intangibly by being allowed to fully participate in the international and regional work, enter meetings of national institutions as voting members, and hold office in the Bureau of the International Coordinating Committee or any subcommittee the Bureau establishes. They can also participate in sessions of the Human Rights Council, take the floor for any agenda item, submit documentation, and take up separate seating. States with "B" status institutions may participate as observers in the international and regional work and meetings of the national human rights institutions. They cannot vote or hold office with the Bureau or its subcommittees. They are not given NHRIs badges, nor may they take the floor for agenda items, or submit documentation to the Human Rights Council. 130 This system shows how reputation and gradual rewarding by intangible benefits works, and it has been deemed very effective. 131 iii. Side Payments Side payments on top of the treaty are another (classical) example of external rewards. 132 They can be given for entry or compliance. Side payments can be considered as a reservation price paid to a target country to make it willing to enter a treaty and/or comply. 133 Side payments are commonly used to enhance international cooperation and can further expand the zone of potential agreement. 134 For example, the United States offered substantial economic and military aid to Egypt and Israel to sign a peace treaty. 135 In another instance, in 1990, the Soviet Union agreed to withdraw its troops from East Germany in return for economic aid. 136 A further example, described by Chayes and Chayes, involved funding provided by the United States to assure that successor states to the Soviet Union which contained nuclear weapons within their territories (Ukraine, Belarus, and Kazakhstan) could meet the requirements of the Strategic Arms Reduction Talks (START) to which the former Soviet Union had agreed and join the Non-Proliferation Treaty. 137 130 GANHRI Sub-Committee on Accreditation (SCA), at https://nhri.ohchr.org/EN/AboutUs/ GANHRIAccreditation/Pages/default.aspxhttps://nhri.ohchr.org/EN/AboutUs/GANHRIAccreditation. 131 Katerina Linos & Tom Pegram, What Works in Human Rights Institutions?, 111 AJIL 628 (2017). 132 Whether those payments are made inside or outside the treaty can be a result of historical accident or of conscious design, especially in bilateral treaties. For analytical and legal purposes, the distinction still is of importance.
133 A reservation price is the highest price acceptable to a buyer to pay for a good (in our case, the highest reward acceptable to an enforcer to give to the target to induce entry/compliance) and the lowest price acceptable to a seller to sell the good (the lowest reward acceptable to a target to enter/comply). 134  Treaties dealing with public goods can provide rewards, like assistance, within the objectives of the treaty. They can also provide rewards outside the scope of the objective of the treaty but regulated within the treaty. This gives additional incentives for states to enter and comply with the treaty. For instance, in arms control, low-capacity countries have claimed their willingness to engage in new arms control commitments in return for assistance by wealthier countries. 138 Countries already contributing to the public good of arms control have incentives to reward countries for entering the commitment, since otherwise countries continue to engage in their harmful behavior. One example is the International Atomic Energy Agency (IAEA) aiming to "accelerate and enlarge the contribution of atomic energy to peace, health and prosperity." 139 The IAEA verifies through its inspection system that states comply with their commitments to use nuclear facilities only for peaceful purposes. 140 States agree to grant the IAEA access to peaceful nuclear facilities and to allow it to employ various verification systems. It offers its member states assistance in the planning and generation of electricity and facilitates the transfer of technology and knowledge (internal rewards). But it also promotes the achievement of the participating states' development goals concerning issues such as poverty, hunger, health, clean water and energy, and climate change (external rewards) by providing assistance in nuclear science and technology. 141 Another example is the declaration of a Marine Protected Area (MPA) as an important means for improving biodiversity and fish resources to protect certain species to which that specific area is especially important. 142 Research shows furthermore that strategically expanding the existing global MPA network to protect an additional 5 percent of the ocean could increase future catch by at least 20 percent via spillover. 143 This amounts to internal rewarding within the Biodiversity Convention, but given the severity of the problem and the gains from mitigating it, external rewarding could be added, since there are considerable spill-over benefits to other countries from increased fish-stock by strategically creating a network of MPAs. 144 In both cases, those rewards could be given in the form of positive transfers to or naming and praising of the countries declaring the MPAs by the Biodiversity Secretariat or the Conference of state parties of the Biodiversity Convention.
In summary, the typology provides possibilities for the usage of rewards under different circumstances and underlying problem structures (game theoretically speaking) and their combination. Its toolbox also illuminates how and which rewards can be used if penalizing mechanisms do not work by themselves. In global or regional public good constellations (like 138 Bernauer & Ruloff, supra note 106, at 7. 139  MEAs, arms control treaties, and human rights treaties) coupled with incapacity, assistance as a reward, internal or external, may be needed and sometimes even topped by side payments. Negative reciprocity and complete outcasting would be undesirable in these constellations. In order to prevent negative reciprocity spirals, a shift toward positive reframing of compliance can be achieved by changing narratives of reciprocity. This connects to another way of rewarding: the use of positive reputation highlighting compliance instead of violations. If needed, this can be topped up with other external rewards, such as linkage of treaties or assistance on top of the objective of the treaty.

IV. REWARDS VERSUS PENALTIES: A RATIONALIST ANALYSIS
In a rational choice framework, a target country does not comply if noncompliance is more beneficial than compliance. A rational enforcer can induce the target country to comply by penalizing the target country for its noncompliance or by rewarding the target country conditional on its compliance. The enforcer will penalize/reward if the benefit expected from the target's compliance is higher than the cost of penalizing/rewarding. To induce compliance, the penalty/reward has to offset the target's gains from noncompliance.
In principle, either sticks or carrots should equally lead to the target's compliance because both means produce the same opportunity costs (costs of noncompliance) to the target country. For instance, if the target country receives a benefit from noncompliance of twenty dollars, then a penalty of twenty-one dollars or a reward of twenty-one dollars should make the target equally compliant. If the enforcer applies a penalty, the target country will suffer a penalty of twenty-one dollars if it does not comply (a net loss of one dollar). If the enforcer applies a reward, the target country will suffer a forgone reward of twenty-one dollars if it does not comply (a net loss of one dollar).
The question then is, when does an enforcer use a reward and when a penalty? A rational enforcer will choose the less costly measure. Both rewards and penalties produce costs. Some scholars have claimed that penalties are always superior to rewards when they are credible, because a credible threat does not need to be carried out and is therefore costless. We dispute this assertion as threats produce costs as well: not only is building up the capacity to make threats and keeping the threat costly, but acquiring a reputation for punishing violators is also costly. Because penalties can be very costly-we will discuss those costs in the following sections-the rational enforcer may consider offering a reward. That rewards are a means to elicit cooperation can be traced back to the Coasean theorem. The Coase theorem stresses thatabsent transaction costs-it does not matter to whom one initially assigns a right. Another party may (and if rational should and would) pay the right holder to relinquish it, so long as both would be better off.
Since rewards and penalties are conceptually symmetric under rationalist assumptions, less effort has been undertaken to analyze their differences, assuming that all or most generalizations about penalties are applicable to rewards. 145 In this Part, we consider some differences between rewarding and penalizing that we deem important for the analysis of rewarding in international law.

A. Costs
Rewards and penalties differ in the costs they produce; we will look at the costs that occur from applying penalties before turning to the costs produced by rewards. 146 Costs are mostly discussed in relation to retaliation, such as sanctions. Economic sanctions as described by former UN Secretary-General Kofi Annan "represent more than just verbal condemnation and less than the use of armed force." 147 They are often considered as a "milder" substitute for military confrontation. 148 Substantial research shows a bleak picture on costs and effectiveness of sanctions. 149 Many studies show that economic sanctions increase poverty and income inequality in the target country, 150 negatively affect the availability of food and clean water, 151 and adversely affect life expectancy-especially for women 152 -and infant mortality. 153 Other scholars point out that economic sanctions worsen the targeted government's respect for human rights 154 and have a detrimental impact on the level of democracy. 155 They are also economically painful for the receiving country. For instance, Neuenkirch and Neumeier find that sanctions imposed by the United Nations and the United States reduce the target state's GDP by 25 percent and 13 percent, respectively. 156 A similar picture emerges for so-called targeted or smart sanctions. 157 These also have unintended consequences, including increases in corruption and criminality, strengthening of authoritarian rule, burdens on neighboring states, strengthening of political factions, resource diversion, and humanitarian impacts. 158 Furthermore, penalties are also costly to the imposing country. Economic sanctions significantly reduce the volume of bilateral trade between the imposing and the target state. 159 The imposition of economic sanctions may interrupt trade and financial contacts of domestic firms with the sanctioned counterpart, generating deadweight losses because import substitutes have to be produced at home at higher costs and the demand abroad shrinks, making economic operators in the sanctioning country economically worse off. 160 Thus, there is a broad consensus among scholars that sanctions produce substantial costs and in many cases fail. 161 The same holds true for targeted sanctions. 162 The costs of penalties are not only a question of how many potential violators there are, and what a specific act of enforcement costs, but how costly maintaining the threat of the sanctioning mechanism is. One may argue that the threat of negative reciprocity is inefficient only if it is unsuccessful; if it induces compliance then it is efficient since it ensures a compliance 157 Smart sanctions have increasingly replaced conventional sanctions. They include actions such as asset freezes, restrictions on luxury goods sales, travel and financial restrictions that should directly affect parties that have some leverage on the regime (e.g., political elites). In theory, targeted sanctions should be more effective as they pressure political elites and decrease the negative effects in the civil population. However, current literature suggests that comprehensive and targeted sanctions both result in civilian pain while political elites remain unharmed. See Peksen, supra note 149, at 280 (reviewing some of the most up-to-date empirical research on sanctions and stating "that both comprehensive and targeted sanctions remain morally impermissible tools due to their substantial negative externalities and low success rate"). 158  However, it should be noted that it is difficult to measure success, as one component is the deterrence of future noncompliance and this cannot be observed. 162 BIERSTECKER, ET AL., supra note 158. One can safely assume that UN targeted sanctions should be more effective than unilateral (or regional) ones, given the number of states implementing them even if considering the practice of secondary sanctions. Assessing their effectiveness is complex, since those sanctions have different goals, they may be designed to (1) change behavior (to coerce), (2) constrain behavior (to constrain), and/or (3) send a signal (to signal). Based on an analysis of twenty-two UN targeted sanctions regimes, the authors conclude that UN targeted sanctions are effective in achieving at least one of the three purposes of sanctions 22% of the time with a higher rate for constraining and signaling but effective in coercing only about 10% of the time. Vol. 115:2 equilibrium. But the sanctioner can already incur costs at the threatening stage. For example, threats of sanctions may trigger uncertainties concerning trade and investment policies. 163 It is clearly costly to maintain a large military for contingencies. This aspect of the costs of penalties is often neglected. In comparison, rewards may not have such negative consequences on the target country's humanitarian situation but rewards must be paid, and this requires the necessary capacity of the sending state. 164 In other words, the credibility of a prospective reward depends on whether the rewarding entity has sufficient resources to provide them. The wealth of the sender is an upper constraint on the use of rewards as the sender must be able to pay the reward (similarly, the wealth of the receiver is the upper constraint on the use of some penalties). 165 One problem arises from how many countries can be rewarded. That depends on the type of reward. Internal rewards can be used for a large number of countries and may even produce "economies of scale." 166 This is different for external rewards such as payouts and assistance since funds are especially scarce on the international level. Those can be costly when many states have to be incentivized to enter and comply but are feasible when only few countries have to be incentivized, e.g., to provide a public good. 167 However, even if the number of states that require rewards appears limited, that number may not be static, and could multiply. If a state can reap not only the intrinsic rewards of participating in a treaty, but also extract a side payment for full compliance, presumably more states will insist on the side payment as a necessary part of the bargain. 168 Not only may they feel fully justified in doing so, as with treaties that assist lower income states, but the rewards may after some period of time also been seen as an entitlement. 169 Intangible rewards such as praising, in contrast, provide another type of reward that can be multiplied to more compliant states at lower costs. 163 The IMF Global Uncertainty Index measures this. It is deemed relevant since the "index is associated with greater economic policy uncertainty, stock market volatility, risk, and lower GDP growth." See Hites Ahir, Nicholas Bloom & Davide Furceri, 60 Years of Uncertainty, 57 FIN. DEV. 58, 59 (2020); Clayton Webb, Re-examining the Costs of Sanctions and Sanctions Threats Using Stock Market Data, 46 INT'L INTERACTIONS 749, 771 (2020) (Webb concludes that "sanctions threats are not costless. The results show that sanctions threats create stock volatility, even when sanctions have not been imposed. This volatility imposes costs on firms."). 164 The question of initial endowments and capacity is connected to the Coase theorem which is generally overlooked in scholarship. In order to be able to bargain, a minimal capacity is necessary. This is not the same as transaction costs. This problem also applies to penalties since imposing the penalty requires resources and capacity. In addition, if we consider monetary fines, the receiver has to be able to pay the fine, thus penalties may incur capacity problems on both sides, receiver and sender, which is worse in comparison with rewards facing capacity problems only on the sender side. 165 See Dari-Mattiacci & De Geest, supra note 145, at 448. 166 Economies of scale describe the lower production costs of goods with increasing production output. In our case, additional members reduce the average costs of providing the (public) good. At the same time, the larger the number of participants in a treaty the higher is the collective benefit of the treaty. 167 See, e.g., Pamela Oliver, Rewards and Punishments as Selective Incentives for Collective Action: Theoretical Investigations, 85 AM. J. SOC. 1356 (1980) (pointing out that the relative costs of using rewards or punishments to produce a public good depend on the fraction of cooperators out of potential cooperators required to produce that good).

B. Rewarders' and Volunteers' Dilemmas
Another problem connected to costs arises from how to incentivize countries to contribute to the reward (rewarders' dilemma) and/or to volunteer to provide a public good (volunteers' dilemma). The rewarders' dilemma is that states would rather free ride on other countries providing the reward. The rewarders' dilemma is mitigated if rewarding generates gains for the giver(s), e.g., when states have a strong preference for the public good to be provided. Benefits vary according to the nature of the treaty or the global public good dealt with in the treaty. Rewarding may generate a positive reputation at the enforcement level (as we will see in the next Section) that may alleviate the rewarders' dilemma. Especially costly rewards are prone to the rewarders' dilemma; intangible rewards can be less costly and therefore less subject to the dilemma. Even with this dilemma, experiments demonstrate that individuals typically choose to reward, regardless of cost, and that rewards increase other individuals' contributions to public goods. 170 The volunteers' dilemma captures the expectation that one prefers to free ride on the effort of other volunteers. 171 A classic example involves bystanders observing a person in danger. One bystander is necessary to help the person but if no bystander volunteers the person suffers harm. The optimal solution to the volunteers' dilemma does not require each individual to fully volunteer; rather, coordination may be needed for assigning the volunteer. Leshem and Tabbach show that for nearly all numbers of volunteers, rewards are more efficient than penalties. 172 There are many instances where a single country can produce alone or contribute to a (global) public good, i.e., volunteer. While tangible rewards can be costly when they have to be multiplied for many countries, they may be especially useful for incentivizing only one volunteering country. 173 Penalties are hard to imagine in those instances. Costs are always smaller with rewarding than with penalizing with regard to the provision of a single shot global public good. 174 For instance, saving the planet from an asteroid does not require all states to contribute to the public good. Another example would be rewarding (tangibly or intangibly) the declaration of an MPA as an important means for improving biodiversity 170 See, e.g., David G. Rand, Anna Dreber, Tore Ellingsen, Drew Fudenberg & Martin A. Nowak, Positive Interactions Promote Public Cooperation, 325 SCI. 1272 (2009)("We show that reward is as effective as punishment for maintaining public cooperation and leads to higher total earnings. Moreover, when both options are available, reward leads to increased contributions and payoff, whereas punishment has no effect on contributions and leads to lower payoff. We conclude that reward outperforms punishment in repeated public goods games and that human cooperation in such repeated settings is best supported by positive interactions with others. can be superior to penalties when the lawmaker requires higher efforts from some citizens than from others, for instance, when only some families need to sacrifice land for a highway project). 173  in an exclusive economic zone to protect certain species to which that area is especially important or if that area is of special importance for a network of MPAs. 175 The compliance stage of CITES could yet be another example, 176 if African states would be rewarded for keeping the numbers of endangered species at a sustainable rate. Given that ever more private-public partnerships are set up for the management of protected parks 177 and these generate income from tourism, intangible rewards from praising 178 by states would be followed by tangible ones from third parties (like income from tourism). 179 Both dilemmas are captured in Table 2. If we have one possible rewarding state and one possible rewarded state, neither of the two dilemmas occurs. When there is one possible rewarding state but N states that could be rewarded, the question arises of who should volunteer and thus be rewarded (volunteers' dilemma). If there are N prospective rewarding states and one prospective state to be rewarded, the question arises of who will provide the reward (rewarders' dilemma). When there are N prospective rewarding and N prospective rewarded states, both questions arise: who should be rewarded and who is rewarding.

C. Pareto Efficiency
When the enforcing country expects a higher loss from the target's noncompliance than the target country gains, there is room for Coasean bargaining. The enforcing country can structure the reward to be conditional upon the target country's compliance. Two assumptions are crucial for the pareto efficiency of rewards: expectation and conditionality.
A rational enforcer will only offer a reward if he or she expects a gain from the target's compliance that is higher than the cost of rewarding. How high must the reward be? Recall that in a rational choice framework, the target country would not comply if noncompliance is more beneficial than compliance. Thus, the reward has to offset the target's gains from noncompliance to induce compliance. In other words, a reward has to compensate the target for its compliance. A reward therefore has to make the target country better off compared to the status quo wherein the target does not comply and rewards are not considered. Rewards have a built-in compensation mechanism, which means that rewards always allow the target country to opt for the status quo. 180  Another important assumption for pareto efficiency is conditionality, which conditions provision of the reward upon compliance by the target country. If a reward is paid before the target complies it can lead to opportunistic behavior, that is, after receiving the reward the target country may decide not to comply. In that case, the rewarding country is worse off-the target country did not comply while the enforcer paid the reward, and thus the reward is not pareto efficient. This problem is alleviated when rewards are paid conditional upon the target's compliance. If the target fails to comply, the enforcer will not pay the reward and the enforcer is not worse off. Note that in a repeated game with reputation, it can be expected that the target does not behave opportunistically when receiving the reward ex ante. A reputation for complying makes the target country likely to receive further rewards. In that case, even though rewards are given ex ante they still lead to compliance and thus remain pareto efficient.
In summary, when rewards are not conditional they can lead to pareto inefficiency but if the (prepaid) reward is followed by compliance the reward is pareto efficient. When a reward is beneficial in expectation and conditional it is always pareto efficient because it makes no country worse off while at least one country is better off.
An enforcing country can also penalize the violator for noncompliance. In a rational choice framework, one difference between rewards and penalties is that penalties do not lead to pareto efficiency. Why is that the case? In the status quo of no penalty, the target country complies if the gain from compliance is higher than the gain from noncompliance and there would thus be no need for a penalty because the target would comply anyway. The pareto inefficiency results from the fact that penalties always make the target country worse off compared to the status quo in which the target country does not comply and does not receive a penalty. 181 If the target complies, it forgoes gains from noncompliance (gains under the status quo) and it may also incur some effort costs of compliance (e.g., destroying weapons requires effort and thus is costly). Note that this also applies to threats that are deterrent. 182 Because the target country is always worse off compared to the status quo of noncompliance and no penalty, penalties cannot meet the requirement of pareto efficiency. Penalties can at best be Kaldor-Hicks efficient if the overall benefit of the punishing countries outweighs the loss of the punished states. 183 For instance, in arms control, if country A provides a reward to country B for giving up weapons it would acquire otherwise, then the acceptance of the reward by country B increases the welfare on both sides-given that the reward is conditioned and beneficial in expectation to the enforcing country. If, in contrast, country B is subject to penalties for acquiring weapons, then country B either suffers from the effort costs of giving up certain weapons or it suffers from the penalty if it does not give up the weapons. Therefore, under a penalty mechanism, the target country is always worse off compared to the status quo.

D. Reputation of the Enforcing Country
Reputation acts negatively and positively and on two levels. The first level concerns the receiving country and how reputation affects its decision to enter or comply with a base treaty. 184 The second level, which is the focus here, regards the enforcement level, namely the reputation of the enforcing (rewarding/penalizing) country. The sanctioning dilemma has been at the forefront of discussion on compliance, leading to the prognosis that costly retaliation will seldom take place. However, states may still refer to external penalties in order to build up a reputation for penalizing violators, so that other states will be less likely to breach their obligations. 185 Imposing external penalties may hurt ties with the receiving country and is prone to generate feelings of hostility. 186 Penalties may also decrease the willingness of other countries (third countries that are not subject to the penalty) to cooperate with the penalizing country, especially when the penalty is perceived as unfair. States then could be deterred from entering a treaty with the penalizing country in the first place.
Similar to penalties but different in direction, rewards can generate a reputation for appreciating those who honor their obligations. The reliance on rewards to appreciate countries' compliance may generate a reputation of goodwill. 187 A reputation of goodwill in turn facilitates future cooperation with the target and other countries. 188 Additionally, rewards may be used to keep a reputation of good intentions, especially in relations considered as friendly. Furthermore, the more dependent the relationship, the more important obtaining the approval of the other nation will be and the higher the incentive to grant (and reciprocate) rewards. 189 Thus, even though costly today, states may refer to rewards to build up a reputation of good intentions that in turn eases (future) cooperation. However, a reputation for not penalizing may signal a permissive attitude and allow more violations to take place. Rewarding in a relationship that is considered as adversarial may not be accepted by the sender's citizens; instead penalties are used to demonstrate firmness. 190 At the same time, penalties fulfill an important signaling function: disapproval. Nevertheless, penalties that incorporate some rewarding may 184 See Section III.B.2.ii supra. 185 See Sykes & Guzman, supra note 35, at 443 ("The decision to bear the costs of retaliation, then, is justified by a desire to persuade others that a state will punish violations. If a state is successful in building this reputation for punishing violators, other states will be less likely to breach their obligations."). 186 See also Section V.B infra, dealing with difference in perception. 187 DAVID CORTRIGHT, THE PRICE OF PEACE: INCENTIVES AND INTERNATIONAL CONFLICT PREVENTION 10-11 (1997) ("Perhaps the greatest difference between sanctions and incentives lies in their impact on human behavior. Drawing on the insights of behavioral psychology, Baldwin has identified key distinctions between the two approaches. Incentives foster cooperation and goodwill, while sanctions create hostility and separation."). And further, at 11 ("Punitive measures may be effective in expressing disapproval of a particular policy, but they are not conducive to constructive dialogue. Where sanctions generate communications gridlock, incentives open the door to greater interaction and understanding."). 188 See also Section V.4. dealing with difference in stability. 189 MARTIN PATCHEN, RESOLVING DISPUTES BETWEEN NATIONS: COERCION OR CONCILIATION? 266 (1988). 190 See, e.g., Ben-Shahar & Bradford, supra note 4, at 385 ("Rewarding rogue regimes could be particularly difficult to justify to the domestic audience that contests the moral rationale for bribing belligerent countries."). Similarly, in arms control, see Bernauer & Ruloff, supra note 106, at 6 ("The strong focus on negative incentives may also be attributable to the fact that many academics and practitioners tend to dislike the idea of rewarding hold-out or laggard countries for not collaborating voluntarily in arms control. They would rather bully even bomb reticent countries into line than bribe them."). absorb some of its positive characteristics (e.g., reduced hostility, good will) and may be a more efficient incentive compared to pure penalizing. 191

E. Monitoring
Differences between penalties and rewards are further reflected in the monitoring of compliance. The problem of monitoring is that the target state has incentives to hide or misrepresent information to avoid the prospect of a penalty or to attract a reward by falsely claiming compliance. 192 There are an array of monitoring and verification mechanisms within international law that parties are able to "cheat." 193 For instance, a test-ban treaty (that forbids nuclear weapons testing) cannot be perfectly monitored because nuclear tests are detected by observing seismic activity (e.g., an earthquake). 194 This incentivizes states to cheat by exploding nuclear devices while claiming that the outcome was caused by seismic activity. 195 The question thus is which of the two mechanisms better incentivizes states to reveal truthful information. Actors with favorable information usually disclose it whereas parties with unfavorable information keep silent. 196 With rewarding, the incentive to provide information by the relevant state is higher, especially "to provide information on problems they encounter in implementing international commitments-if a country has no such problems-it will not receive assistance." 197 This effect is enhanced if the rewarded state is the one with the duty to provide the information in order to receive the reward. With threats, the common strategy is to act as if it did not hear or understand the threat. 198 The target's incentives with rewards are different: It is likely to make every effort to show it has heard the source and is behaving accordingly in order to reap the reward. Threats promote deceptive behavior on the part of targets, distrust on the part of the source. In contrast, promises can promote open and honest action on the part of targets and may promote greater trust in the overall relationship. 199 While the misrepresentation of information may be enough to avoid a penalty when noncompliance cannot be directly observed, it may not be sufficient to receive a reward when the burden of proof of compliance is on the recipient's side and demands more information sharing.
Monitoring is often hampered by technical as well as political problems. Rewards in the form of technical support, for instance, can make monitoring compliance cheaper. Regarding the political problems, monitoring can interfere with sovereign rights of a country, such as onsite inspection of military installations in arms control, prisons under IHL, or international human rights law. Inspections may be perceived as more legitimate if rewards instead of penalties are promised and thus states will allow inspections to take place, easing monitoring. Rewards may also decrease the inclination of cheating by fostering cooperative relations. Lazear points out that rewards are superior when it is not clear what the maximum level of performance is, whereas penalties are superior when the minimum level of performance is unclear. 200

V. REWARDING: THE (BEHAVIORAL) DIFFERENCE IT MAKES
In the previous Part, we looked at differences between rewards and penalties from a rational choice perspective. In this Part, we deal with behavioral differences between rewards and penalties. As Baldwin expresses it, "When B reacts one way to a promise of $100 if he will do X, and another way to a threat to deprive him of $100 if he fails to do X, the concept of opportunity costs makes it difficult to explain why." 201 The importance of studying compliance theory from a psychological perspective arises from the different effects rewards and penalties have on human behavior. In the field of psychology, where rewards have been studied for a long time, 202 the literature generally seems to be more favorable toward rewards than toward penalties with respect to human behavior. 203 In international politics, two psychologists, Milburn and Christie, summarize rewarding as "an alternative without the major disadvantages of threat with its potential implications for instability, distrust, and mutual dislike." 204 That is, psychologists assume asymmetrical effects between rewards and penalties. Perhaps the single most important insight of cognitive psychology derives from Prospect Theory. 205 Prospect Theory questions the validity of the rationalist Coase theorem, the latter neutralizing the psychological contexts of human interactions whereas the former stresses the importance of the difference of perceived gains and losses for behavior. But can those insights be applied to states? We discuss this issue before turning to the psychological differences of rewards and penalties. As for the differences, penalties and rewards first differ in the receiver's perception: penalties are likely to be perceived as negative, rewards are evaluated positively. Second, penalties and rewards differ in the receiver's response: penalties are more likely to cause resistances or even counter-threats, rewards are more likely to be reciprocated.
Third, rewards and penalties differ in the impact on international relations, and thus their stability: penalties are likely to increase conflicts, rewards to decrease them.

A. Applying Behavioral Insights to States
The rational choice paradigm as employed in economics and IR theory informing international law has been challenged since the 1970s by psychological experimental research, with a revolutionary impact for economics and law and economics. This research shows that in contrast to the expected utility model, actors are only boundedly rational, and systematically have other-regarding preferences (both positive and negative). But is this even relevant for international law given that states are complex organizations? Political psychology in international relations has a long and strong tradition using psychological insights 206 and international law scholars have been taking up those insights more recently; we are thus not in uncharted waters. 207 There are two major challenges to applying psychological insights, especially when based on experiments, to international law.
The first challenge is the relevant unit of analysis. Whose behavior is at issue? Is it the state as a "black box" or is it individual actors, such as judges, political leaders, military commanders, trade negotiators, or other individuals, whose actions and decisions are attributable to the state under international law? Are we concerned with "elite" decisionmakers, experts, or the public? There is no methodological challenge if individual behavior is attributed to the state under international law. Or is it small decision-making groups, acknowledging that many decisions regarding international law-related conduct are made by such groups, and that group psychology is often different from individual decision making? If so, the research becomes more complex but we also know that group behavior deviates from rational choice assumptions-groups do not necessarily make a decision more rational. 208 Or do we take the state as unit of analysis? States are multi-sectoral, multi-agent entities and as such are complex organizations. There are three possible approaches to this problem. The first views the state as an organization. Psychology and behavioral economics are already being successfully applied to organizations, albeit mainly business organizations. 209 The second approach looks at the relationship between citizens and politicians. Whereas political economy has long explored domestic political processes and interactions between national and international politics (the 206  Vol. 115:2 "two level game"), 210 behavioral political economy is still in its early stages, but is gaining ground. 211 The third approach simply attributes nonstandard preferences, beliefs, and decision making directly to states (or the individuals acting on its behalf); it is this approach we follow here for simplicity reasons. This can be defended since much of international law decision making is in essence made by individuals or small decision-making groups; the very term "state conduct" implies that states are regularly assimilated to individual actors. The rational choice approach is no different in this-in order to reduce complexity, the same behavioral assumptions are used on the individual and the state level. The second challenge derives from the experimental basis of much of behavioral research (but not all psychological research) used in our context. 212 Applying experimental psychology and its methods to international law is feasible. 213 Some experimental results with intuitive appeal were confirmed by field studies (e.g., in the realm of commons) 214 and were empirically tested in the context of international law using the state as unit of analysis. 215 Applying experimental insights to individual decision makers whose acts are in turn attributed to the state (e.g., treaty negotiators, diplomats, or state officials) or to international judges is no major problem-the unit of analysis is the individual (as in most experiments). One limit of external validity is that most experiments are conducted with students. But experiments that are conducted with experts mostly show that experts exhibit similar deviations from rationality. 216 Applying experimental insights to the state directly is more problematic since aggregation problems arise. But rational choice theory faces the same criticism when applied to the state as such-reverting to the "unit of analysis" problem. The rationalist approach also needs to justify why and how states act rationally given that individual actors evidently show bounded rationality, since there is currently a disconnect between behavioral insights for individuals and states in the rationalist approach. Of course, material interests and strategic interaction remain of cardinal importance, as posited by rational choice theory. But psychological realities have been underappreciated even though its experiments yield more factors to consider-more, perhaps better, tools in the toolbox-that may enable sustained cooperation in the international realm.

B. The Difference in Perception
Scholars of international relations have long understood that threats and sanctions in the international arena may generate feelings of hostility toward the source country. They may generate perceptions of "out-and-in-groups" and may result in stigmatizing effects, where the threatening part positions itself as complying with norms and stigmatizes the other country as the deviant. 217 The target's interpretation of the sender's intentions matters. Threats and sanctions are often exploited by the target government to generate an image of a hostile foreigner that holds malevolent intentions and is blamed for the economic difficulties faced by the receiving country. Positive inducements make it difficult for the regime to stigmatize the foreign country's behavior as hostile, and as a result undermines the regime's ability to mobilize support. 218 Rewards are also considered to be less confrontational as compared to penalties, and therefore lead to fewer problems related to sovereignty issues and interference in domestic politics. 219 Perceptions may differ depending on who the sender is. Threats and promises coming from adversaries are most likely perceived differently than ones from an ally. 220 Rewarding in a relation considered as friendly is more likely to be perceived as appropriate and received with sympathy, while rewards in adversarial relations are probably perceived as suspect. 221 Penalties in close relations, e.g. relations between the United States and Canada, or France and Germany, might be considered as inappropriate. But also when dealing with adversaries, negative incentives might be less effective in extracting meaningful concessions. 222 Scholars have highlighted the expectation of conflicts: when a conflict is expected, concessions to a threat will only weaken the bargaining strength, but accepting a reward would strengthen the target's position; thus rewards are more likely to be accepted. 223 Moreover, penalties may be perceived as unfair, e.g., when a country unsuccessfully tries to explain why it could not avoid breaching the agreement or when a penalty falls on low-income countries with capacity restrictions. 224 Penalties may also be perceived as illegitimate when they undermine sovereign rights of states. Rewards can help to reach international agreements to be perceived as fair. For instance, agreements that reward poor countries for their compliance may be perceived as more appropriate and fair than penalties. 225 COOPERATION (1996). 219 Bernauer & Ruloff, supra note 106, at 21. 220 DREZNER, supra note 39. 221 See Milburn & Christie, supra note 4, at 631 ("The arms race could be seen as a perceptual dilemma in which each side professes to desire near parity of forces or disarmament but believes that the other side secretly harbors a motive for superiority."). 222 Drezner, supra note 39, at 201. 223 Id. (This leads to the following paradox: "In the case of incentives, receivers will be the most eager to accept a carrot when they anticipate frequent conflicts with the sender, which is precisely the situation where senders are the most reluctant to proffer the carrot. . . . Thus, senders will prefer to use sanctions over inducements against adversaries because they anticipate frequent conflicts, but those expectations also make sanctions less effective and inducements more so."). 224 Bernauer & Ruloff, supra note 106, at 5. 225 For an analogue's argument to national law, see Vol. 115:2 international climate policy, for example, tackles fairness considerations by relying on rewards in the form of funds and attaching a stronger burden on developed countries. 226 Scholars of international law have pointed out the importance of fairness perception to effectively implement agreed measures. 227 Rewards that support fairness perceptions can increase compliance.

C. The Difference in Response
There are several reasons why responses to rewards and penalties may differ. One reason is linked to the emotions produced. Threats trigger negative emotions such as fear, anxiety, or anger, and cause a subject to feel stress. 228 Stress is supposed to reduce cognitive abilities of decision making and may result in irrational evaluations of the benefits and costs of compliance versus noncompliance. 229 Threats may also provoke a perception of conflict. People tend to take hawkish decisions in conflict situations, including those described by Prospect Theory. 230 The term "hawkish" denotes a propensity for suspicion, hostility, and aggression as well as for less cooperation and less trust in the resolution of the conflict. 231 Actors who are susceptible to hawkish biases are not only more likely to see threats as more severe than an objective observer would perceive them, but are also likely to act in a way that will lead to unnecessary conflict. Thus, the response may be to resort to noncompliant behavior.
Another reason why threats may be less effective than rewards is linked to psychological costs. 232 When threats are perceived as hostile or even as insulting by the leader (or by an audience that has some leverage over the leader, e.g., elites, citizens, foreign allies), noncompliance is motivated by the avoidance of looking weak (or to lose the approval of the group). Compliance under threats is then considered as damaging to a government's reputation of firmness. 233 As penalties are almost always public (and they should be to achieve deterrence), it makes it much harder for the government, if it complies with demands, to maintain that it did so voluntarily. 234 The fear of losing face therefore can result in escalation. 235 Rewards can produce a more neutral setting, e.g., by highlighting mutual benefits. 236 Adding a reward might, for example, be very effective when a government considers cooperating due to the pressure from sanctions but would not do so for fear of humiliation. A substantial reward might allow a government to sell their eventual concession as a mutually beneficial deal.
Another reason for differences in responses is linked to reciprocity. Receivers may behave noncompliant to penalties and compliant to rewards because of reciprocity that calls for returning bad for bad as well as good for good. 237 As mentioned, reciprocity has long been known to be a crucial building block of international law. 238 Laboratory experiments show that a significant number of subjects are willing to reward cooperative behavior, referred to as "strong positive reciprocity," and to punish the uncooperative behavior of opponents, referred to as "strong negative reciprocity." 239 This finding even holds true in one-shot interactions where reciprocity is costly and does not maximize the tangible payoff. More recently, in an experiment, Chilton, Milner, and Tingley examined how reciprocity influences public opposition to foreign direct investment. 240 They showed that individuals care about rewarding or penalizing foreign countries for their policies. When a foreign firm's home country restricts investments from the respondents' country, the respondents are more likely to oppose potential transactions. Other empirical findings confirm that nations reciprocate each other's behavior. 241 233 Id. at 180-81 ("The target of a threat may defy the threatener not because the immediate tangible costs of compliance are too high but, rather, because he views compliance as humiliating or as damaging to his long-term relationships with adversaries by creating an impression of weakness under pressure."). 234 Id. 235 DAVIS, supra note 6, at 22. 236 Milburn & Christie, supra note 4, at 633. 237 PATCHEN, supra note 189, at 264 ("When leaders of one nation receive a reward or concession from another or a promise of such reward, they may believe that it is right and appropriate to reciprocate."). See also DAVIS, supra note 6, at 19 ("To the extent that promises of shared rewards promote the norm of reciprocity between actors, they hold greater prospect for transforming relations from conflictual to cooperative over a range of issues."). 238  Penalties and rewards change individual perception of others' expected conduct. 242 Sanctioning increases the salience of unlawful behavior that may harm the general perception. 243 They might be seen as a cue that others are not cooperating and this can in turn trigger negative reciprocal behavior. Rewards incorporate an important signaling function that increases the perception of respecting law in the international arena. For instance, research in tax compliance has shown that the perception of other individual's tax compliance is crucial for the own tax compliance. 244 As rewards increase the salience of law-abiding behavior, they can encourage conditional cooperators to invest trust, complying if they expect enough others do so, too. In other words, individuals are willing to contribute, trusting or knowing that others are contributing as well. The expectation is that if there are enough players "in" and there is a reasonable expectation that other states will comply, those that are conditional cooperators and who are willing to invest trust will indeed cooperate. Most individuals are conditional cooperators and the same has been diagnosed for states in climate change law. 245 The illustration in IHL on positive reciprocity from above is another intriguing example. 246 Rewarding thus has a third-party effect, especially in multilateral treaties.
Yet another reason for differences in responses is linked to Prospect Theory. Prospect Theory revealed that people are very sensitive to changes in their endowment and that choice is driven by an overwhelming psychological desire to avoid loss. 247 In our analysis of compliance with international law, Prospect Theory leads to an important hypothesis: in the domain of loss, 248 rewarding is more effective than penalizing. 249 Not all noncompliance is motivated by the desire to make gains. Sometimes it is fear that is driving behavior. If noncompliance is motivated by the fear of loss, threatening with more loss only enhances the motivation that gave rise to the problem in the first place. Actors in the domain of loss are risk seeking, thus, ceteris paribus, less sensitive to the risk associated with escalation. 250 For instance, not all states have an interest to comply with human rights (even when entering human rights treaties). If a leader fears losing political power when complying with human rights, threats of further losses (e.g., sanctions) may not have a deterrent effect. 251 Rather, the avoidance of losses may lead to higher degrees of repression. However, according to Prospect Theory, decision makers should be highly receptive to promised rewards when noncompliance is motivated by the fear of loss. Hafner-Burton shows that when compliance with human rights treaties is tied to tangible benefits (in the case examined, the benefits were captured via preferential trade agreements), it improved states' human rights records. 252

D. The Difference in Stability
The necessity to analyze the behavioral differences between rewards and penalties in the field of international law further arises from their effects on interstate relations and thus on political stability: "[P]romises can transform relations among adversaries in a way that threats cannot." 253 Rewards are more likely to please the receiver and tend to invite future cooperation. Threats are more likely to reduce the receiver's willingness to have any future contact with the penalizing state. As noted by Baldwin, "If A uses positive sanctions today, B will tend to be more willing to cooperate with A in the future, but if A uses negative sanctions today, B will tend to be less willing to cooperate with A in the future." 254 Rewards are likely to spill over to the target's willingness to cooperate on other issues while penalties are likely to impede such cooperation. 255 For instance, sanctions by the United States against Cuba, Iran, and North Korea since the Cold War era have also hindered their willingness to cooperate on other foreign policy issues. 256 Rewards in the form of (economic) integration support cooperation and communication, while penalties lead to isolation. For instance, Hellquist argues that one possible reason why the EU favors sanctions abroad but not at home is that sanctions are instruments of exclusion and ostracism-a tool that does not fit at home where disagreements are resolved through dialogue. 257 Integration is considered to be essential in promoting compliance with international law. 258 There is strong evidence from experimental research that communication and personal contacts between players increases cooperation. A meta-analysis of social-dilemma experiments concludes that discussion has an extremely positive effect on subjects' willingness to cooperate. 259 The tendency of penalties to spawn more penalties may escalate in conflicts, sometimes referred to as "conflict spirals." 260 What has been less of a focus is that rewards can produce de-escalatory behavior. 261 This insight is again linked to Prospect Theory. People generally place a higher value (usually twice as much) on what they stand to lose than on what they may gain of objective equivalent size. The use of penalties increases the value of winning because the receiver of penalties is more likely to face a loss with respect to the status quo. Fighting in a dispute which grows in magnitude, makes winning more important than it was initially. Each side becomes less willing to concede and more willing to suffer costs and take the risk of escalating the fight. In contrast, rewards reduce the value of winning in a dispute and the question of who will win becomes less salient. At the same time, the prospect of cooperation and mutual benefits increases in salience.
Another difference is built on trust. It was only relatively recently that IR scholars began to probe what trust really is, how it can be studied, and how it affects state relations, be it from the rationalist perspective 262 or from a more constructivist or psychological perspective. 263 Rewards are more likely than penalties to create trust, be it inter-personal trust 264 or strategic trust. 265 Intangible rewards, e.g. visits, social approvals, and praising, are important instruments to build up trust. There is an agreement in the literature that signals of uncooperative behavior or sanctioning do not help to develop trust. 266 Missing trust, in turn, affects the stability of international relations and impacts the behavior of conditional cooperators.

E. Summary
Contrary to rational choice assumptions, rewards and penalties are not equally incentivizing means-they are not two sides of the same coin but are in fact two different currencies. Penalties and rewards communicate two different principles: the bad of breaking law versus the good of complying; disapproval versus approval; the willingness to punish noncooperative behavior versus the willingness to reward cooperate behavior. All other things being equal, these two principles do not lead to the same impact on behavior. Table 3 summarizes our main findings regarding the behavioral differences between rewards and sanctions.

VI. WHEN WILL REWARDS WORK BEST?
In this Part, we describe the limitations to rewarding as well as the conditions under which rewards (internal and external) are likely to be successful, taking into account also rewards' behavioral impact on a state's decision to comply. self-esteem. 278 Rewards in form of development aid may act as one example of reduced selfdetermination in the international arena. While developed countries may subjectively define development aid as rewarding, developing countries may perceive it as controlling and dependency enhancing. This dependency increases as receiving countries become subject to threats of permanent aid cutoffs. Nevertheless, external interventions crowd-in intrinsic motivation if it is perceived as supportive.
Intangible rewards, in contrast, such as praising or approval have been proven to crowd-in motivation. 279 In that case, self-esteem is fostered and individuals feel they are given more freedom to act, thus enlarging self-determination. If these insights can be translated to the international arena, social approval, recognition, praising, and inclusion could foster motivation to comply. This form of crowding-in effect is important to the analysis of international law, if it is assumed that governments comply also out of intrinsic motivation.

B. Conditions Conducive to Rewarding
Several conditions are conducive for the effectiveness of rewarding. 280 First, for a reward to be effective, it needs to match with the receiver's perception of what is conceived as rewarding and what is valued. 281 While the giver may subjectively define the action as rewarding, the receiver may not: "A may perceive himself as employing carrots, while B may perceive A as using sticks." 282 Promises are simply ineffective in securing concessions if the rewards promised are perceived as inappropriate or even insulting. Rewards are then perceived as coercive and operate similar to penalties. With coercive rewards, the psychological cost of giving in increases. 283 Rewards may be considered imposing and seen as undermining the integrity of the state/political community, especially if they are conditioned, e.g., EU foreign assistance and aid in some states such as Turkey. Therefore, the value of a reward will depend on the receiver's need and on the objectives it pursues. While for some national leaders tangible rewards are of great value, e.g., grants or loans of money to their nation, for others the need for symbols of high status are more important, e.g., social approval, inclusion in international conferences, and invitations to state visits. Furthermore, the clearer and more specific a promised reward is, the more it is conducive to compliance. Leng showed that national leaders were more likely to respond with compliance to clearly specified, rather than unspecified, promises. 284 In the same line, Snyder and Diesing found that clear, explicit offers of concessions were more likely than vague offers to facilitate settlements of serious disputes between nations. 285 Deutsch suggests that vagueness about the rewards may lead the target to conclude that the promisor has little power to deliver. 286 Thus, the explicit communication of reward contingency is essential for its success; this argues for explicit internal rewards in treaty design.
The timing of delivery is important as well. Rewards are expected to be more successful if the promised reward is timely delivered after compliance by the receiver. As Deutsch states, promises that are to be fulfilled far in the future will have a lesser effect on compliance than promises of rewards closer in time. 287 This holds even more for short-sighted leaders who value the present more than the future. 288 One example that underlines the importance of timely delivery of rewards is the revelation of North Korea's secret uranium enrichment program seven years after the so-called Agreed Framework was signed between the United States and North Korea on October 21, 1994. North Korea complained that the United States fell short in fulfilling its promises by failing to lift economic sanctions and failing to provide it with promised light-water reactors. 289 Invitations to bid for construction of reactors were not issued before 1998, and the construction of the first reactor began only in 2002. 290 North Korea's revelation was allegedly based on a failure by the United States to deliver on its promises. 291 Furthermore, to be effective, a reward must be credible. The target should perceive that the promise is within the enforcer's control. 292 Promises, like threats, according to Deutsch, will be more credible when the rewarder is perceived as determined to influence and has the capability to implement the promise. 293 Rewards that are seen as excessive may lack credibility. 294 Thus, any promises made should be clearly within the rewarder's capability to fulfill and should not be beyond budget capacity. Public statements as commitment devices can help increase the credibility of promises just as explicit provisions for trust funds in treaties. The credibility of promises will be affected by past behavior of the rewarder as well, i.e., how well it has kept its past promises. Establishing a record of not exploiting others and of keeping past promises makes current promises more credible. 295 Rewards may also be more effective toward countries that already suffer from substantial penalties. 296 The marginal impact of an added penalty diminishes with increasing penalties and countries' sensitivity toward penalties decreases. Lastly, the effectiveness of rewards is influenced by the value the receiver attaches to future cooperation with the rewarder. 297 Considerable economic or political interdependencies increase the value the receiver puts on future cooperation and thus the more likely a positive response will be. This also applies to relations perceived as generally friendly. By complying, the receiver will be viewed more favorably, while rejecting an offer of a reward would make the receiving state be perceived less favorably. Whether a leader will comply in response to a reward thus depends on the advantage of maintaining the goodwill of the rewarding state.
Thus, rewards work best, when: (1) the reward matches with the receiver's perception of what a reward is; (2) the receiver values the reward; (3) the receiver depends on the rewarder to provide the reward; (4) the reward is not combined with coercive measures that would cause increasing psychological costs for accepting the reward (unless to deter moral hazard); (5) the reward is clearly specified; (6) the reward will be timely delivered; (7) the reward is credible; (8) the rewarder has the capability to implement the reward; (9) the rewarder has a record of holding promises; (10) the reward follows substantial penalties in grave cases of violation; and (11) the receiver appreciates the cooperation with the rewarder and depends on its goodwill.

VII. CONCLUSION
Rewarding is an important mechanism for compliance with international law. Although the problem of compliance has loomed large in the IR literature, IR scholars have largely focused on penalties (sticks). Only recently have they started to discuss rewards (carrots) as well as the interaction between the two. Mainly, they focus on security constellations and stay on the diplomatic policy level. The rationalist international law scholarship on compliance has also been more focused on penalties, and rewarding has been undertheorized in that discussion. In fact, we are not aware of one international law article dealing with rewarding in international law more generally, let alone elaborating a typology. 298 This is surprising given that rewarding is, as we have shown, inherent in compliance mechanisms like reciprocity, reputation, and outcasting. By defining and elaborating the "classical" mechanisms more precisely and providing a toolbox of options, we hope to illuminate the compliance discussion, and, with it, the rational design literature as well. External penalties, like countermeasures and retorsions (retaliation), have advantages and disadvantages that differ from internal penalties and can be employed in other constellations. The same applies for internal and external 295 See Section VI.D (rewards can be used to generate a reputation of good will). 296 See Ben-Shahar & Bradford, supra note 4, at 427 ("Sanctions are similarly ineffective in situations where the Sender has already employed them unsuccessfully against a Target in the past and where further sanctions can only inflict marginal additional pain on the Target. . . . In such situations, the promise of lifting existing sanctions is often the most attractive reward for the Target."). In contrast, because rewards increase the target's wealth they can lead to a saturation effect: at some wealth level a reward does no longer incentivize the target. See Dari-Mattiacci & De Geest, supra note 145, at 440. In that case, a penalty may be more efficient. 297  rewards. Under a rational choice approach, the selection between rewarding and penalizing is made according to the benefits and costs of each means. Their differences, even if both means are considered to be symmetric, are manifold. Even though rewards can be costly when applied to many countries, rewards in general are pareto efficient and bring about mutual benefits. Rewards can generate a reputation of goodwill that can increase cooperation.
Monitoring and fact-finding, an approach that is gaining ground in international law, could be more successful when using rewards rather than penalties. 299 Penalties, when applied frequently, also create substantial costs to the penalizing country. However, when a reputation is built up to penalize countries not complying with their commitments, it can deter countries from misbehavior. The insights we provide into compliance mechanisms help not only to better understand the mechanisms as already discussed in the literature but also shift the focus from a penaltyoriented system to governance mechanisms between states. A focus on rewards can reframe the compliance debate toward positive inducements that have often been overlooked. We illuminate where rewards are already used (but were not discussed as such) and can be used in a more targeted fashion. Rewards can be used not only in international diplomacy (which may end up with treaties) but in treaties more generally. They are thus of practical relevance. They can be used in bilateral as well as in multilateral constellations. Rewarding can be applied in treaties dealing with global public goods and commons just as in reciprocal treaties. They can also be used in soft law. We also show the limits to rewarding and the conditions under which rewards can best be used.
Moreover, theorizing only from a rationalist perspective, in which rewards and penalties are two sides of the same coin, may overlook important differences between penalty and reward. Psychological literature (and IR scholarship) has long emphasized these differences and has reported empirical evidence on the individual and the state level. Psychological insights show that rewarding and penalizing differ significantly in their effect on an individual's behavior. Laboratory experiments show that, even when transaction costs are low, the Coasean equivalence (assuming that penalties and rewards lead to the same result) does not hold true in reality. The behavioral analysis is therefore an important addition to the use of rewards in international law. The behavioral perspective allows evaluating penalties and rewards from a perception, a reaction, and a stability perspective. This leads to additional arguments why theorizing rewarding can fill a gap in the literature and in practice.
Although their costliness and ineffectiveness have been thoroughly discussed, penalties remain at the forefront of academic discussions and policy. With this Article, we submit that compliance theory needs reframing in order to realize the array of positive inducements already existing and their potential in international law. The framework we elaborate can be used for more doctrinal research on specific treaties or issue areas of international law as well as comparatively between them de lege lata and de lege ferenda.