Samsons or Methuselahs? The (Potential) Virtue of Article II Treaties

Why does Wüsthof sell a fancy kitchen knife for US$2000, but mass-produce something similar for US$100? Why do some of us mail holiday cards, while sending anything similar by email? Why does the American Journal of International Law print its journal, when interested readers—and there should be many—can read articles like Julian Nyarko's “Giving the Treaty a Purpose: Comparing the Durability of Treaties and Executive Agreements” online? Come to think of it, why bother with Article II treaties, when they too have a near substitute, more easily produced, in congressional-executive agreements? On this last question, Nyarko's article offers an interesting approach and an intriguing finding: if we measure the commitment strength of agreements in terms of duration, treaties are measurably longer and, perhaps, stronger. Having spent several years working on treaty issues for the Restatement (Fourth) of the Foreign Relations Law of the United States, I am acutely (and perhaps embarrassingly) interested in finding out why they matter. In this essay, I note some misgivings about how the article reckons the substitutability of agreements and about treating their age as a proxy for strength—perhaps Methuselah rivaled Samson's might at some point, but that was not how he distinguished himself—before closing by trying to imagine rival inferences that might be consistent with Nyarko's valuable insights.


The Puzzle of Substitutability (and Reliability)
For over seventy-five years, U.S. foreign relations law has wrestled with the relationship between treaties and executive agreements. 4 As the relationship has shifted, the questions have evolved. At the beginning, the vital question was whether the United States could use executive agreements, and for what subjects-particularly as to areas previously dominated by treaties. Nowadays, the question is slightly more academic. Why are treaties used, scholars ask, when they appear to accomplish the same things as executive agreements, but seem so much harder to get approved? This question became more pointed as treaty approvals fell, and in light of recent failed attempts to use Article II-as with the Disabilities Convention and the Law of the Sea Convention. If one regards treaties as inferior in some regard, one might go further and press for using executive agreements, particularly congressionalexecutive agreements, whenever possible, rather than enduring pointless ordeals. 5 Nyarko highlights these and related questions, then distills the discussion to focus on legal substitutability. That concept seems to be narrowly conceived. Thus, when Nyarko reports that treaties and congressional-executive agreements "act as legal substitutes under domestic law for the vast majority of agreements," 6 he seems to draw a parallel with the substitutability of treaties and executive agreements on the international plane; substitutability under domestic law is likewise achieved when a mechanism permits forging internationally binding agreements. 7 A fuller notion of substitutability, however, might also address domestic legal efficacy: for example, accounting for the view that congressional-executive agreements are only effective as means to establish U.S. law for matters that would fall within federal authority absent an agreement. 8 Substitutability in the eyes of any decision-maker would be yet broader. Executive branch lawyers might regard an executive agreement as second best in any area that is even arguably reserved for treaties. And they would certainly regard a treaty as a poor substitute if standing legislation already authorized a congressional-executive agreement, such that no fresh consent was needed. These more nuanced distinctions, though critical, may be difficult to capture in any quantitative approach.
Nyarko's own analysis devotes little attention to such parameters, at least beyond assuming international legal fungibility; the article instead focuses on potential "different outcomes." 9 But only the apparent fungibility of the instruments excites interest in comparing duration or any other outcomes. Perhaps data indicates that handcrafted knives last longer, or that paper holiday cards are more likely to be saved. But we would want to know if choices were plausibly being made on that basis-as opposed to, for example, if people purchased high-end knives mainly because they were pricey enough to be respectable wedding presents, or if households mailed cards to avoid real-time interactions over the holidays-and, more generally, whether the choice was demanddriven or instead driven by the supply side. Rather than exploring analogous questions, Nyarko uses some kind of "quality of the promise itself " to distinguish hypotheses concerning core differences between treaties and executive agreements from mere "orthogonal" differences-although most involved would regard themselves as addressing matters directly relating to quality. Particularly when one turns to examining the "motivations" of primary actors in "the choice to use the treaty," 10 it pays to confirm whether outcomes are decisionally salient.
As Nyarko's own, rich reading of the literature suggests, existing studies allow rival predictions on the quality of agreement types. Some, like Oona Hathaway, 11 question the relative utility of treaties, noting that the implementing legislation necessary to make a treaty domestically effective is far from guaranteed, while also contending that treaties are easier for a president to withdraw from unilaterally. Others, like Lisa Martin, 12 suggest that treaties are more reliable, because they are the costlier of the two paths for the president to pursue (in terms of mustering legislative support) and so when selected, signal the president's commitment to the agreement. Yet others focus less on reliability than other explanations, such as that the Senate focuses selectively on major agreements, 13 or that the president uses treaties when Senate support is high and evades the treaty process when it is not. 14 As the competing claims about reliability, at least, bear closely on Nyarko's own findings, it may be valuable to consider whether the underlying inquiries are congruent. On the one hand, the agreement types considered in the competing studies differ substantially. Hathaway focuses on comparing non-self-executing treaties with ex post (and rara avis) congressional-executive agreements, while Martin compares treaties with all executive agreements, when sole executive agreements require no direct legislative support. Nyarko mostly follows Martin, but he does attempt to measure whether differentiating within executive agreements matters. 15 On the other hand, reliability (and durability) hypotheses tend to regard the president as the party who is motivated or choosing-natural, given the president's front-line role, but in tension with perceptions that the president has been compelled to use Article II at the Senate's insistence. 16 This is material. We would not ignore whether it was Wüsthof 's CEO, rather than its marketing department, who insisted on producing a showpiece knife the oldfashioned way, or if one parent made posing for the traditional card a precondition for presents. Just so, as Nyarko acknowledges, we should remain attuned to whether an agreement path is chosen for domestic reasons, including because the Senate-sometimes-insists that treaties be used.

The Puzzle of Durability
As Nyarko observes, existing studies tend to examine the circumstances in which an agreement type is chosen and then extrapolate patterns that help to predict future uses, with the hope of understanding what motivates choices between treaties and executive agreements. This requires strong assumptions, and surely risks finding anecdotal support and overlooking rival explanations. 17 Nyarko's data offer additional insight-for example, by suggesting only weak support for the idea that treaties are used according to historical convention. 18 Of course, totals tend to count important and unimportant agreements alike. One might accept, for example, that human rights agreements are not usually done in the form of a treaty, while at the same time recognizing that actors might insist on treating more prominent human rights conventions that way.
Anyway, if looking at outcomes, what should we measure? Most would agree that compliance is hard to assess; an agreement's resistance to shocks is also complex and subjective. Nyarko turns to what he calls "commitment strength," measured in terms of "durability"-for which he uses survival-time analysis, based on data gleaned from the Treaties in Force (TIF) series. Nyarko ably explains his methodology, and I will not focus on that here, as my survival-time in discussing regressions is short indeed.
Future researchers should be cautious, though, of using "durability" as proxy for "a party's reliability," as Nyarko suggests at one point-albeit without fully adopting that view. 19 The TIF does not detail why any particular agreement goes out of force for the United States. It might be because the United States lawfully withdrew-or because the agreement ended according to its terms or by party consent, or (if bilateral) terminated following a party's denunciation, or was superseded by another agreement. Nyarko

SAMSONS OR METHUSELAHS? THE (POTENTIAL) VIRTUE OF ARTICLE II TREATIES
terms as implicating reliability, but that is not the only curiosity. 20 An agreement that falls out of force for the United States because of U.S. withdrawal surely bears on U.S. reliability. But an agreement not in force because another party (or parties) withdrew, or because the parties agreed to terminate or to a superseding arrangement, has much less to do with how parties participating in those decisions would evaluate risks. Duration, in these circumstances, really is just duration, not durability in any strategic sense.
At the same time, agreements may remain in TIF despite being completely unreliable. Some, like Martin, regard violating an agreement or interpreting it unreasonably as exhibiting unreliability. But such agreements might well remain in TIF along with others that are simply irrelevant, zombie pacts that fed once and for all time on diplomatic brains. TIF, then, will undercount as well as overcount in reckoning reliability.

Implications and Applications
Bracketing these issues, Nyarko's finding that "treaties last significantly longer than executive agreements" is potentially significant. 21 He is guarded, however, about what this indicates for instrument choice. His analysis suggests, by using controls and fixed effects, that variables associated with some theories-such as that choice is driven by subject-matter norms, or senatorial attention only to important agreements-do not predict variation. At the same time, he does not dismiss their possible influence, even on duration, and he wisely cautions about causal inference. 22 Mainstream arguments about why one agreement type or the other might be perceived as superior are likely to be unaffected. For example, those advocating the use of congressional-executive agreements might surely maintain that if treaties actually allow presidents greater capacity to withdraw, or to renege on domestic implementation, that could cloud reliability in any particular situation even if neither seemed to affect average duration. 23 Still, among the theories he discusses, Nyarko's discussion remains relatively amenable to signaling theory and its notion that presidents choose treaties to indicate the seriousness of U.S. commitment. 24 One can see how using treaties might distinguish reliable presidents, if treaties are too costly for unreliable ones-as they need to overcome greater domestic opposition, offer greater concessions to secure votes, and the like. And surely trialby-Senate might appeal to another state, if it cares less about whether it achieved any agreement than whether it achieved a particular kind. 25 Is the longer duration of treaties possibly support for signaling? The theory's basic premise that other states ascertain, prior to committing themselves, which path the U.S. president will choose, seems unlikely in the general case-and even more unlikely in the case of multilateral conventions.
Whether the theory's mechanisms respect real-world results, or reflect valid inferences, is also unclear. Presidents and foreign states might well regard average duration data as a poor way of determining how the United States is likely to conduct itself under a given agreement. Either may also wonder whether extrinsic factors, such as the fidelity of other states, influenced the duration of prior agreements, and another state might pause over 20 As he says, an expiration date might be partly determined by unreliability. Nyarko, supra note 3, at 68. But how much is unclear-nor will it be clear whether doubts center on the other party's reliability. 21 Id. at 77, 81-82. 22 Id. at 82-84. 23 Indeed, whether a non-self-executing treaty has been implemented by statute has no necessary impact on whether it stays in force.
Hope is always around the corner. 24 For suggestive passages, see, for example, Nyarko, supra note 3, at 72, 84. 25 The difference between treaties and congressional-executive agreements might even be accentuated when ex ante authorization of the latter was older and less reliable as a signal of domestic support.
whether it even valued extra duration in the first place. Learning that a holiday card might survive your attempts to throw it away would not be especially heartening. The data might be supplemented by case studies, but here researchers should look beyond the examples that are usually cited to demonstrate that other states insist on Article II treaties. Nyarko cites three instances. 26 For the second Strategic Arms Limitations Treaty (SALT II), the Senate, not the Soviet Union, compelled the switch from a congressional-executive agreement (the form taken by SALT I) to a treaty. 27 (Soviet officials did object to segregating some commitments into a short-term protocol-not a congressional-executive agreement-and demanded that they be incorporated into the treaty that the parties were concurrently developing. 28 ) For the Strategic Offensive Reductions Treaty, the United States proposed parallel, unilateral nonbinding pledges before acquiescing in Russian demands for binding commitments, but it was again U.S. senators who demanded the treaty form. 29 Finally, the Philippines sought a treaty to govern U.S. bases there, according to its view that a fresh agreement, rather than another periodic extension by a ministerial-level memorandum, was required. The Philippines also took the view that under its law, this required consideration both by the Philippine Senate and the U.S. Senate; the United States resisted the latter, and the former proved fatal in any event. 30 Instances in which other states insisted on U.S. Senate approval, as against the congressional-executive form, seem rare, and they need not in any event have much to do with whether a treaty stays on the books for a few more years.
I expect that further work might well suggest decision trees of mixed methodology. Convention likely establishes soft defaults. Perhaps on occasion, presidents reasonably confident of approval may opt for Article II, particularly if another route would court needless controversy over Senate prerogatives. On other occasions, legislators may demand it, either with the aim of undermining an agreement's prospects or preserving prerogatives; a president may agree to run the Senate gauntlet if she is more optimistic on a treaty's (relative) odds or fears that evasion will prove costly domestically or internationally.
In such circumstances, duration might be a happy byproduct. At least sometimes, the party driving the choice of a treaty-the president, to curry institutional favor when support is high, and legislators, to raise a fuss-may be more likely to do so for agreements they perceive will confer substantial benefits or costs, which may correlate with duration. Multilateral agreements seem like a prime candidate, and it is notable that Nyarko finds that twenty percent of multilateral agreements are treaties, "far exceeding the share in any bilateral relationship." 31 In contrast, congressional-executive agreements premised on ex ante authorization (as they overwhelmingly are) require a shorter horizon in order to realize positive gains, and may even be renegotiated, and go out of force, without troubling Capitol Hill.
Regardless, Nyarko's study gives us data that new or synthetic theories have to explain, and it is a bright spot in a field prone to anecdote and guesswork. It also invites those in the field, and future researchers, to engage in greater introspection as to what we are comparing-and what it means for an international agreement to succeed.