Introduction
Experiments in the Laboratory Country
At 4:00 pm on September 25, 2019, one of the most significant meetings in recent Chilean history took place in a nondescript office in downtown Santiago. No senior officials or members of the parliament participated, nor was it mentioned in the press or on social media. On paper, it was just the routine quarterly meeting of the Public Transport’s Expert Panel (PTEP), a group of three experts – academics in transport economics – whose purpose was simply, as the meeting’s minutes describe, the “application of the fare indexer of the … Ministry of Transport and Telecommunications” (MTT 2019, p. 1).
The fare indexer was a function that sought to determine the need for a fare increase in Transantiago, Santiago’s public transportation system, based on recent variations in the values of a series of key system inputs, such as the price of oil or inflation. Its mathematical formulation was quite simple, constituting in practice little more than “a simple Excel spreadsheet with data provided by the [National Institute of Statistics,] INE and other government agencies,” in the words of Juan Enrique Coeymans (Reference Coeymans2019), a professor of the Department of Transportation Engineering of the Catholic University of Chile and president of the PTEP at the time.
The meeting began with a presentation by technicians from the MTT regarding some recent changes in the factors included in the indexer. Based on this information, “the panel determines that it is appropriate to adjust 10 pesos in the adult fares … and also increase by 2.857% all current adult fares, equivalent to 20 pesos” (MTT 2019, p. 3), establishing that the overall increase of 30 pesos (roughly 0.03 US dollars) would come into effect on October 6, 2019. After briefly discussing other matters, the members of the panel leave at 6:15 pm, “without having more topics to discuss,” as concluded by the minutes.
It was quickly evident that there were “more topics” to discuss regarding this fare increase. A few days after it took effect on October 6, groups of high school students began protesting the measure by jumping over the payment booths at the city center’s subway stations. The authorities asked the police to intervene, arresting protesters, usually in a violent manner, while vilifying the movement in the media. This reaction only gave greater strength to the movement, with evasions now spreading to the entire city’s subway network. Finally, on the afternoon of Friday, October 18, the government decreed the closure of the whole metro network, leaving millions of people with no way to return home. This measure generated chaos in the city, leading to the development of massive street protests. The protestors were violently repressed by the police, only generating an escalation in the clashes that led the government to declare a constitutional State of Exception a few hours later. The movement quickly spread to other cities in the country, becoming a generalized protest against the low quality of public services, including health, education, and pensions.
As the hours passed, this unrest turned into the most significant social movement the country has experienced since the return to democracy in 1990, including rallies with millions of participants, the massive destruction of infrastructure, and more than thirty deaths and hundreds of injuries, mainly due to the excessive use of police repression. The country was paralyzed for several weeks, forcing the authorities to make a series of concessions, especially holding a plebiscite to determine a possible change in Chile’s political constitution (something previously unthinkable just a few days before). Only the beginning of confinement due to the COVID-19 pandemic at the end of March 2020 was able to pause the “Estallido Social” (social outbreak), as the movement was known.
Although the phrase “No son 30 pesos, son 30 años” (it’s not 30 pesos, it’s thirty years) – indicating that the causes of the unrest were much deeper than the fare increase – quickly became one of the most popular slogans used by the protesters, there was no doubt that such increase has acted as a trigger for the movement. As a political commentator pointed out, “no one could deny that thirty pesos … were at the origin of everything” (Mayol Reference Mayol2020, p. 22).
The fare indexer and the PTEP have emerged in the shadow of the massive controversy that arose from the start of Transantiago in February 2007, a radical reorganization of Santiago’s public transportation system. Starting from a diagnosis regarding the historically poor quality of the public transport system, Transantiago sought to provide the city with a world-class system. To achieve this, it was primarily inspired by the innovative Bus Rapid Transit (BRT) model of public transportation. Conceptually developed in the United States and England in the 1970s (Wood Reference Wood2015), BRT promised to address various critical issues of mass public transportation systems rapidly and cost-effectively, offering an alternative to expensive subway systems.
Although perceived as highly successful, as we will explore in Section 1, BRT implementations have only been partial to date, typically limited to a few lines in central areas of cities. Nothing of the sort happened in Transantiago. From one day to the next, the system underwent a radical transformation, affecting almost every aspect of the public transportation system for its millions of users. To make matters more complex, the system also included a series of other innovations, from a new payment system to an alternative bus design. All was done with quite limited funding, resting on the promises of BRT to solve multiple public transport ailments in a highly cost-effective manner, and after a planning period of merely five years.
As we have explored in a previous book (Ureta Reference Ureta2015), this combination of factors was the perfect recipe for a disaster. From the very early hours of the opening day, almost every aspect of the new system seemed to be behaving in unusual ways or simply failing. Added to users who had received virtually no training on how to navigate the new system, Transantiago rapidly became a fierce public controversy. Not only was the government criticized for its incompetence, but city dwellers also experienced a significant decline in their quality of life.
Until then, the Chilean State was widely regarded as a leader in policy implementation in the Global South. Adopting pompous terms such as “Tigers of Latin America,” Chile was perceived as having successfully sorted many of the governance malaises that have historically affected its neighbors, implementing effective policy reforms that have put the highly coveted prize of finally being considered a “developed country” within reach. Transantiago certainly broke such a trajectory, starting a cycle of decay in the effectiveness and public standing of policymaking that finally led to the Estallido Social of 2019.
Given such an outcome, it is valid to ask “Why?”. Why was such a perilous policy implemented in the first place, especially taking into consideration that public transport reform was not an urgent matter of public concern? Chiefly among the possible answers to such a question, we must put Chilean policymakers’ unparalleled love for risky policy experiments. On this unmatched affection, they were not alone.
Toward Experimental Societies
Experimentation is in fashion nowadays. From movements within the arts that call artists to “become experimenters” (Schwab Reference Schwab2013) to the notion that politics should aim to carry out “democratic experiments” (Laurent Reference Laurent2011), the concept has become a key heuristic for understanding several dynamics of contemporary societies. Derived from the high valuation of innovation and creativity, our societies are increasingly becoming “regimes of collective experimentation” (Felt & Wynne Reference Felt and Wynne2007, p. 26) focused on the deliberate creation of situations “which allow to try out things and to learn from them, i.e., experimentation. Society becomes a laboratory, one could say” (Felt & Wynne Reference Felt and Wynne2007, p. 26).
As the last part of the quote reveals, the notion of experiment has evolved beyond its origins in laboratory practices in the natural sciences to become a “polymorphous concept” (Schwarz Reference Schwarz2015, p. 112) that can be applied to a wide range of social situations. In the process, the notion has progressively lost its connection with the testing of hypotheses in controlled settings, becoming something fuzzier, akin to John Dewey’s (Reference Dewey1938) notion of the experiment as a universal form of inquiry connected with innovative kinds of problem-solving in societal affairs. In this general sense, we are all experimenters.
There is a second, more specific, way in which it could be said that ours is an experimental society. A key development associated with the emergence of a knowledge society has been the progressive “extension of research practices to sites outside the institutional framework of science” (Schwarz Reference Schwarz2015, p. 113). Increasingly, societal actors are opting for the application of scientific principles and methods in addressing aspects of social concern, often adopting the form of “in-vivo experiments” (Callon Reference Callon2009). Some of these experiments are conducted with the specific aim of producing scientific knowledge. Still, many others are conceived from the outset as novel approaches to addressing matters of public concern. In this regard, a knowledge society becomes mostly “a society that builds its existence on certain kinds of experiments, practiced outside the special domain of science” (Gross & Krohn Reference Gross and Krohn2005, p. 77).
In the policy domain, several authors have discussed the emergence of a broad “experimentalist turn” (Huitema et al. Reference Huitema, Jordan, Munaretto and Hildén2018). Mainly related to the growing popularity of Randomized Controlled Trials (RCTs) in policy evaluation, discourses and methods labelled as experimental have carried out a “silent revolution” (Ansell & Bartenberger Reference Ansell and Bartenberger2016, p. 64) in policymaking globally. Within a few years, experiments have gone from being a marginal and mainly academic tool to one of the most favored approaches to policy design and evaluation in multiple areas, forming the practical basis of today’s “evidence-based policy” movement (Heinrich Reference Heinrich2007).
Despite its popularity, there is little clarity as to what exactly an experiment is in the policy field. While, in some cases, such as in RCTs, the systematic use of methods and standards echoing those of the natural sciences represents a defining element, in many others, the term refers simply to “anything that deviates from normality” (Huitema et al. Reference Huitema, Jordan, Munaretto and Hildén2018, p. 145). As a result, experimentation is freely used to describe any practice that deviates from traditional ways of policymaking, typically associated with what is pejoratively referred to as a “bureaucratic logic” (Hasmath et al. Reference Hasmath, Teets and Lewis2019; Tepe & Prokop Reference Tepe and Prokop2018). Beyond this basic understanding, different actors “do not necessarily refer to the same thing when using the term” (Ansell & Bartenberger Reference Ansell and Bartenberger2016, p. 64), which generates no little confusion when trying to arrive at a more operational definition.
A View from Elsewhere
This Element aims to contribute to clarifying these issues by developing a novel approach to policy experiments. This perspective is innovative in two primary senses. First, it aims to make policy analysis denser by incorporating theories and concepts from Science and Technology Studies (STS). For more than sixty years, STS has been developing a highly innovative understanding of the role of science and technology in our societies (for an introduction, see Sismondo Reference Sismondo2004). Given this focus, STS appears especially fruitful to offer novel understandings of policy interventions that regularly borrow elements from science in their justification, implementation, and evaluation. Second, the Element is innovative because it diverges from the usual focus on industrialized countries in most literature. Recognizing that policy experimentation is a global phenomenon, the Element aims to develop a theory specifically focused on the Global South by examining Chile, a country where radical policy experiments have become the norm rather than the exception.
From elsewhere, a policy experiment will be understood as interventions into issues of public concern using innovative policy instruments to obtain knowledge about its effectiveness. In Section 1, we will ground this definition on a deeper understanding of experiments, the field, and experimentation in the Global South. We will then devote the rest of the Element to portraying contemporary policy experiments as consisting of four interrelated phases. These experiments typically begin with an encounter with a charismatic policy instrument of foreign origin, an instrument that generates diverse imaginaries and emotional attachments (Section 2). These imaginaries and attachments motivate a new problematization of some pre-existing issue, which presents the new instrument at the center of its possible resolution (Section 3). The acceptance of this problematization leads to testing the instrument in mesocosms formed by multiple components of the issue, which are usually purified and reordered following the frames proposed by the instrument (Section 4). These tests yield different types of results, primarily transformations in the experiment’s objects, various types of situated inscriptions, and emotional (dis)attachments (Section 5). Finally, evaluations of the experiment’s effectiveness are conducted, yielding contradictory evidence and lessons learned.
The growing popularity of experimental policymaking poses not only practical but also ethical challenges. A central element of policy experiments is that their primary focus is on testing an instrument, rather than necessarily improving the issue at hand. Such tests could have nefarious consequences for the societies in which the experiments are conducted, as was the case with the Transantiago project. Furthermore, conventional accountability mechanisms often fail to effectively address innovative and complex policy experiments. Given its diffuse and tentative nature, responsibility for its design and implementation is typically widely distributed, resulting in no one commonly feeling compelled (or being forced) to assume responsibility when things go wrong. The sections of this Element, therefore, not only propose a new vocabulary for analyzing contemporary policymaking but also aim to explore alternatives to the dilemmas that policy experiments generate.
1 Which Policies, Whose Experiments
The 2019 Nobel Prize in Economics, awarded to Abhijit Banerjee, Esther Duflo, and Michael Kremer for their work in development economics, seemed to enshrine the primacy of experimental methodologies in evaluating public policies. These researchers are among the most prominent contemporary representatives of the current approach, which has sought to utilize RCTs for assessing and selecting poverty reduction policies, an approach that has revolutionized development economics. In recent decades, this method has been applied to an increasing number of areas of public action, ranging from education to housing.
Starting from the pioneering work of Campbell (Reference Campbell1969), RCTs are based on
Creating two groups through random assignment. One group is exposed to the treatment intervention, and the other group (the control or comparison group) is exposed to an alternative intervention or no intervention. Both groups are then followed for a specific period, and the results of interest are observed.
Based on this formulation, its proponents identify the key advantage of these experiments as the easy establishment of causal relationships between the components of a policy, with the intention of determining the effectiveness of a particular intervention. The transparency of its design appears to ensure its objectivity. As Banerjee has enthusiastically concluded (Reference Banerjee2007, p. 115), “the beauty of randomized evaluations is that the results are what they are.”
Popular in the 1960s and 1970s but declining afterwards (Shadish & Cook Reference Shadish and Cook2009), RCTs have experienced a strong resurgence in recent years, being considered the “gold standard” (Torgerson et al. Reference Torgerson, Torgerson, Taylor, Newcomer, Hatry and Wholey2015) for evaluating policies across multiple areas. This approach has become especially popular in the Global South, where RCTs are usually seen as “objective, unbiased, independent of any subjective assumptions, and therefore academically rigorous and trustworthy, as well as legitimate from the point of view of actors in the field of development aid” (de Souza Leão & Eyal Reference de Souza Leão and Eyal2019, p. 387). Such an assumption has transformed several locations in the Global South into laboratories for various RCT-inspired policy experiments, ranging from poverty reduction (Tyagi & Webber,Reference Tyagi and Webber2021) to technological innovation (Fejerskov Reference Fejerskov2017). In the process, RCT practitioners became increasingly intolerant of other modes of policy experimentation (Harrison Reference Harrison2013), seeing randomized trials as the only serious form of policy experimentation.
The progressive universalization of RCTs as the “gold standard” for policy experimentation in the Global South represents a problematic development in at least three primary senses. First, it rests on a radical simplification of the policy process in contemporary societies. Second, it misunderstands several critical elements of field experiments in policy. Third, it overlooks the numerous forms of policy experimentation already underway in the Global South, most of which are not remotely connected to RCTs. The rest of this section will explore these three objections.
Policies as Assemblages
Emerging in the immediate aftermath of World War II, policy analysis had to confront firsthand both the bureaucratic horror of the Holocaust and the ethical dilemmas derived from the Manhattan Project. For this reason, from the beginning, it was imagined as an analytical and moral endeavor. As Harold Lasswell (Reference 74Lasswell and Braman1951, p. 3) claimed in one of the founding texts of the discipline:
A policy orientation has been developed that cuts across the existing specializations. The orientation is two-fold. In part, it is directed toward the policy process, and in part toward the intelligence needs of policy. The first task, which involves the development of a science of policy formation and execution, employs methods from social and psychological inquiry. The second task, which consists of improving the concrete content of the information and the interpretations available to policymakers, typically extends beyond the boundaries of social science and psychology.
What was sought was not only to understand the formulation and effects of policy but also to provide recommendations for their better design, to avoid repeating processes such as those experienced in the first half of the twentieth century.
As a basic strategy to achieve both goals, strict adherence to scientific knowledge production models was demanded, to “replace the ‘ordinary and local’ knowledge of politicians and citizens on the formulation of policies” (Hoppe Reference Hoppe1999, p. 204), which was explicitly blamed for the excesses of the past. This “scientific” approach would not only produce better analyses but also contribute to “the reorganization of society through policy measures based on a scientific problem-solving rationality” (Webb & Gulson Reference Webb and Gulson2015, p. 164). The scientization of policy analysis was seen as a prerequisite for making increased contributions to actual policy design, thereby transforming the very issues motivating such interventions.
Following this “scientific” approach, policy design was usually understood as “the logical way of determining the means available to achieve a given end” (Alexander Reference Alexander2000, p. 245). This logic rests on a realist framework that assumes that “the data and observations that form the inputs for policy analytical techniques are unproblematic” (Hajer & Wagenaar Reference Hajer and Wagenaar2003, p. 16). From this perspective, policy design assumes a linear and quasi-automatic character, as described by Colebatch (Reference Colebatch2006, p. 4): “a problem is identified, data are collected, the problem is analyzed and recommendations are given to the policy implementer, who makes a decision that is implemented.” Such a process can be divided into easily distinguishable and self-contained phases, each one with its components and characteristics.
A key consequence of this approach, as Béland and Howlett (Reference Béland and Howlett2016, p. 394) note, is the notion that in policy design, “problems precede solutions.” Well before any policy is sketched, let alone implemented, there are problems of public concern that demand attention. These problems are self-evident to most relevant actors, becoming the key drivers of policy design. Given such demand, for these authors, policy design becomes mostly “the elimination or refinement of options based on criteria such as efficiency, effectiveness, legitimacy or applicability” (p. 394). It is usually accepted that policymakers have preestablished preferences for specific policy designs, but these preferences “only follow the consideration and definition of the problems, not the other way round” (p. 395). From this perspective, the policies finally implemented appear as “technical, rational, action-oriented instruments that decision-makers use to solve problems and effect change” (Shore & Wright Reference Shore and Wright1997, p. 5).
This approach has given rise to a plethora of empirical research that seeks to describe and analyze the well-defined stages of policy design in specific contexts. Policy design appears as a concatenation of logical and coherent steps, all focused on addressing the original problem, and in which different human actors occupy leading roles. In this endeavor, primary research materials have been the narratives developed by the actors involved, either verbally (primarily through interviews) or through the “paper trail” of documents they produce. The overall emphasis has been on studying policies that can be classified as either “success stories” or “failures” to extract “good practices” or “lessons” that can be applied to policies being implemented elsewhere.
Motivated by developments in critical and poststructuralist theory, and adopting genealogical and ethnographic methods, a growing number of policy researchers have challenged the linear model of policy design since the 1990s. What they found when applying these alternative conceptual and methodological tools differed from the usual stories of policy success and failure, as well as the seemingly endless quest for snippets of “good practice.” As Radin (Reference Radin2000, p. 221) concludes in one of the pioneering examples of this approach, from this perspective, it rapidly became evident that “there was a disconnect between the analysts’ perception of self-value, usually based on models of rational action, and the actual contribution that individuals make to the weaving of the policy formation process.” Put another way, there was an apparent mismatch between the neat narratives of policy practitioners (and many policy analysts) and the messy realities of policy implementation.
Seeking to escape this dead end, these authors have sought to expand the analytical tools of policy studies. Inspired by the “practical turn” in the social sciences (Reckwitz Reference Reckwitz2002; Schatzki et al. Reference Schatzki, Knorr Cetina and E.2001), they have argued for the need to move the focus to the analysis of the concrete practices through which policies emerge and are implemented. As Colebatch and co-authors (Reference Colebatch, Hoppe and Noordegraaf2010) argued in an influential publication, policy design and implementation are a form of “work,” or a sum of practices that must be executed. Therefore,
The main concern of the scientific study of what policy professionals do should not be the methodological rules of self-justification, but observation and immersion in what public policy workers are doing most of the time and what they achieve with their activities. Just as science is what scientists do, public policy is what those who professionally engage in public policy do, in other words, how public policy practices are made and how they evolve
From this perspective, the study of policy comes to be understood as the systematic (even forensic) analysis of the routine behaviors of the multiple actors involved in the process of generating and administering a policy.
This approach made it rapidly evident that, although human actors and institutions always play central roles in policy design, they are accompanied by a plethora of nonhuman entities, ranging from signs to infrastructures. These material entities are not merely inert tools, backdrops for the projection of human agencies and institutions. They are conditions of possibility for the practices from which policies emerge, “artefacts and practices […] are linked to each other, they constitute each other” (Freeman et al. Reference Freeman, Griggs and Boaz2011, p. 129). To make sense of policy, analysts must “study the objects in practice and pay as much attention to the material or nonhuman part of policy as to the immaterial or human part” (Mellaard & Meijl Reference Mellaard and Meijl2017, p. 332).
This growing attention to technical practices and artifacts has moved policy studies closer to the interdisciplinary field of STS. Since its inception in the 1960s, STS has understood science, technology, and society as entities so deeply interrelated that they must be understood as co-constructed (Jasanoff Reference Jasanoff2006). Not only do processes of scientific knowledge production and the production of technologies appear to be crisscrossed by social, cultural, and political matters. In parallel, processes traditionally understood as purely social (cultures, power, values, affects, etc.) are seen as dependent on countless artifacts and devices of techno-scientific origin, without which they could not exist.
In recent years, an STS sensibility has been fruitfully applied to understand multiple political problems and themes (Barry Reference Barry2001, Reference Arabatzis2013; Bennett Reference Bennett2010; Braun & Whatmore Reference Braun and Whatmore2010a; Gomart & Hajer Reference Gomart, Hajer, Joerges and Nowotny2003; Marres & Lezaun Reference Marres and Lezaun2011). The common objective of these authors has been the development of a “materialist political theory” (Braun & Whatmore Reference Braun, Whatmore, Braun, Whatmore and Stengers2010b, p. x), or the recognition that you cannot understand political processes without considering the “missing masses” (Latour Reference Latour, Bijker and Law1992) of multiple material entities at the core of them. This sensibility has led to the recognition that “material objects can no longer be conceived as a stable terrain on which the instabilities generated by disputes between human actors develop; rather, they should be understood as integral elements in the evolution of these controversies” (Barry Reference Barry2013, p. 12).
The consideration of objects as political actors does not imply returning to traditional forms of technological determinism – or the approach that understands technology as the primary driver of history (Smith & Marx Reference Smith and Marx1994) – but simply being sensitive to the basic fact that acting in the world involves acting-with-others, others who are both human and nonhuman. In these complex and evolving patterns of relationships, material objects produce effects not through a unidirectional determination of human relationships, but in variable, open, and multidimensional ways. In some cases, the impact of these materials on our forms of political organization may be minimal, serving only as support for human decisions and actions. In many other cases, however, “governance problems are caused both by the behavior of materials (their resistance and disturbance) and by the behavior of humans” (Barry Reference Barry2011, p. 10). Technical artefacts should be seen as political actors. Although lacking self-consciousness, they could well originate and intervene in multiple ways in our modes of political organization, as we saw in the case of Transantiago’s fare indexer in the introduction.
This revaluation of the material components of politics has led to a rethinking of the very nature of policy. From an STS perspective, policies appear as distributed and non-localizable entities; much closer to the notion of an “assemblage” than to well-defined processes of implementation. Following Deleuze and Guattari (Reference Deleuze and Guattari1988), a previous work argued that “assemblages are never totally stable and well-defined entities; these do not have an essence but exist in a state of continuous transformation and emergence” (Ureta Reference Ureta2015, p. 34).
Understanding policies as assemblages, “upends the often implicit assumption that policies emerge fully formed in a particular place and then sometimes move, entire and unchanged, across space” (McCann & Ward Reference McCann and Ward2012, p. 328). On the contrary, policies as assemblages are always embedded in continuous processes of change and redefinition, gaining and losing elements and capacities. Policies as assemblages are constantly being “made and unmade, intersected and transformed: creating territories and then undoing them” (Wise Reference Wise and Stivale2005, p. 86). This fluid character makes it almost impossible to empirically distinguish between different stages of the process, such as problems, designs, and implementation (Gill et al. Reference Gill, Singleton and Waterton2017, p. 8). In practice, multiple versions of a policy often coexist, each with its objects, processes, and purposes. What is called “policy,” therefore, is often little more than “the glue that holds together, or seeks to hold together, a mosaic of humans and nonhumans in a given problem space” (Mellaard & Meijl Reference Mellaard and Meijl2017, p. 335).
From this perspective, policies therefore become assemblages of multiple entities, both human and nonhuman. These assemblages are open and undefined, extending in multiple directions at once. This tentacular character implies, as we will see in the rest of the Element, that different stages can coexist, from initial ideas to actual implementation, often causing considerable friction and controversy. Critically, not always do problems precede solutions, but they are often co-created, experiencing many changes along the way.
In these processes, material entities always occupy central places. Sometimes they act as necessary counterparts to human actions, serving as the scaffolding that supports ideas and values in time and space. In many others, especially in highly technological societies such as ours, they become the leading operators of policy processes, acting in ways that could be perplexing to the very human beings that created them.
In practice, policy design and implementation bear little resemblance to the neat narratives of success/failure of early policy analyses. More than a deployment of a previously established rationality, they appear as widely open processes, populated by a diverse array of human and nonhuman entities, and whose paths are effects that are rarely evident from the outset. This openness is especially evident when policies are assembled as experiments, as we will see in the next section.
The “Field” of Field Experiments
RCTs are commonly referred to as “field experiments.” Following its technical literature, a direct connection with the field allows its practitioners “to yield compelling evidence of causal effects on real-world behaviors” (Baldassarri & Abascal Reference Baldassarri and Abascal2017, p. 42). In contrast with the perceived artificialness of a laboratory, the field provides “the ability to replicate simple, transparent, and credible designs … [allowing] to cumulate knowledge across a set of related experimental studies” (Dunning Reference Dunning2016, p. s1). These advantages, however, come with a caveat. From the pioneering work of Campbell (Reference Campbell1969), it has been recognized that field experiments, in contrast with laboratory practice, are constantly menaced by the interference of multiple external factors. The field presents a complex experimental setting, featuring combinations of events and agents that make it challenging to replicate results, thereby jeopardizing its key aim of proving causality.
To counter such consequences, RCT practitioners have introduced a series of methodological demands, ranging from the random allocation of participants to the suspension of the researcher’s theoretical background. In a process that Duflo (Reference Duflo2020, p. 1966) has compared with plumbing, the researcher is constantly tinkering with experimental entities and spaces with the overall aim of “making the field vanish: … field experiments devoid of a messy and complex environment” (Favereau Reference Favereau, Heilmann and Reiss2021, p. 344). While in discourse, “the field” is frequently mobilized by RCT practitioners to enhance the validity of their results, in practice, it is actively resisted, if not proscribed.
Such an aim, however, is fatuous. As the literature on scientific fields has explored (for an overview, see R. Kohler & Vetter Reference Kohler, Vetter and Lightman2016), this term has been used since its inception to describe science practiced in “uncontrolled settings” (Kuklick & Kohler Reference Kuklick and Kohler1996, p. 13). Emerging as a counterpart to modern notions about the laboratory, the field has always referred to “the messy, complex world outside the lab” (Vetter Reference Vetter2022, p. 458).
In contrast with the aseptic character of labs, fields are always populated. Especially in the social sciences, research fields are places of inhabitation and habits, “and the people within them – like all humans – are constantly changing, responding, and adapting” (Dent Reference Dent2022, p. 138). This populated character not only forces field scientists to be constantly negotiating with the many other uses and meanings of the experimental space. In parallel, this means that “scientists are seldom the most important or the most powerful people at their field research sites” (Rees Reference Rees2009, p. 8). This relative weakness not only restricts scientists’ capacity to conduct experiments but may also turn other inhabitants of these spaces, even those without formal education, into producers of scientific knowledge (Kuklick & Kohler Reference Kuklick and Kohler1996, p. 5). Well before the current hype about citizen science, scientific fields have been places where scientists have been frequently asked, and sometimes even compelled, to allow others – human and nonhuman – to participate in the production of scientific knowledge. More than a particular physical location, the field should be seen as a place of many unforeseen encounters, a “relational category” (Kohler & Vetter Reference Kohler, Vetter and Lightman2016, p. 284).
This populated and relational character has critical consequences for the knowledge being produced at field sites. First, field data is always located. No matter how much work is put into purifying research materials and inserting frames that limit their behaviors (as we will see in Section 4), there are always local configurations of people and things that remain unaccounted for at the onset of an experiment and can significantly impact research results. As Kohler and Vetter (Reference Kohler, Vetter and Lightman2016, p. 290) concluded, “all the field is somewhere.”
Second, field data is experiential. Although technical equipment always plays a role, most field data is acquired “through [the] interactional involvement” between the researchers and the objects of research (Bogusz 2022, p. 17). In these encounters, not only do the multiple entities populating field spaces have central roles. The researchers themselves, with their values, judgments, and affects, are embedded into the knowledge being produced.
Third, field data is negotiated. Given the presence of multiple agents with diverging aims, field-based data production is always the result of negotiation processes, in which several points of view must be taken into consideration. Such negotiation is inescapable and places “constraints on what, if anything, researchers could do in particular places” (Rees Reference Rees2009, p. 8). Given the inescapability of negotiation, there is always a certain degree of politics in field data, as different actors attempt to mobilize other aims and projects through field data.
Finally, field data is unexpected. The field is a place of constant improvisation and transformation, with scientists continually tinkering with their experiments to adapt to the demands of its multiple inhabitants. Constant adaptations mean that many outcomes are largely unexpected, openly defying the researchers’ initial expectations. Such surprises, however, are not necessarily a problem, but they “might interrupt research expectations in promising ways” (Gieryn Reference Gieryn2006, p. 6).
The located, experiential, negotiated, and unexpected character of field data makes its replication challenging. As noted by Candea (Reference Candea2013, p. 242), “you never step twice into the same fieldsite.” Given its populated and complex character, fields are always fleeting and in a state of transformation. Hence, reproducing the very same combinations that allowed someone to produce specific data is highly unlikely. Even if they put the same entities under the same circumstances, the mere passage of time would produce differences in response. Hence, field data in the sciences is never final, as “interpretations can always be contested and attributed to uncontrolled variability of place” (Kohler & Vetter Reference Kohler, Vetter and Lightman2016, p. 288).
All these aspects are especially relevant when considering field data related to policy experiments. As noted by Cartwright (Reference Cartwright2012, p. 977), the situatedness of data rapidly derails the replicability claims common to many RCTs.
[Experimental data] are local because they arise from an underlying arrangement of individual preferences, habits, and technology and are tied to these arrangements. … These principles are fragile [because] … when governments try to manipulate the causes in them to bring about the effects expected, they are likely to alter the underlying arrangements responsible for those principles in the first place, so the principles no longer obtain.
Policy experiments always take place in a specific location, one that endows them with particularities that are difficult to find elsewhere. Moreover, the exact configuration of these particularities is constantly shifting, making it extremely difficult to expect the same results when implementing them in different contexts and/or at different times.
A common reaction has been to try to control as many of these particularities as possible, producing spaces that can be replicated elsewhere. However, in policy practice, “implementing control strategies often fails, as even the experimenters themselves sometimes admit” (Schickore Reference Schickore, Schickore and Newman2024, p. 3). There are, simply, too many aspects to consider to achieve any form of successful control. Even if it is feasible, too much control can be harmful, as “an over-emphasis on control might mislead by obscuring the very object of investigation or by preventing fruitful discoveries” (Desjardins et al. Reference Desjardins, Oswick and Fox2023). As a consequence, “you cannot just take a causal principle that applies here, no matter how sure you are of it, and suppose it will apply there” (Cartwright Reference Cartwright2012, p. 978).
Field experiments are relevant in much more than their capacity to show causal relations between policy implementations. In line with a general turn toward practice in social theory (Schatzki et al. Reference Schatzki, Knorr Cetina and E.2001), there is a growing recognition that what is interesting in an experiment is not its results but the processes through which these results are established. As STS has largely explored, experimental practice is not only the systematic observation of phenomena, but the creation of multiple “inscriptions” (Latour & Woolgar Reference Latour and Woolgar1986) through which phenomena are rendered observable. Critically, different experimental practices will yield different inscriptions, thereby altering the very results of the experiment and the theory that can be modeled after them. Experimental practices are not merely the repetition of standardized analytic procedures, but the very space on which the realities of science and society are being created and recreated.
In policy, experimental practices encompass a range of elements. As already shown, there are various theories and hypotheses. Some adhere to the tenets of conventional economic theory, as seen in RCTs, but others can take starkly contrasting directions. Typically, theory can only provide researchers with a general understanding of a particular field and its inhabitants. The phenomenon under experimentation remains poorly understood by theory (otherwise, it would make no sense to experiment with it); hence, theory offers no guidance for experimental practices (Rheinberger Reference Rheinberger1997). In other words, “theories are rarely specific enough about the minutiae of engaging the world” (Gooding Reference Gooding and newton-Smith2017, p. 117). In many policy experiments, there is no well-defined hypothesis preceding the intervention, as they are commonly run “to explore a new domain, without having any systematic high-level theory to guide their design and implementation” (Arabatzis Reference Arabatzis2013, p. 163).
Then there are the experimenters themselves, whose skillful movements and embodied knowledge are crucial to achieving results. These experimenters are typically more than just epistemic actors. In parallel, they tend to exhibit significant degrees of personal commitment to their methods, even becoming actively intolerant of other approaches (Harrison Reference Harrison2013). Such zeal manifests in multiple ways, from arguing constantly about the practical and moral benefits of their approach (Ravallion Reference Ravallion2020) to engaging in concrete practices and procedures aiming at making others see the world the way they see it (Goodwin Reference Goodwin1994).
Experimental practice in the policy field is also populated by multiple “epistemic objects” (Rheinberger Reference Rheinberger1997), which are the entities and/or processes that scientists wish to study. From sought-after policy outcomes (such as a decrease in armed violence) to specific identities (such as welfare recipients), policy epistemic objects are, by nature, ill-defined; hence, experimental practice focuses on subjecting them to different kinds of manipulations and obtaining specific knowledge by analyzing their reactions. Such a process is rarely straightforward, as epistemic objects commonly respond in ways that could be baffling and/or unintelligible to the experimenters, easily becoming “strange things“ (Ureta Reference Ureta2015).
Mediating the relationship between theories, experimenters, and epistemic objects, field experiments comprise multiple policy instruments. Lascoumes and Le Galés (Reference Lascoumes and Le Galès2007, p. 4) have defined them as “ … an artifact that is both technical and social, which organizes specific social relations between the State and its recipients, according to the representations and meanings it entails.” From this perspective, instruments are not singular entities but complex and heterogeneous sociomaterial networks, including elements such as “aligned practices, trained bodies, specifically configured tools, supporting data and organizational infrastructures” (Voß Reference Voß, Voß and Freeman2016, p. 144). In addition to material elements, these instruments are also “carriers of values, fed by an interpretation of the social and by precise notions of a particular mode of regulation” (Lascoumes & Le Galès Reference Lascoumes and Le Galès2007, p. 4). Policy instruments operate on multiple scales and have varying levels of complexity. Some are singular and self-contained, having well-defined and obvious boundaries. Others include multiple components within themselves, each of which can operate independently as a separate instrument. The latter function in practice as meta-instruments, or instruments of instruments, although they are also frequently treated as singular entities, especially in the early stages of their development.
A notable aspect of contemporary policy practice is that these instruments are increasingly effective “in the absence of an agreement on exactly how they do so” (Radder Reference Radder and Radder2003, p. 5). Policy instruments are embedded with variable degrees of “thing power” (Bennett Reference Bennett2010). Sometimes they are rigid, setting quite clear limits and constraining action (Baird Reference Baird2004); at other times, they are more loose and open to negotiation and interpretation (de Boer et al. Reference de Boer, Te Molder and Verbeek2018). In all cases, policy instruments are at the very core of contemporary experimental practice.
Rather than a set of individual and self-contained practices for the establishing of causal relationships, field experiments have to be seen as forming part of broad and complex “systems of experimentation” (Rheinberger Reference Rheinberger1997), in which a wide range of actors – human and nonhuman – actively participate and produce multiple types of results, not just “evidence.” In these processes, the experiments often present a “life of their own” (Hacking Reference Hacking1983), commonly generating innovations and surprises, even causing significant transformations in the entities and realities under manipulation, as well as in the scientists themselves (Latour Reference Latour1993).
Southern Experiments
The image of a Western technocrat arriving in an off-the-road country to implement sweeping changes has become deeply ingrained in the public discussion about policy experiments in the Global South (for a poignant example, see Wilson Reference Wilson2014). Although associated with the global spread of neoliberalism since the 1990s, these movements have a longer genealogy, being structural components of multiple “colonial governmentalities” (Scott Reference Scott2005) emerging since the seventeenth century. From Africa (Tilley Reference Tilley2011) to Latin America (Medina et al. Reference Medina, Marques and Holmes2014), multiple territories – including entire countries – were frequently transformed into laboratories for radical policy experimentation by colonial powers, ranging from governments to corporations.
The magnitude of these interventions, particularly when resulting in nefarious consequences for local populations, has obscured a parallel phenomenon that has become increasingly prominent in the last few decades: Policy experiments carried out by local powers in these countries. This omission begins with the overall northern bias in a significant portion of the literature on policy innovation, a trend that has “significantly underrepresented” local interventions (Goyal et al. Reference Goyal, Pattyn and Demircioglu2025, p. 223). As argued by Kemmerling and Makszin (Reference Kemmerling and Makszin2023, p. 540), such a bias results in a situation in which,
Global South countries sometimes do not even get the credit (negative or positive) for what they have done. Furthermore, even if they do, the odds are considerable that repackaging leads to a lop-sided, contorted, or biased form of policy translation into the national context. The irony then becomes that any learning or even adoption can easily turn into failure, even if the initial model is quite successful.
The recognition of such a bias has led to the development of a proper “southern perspective” in policy analysis (Nedley Reference Nedley and Evans2004), especially regarding highly innovative experiments. Besides multiple case studies of radical concrete experiments – from China’s one-child policy (Greenhalgh Reference Greenhalgh2008) to Kenia’s mobile money transfer service (Kingiri & Fu Reference Kingiri and Fu2020) and Ecuador and Bolivia’s Buen Vivir alternative development model (Acosta Reference Acosta2012) – this aim has translated into an emerging field studying the mobilization of policy innovations within and beyond the Global South.
Leaving aside the usual north–south mode of policy learning, significant efforts are being made to highlight trajectories that grant greater agency to actors and institutions in the Global South. First, this aim translates into studying south–south mobilities (Waisbich et al. Reference Waisbich, Pomeroy and Leite2021), when policy innovations travel between countries in the Global South. For instance, a body of literature has explored the role of particular Latin American countries, such as Brazil and Colombia, as policy exporters within the continent (Osorio Gonnet Reference Osorio Gonnet2021) and also to Africa (Oliveira & Milani Reference Oliveira and Milani2022). More recently, a second branch of literature has emerged, studying south–north policy mobilities (Kemmerling Reference Kemmerling2023), which refers to the adoption of southern policy innovations by industrialized countries. Cases such as Brazil’s participatory budgeting (Lehtonen Reference Lehtonen2022) and Mexico’s Conditional Cash Transfers (Parker & Todd Reference Parker and Todd2017) show that such movements are more frequent than expected but go unnoticed due to a perceived lack of authority about their point of origin (Kemmerling Reference Kemmerling2023).
Chile is a prominent example of these trends. As mentioned in the introduction, the country’s significant development in the last few decades has been achieved not only by successfully implementing policies already tested elsewhere, especially in the North. Since the 1960s, the Chilean State has also been a leader in the successful implementation of highly experimental policies. Such experimentation was initially influenced by foreign agencies, which viewed the country as a “laboratory” where novel policy ideas could be tested (Dezalay & Garth Reference Dezalay and Garth2005, p. 223), with an early relevant example being an extensive, US-motivated, agrarian reform policy started in 1962 (Bellisario Reference Bellisario2007).
The arrival of a socialist Salvador Allende in power in 1970 intensified this trend toward experimentation with novel policy ideas, from radical social policies (Zúñiga Reference Zúñiga, Bulmer and Warwick1983) to the creation of the world’s first computerized productivity network (Medina Reference Medina2011). Radically swinging to the right, the military’s arrival in power only intensified this experimental trend. Under the auspices of the “Chicago boys” (Valdes Reference Valdes1995) – a cadre of young technocrats trained in neoclassical economics at the University of Chicago – since 1978, the dictatorship started implementing a set of radical policy experiments that set the tone for the neoliberal governance revolution that was going to sweep the world in the following decades (Rupprecht Reference Rupprecht2020). This rapid switch between such contrasting forms of government “inserted Chile as never before into world politics. … this arena made the comings and goings of this small Latin American nation be considered as a world laboratory where diverse social models were tested by trial and error” (Prieto Larraín Reference Prieto Larraín2011, p. 99).
The return of democracy in 1990 did not temper this appetite for experimentation, with Chile becoming a leader in the development of policies that have tried (commonly, not very successfully) to put a “human face” on the neoliberal project (Navia Reference Navia, Burdick, Oxhorn and Roberts2009). The Estallido Social of 2019 could be read as a massive rejection of this kind of policy experimentation, with one of the more common slogans among protesters being: “El neoliberalismo nace y muere en Chile” (Neoliberalism was born and will die in Chile). Such a rejection, however, opened the door to further experimentation, this time with radically democratic formats of citizen participation in a failed attempt at changing the national constitution (Tschorne Reference Tschorne2023).
As a consequence of all this experimentation, the image of Chile as a “country-laboratory” (Gaudichaud Reference Gaudichaud2016) has become entrenched in thinking about national affairs, a place where radical policy experimentation is not only considered valid but also expected. As the case studies in this Element’s sections reveal, such experimentation yielded mixed results. The fact that local ideas and actors occupied central roles in their implementation did not automatically grant them higher rates of success.
Studying Policy Experiments in Chile
This Element entirely agrees with Campbell’s assertion that policies always involve a certain degree of experimentation (as noted by Dehue Reference Dehue2001, p. 284). However, causal experiments of the RCT type are only one manifestation of the growing relevance of experimental approaches in contemporary policy practice. Furthermore, they are not even the best way to describe field experiments in policy implementation. On the one hand, causal approaches appear as ill-suited to deal with policies that emerge from open assemblages of highly heterogeneous entities, in which multiple stages and processes coexist and interact in nonlinear ways. On the other hand, deterministic approaches bear little resemblance to wide varieties of contemporary experimental practice in the Global South, exercises not primarily guided by testing the effectiveness of a particular measure, but somewhat interested in more open learning processes derived from the reactions of epistemic objects under novel circumstances.
Experiments in policy take many forms, ranging from the strict evaluation of an intervention’s effects under laboratory-like conditions to openly speculative exercises that explore what happens when certain conditions are altered. This Element examines a specific, prominent form of such experiments occurring in the Global South, namely, attempts to test novel policy instruments. Using policy experiments conducted in Chile over the last two decades as case studies, the Element will explore how the primary aim of these experiments is not to establish the existence of causal relationships between variables. Nor are they based on random selection of subjects or comparisons with control groups. Additionally, these experiments are not intended to be merely analytical exercises. They do not want to see reality as it is, but alter it through these novel instruments.
These interventions become experimental because they are guided by the hypothesis that introducing a new policy instrument will lead to changes in the issue. Beyond the concrete nature of these changes, which are often initially assumed to be positive, it is the centrality of this speculation that lends these policies their experimental character. Even if there is ample evidence of the instrument’s effectiveness in other contexts, its local effectiveness remains a hypothesis to be tested.
2 Charismatic Foreigners
Most of the stories of policy experiments in Chile that we have been analyzing for the last fifteen years include a journey in their origin. An engineer who travels to do a master’s degree in the United States (Ureta 2014a), a civil servant attending a seminar in Brazil (Ureta 2016), and different groups of politicians and experts on technical visits to Colombia (Ureta 2015). In some cases, the main objective of the trip is to learn about and explore the possibility of exporting a particular policy process. In others, the encounter is almost casual, as if the foreign policy instrument had been waiting to be discovered for them. These encounters are part of a much broader process of global policy diffusion. Derived from growing international mobility, but also developments in communication technologies, “never before have state officials had such immediate and expansive access to the policy outcomes and experiences of overseas counterparts” (Legrand Reference Legrand2020, p. xi).
Among the multiple experiences that accompany these trips, one is especially relevant to this Element: Travelers learn about policy instruments that have been developed and, in some cases, applied to address a specific aspect of public concern. These instruments vary widely, ranging from untested theoretical notions found in academic publications to serving as the guiding principles of massive infrastructural systems. Beyond such diversity, two key elements unify these encounters: First, the perception of these instruments as self-contained tools that can be easily mobilized internationally. Second, the initial motivation to mobilize them is not solely derived from a careful evaluation of their potential, but also from an emotional embrace of their charisma.
In seeing instruments as movable tools, these actors are not alone. One of the key assumptions of contemporary processes of policy transfer is that “almost anything can be transferred from one political system to another” (Dolowitz & Marsh Reference Dolowitz and Marsh2000, p. 12). This perception has spurred in recent years a lively market based on the export and import of policies, which McCann and Ward (Reference McCann and Ward2012, p. 45) ironically describe as follows:
‘Solutions-starved’ actors, often under pressure to ‘deliver’ successfully, quickly, and at low cost, ‘scan’ globally for pre-tested policy models that have been anointed as ‘best’ in one way or another, with the idea of ‘importing’ them. To learn more about these ‘off-the-shelf’ policies, they develop relationships with the places with which the policies are associated. This is done either directly through examples of ‘policy tourism’ (Ward, Reference Ward, Bridge and Watson2011) or via mediating expert consultants who offer knowledge in easily consumable, sellable, and moveable packages.
Although it is recognized that this process often generates numerous problems (Dussauge-Laguna Reference Dussauge-Laguna2013), there is a consensus that when the transfer is conducted with a thorough assessment of the target contexts and the necessary adaptations, a reasonable level of success can be expected.
As noted above by McCann and Ward, a key component of this perception of policies as inherently mobile is the capacity of their proponents to present them as “easily consumable, sellable, and moveable packages” (Dussauge-Laguna Reference Dussauge-Laguna2013). On this task, they have been significantly helped by a long tradition of conceiving policy instruments simply as “tools,” or relatively self-contained entities that can be easily “detached from their association with particular policy programs” (Sidney Reference Sidney, Fischer, Miller and Sidney2007, p. 82).
This capacity for detachment is beneficial for the transfer of transnational policies. Unlike institutions and people, which are often bound by location and national regulations, instruments-as-tools are represented as inherently mobile, stateless, and adaptable, emerging as the primary means through which the dense transnational interconnection and mobility that characterizes contemporary policymaking take place. This emphasis on well-defined tools makes the whole process appear relatively straightforward. If all the due processes are followed, the ultimate expectation is that “policies that have been successful in one country will be successful in another” (Dolowitz & Marsh Reference Dolowitz and Marsh2000, p. 17).
This aura of success is especially relevant when the transfer process happens between countries with variable degrees of development. Especially in the model of a “north-south” policy transfer, the publicity gained from seemingly successful policy instruments has become a key tool of transnational “soft power” strategies (Dummer Scheel et al. Reference Dummer Scheel, Faucher and Gatica2024), a way to reassert the international standing of specific projects and/or political models. Sometimes these instruments are offered in a more subtle form of international development assistance. In other cases, the process takes a more violent turn, even becoming tools of neo-colonialism, as when international bodies impose “proven” policy instruments on countries in need of assistance (Wilson Reference Wilson2014). Reversing this trend, the growing south–south and south–north mobilization of policy instruments with a proven track record of success has also become a leading way for developing countries to challenge traditional international inequalities (Kemmerling Reference Kemmerling2023), as illustrated in this section’s case study.
This perception of sustained success is not merely anecdotal; it forms the backbone of the initial enthusiasm among the traveling actors for the instrument. Well before any systematic assessment of the advantages and/or feasibility of mobilizing the instrument is made (if ever), policy transfer is a matter of belief. Against conventional notions that see policymaking as a process “driven by cognitive decisions” (Howlett et al. Reference Howlett, Ramesh and Saguin2018, p. 267), emotions commonly occupy the most prominent positions on early evaluations of a policy. This centrality of emotional aspects is not only a characteristic of encounters with foreign policy instruments. As a growing literature on the topic has shown, emotional components always play central roles in policy processes (Borén et al. Reference Borén, Grzyś and Young2021; Maor & Capelos Reference Maor and Capelos2023). Especially during the early phases of the process, emotions such as enthusiasm are central drivers in policy transfer (Pierce Reference Pierce2021).
In generating such a response, the literature has assigned a key role to the figure of the policy entrepreneur, defined as “an energetic actor who engages in collaborative efforts in and around government to promote policy innovations” (Mintrom Reference Mintrom2019, p. 307). Especially in the early stages of the process, policy entrepreneurs are distinguished by being “charismatic communicators with strong rhetorical skills” (Gunn Reference Gunn2017, p. 268), actively promoting a particular policy instrument to a broader audience. This figure of the (usually male and white) charismatic policy entrepreneur looms large in the analysis of emotions in the policy design process, typically assigning him a substantial role in the early acceptance of a policy design.
However, charisma is not a monopoly of human beings (Antonakis et al. Reference Antonakis, Bastardoz, Jacquart and Shamir2016). Nonhuman entities, such as particular species of animals (Fordahl Reference Fordahl2024) or concepts like “globalization” (Tsing Reference Tsing2000), also possess considerable charisma. Through his analysis of the famous Massachusetts Institute of Technology (MIT) “One Laptop per Child” program, Ames concludes that there is also a technological charisma that,
derives its power experientially and symbolically through the possibility or promise of action: what is important is not what the object is but what it promises to do. Thus, the material form of charismatic technology is less important than how it evokes the imagination
Adapting this notion, it could be said that specific policy instruments are also endowed with technical charisma. Intimately associated with an aura of success and sophistication, such charisma explains, to no small degree, the decision by certain actors to start mobilizing them from one country to another, kickstarting a policy experiment. In line with Ames, in such a process, what is central is not the actual performance of the instrument. Even more important is its promise, the shiny futures of success that would emerge on their paths.
To explore this issue, a particularly poignant case will be examined: How the charisma of Transmilenio – a BRT system launched in Bogotá, Colombia, in 2000 – was leveraged to lend some much-needed attractiveness to the faltering Transantiago in the early 2000s.
Late in 2000, a group of transportation experts delivered to the Chilean government a document entitled “Plan de Transporte Urbano para Santiago” (Plan for Urban Transport for Santiago, PTUS), marking the beginning of the policy implementation that would become Transantiago in February 2007. The PTUS proposed a radical transformation of Santiago’s public transport system. In replacement of the underregulated (and often chaotic) surface bus-based system, the whole urban area was to be divided into zones served exclusively by local lines and connected through trunk lines of buses and metro. In drafting the proposal, their main inspiration was the pioneering BRT system implemented in the Brazilian city of Curitiba since the 1990s, a project that these experts have known and admired for several years.
Since its publication, the PTUS has lived under constant menace. A change of such radicality was always going to cause important degrees of opposition, especially from some powerful actors who the new systems were displacing (especially the owners of bus companies). Besides, implementing the proposed changes required significant efforts and resources at a time when the government was facing substantial fiscal restraints. Although the low quality of public transport was a constant complaint among city dwellers, multiple other issues demanded more urgent intervention from the government. Consequently, the proposal was on the verge of being scrapped several times during this early period.
Given these multiple menaces, the project’s proponents were forced to constantly defend their proposal, producing a near-constant stream of arguments about the need for such radical change. Techno-economic evaluations about its potential benefits were always used, but they did not suffice in gaining many critics and neutrals into the project’s fold. Something more substantial was needed, an image that could embolden the proposal, providing it with an aura of (almost) undeniable future success. Here enters Transmilenio.
Starting in 2000, Transmilenio rapidly rose to become a star in global forums about transport reform at the turn of the millennium. Although its initial scale was modest, it was rapidly presented as a glaring success, managing with little investment not only to improve daily commuting in Bogotá but urban life as a whole. On becoming a shining example of cost-efficient policymaking from the Global South, Transmilenio was aided by the tireless work of an extensive and efficient coalition of “persuasive practitioners” (Montero Reference Montero2017), led by Bogotá’s then-mayor, Enrique Peñalosa.
Transmilenio was presented as not only highly successful but also, and centrally, as easily movable. Such discourses of movability began by setting aside the substantive processes of urban and political reform that Bogotá had implemented over the previous decade. These processes had made the policy possible in the first place (Montero Reference Montero2015). Through such an omission, Transmilenio was repackaged as an ingenious and cost-effective public transportation system that could be easily implemented elsewhere, leading to various positive transformations, especially in other cities of the Global South (Vecchio Reference Vecchio2023). From a dense network of highly idiosyncratic local transformations, the policy was turned into a self-contained package that could be moved anywhere (Valderrama Reference Valderrama, Farias and Bender2009). Given its charisma and movable character, for cities dealing with public transport issues, it was challenging to resist Transmilenio’s charms. Moreover, Santiago was not an exception to this trend.
Transmilenio had not played a significant role in the early development of the PTUS. As one of its leading designers answered vehemently when asked, “It is false [that Transmilenio was an influence]! We did not know about Transmilenio when we were preparing this proposal.” (Garrido Reference Garrido2015, p. 26). However, in contrast to the relative obscurity of Curitiba’s BRT (Montero Reference Montero2015, pp. 88−90), during the critical period when any support for the PTUS was welcomed, Transmilenio has become a highly charismatic example of successful public transportation reform.
Notably, the canonization of Transmilenio occurred not only at the international level but also in Chile. First, the UN’s Economic Commission for Latin America and the Caribbean (ECLAC), whose headquarters are in Santiago, lending it historical prominence in local policy, was an early promoter of the reform through publications (Chamorro Reference Chamorro2002) and events. In parallel, a constant stream of Chilean experts and policymakers conducted “study tours” to Bogotá to gain firsthand knowledge of the system (Montero Reference Montero2015). Reflecting this situation, in February 2002, an MTT authority, along with two prominent economists, published a highly influential paper that claimed that “after one year of operation, it can be affirmed that the Transmilenio system has been a success.” (Diaz et al. Reference Diaz, Gómez-Lobo and Velasco2002, p. 15). Such a success reaffirmed for the authors the “political feasibility of [carrying out] a radical reform of [Santiago’s] transport system” (p. 1).
The highlight of this process of borrowing charisma was Peñalosa’s visit to Santiago in August 2003. Organized by a local NGO with funding from the World Bank, it consisted of a whole week of public activities, including a book launch, cycle rides, and multiple encounters with local stakeholders, the media, and the general public (Sagaris Reference Sagaris2019, p. 194). Its culmination was a widely attended talk by Peñalosa at ECLAC’s main auditorium. On that occasion, he offered a stirring vision of future “equality, happiness, and competitiveness” for cities willing to follow Bogotá’s path. Aptly combining cherished mottos from the left (equality) and the right (competitiveness) with the general aim of seeking happiness, Peñalosa offered a vision of Transmilenio as an almost magical solution to many urban ills.
The local effects of such efforts were rapidly evident. Even before being officially christened, the PTUS was nicknamed in the local press as “Plan Bogotá” (Garrido Reference Garrido2015). This connection with Transmilenio and its valuable charisma was again expressed when it was time to select a proper brand for the PTUS. Initially, the actors in charge of the implementation process had selected a brand (Welén, the indigenous name for a hill located in the center of the city) that had no connection to Transmilenio, explicitly aiming to carve its path for the Chilean system. Other members of the government strongly resisted such a move. Especially the higher hierarchies of the MTT wanted a brand that made the connection between both systems explicit. As one of these actors explained, “Transmilenio in Bogotá has been sold quite well and has positioned Bogotá as an example of transport innovation. Well, then, instead of giving [our system] a fancy name, let us associate it with the city because this was a much bigger project” (Ureta Reference Ureta2015, p. 62). This contrasting approach sparked a massive controversy that was only settled with the resignation of the entire team in charge of the PTUS’s implementation. One of the first measures taken by the new team appointed by the authorities was to rebrand the system as Transantiago, explicitly highlighting its connection with its prestigious Colombian counterpart.
As seen in the above case, perceptions regarding the inherent mobility of these instruments and their technical charisma play central roles in the initial phases of these experiments. On the one hand, from the very beginning, these instruments have been widely perceived as easily movable, as detached entities that can travel easily to different places. Echoing long-held technological determinist positions (Mumford Reference Mumford1934), policy instruments were viewed as devices whose effects would be replicated regardless of where they were introduced. In doing so, all local aspects that made a policy successful in the first place – social, political, cultural, environmental, and even physical – were either overlooked or actively dismissed. Regardless of its location, the instrument would produce the same effects.
On the other hand, the emotions generated by the charismatic capacities of these novel instruments also played a central role. Quickly after learning about them, these instruments become objects of interest, even fascination, for these actors. Derived from a series of positive attributes associated with these instruments – effectiveness, sustainability, democracy, and modernity – the strength of this fascination hindered most systematic in-situ assessments of the instrument, especially regarding its limitations.
The combination of detachment and emotion ultimately produces powerful “sociotechnical imaginaries” (Jasanoff & Kim Reference Jasanoff and Kim2013) surrounding these instruments. These imaginaries refer to the set of ideas and projections associated with the instrument, particularly regarding its potential usability in the country of origin of the traveler. These imaginaries constitute the first outline of what is expected from the future experiment, its founding hypotheses. At this stage, they tend to be quite diffuse, summing up a disparate number of ideas and affects. Nonetheless, their development is central, as these imaginaries rapidly become the background against which the instrument will be introduced to new audiences back home. The development of these imaginaries, in addition, gives policy processes a speculative component, which helps them outline possible alternative futures and ways to materialize them.
3 (Re)Problematizing an Issue
A powerful sociotechnical imaginary, however, is never enough. No matter how evocative the ideas or strong the emotional attachment that a new and exciting policy instrument generates, several other things are needed to turn them into the focus of an experiment. Policy experiments tend to be demanding affairs, involving a wide array of human and nonhuman elements that must be mobilized and reorganized in comprehensive ways. Especially in a social context in which a certain semblance of rationality is compulsory, emotional attachment to an instrument must be (essentially) replaced by different forms of argumentation stating the need for the experiment in convincing ways. Such a need is primarily enacted by establishing a firm connection between the novel instrument and an issue of public concern.
Issues could be simply defined as quite “open, undefined situations that put into question existing systems of control and government” (Ureta Reference Ureta2015, p. 14). Issues can be of any type and magnitude, ranging from massive and unexpected disasters, such as earthquakes, to long-standing negative perceptions, such as the poor quality of public schools. Our worlds are filled with multiple kinds of issues. However, as noted long ago by Dewey (Reference Dewey1927), no matter their scale, issues never exist in isolation. Nothing is an issue by its sheer existence, but becomes an issue when the public recognizes it as such. Each issue has its public, the people who evaluate a particular situation or entity as erroneous, mistaken, or perverse. Some publics are pretty limited, comprising only a handful of experts or concerned citizens who are focused on a specific topic or situation. Others are massive, spanning the whole world. No matter their scale, no issue exists without its public.
A key task for the proponents of an experiment is to connect the instrument they wish to test with an issue and, hence, the public behind it. This connection is established through a process known as problematization. As explored in previous publications (Ossandón & Ureta Reference Ossandón and Ureta2019; Ureta Reference Ureta2014b), problematizations can be defined simply as the process through which an issue is represented as a matter demanding public action. Concerned groups are continually problematizing their issues of concern. It could be said that their main reason for existence is to carry out such problematizations. Therefore, our world is full of problematizations. However, only a handful of them managed to incentivize public intervention in the form of a policy process.
According to conventional models (Peters Reference Peters2005), problematizations that lead to a public policy are those that successfully manage to enlarge the public surrounding an issue to a degree that its resolution becomes a priority. Following the well-known scheme of Weiss (Reference Weiss1989), concerned groups first agree that the time is ripe for their issue to be presented to a broader public, especially the authorities, arguing about its seriousness and demanding public intervention. Then, they develop an agenda-setting campaign, focusing on gaining increasing space in the public sphere for presenting their issue (especially the media and, nowadays, social networks). This process of public pressure, when sustained and effective, finally forces authorities to intervene, formally starting the policy process. This is a fairly linear model, according to which “problems precede solutions and adaptation processes address issues such as the elimination or refinement of options based on criteria such as efficiency, effectiveness, legitimacy or viability” (Béland & Howlett Reference Béland and Howlett2016, p. 394).
Problematizations focused on carrying out experiments with novel policy instruments do not follow this path. Typically, the issues at the center of their problematizations are well-known, have remained relatively unchanged in recent times, and/or have already been problematized on multiple occasions. In some cases, earlier problematizations of the issue have garnered sufficient support to become matters of policy. However, the effectiveness of such an intervention has been limited, as the issue remains unresolved for certain members of the public. The motivation for a new problematization, therefore, is not primarily related to a sudden change in the issue. The difference this time is due to the emergence of the instrument within the realm of possibilities for public action.
A novel kind of public introduces this instrument. Replacing its traditional focus on an issue, this public is joined mainly by a common interest in the instrument and its potential. Hence, this “instrument coalition” (Simons & Voß Reference Simons and Voß2018) emerges with the primary objective of transforming a foreign instrument into a tool for local public action. The emergence of this coalition and its interest in testing the instrument shows us how “deliberations on policy tools, their composition and requirements, can and do proceed in the absence of any specific problem and often are only linked up to specific problems later” (Béland & Howlett Reference Béland and Howlett2016, p. 404). The initial interest of the actors involved in this coalition focuses on the instrument itself, as well as the exploration, often motivated by a combination of intellectual curiosity and emotional attachment, of its potential benefits.
Instrument coalitions can take many forms. In some cases, they are little more than a loose group of individuals with a superficial connection among themselves, except for their shared interest in the instrument. In others, they can form a tight-knit community, bonded by long-lasting ties of collaboration and affinity. In contemporary policymaking, a particularly relevant shape that coalitions take is the consultancy group.
Since the rise of the new public management movement in the late 1970s – and its demands for efficiency and the reduction of state apparatus – consulting has become a privileged space for the articulation of public action, to the extent that it is said that our society is a consultocracy (Gunter et al. Reference Gunter, Hall and Mills2015). Literature on the topic has identified many problems related to this prominence, such as “the monopolization and privatization of knowledge, and the consequent dependencies between public contractors and private service providers, the erosion of tacit knowledge within government agencies, the weakening of accountability and the strengthening of instrumental rationality.” (Ylönen & Kuusela Reference Ylönen and Kuusela2019, pp. 242–243). Several of these problems stem from the typical positioning of consultants as actors who have broad powers to propose and/or modify policies, but who, unlike public officials, rarely face the scrutiny of their peers or supervisory institutions regarding the consequences of their actions.
In policy experiments, the figure of the consultant has become a prominent way through which coalition members can push for testing an instrument. Supported by their academic prestige and/or previous experience with the instrument at international levels, assuming the role of consultant allows them to obtain significant amounts of power and influence in the process, without facing greater scrutiny or external evaluation (a point to which we are going to return in the conclusions).
To kickstart a test exploring the hypothetical benefits of these instruments, the problematization proposed by this coalition cannot be based solely on a theoretical discussion about the benefits of the instrument. Not even an argument about its successful implementation somewhere else will suffice. No one is going to invest the time, effort, and costs that tests usually entail based solely on concepts or foreign success stories, at least not on the scale necessary to generate substantive results. For this reason, coalition members need to generate a powerful new problematization of the local issue, one in which the instrument occupies a central position in its resolution.
The selection of the issue to problematize (or, more usually, re-problematize) can follow various trajectories. In some cases, it corresponds to an issue with which coalition members have already been involved in the past, but without much success. In other cases, the issue may be entirely new to coalition members, having been chosen solely for pragmatic reasons. Commonly, there tends to be a diffuse and latent discomfort with the issue and its solutions up to that point, especially at the instrumental level. In any case, it is relatively straightforward that one of the central characteristics of these experiments is a situation in which “the problem follows the solution” (Béland & Howlett Reference Béland and Howlett2016), rather than the other way around. At first, there is the instrument, its characteristics, and its many promises (as seen in section 2). Only afterwards, when a test based on the instrument starts to be considered, the issue is mobilized.
Typically taking the form of a document, ranging from a research output to a manifesto, the new problematization tends to include a relatively stable set of elements. First, it introduces a group of human beings (although they can also be nonhuman, such as ecosystems or species) who are significantly affected by the issue, usually adopting the model of a “crisis.” As Koselleck (Reference Koselleck2006) explained, the “crisis” model is especially effective in problematizations, given that it presents this adverse situation as openly intolerable, thereby demanding urgent action. The existence of this crisis is argued in various ways, with quantitative data occupying a special place given their high degrees of credibility in our societies (Espeland & Stevens Reference Espeland and Stevens1998). Later on, this population in crisis would work as an ad-hoc control group (Dehue Reference Dehue2005), a largely imaginary group against which the effectiveness of the novel instrument will be contrasted.
Using the “crisis” model is a double-edged sword, tough. On the one hand, turning an issue into a crisis is extremely useful for highlighting the urgency of intervening in novel ways. On the other hand, a rhetorical crisis can easily become a real crisis. As we saw in the case of Transantiago, it is not uncommon for an issue that was half-forgotten or not really a priority to suddenly become urgent due to the unexpected effects of a policy. This outcome can not only have harmful effects on those responsible for the policy but can also end up worsening the situation of those affected by the issue. It is always advisable to problematize responsibly (although it is rarely done).
Second, it is subsequently argued that the solutions applied to address this issue up to that point, especially their policy instruments, have not been effective for various reasons, often presenting technical arguments to support this claim. In doing so, criticism is also leveled against the institutions and, even, the individuals behind these failed policies, to clear the ground for the novel technocracy that will take charge of the experiment.
Third, a new action program is presented, incorporating new elements and processes. The novel instrument features prominently in such a proposal, being endowed with many responsibilities in resolving the issue. The introduction of the instrument to a new audience is not straightforward. As explored in the previous section, in most cases, the instrument has only been tested (if at all) in a foreign country, usually in circumstances quite dissimilar to the ones found locally. Despite all the usual arguments about its inherently mobile nature, in practice, the transnational movement of instruments entails no little challenges.
As the body of literature studying the way concepts and technologies travel from different places/times has shown, commonly devices “can only be considered ‘the same thing’ as long as they remain in the same ‘place,’ that is, as long as they are in a place where they are made to refer to each other” (de Laet Reference de Laet2000, p. 163). Moving instruments from one place to another is never a simple process of diffusion. It always involves “a heterogeneous array of problems that range across equipment, work practices, features of the landscape, and local ecology that had to be negotiated or managed to make it work” (Shepherd & Gibbs Reference Shepherd and Gibbs2006, p. 675). The adoption of foreign instruments always involves a certain degree of transmutation, a process in which the traveling instrument becomes slightly (or manifestly) different.
This process of transformation is made more complex by the fact that the instrument is not only introduced as a technical tool, but also as a cultural one. To attract the broadest possible support, the instrument is also accompanied by a series of heterogeneous attributes extracted from the sociotechnical imaginaries that coalition members have been developing around it. Problematizations not only aim to reorganize a wide range of relationships between components of an issue. Through this reorganization, they also promise to address the issue more effectively, alleviate the discomfort it causes to the affected public, and, ideally, resolve the crisis.
To explore this issue further, we will analyze a problematization process involving another foreign policy instrument tested in Chile: a market-based instrument introduced in the early 1990s to address air pollution in the city of Santiago.
Derived from its unregulated growth and industrialization, air pollution in Santiago was for a long time the ultimate environmental issue in Chile (Riveros Reference Riveros1997). Since the 1950s, such prominence has motivated multiple problematizations, including different kinds of environmental policy instruments. Most of these instruments aligned with the command-and-control model predominant in environmental regulation at the time, in which the authority set fixed emission limits and expected (or forced) all the polluting sources to comply. By the 1970s, these regulatory attempts had yielded no substantive results in alleviating the issue, primarily because the State lacked the technical capacity to enforce these limits.
This situation began to change in the late 1970s when the Office of National Planning (ODEPLAN) became involved in the issue. At the time, ODEPLAN was the main hideout of the “Chicago Boys,” becoming their institutional headquarters to implement their aim of radical deregulation and privatization. Regarding environmental issues, ODEPLAN actors argued about the need to “frame this problem inside a realist vision of national development, avoiding copying extreme positions of environmentalism and anti-pollution characteristic of developed countries” (ODEPLAN 1979, p. 1). Such a realist framing was, in practice, a challenge to the environmental governance model that had existed in Chile up to that point, which was based on the prominence of health and engineering expertise and primarily expressed in command-and-control regulations. In contrast, ODEPLAN actors looked to “economize” (Calıskan & Callon Reference Callon2009) air pollution and its regulation.
After a first attempt to introduce pollution taxes failed, they had a lucky break. In 1983, a young ODEPLAN engineer who had been in the US pursuing a Master’s degree in Environmental Management returned with several “wonderful books” – in the words of a future coalition member – on up-to-date economic theory in environmental management (especially Baumol & Oates Reference Baumol and Oates1975). Mentioned in the books was one instrument that rapidly caught the imagination of several young technicians associated with ODEPLAN: Emissions Trading Schemes (ETS). Although relatively well-known nowadays, at the time, ETS had only a few empirical applications.Footnote 2 In contrast with extensively tested command-and-control instruments and taxes, it was still mostly “viewed as an intriguing, but somewhat eccentric and uninspiring alternative” (Meidinger 1985, 455 quoted by Simons & Voß Reference Simons and Voß2011, p. 5).
This openly experimental character, however, was not a problem in Chile at this time. Under the guidance of ODEPLAN, most public offices had embarked on implementing different kinds of highly experimental policy instruments, with the only major precondition being that they must be market-based (Foxley Reference Foxley1983). Given this support, a stable instrument coalition formed by young engineers and economists emerged with the sole focus of testing an ETS for environmental governance in Chile. Given its prominence in public discussion, it seemed logical to start with air pollution in Santiago.
Their starting point was to re-problematize the very definition of air pollution and its sources, as could be seen in a presentation that Ricardo Plaza, one of the leaders of the coalition, made in 1985 in a seminar about air pollution in the city:
The environment has a capacity for self-purification that has been exceeded, and therefore, it has become a scarce resource over which various activities compete and dispute rights. In summary, the environment’s assimilation capacity is limited, and every time we exceed it, we produce pollution. The question is, then, who should use this capacity and to what ends? … pollution is a subproduct of activities that provide benefits to society and a decrease in such subproducts typically carries costs that imply that society can consume other goods less, therefore society decides to determine how much it wants each good and what level of environmental quality it needs. … Currently, an analysis of what is called emission rights is starting to be made to decide how to divide the assimilation capacity of the atmosphere.
Instead of the usual approach of attributing pollution to the emissions of individual sources, Plaza enacts it as a general problem derived from the limited capacity of Santiago’s atmosphere for self-purification. Therefore, the main issue is not how to control the externalities of each source but how to assign correct rights over units of the atmosphere, balancing a certain level of environmental protection with the demands for the production and consumption of goods. In this task, the implementation of a textbook market of tradable emission rights is presented as a solution that is under “analysis.”
However, it was soon evident that if coalition members wanted to implement such an instrument, they would need much more than this conceptual development. What was urgently needed was to establish an alternative “metrological regime” (Barry Reference Barry, Barry and Slater2005) in which both pollution control science and neoliberal environmental economics could be combined, allowing them to quantify industrial air pollution in Santiago as controllable through an ETS.
Such new problematization was constituted through a series of research projects commissioned by the Intendencia de la Region Metropolitana (Santiago’s local government), projects in which several members of the coalition participated as consultants. The first two studies had the overall objective of setting up an air pollution control and management system for Santiago. Although such a system, typically associated with command-and-control policies, may appear at odds with a market-based instrument like an ETS, in practice, it proved crucial for its development for three main reasons. First, it enabled the translation of emissions from several different industrial sources into a standard metric. Second, it presented the city’s atmosphere as a delimited geographical space in which all individual emissions, regardless of their point of origin, would be summed up in a commensurable whole. Third, it provided a model in which emissions of different kinds could be compared and weighted with the accomplishment of air quality standards set by authorities.
With the data from these studies, a third research project was conducted by a consortium comprising several coalition members, with the explicit aim of developing a highly sophisticated understanding of air pollution in the city, for which an ETS emerged as its almost natural solution. Using all the available information, coalition members designed a scheme in which an emissions quota was assigned to each significant industrial source, paired with an operational license. This operation license was only temporary for sources that were already polluting above their quota. In the case of industries that were polluting below their quotas, the authority issued certificates of emission reduction, over which they had property rights and could be traded.
When such attribution of rights was made into law in 1991, it created a market in which the owners of certificates could trade them with two types of buyers: Existing industries that needed to compensate for their emissions above their assigned quotas and new industries that wanted to start operating in the urban area of Santiago. In all, this consultancy project enabled coalition members to radically re-problematize the issue, shifting from a public health problem that needed to be eliminated to an economic externality that could be controlled by merely assigning property rights and letting the market do its work.
Problematizations are complex affairs. Not only do they involve a highly heterogeneous array of actors. They also require significant effort, comprising multiple interconnected stages. In the process, the actors involved, as well as the issue itself, change. Setbacks are regular, forcing coalition members to extend their resources in their quest to gain enough support for the experiment.
No matter how sophisticated the problematization, its transformation into an actual policy experiment is never automatic. Once it becomes public, a lengthy process of negotiation and adjustment begins. First, the new problematization must be imposed on other problematizations of the same issue. This process can be tortuous, given that it usually involves numerous changes in actors and paradigms. On its way to becoming the central component of a proper test, the instrument acquires multiple new elements and relationships, especially with components of previous problematizations. The fundamental thing is that no matter how many new elements are incorporated or extracted in further rounds of problematization, the new instrument must continue to occupy a central place in the future test and the effects it hopes to generate.
More than momentous occasions, the final approval of these problematizations tends to be an underwhelming affair. After considerable effort has been invested in packaging the instrument of choice as the ultimate solution to a pressing issue, in many cases, there is not even a single moment of approval. Instead of a grand public event signaling its definitive acceptance, there is commonly a series of smaller agreements that are chained together, leading from a problematization to the green light for testing the instrument. Some of these instances occur in public – at fairs and public conferences – while many others happen in private, even in some cases, no one is aware of them happening at all (as it becomes more common with highly automated policy processes). This entire process is surrounded by high degrees of uncertainty and involves a significant degree of serendipity. At some point, all necessary agreements have been made (or the opposition has been subdued), and the experiment can proceed to a new phase: Conducting an actual test of the instrument.
4 Testing Mesocosms
Tests are curious affairs. Although they seem, in principle, somewhat technical and dull, in practice, they reveal a much more vivid trajectory. In STS, tests are usually understood in principle as “a set of activities … carried out in a circumscribed environment that is designed to produce an outcome that gives us information as to the operation of … [a] technology” (Pinch Reference Pinch1993, p. 28). However, usually, there is much more at stake in a test than probing the virtues or problems of a particular device. As an overview of the theme by Marres and Stark (Reference Marres and Stark2020, p. 425) recognizes, “tests should be studied not based on what they resolve but by what they generate. Tests are generative, they stimulate further testing, … involving diverse modalities (…) of knowing, valuing, and acting.” Hence, tests are usually open and creative spaces in which some capabilities (but not all of them) of the device are explored. This process also involves multiple reconfigurations of other elements.
Centrally, testing involves the emergence of certain “users” (Oudshoorn & Pinch Reference Oudshoorn and Pinch2003) or human beings who become embedded into different sets of relationships with novel devices. Users are initially a projection, a more or less formalized set of ideas about future human users, ideas that significantly influence the overall design of the test and its devices. In a classic piece, Akrich (Reference Akrich, Law and Bijker1992, p. 208) called these ideas scripts, seeing them as something akin to guidelines that “define actors with specific tastes, competences, motives, aspirations, political prejudices and the rest … [defining] a framework of action together with the actors and the spaces in which they are supposed to act.” As explored in a previous publication (Ureta Reference Ureta2015), scripts are a key device of contemporary power, establishing the frames and affordances for social action in multiple fields. Those who can set scripts – from envisioning users of new AI chatbots to beneficiaries of a new pension scheme – can usually substantially influence social processes.
Scripts, however, are not monolithic. First, there is no way a script, no matter how sophisticated, can account for the complexity of real human beings and social processes; hence, scripts are always partial and in a constant state of redefinition. In parallel, the concrete human beings invited (or forced) to embody such scripts when using the device or system rarely comply dutifully. More commonly, they try to set their ways of dealing with the device, challenging in various ways their assigned scripts.
This is why tests are so important, as they are the instances in which human beings face scripts of devices or processes for the first time. Most of the time, this encounter is not peaceful, but it gives rise to all kinds of friction and conflict. Scripts, in particular, got challenged in various ways, sometimes resulting in substantive redefinitions of the instruments or processes under analysis. Tests submit the innovation to a first barrage of reality; sometimes, they even make it collapse altogether.
In the policy field, testing is never only about a novel policy instrument. Not even about its future beneficiaries. Given that the ultimate goal is to influence social affairs, testing is also about creating novel configurations of the social, in which specific issues, human beings, and a multitude of other entities are rearranged to explore new forms of coexistence. Derived from the progressive experimental nature of contemporary societies, commonly policy tests “are not just in society but are tests of society” (Marres & Stark Reference Marres and Stark2020, p. 425). For this reason, policy tests can rarely be conducted in conventional experimental spaces such as laboratories.
In the natural sciences, laboratories have an unmatched centrality as spaces for running experiments. Through the establishment of a “set of differences between the experimental space and the world” (Millo & Lezaun Reference Millo and Lezaun2006, p. 181), laboratories allow the entities under study to be singularized, establishing clear demarcations with other entities and phenomena. Therefore, their reactions to hypothesis-based alterations could be observed, systematized, and analyzed, producing knowledge deemed as objective.
In policy, this secluded nature of laboratories is problematic. Their distance from the spaces and publics that are affected by the issue at hand opens the door to constant “disputes about the validity or applicability of the data generated” (Millo & Lezaun Reference Millo and Lezaun2006, p. 181). No matter how much laboratories try to replicate reality, they are not reality itself. There is always no slight artificiality in their composition, which directly weakens the public validity of their results. The aseptic nature of a laboratory appears as a barrier to achieving one of the central objectives of policy experiments: Not only producing results in the form of inscriptions but also generating emotional attachments to the instrument in a wide range of (still) skeptical actors (as we will explore in Section 5).
To reduce this risk, policy experiments aim to avoid entirely secluded laboratories and opt for more open experimental settings, spaces resembling what ecologist Eugene Odum called mesocosms. Mesocosms are “middle-sized worlds falling between laboratory microcosms and the large, complex, real world macrocosms” (Odum Reference Odum1984, p. 558). In these spaces, experimental entities encounter a significant number of real variables, but in a controlled manner. Besides facing individual variables, an advantage of mesocosms is that they allow for the “simultaneous study of parts and wholes inherent in the mesocosm level of organization” (p. 561). Both individual and systemic effects can be accounted for in mesocosms, adding to a far more complex environment than laboratory settings.
In field sciences such as biology and climatology, “mesocosm experimental infrastructures allow scientists to learn from ecosystem dynamics through immersion within the realities and frictions of a specific site” (Jacobs & Utting Reference Jacobs, Utting, Borden and Meredith2024, p. 239). The great advantage of mesocosms is that “they are located in the field, but they are not completely from the field; … [They are] designed to absorb certain features of their environment without being overwhelmed by them” (Kelly & Lezaun Reference Kelly and Lezaun2017, p. 370). Mesocosms appear as spaces in which a balance can be made between laboratory conditions (which allow the systematic observation of phenomena) and field conditions (which allow the production of transformations in the issue).
Mesocosms have become privileged spaces for policy experiments because they “promise a better understanding of the processes occurring in the … real world, being in themselves at the same time an artifact, … an artificially designed object, an analog model … ” (Schwarz Reference Schwarz2015, p. 107). However, in their manufacture character also resides some dangers, as they have also been criticized for their “technocratic optimism” (Taylor Reference Taylor1988), or their implicit belief that all relevant factors of an environment can be accounted for in controlled ways, as was mentioned in Section 1 regarding the non-fields RCTs strive for (Favereau Reference Favereau, Heilmann and Reiss2021). In practice, control in mesocosms is always limited. No matter how much care is taken to control every aspect, the entities populating mesocosms always have enough degrees of freedom to cause surprises, including the total collapse of the experiment. In this hybrid character, their existence between the laboratory and reality lies much of the power of mesocosms, as well as their frailties, opening up fruitful paths to rethink policy practice.
In the natural sciences, the preferred form that mesocosms take is that of experimental spaces that are located in the exact locations where the phenomenon to be studied occurs, such as experimental stations in biology (Kohler Reference Kohler2002). In the case of policy experiments, mesocosms take on multiple forms, ranging from tests conducted in the exact locations where the problems being studied occur to the creation of controlled-access spaces, but with important similarities to the reality being analyzed.
Typically, an experiment progresses through various mesocosms during its development, each comprising distinct sets of experimental entities and their relationships, resulting in different types of outcomes. These mesocosms acquire increasing scales and degrees of complexity, including a growing number of entities affected by the issue. Especially in cases where issues affect large populations, mesocosms can result in “in-vivo” experiments (Muniesa & Callon Reference Callon, MacKenzie, Muniesa and Leung-Sea2007), which are conducted directly in the affected population’s daily living spaces.
Mesocosms are never just a physical space, as the conventional image of a laboratory could lead us to believe. More than a rigid spatiality, mesocosms should be seen as a simplified sociotechnical model – a circumscribed number of relationships between multiple entities, human and nonhuman, that represent a far more complex network existing beyond them. The creation of this simplified version comprises two interrelated processes: The purification of experimental entities and the establishment of novel frames of relationship.
As laboratory studies in STS have explored (Doing Reference Doing2008), experimental entities (ranging from humans to artifacts) are never simply mobilized from elsewhere and introduced unchanged into experimental spaces. Most experimental spaces cannot process, let alone accommodate, entities in their regular existence; they are too complex, too large, and too multifaceted. Many of them are so deeply entangled with other factors in their respective contexts that individual analyses are often impossible, without even considering the issue of scale. Therefore, “before being subjected to laboratory manipulations, the materials to be used […] are prepared for that use; Substances are purified and objects standardized, even reinforced” (Knorr Cetina Reference Knorr Cetina1983, p. 160). The implementation of an experiment always implies “an operation of transformation and reduction” (Muniesa & Callon Reference Callon, MacKenzie, Muniesa and Leung-Sea2007, p. 170). This purification could mean that the objects that end up participating in an experiment resemble their original versions in little more than name.
The policy instrument at the center of an experiment cannot escape this purification process. More than a mere “transfer” from their location abroad, as the literature on the topic suggests, their mobilization to a new mesocosm implies significant reconfigurations. Materials, actors, and organizations come and go, producing transformations in agency and power. In practice, instruments “rarely travel as complete ‘packages’; they move in pieces, as selective discourses, incipient ideas, and synthesized models, and therefore ‘arrive’ not as replicas but as policies that are already in transformation.” (Peck & Theodore Reference Peck and Theodore2010, p. 170). The purified versions of these instruments typically include elements and relationships that were not even considered in their original versions.
The inescapable degree of endemism of any policy instrument, however, does not imply that there are no elements that travel, either physically or virtually. Quite the contrary, many things travel – from “promoters” of specific instruments (Voß & Simons Reference Voß and Simons2014) to material artifacts – but in doing so are reconfigured in fundamental ways. Through the establishment of new relationships and identities, traveling elements cause changes in processes and agents in the areas and territories that receive them. From this perspective, the mobilization of instruments “has to be understood as relational and territorial, simultaneously moving and fixed” (Cochrane & Ward Reference Cochrane and Ward2012, p. 6).
Upon arrival in the mesocosms, these entities are substantially reconfigured through their introduction into a series of novel “framings” (Callon Reference Callon1998) that seek to redefine how they can behave and/or relate to one another. Initially guided by theoretical insights and/or previous experiments (both local and/or international), these novel frames of relationship aim to test the hypotheses or ideas that motivate the experiment. In scientific practice, the reactions of experimental objects under these novel framings are systematically recorded, constituting the data from which the validity of the hypotheses will be judged.
In policy experiments, the new instrument occupies a central role in this framing. We could say that instruments become framings. More than material entities, policy instruments can be seen as conceptual heuristics, sets of rules that propose novel ways to connect things. These frames can take various forms, from the establishment of new standards of qualification and commensuration to the introduction of barriers and physical connections between the entities participating in the experiment. In particular, these frames seek to introduce what Callon (Reference Callon1984) has called “compulsory passage points” in the structure of relationships, or spaces in which “actors are forced to converge around the dominant frame and then engage in specific negotiations in the context of that frame” (Rydin Reference Rydin2013, p. 26). This series of dispositions aims to substantively transform the objects included in the experiment, leading to the emergence of new identities and relational nodes. From this reconfiguration, it is expected that a new architecture of relationships will emerge, fundamentally redefining the issue.
To explore the complexities entailed in this double process of purifying and reframing, we will present a case study based on a policy experiment conducted in 2003, which centered on testing a novel procedure for public participation in science policy.
On the afternoon of Friday, October 3, 2003, a group of sixteen people met in a public building in downtown Santiago. They have never been in the same room; actually, they had strikingly different backgrounds. There was a trendy young designer from the port city of Valparaíso, a homemaker from an upscale area of Santiago, and a peasant from the rural area of Doñihue. They had almost nothing in common but one thing: They had all been selected to participate in an event entitled “El Manejo de mi Ficha Clínica” (The Management of my Health Record). Upon arrival, they were greeted by two people from the organizing committee and then swiftly put into a minivan, heading in the direction of an undisclosed location on the outskirts of the city. This trip was the final stage of a purification process that had begun several months prior.
Reacting to a global enthusiasm for novel methodologies for citizen participation in technoscience, in 2002 the Pan-American Health Organization (PAHO) agreed to fund the first experimental implementation of an instrument known as the Citizen Consensus Conference (CCC) in Latin America. After other countries desisted, a coalition of foreign and local experts managed to interest several Chilean public offices, especially the Ministry of Health (MINSAL), in running such a test. In the formal agreement, the CCC was understood as “a pilot experience for Chile and the region” whose aim was to “test the methodology and transfer it to other organisms and sectors interested in using it to evaluate technologies with social impact.”
Developed by the Danish Board of Technology (DBT) in the late 1980s, the CCC aimed to enhance citizen participation in policymaking through direct and informed discussions with experts on specific issues (Grundahl Reference Grundahl, Joss and Durant1995). The guidelines for the event prepared by experts form the DBT (OPS 2002) affirm that running a successful CCC depends on the enactment of three entities: a relevant technoscientific issue that was a matter of public debate at the time, a group of experts who can provide different insights on it, and a group of citizens who are willing to discuss it aiming at reaching a consensual position about it. As in most experiments, these entities could not enter the experimental space immediately but needed to be substantially purified beforehand.
Regarding the technoscientific issue, in early 2003, the organizing committee, formed by employees from MINSAL and other Chilean public offices, with technical assistance from PAHO and the DBT, met several times to explore possible topics. Several issues that were a matter of public debate at the time in Chile were considered, such as transgenic food, air pollution, and emergency contraceptives. After several rounds of discussion, they selected the modernization of the Patients’ Health Records (PHR) as the project. As affirmed by the organizing committee’s proceedings, this topic was selected because it was quite concrete, requiring definitions from the authority, and had an important technoscientific component. Centrally, there was a need to know the public’s attitudes about it, given that it was one of the topics being considered in ongoing parliamentary discussions about a patients’ rights law. In all, the PHR appeared quite close to the “ideal” issue identified by the DBT guidelines (p. 17).
There was a key aspect in which the PHR was not so close to the guidelines’ ideal issue: There was no social controversy surrounding it. This divergence was not considered problematic, as recalled by Project Manager Javiera Lozano.
The truth is that it [the PHR] was never too conflictive, I mean, there were tensions, but it was not the same as with, to say something, abortion or euthanasia. When we discussed the issue, the thing that we tried to do was to find an issue that was of relevance for the citizenry but that did not cause too much strain, because what interested us was to test the method, to see if it could work [in Chile], do you understand? Then, if we started talking about abortion, the group will end up exploding in the air, along with the method, the PHR was, between brackets, more aseptic [laughs] from a health perspective.
The PHR was selected above all as a means to test the device’s functionality, rather than for its relevance as part of an ongoing sociotechnical controversy, which is quite unusual in CCC implementations in countries other than Denmark (Bogner Reference Bogner2012; Goven Reference Goven2003; Seifert Reference Seifert2006). For the organizing committee, the most important thing was not the existence of public controversy regarding the PHR but the fact that the issue was “at hand” (Seifert Reference Seifert2006) and carrying an important degree of “asepticism” that appeared to increase the experiment’s chances of being successfully run.
Once this issue was approved, the next task was to select the group of experts who would provide technical inputs for the citizens’ deliberations. The organizing committee proposed “a list of experts with a multidisciplinary focus” covering five key areas of the PHR: legal, ethical, managerial, medical, and computing. Following the advice from MINSAL experts, several people were invited to participate. The group was finally composed of four doctors covering differing medical and managerial aspects of the PHR, a lawyer from the Doctors’ Guild, and an IT expert. In all, they formed a group of “scientific” experts, using the definition from the guidelines (OPS 2002, p. 23), or people with technical expertise in the issue at hand. The guidelines also recognized the relevance of considering a second kind of expertise in the form of “opinion-setting people” or people who are “representatives of stakeholder organizations, prominent in the arts, etc.” (OPS 2002, p. 23). Given the historical prominence of science-based forms of expertise in Chilean public action (Silva Reference Silva2008), anything that diverges from this pattern is not considered expertise at all. Hence, this second kind of expertise was entirely ignored by the committee.
The experts were asked to submit input stating their positions about the PHR to be sent to the citizen panel in advance of the first meeting. The most remarkable aspect of this document is the similarity between the experts’ positions on the issue. All agreed about the need to reform the existing PHR due to its multiple problems and that the new PHR should be unified, standardized, and have national coverage. A group of “scientific” experts, naturally, produced a highly purified form of technical input.
Regarding the selection of the citizen panel, the committee initially struggled to apply the guidelines. First, due to financial constraints, only people from three central regions of the country were invited to participate, against the recommendation of national coverage. Second, the guidelines established that all citizens should have the opportunity to participate, especially those who had no special connection with the issue and did not usually participate in these kinds of events. The strategy initially pursued by the committee was at odds with this mandate, as it distributed the call only to individuals with a particular connection to the health system. When the DBT representative challenged this option, they agreed to also distribute the call through other means (especially social workers at the borough level), receiving finally applications from almost 500 people.
The sixteen members of the citizen panel were selected based on their place of residence (used as a proxy for their socioeconomic status), occupation, and demographic variables, including gender and age. The group finally selected gave the impression to the committee of successfully representing the true “diversity” of the population of the three regions considered. Once these individuals were contacted and agreed to participate, the CCC experiment could finally be enacted.
The place selected to run the first meeting of the experiment was not casual. The San José Retreat House, located in the small town of Malloco (35 km southeast of Santiago’s downtown area), was built in the 1950s to serve as a seminary for future priests of the Order of Piarist Fathers. Open to the community since the 1990s, it is currently promoted as “a place to share, to reflect, to meet, to pray,” following its website. This capacity is enhanced by its location in a quiet rural environment, offering extensive lawns and stunning mountain views to visitors. The members of the citizen panel were quickly drawn into this environment. As later recalled by Juanita Lara, one of its members, “the place was very cozy, quiet, pleasant, huge, and the gardens are very well kept.”
From their first arrival, the retreat house offered new frames for the participants to locate themselves during the conference. The “cozy” and beautiful environment aims at helping them to leave aside the feelings of nervousness and anxiety that accompanied their first encounter. Such an effect was not casual. The location of the meeting at the retreat house aimed physically and mentally at “disentangling actors from the attachments of their everyday, material lives in order to produce a purified, stand-alone public” (Marres & Lezaun Reference Marres and Lezaun2011, p. 12). This relocation marked their first entrance into the CCC mesocosm, from which they were expected to emerge as deliberative citizens, dutifully embarked on the task of providing substantial advice about possible futures for the PHR.
Besides the professional facilitator and the project manager, a representative from DBT was also present at the meeting, with the explicit role of ensuring the “correct” application of the guidelines and blocking any unexpected deviations that could jeopardize the aseptic nature of the mesocosm. For instance, she demanded that the convenors “must have a technical restraint, impartiality, … and if the participants do not remember, they just do not remember; it is their problem, in the Citizen Consensus Conferences the citizens are the ones who talk.” This experimental character was enhanced by the presence of observers from MINSAL and PAHO, with the explicit aim of taking notes on the application of the instrument.
This meeting focused on providing the panel with general information about the issue, enabling them to discuss and identify a preliminary set of key subthemes that require in-depth exploration, forming the basis for a future consensual document. With this aim in mind, the first activity was for citizens to introduce themselves and discuss their reasons for applying to the conference. Following the guidelines, these “discussions and brainstorming sessions form the starting point for the discussion of key ideas about the issue” (OPS 2002, p. 20). In the Chilean experiment, the situation was quite different, as the DBT representative commented afterwards.
It is normal for citizens to be expectant and have many questions about the process during the first weekend. However, this group of citizens had some special characteristics. When expressing what motivated them to apply to the conference, the overwhelming majority discussed their desire to participate in decision-making and make their voices heard. They referred to the health record very little or not at all. They showed a great need to be heard and taken into consideration. This concern exists in all societies; citizens generally feel alienated from the decision-making process. However, this need was very intense in the case of Chile and possibly other countries of the region.
Despite previous purification processes, the kind of citizen that first emerged on the Chilean CCC experiment notably diverged from the one “scripted” (Akrich Reference Akrich, Law and Bijker1992) in the DBT guidelines. Instead of being people who “voluntarily manifest an interest in the issue” (OPS 2002, p. 12) for most of them, the PHR was largely irrelevant; it could have been any other issue, and they would have applied anyway. What they cared about was making their voices heard because of the promise made during the call about the opportunity to participate actively in policymaking.
In parallel, participants found difficulties in acting as a group, as recalled by Lozano:
I think this group, the citizen panel, is a group that does not exist, that is fictitious, and hence you have to work all the aspects related to their consolidation as a group of citizens that have a common objective, … to generate spaces of trust, a bond between them, and this means not only to make consultations but also to support all the internal work of the citizen panel, something that I think is a more Latin characteristic, not so … [Danish], isn’t? Moreover, I think that they needed this space during meals and parties to be face-to-face, to generate a cathartic space, so to speak, even including some conflicts between them as well.
Purification and framing can only take you so far. As already noted, mesocosms are spaces where several environmental factors are present in a more naturalistic manner, making them more realistic than laboratories. Nevertheless, the greater degrees of freedom expose the event to multiple risks, especially when human beings challenge the scripts assigned to them, as happened in Malloco when the members of the citizen panel not only did not care about the PHR, but also did not perceive themselves as being a part of a “Danish” democratic citizen assembly.
This unexpected development forced the directing team to adjust their weekend activities plan. Instead of focusing solely on identifying the relevant subthemes of the issue, they conducted several group games with the dual aim of helping participants start acting as a group and developing their capacities to reach consensus. Such games helped participants build trust and start seeing themselves as a group. In doing so, the games also transformed the tone of the meeting from a reflexive exercise expected in the guidelines to a more social-bonding, even self-help, experience. For the citizens, the experiment was never a matter of solely discussing the problems of the PHR or testing an innovative instrument for deliberation, but also a source of strong emotions, an aspect that is usually forgotten when analyzing the work of mechanisms for citizen participation (Harvey Reference Harvey2009; Hoggett & Thompson Reference Hoggett and Thompson2002).
Late in October, a second preparatory meeting was held in the same location. Its focus was on elaborating key questions about the PHR that the panel would pose to the experts at the final CCC event. Deriving from this, the activities centered on learning to reach consensus and formulate informed questions about the issue. However, running this event proved to be much more challenging than expected.
First, the members of the citizen panel were dissatisfied with the experts selected to present on the PHR subthemes. As recalled by the facilitator and main organizer in a later publication, the citizens “questioned the mechanism used to select the experts” (Pino & Elizalde Reference Pino and Elizalde2004), given that not all relevant perspectives about the PHR were introduced by them. This was especially relevant about the thorny issue of making the content of their PHR open to patients, something that the citizen panel ardently favored but was openly resisted by the experts. The absence of a spokesperson for this popular position was a direct consequence of the way experts were selected, which invited only those with technical credentials and excluded “opinion-setting” experts who could represent the views of citizens.
A further difficult situation arose on the second night, as Lozano recalled.
We had a problem, but it was manageable; besides, they weren’t in prison! They went out to buy something to drink [alcohol] and started dancing; we slept in a place located further [from them], because we didn’t want to intervene all the time. They should have some freedom during the weekends. It was a minor incident. We told them that it was an excess of sociability, that we had to work the next day, as the schedules were respected, at 8:00 AM. We talked along with the facilitator and we were able to manage this, because it was the weekend in which we were alone, we managed it quite well, luckily we had no further problems, we said “this can go out of our hands” because some of them were angry the next day because of what we have said.
The CCC guidelines state that a moderate amount of goodwill between the panel members is welcomed, as it will facilitate their move toward consensus. However, an “excess of sociability,” such as a proper party, was a breach of these framings and had to be stopped, as it threatened to fracture the group when it was essential for them to advance rapidly toward consensus.
Finally, a third issue was the difficulty in reducing the number of subthemes about the PHR. After failing to do so in a communal meeting, the organizers chose a different approach, as recalled by Hernandez.
[On the second day] each subgroup was asked to choose a representative; … [and] these people reached a consensus about a way to regroup the number of subthemes, something that eased the work made by the group the rest of the day. … A deepening of the key questions per subthemes was made, asking that only two affirmations per subtheme be selected, developing in this way [the citizens’] capacity of synthesis and [their ability] to highlight the relevant above the secondary.
This quote shows the important amount of work devoted by organizers to making model “citizens” out of the participants. Given the initial failure to reach an agreement, the convenors selected representatives and assigned them to work on the reduction. This strategy proved to be successful, but it had also its costs: It replaced what Horst and Irwin (Reference Horst and Irwin2010, p. 107) have called the “consensusing” expectation at the heart of the CCC, or “the active process of seeking and expecting societal consensus,” by an oligarchic arrangement in which a sample of the citizens was selected to decide for the rest of them.
Finally, the panel started sketching the questions to be asked of the experts during the final CCC event. However, in this last respect, the results were not very satisfactory for the organizers. As the DBT representative recalled summarizing this second weekend: “the group of citizens seemed to have some difficulties in written expression. … The questions were not well composed nor structured.”
The final meeting of the CCC was carried out late in November 2003. Without losing its mesocosm-based experimental character, the guidelines required that a broader audience be incorporated into the scheme, allowing it to produce effects beyond the event. In this regard, its location was moved to the former building of the National Congress in downtown Santiago. Attendance was open to any citizen who wished to participate, and personal invitations were sent to all 500 people who had applied to the original call. The media and authorities were personally invited to attend. These measures aimed to establish connections between the panel’s conclusions and key actors surrounding the PHR, particularly politicians, so that its results would be genuinely taken into consideration in policymaking.
The first two days consisted mainly of the citizen panel listening to the expert panel and then asking questions. The contribution of this activity to the overall result of the CCC, which was key to the guidelines, was weakened by two developments. First, as noted by Pellegrini and Zurita (Reference Pellegrini and Zurita2004, p. 356), “the experts’ talks were more general than they usually were [in Danish implementations].” Second, afterwards, the citizens posed “diffuse” questions, in the words of the DBT representative, which contributed little to generating a proper debate.
On the afternoon of the second day, the citizens were taken to a conference center in southern Santiago and put to work drafting a document summarizing their main recommendations about the PHR. This process lasted until the early hours of the morning. At the closing ceremony the next day, this document was presented by a member of the citizen panel to an audience comprising authorities from MINSAL, parliamentarians, experts, media outlets, and the public.
For the instrument coalition, this ceremony was the fulfillment of a long-held expectation. The CCC was finally delivering its promise; it did work as an instrument for the enactment of deliberative democracy. As was concluded in a paper published later by two members of the coalition,
Despite its shortcomings and the criticisms pointed out throughout this article, the First Citizens’ Consensus Conference held in Chile fully achieved its objectives and constitutes an excellent model for similar conferences in other countries of the region. … The final document is well-written; it addresses the various aspects of the topic discussed with great propriety and depth, and contains concrete recommendations for the development of the envisaged policies. … Public participation during the conference was extensive and active. There was excellent participation by experts and extensive dialogue between experts and lay citizens as part of an interdisciplinary forum for dialogue (Pellegrini & Zurita Reference Pellegrini and Zurita2004, pp. 354−355).
Through purification and framing, the CCC had produced a deliberative mesocosm of high complexity, demonstrating that a policy instrument can travel thousands of miles, be tested in a radically different social context, and still produce a similar effect: The democratization of technical expertise. All the problems the instrument faced could be forgotten as anecdotes.
Such experimental success, however, did not translate into any change in the issue. Not only was the citizen panel document not even mentioned in the subsequent public discussions about the PHR. More critically, the CCC was never used again in Chile or any other Latin American country. As explored in depth in a previous publication (Ureta Reference Ureta2016), this lack of impact was primarily due to the organizing committee’s overall “aseptic” approach to the experiment. They were so obsessed with making the experiment a “success” that they actively sabotaged any chance for it to become politically relevant.
As this case study reveals, purification and framing are always incomplete. No matter how much effort is put into trying to control every variable, there are always some factors that escape such control. This is especially true when dealing with human beings. As Callon (Reference Callon, MacKenzie, Muniesa and Leung-Sea2007, p. 347) recognizes, “humans in their somatic envelope, made of neurons, genes, proteins, and stem cells, are constantly overflowing. A total, unambiguous configuration is impossible. There is always a remainder, something that hasn’t been taken into account.” Against usual narratives about experiments as a tightly controlled instance, in practice, frames are continually challenged by the “inner life” (Hacking Reference Hacking1983) of experimental entities.
Experimental mesocosms are perennially beset with multiple “overflows” (Callon Reference Callon1998), ranging from materials that fail to exhibit expected properties to human beings who behave in confounding ways. This phenomenon is especially prominent in the policy field (Plehwe Reference Plehwe2023) due to the need to incorporate multiple actors with diverging agendas and capacities. Such overflows are not only temporary disruptions, but could ultimately generate fully fledged “strange things” (Ureta Reference Ureta2015), or entirely new experimental entities. Most of the time, mesocosms manage to continue functioning in the presence of these overflows, with even unusual phenomena becoming prized assets in experimental practice. There is commonly some room for reaction regarding overflows, a capacity to reaccommodate unruly entities to keep the experiment running.
As seen in the case of the CCC, this capacity to control may ultimately produce a paradox: a policy experiment that is formally successful but fails to address its issue of concern. This lack of proper transformations on the issue, however, might not be a problem for the final assessment of the policy experiment, as will be explored in the next section.
5 The Multiple Shades of Success
In policy narratives, it is seldom the case that the trajectory of an instrument ends after just one experiment. Rather than being tied to a specific location, policy instruments are portrayed as having trajectories that vastly exceed those of the issues to which they were locally applied. Each implementation is just one stage in a fluid journey, with new and attractive destinations always appearing on the horizon. However, these new destinations and the distinction they would bring to their coalition members will only start to materialize if local experiments with the instrument obtain accreditation as a “success.”
This issue is often where some of the most significant differences between scientific experiments and policy lie. The raison d’être of most scientific experiments is that unexpected things happen. Failure to meet initial expectations, then, is not seen as problematic. More than aiming to traverse a safe path, experiments are “deliberately organized to generate surprises” (Gross Reference Gross2010, p. 6). As a consequence, “experiments that fail can be called successful experiments” (Gross Reference Gross2016, p. 618), because something new has been learned from them even if such a failure can lead to the complete falsification of the hypotheses and theories supporting the study.Footnote 4
In policy, the status of surprising results is quite different. In theory, it is understood that “experiments introduce a moment of openness and indeterminacy in the policy process” (Millo & Lezaun Reference Millo and Lezaun2006, p. 180). Acknowledging this openness is not the same as welcoming it, in any case. Most of the time, the value given to openness goes little beyond seeing it as a way to tune up the instrument. The possibility that this indeterminacy could cause an overflow, even a complete derailment of the experiment, is usually not considered, let alone discussed (Clare Reference Clare2019). Despite being relatively common (Hudson et al. Reference Hudson, Hunter and Peckham2019), failures in policy experiments are seen as problematic. This is especially true when such failure implies that the tested instrument did not deliver on its promises. This resistance originates mainly from the fact that these experiments seek much more than solving an issue. From the perspective of coalition members, any result that threatens the transferability of the instrument should be avoided, even if (paradoxically) these surprises contribute more substantively to moderating the adverse effects of the issue than a properly functioning instrument.Footnote 5
Coalition members will always try to present the experiment as a success of some kind, an argument in which they will be helped by the fact that success (like failure) is a polyvalent concept. Given specific frameworks, success and failure can be easily interchangeable, opening multiple avenues for a positive evaluation. Besides, authoritative assessments are complex regarding policy experiments for at least two reasons. First, most experiments do not emerge in isolation but are part of an “ecology of testing” (Marres & Stark Reference Marres and Stark2020, p. 430), involving multiple stages and components. Therefore, it is challenging to evaluate each experiment alone. In parallel, most (if not all) policy experiments do not have a control group, as is usual in many scientific experiments – hence it is difficult to determine clearly whether unwanted results are a consequence of the new framing proposed by the instrument or depend on something else.
Commonly, coalition members have wide margins to debate the effectiveness of their instruments, mainly because, in the policy field, there remains an overall ambiguity about what such effectiveness entails in concrete terms (Mattocks Reference Mattocks2025). The trouble for them is that they are no longer alone in doing so. Depending on their scale and publicity, there are usually multiple other actors offering visions about the experiment, visions that tend to be much detached and, frequently, critical. Then, coalition members not only need to argue about the many successes of the experiment. They need to do so in ways that are more persuasive than critical voices.
The starting point of this process is to determine what constitutes the results of an experiment. From a field experiment perspective, these results would be considered exclusively data, the much-revered evidence of the instrument’s effectiveness. From the broader definition of policy experiments we have been building on this Element, data is never enough. Like any other complex intervention, policy experiments yield multiple types of results, many of which are unexpected from the outset. To explore their contours, these results will be divided into three main categories: transformations, inscriptions, and attachments.
Being a policy that seeks to intervene (even in a limited way) in reality, the first results of an experiment are different types of transformations (or the absence of) in the entities that have been brought into the mesocosm. These transformations are diverse and can substantially modify the entities involved. Some of them may confirm the images and hopes expressed in the various sociotechnical imaginaries that accompanied the instrument’s trajectory, providing useful snippets to argue about its effectiveness. Many other transformations, however, emerge closer to the figure of an overflow, unexpected effects that the actors in charge of the system must learn to understand and (when possible) control.
The observation of these transformations, both systematic and merely anecdotal, generates a second type of result in the form of multiple types of inscriptions. Latour and Woolgar (Reference Latour and Woolgar1986, p. 51) defined inscriptions as “any item of apparatus or particular configuration of such items which can transform a material substance into a figure or diagram which is directly usable by one of the members of the office space.” The production of inscriptions is the cornerstone of modern science, the primary vehicle through which observations of the phenomena under study are registered, systematized, and mobilized. The laboratory and other spaces of science, such as the mesocosms seen here, are seen as merely “inscription devices,” or technologies whose central function is the production of inscriptions. From handwritten notes on logs to highly sophisticated 3D representations, inscriptions bring science to life.
Inscriptions derived from policy experiments are of many kinds. Given the prominence of different forms of quantification in our societies, data – numeric entries that can be later organized into datasets and visually displayed as graphics usually hold a central position. These data are never “given” (Gitelman Reference Gitelman2013), that is, derived naturally from the transformations caused by the experiments. As the social study of quantification has demonstrated, data must be produced through concrete epistemic and material practices, which ultimately transform the very results they aim to convey. The process provides visibility into specific results of the experiment, but also provokes multiple forms of strategic ignorance (McGoey Reference McGoey2012), which is the active process through which results that are resisted for some reason are made invisible.
Data is not the only kind of inscription being produced. Besides quantifications, an experiment produces multiple other kinds of inscriptions, especially of a more qualitative kind. Some of these inscriptions could also result from systematic analyses of the experiment, which relies on an in-depth description of procedures and actors (as is often the case when using genealogical or ethnographic methods). Others are more open and less systematic, ranging from anecdotes of those involved to general appraisals by the media. Even in exercises where the provision of quantified data is prominent, qualitative inscriptions are key, providing critical insights to contextualize the data.
While the generation of knowledge is always at the heart of these experiments, inscriptions are never purely descriptive or analytical. The results sought through a policy experiment always have a practical, applied component. Against the usual demand for objectivity in science, in policy experiments, inscriptions explicitly sought to be “situated” (Haraway Reference Haraway1988), to directly refer to the local issues of concern the instrument wanted to ameliorate. In this sense, these inscriptions usually operate in a similar way to the “evidence” produced by causal experiments such as RCTs, seeking to shed light on the empirical effectiveness (or lack of effectiveness) of the instrument. Here, the experimental component of the policy is evident, as it seeks to “produce a continuous and reciprocal readjustment of ends and means through the comparison of different approaches to promote common general objectives” (Sabel & Zeitlin Reference Sabel, Zeitlin and Levi-Faur2011, p. 5).
Along with transformations and inscriptions, policy experiments generate a third type of result: multiple forms of emotional attachment to the instrument, the experiment, and its results. As repeatedly mentioned, experiments focused on novel instruments never look solely to carry out transformations on an issue and/or produce detailed inscriptions. In parallel, these experiments aim to persuade multiple audiences – ranging from high-level state actors to the general public – about the positive results of the experiment and the need to approve new experimental applications and/or extend them to the broader society. This conviction is never solely intellectual, based on the systematic analysis of data, but also sensible or, more accurately, emotional. As seen in the case of Transmilenio, successful policy experiments are seductive. They invite or cry out to be followed, to be mobilized, to be extended. Moreover, coalition members will always try to use this charisma to their advantage.
To explore how these three kinds of results interact, we will develop a final case study: The global mobilization of the “Chilean Model” for pension system privatization in the 1990s and 2000s.
In June 2004, Nigeria enacted the Pension Reform Act (PRA), replacing a pay-as-you-go pension scheme that only covered a minimal portion of the population and offered meager pensions. In its place, the PRA established a mandatory Contributory Pension Scheme for both public and private sector employees. This scheme was to be funded with money coming from a percentage of the employee’s monthly salary. To manage such money, upon entering the workforce, the employee must choose a Pension Fund Administrator. These private enterprises were allowed to invest these savings to increase their value, obtaining profits in the process. The only role assigned to the State was a regulatory one, through an entity known as the National Pension Commission. If no complaints were presented, the system would be run entirely by private entities.
Although it is not mentioned in the PRA, and scarcely in subsequent official documents, this radical transformation is almost a carbon copy of a pension reform scheme enacted in Chile in 1981. Also replacing a seemingly malfunctioning pay-as-you-go system, this reform created the fully privatized Administradoras de Fondos de Pensiones (AFPs, or Pension Fund Administrators), intending to manage, for a profit, the monthly pension savings of millions of workers. The role of the State, again, was limited to overseeing the system from a distance to ensure its proper functioning.
The fact that this instrument could travel almost intact between two countries located on different continents, with almost no political or economic connections, and separated by decades, demonstrates the strength of the “Chilean Model,” as this instrument has been known since then. Seen as an utter rarity when it was enacted, in a matter of fifteen years, the Chilean Model of pension reform rose to be seen as “one of the most successful policy innovations from the Global South, … and one of the best examples of transnational policy circulation” (Kemmerling & Makszin Reference Kemmerling and Makszin2023, p. 532).
However, this aura of success was not due to the Model’s local success. One of the most striking aspects of the Chilean Model’s history is that the system did not have sufficient time to prove its actual effectiveness before being lauded as a success. The people who contributed to it for most of their working lives, receiving a proper AFP pension, just started retiring around 2020, decades after the system was presented as a success worldwide. Furthermore, when these pensions finally started to arrive, they were not that good at all. As a recent assessment of the system by the IMF has concluded, the system “is now delivering low replacement rates relative to OECD peers, … while informality persists in the labor market. In the absence of reforms, the system’s inability to deliver adequate outcomes for a large share of participants will continue to magnify” (Evans & Pienknagura Reference Evans and Pienknagura2024, p. 50). Since the 2000s, the recognition of this issue has led to ongoing attempts to modify the system, with only partial success (Larrañaga Reference Larrañaga2024). If the aura of “success” surrounding the Chilean Model cannot be attributed to positive transformations for its beneficiaries, its roots must be found elsewhere.
Part of the inscriptions, out of which the success of the Chilean Model was built, preceded its actual implementation in 1981. The reform was a central component of the comprehensive neoliberal reform program being implemented during Pinochet’s dictatorship by the “Chicago Boys.” Neoliberal economic theory and its supporting coalition have long viewed pension privatization as a key pillar of any substantive program of public deregulation (Orenstein Reference Orenstein2008, p. 74). For this reason, from its first drafts, the Chilean Model “drew much interest from policy experts given that it was seen as a radical experiment that should test some of the assumptions of liberal, and specifically monetarist, economic thinking” (Kemmerling & Makszin Reference Kemmerling and Makszin2023, p. 532). For many observers, “it was only after the Chilean precedent that pension privatization turned from a theoretical concept into political reality” (Müller Reference Müller, Holzmann, Orenstein and Rutkowski2003, p. 60).
Chilean authorities and companies rapidly saw the potential of the good reception of the reform to somehow challenge the overtly hostile international narrative about the Pinochet regime, mainly due to its terrible legacy of human rights violations (Avery Reference Avery, González and Prem2025). Aiming at making a counterargument, government and businesses (prominently the AFPs) embarked since the mid 1980s on several international marketing campaigns to present to the world Chile’s economic achievements through liberalization, hoping that it would be critical in “helping to improve the country’s poor [international] reputation” (Prieto Larraín Reference Prieto Larraín2011, p. 135). In most cases, the new pension system played a prominent role in these efforts (Orenstein Reference Orenstein2008, p. 75). This process gained new impetus in 1990, when the return of democracy gave (somewhat) less prominence to human rights issues in the country.
On this task, they were helped by the emergence of a highly sophisticated coalition surrounding the instrument. From the late 1980s, several Chilean technicians who had been involved in the system’s inception became active promoters of its adoption in other countries, starting with those in Latin America. In promoting the many benefits of the reform, technical descriptions of its social and economic strengths were never alone. As highlighted by Weyland (Reference Weyland2006, p. 23), “their missionary zeal, which often made them push beyond the specific consulting tasks for which they had been hired, provided the spark for this reform project to catch on in several countries.”
Chiefly among these Chilean Model “missionaries” was José Piñera, under whose stewardship as Minister of Labor (1980−1982) the reform was implemented. Through his technical credentials – such as a PhD in Economics from Harvard University – and personal charisma, Piñera rapidly became “a fantastically successful spokesman” for the Chilean Model (Orenstein Reference Orenstein2008, p. 75), constantly presenting the model to international audiences, meeting in the process with many world leaders from George W. Bush to Vladimir Putin.
In arguing for pension reform, Piñera regularly employed a rich array of rhetorical and emotional motifs. For instance, in a highly influential book on the reform, he called the former pension system “a daily drama … [a] tragedy that continued to spread and each year—in silence, amid the indifference of the Chilean society” (Piñera Reference Piñera1991, p. 3). Applying a “reverse design” strategy (Morgan Reference Morgan2013), he created a control group for the Chilean pension experiment, portraying a dramatic scenario of “thousands and thousands of pensioners condemned to poverty” (Piñera Reference Piñera1991, p. 3). In stark contrast, he claimed that the beneficiaries of the Chilean Model were about to receive pensions “already 50 to 100 percent higher … than they were in the pay-as-you-go system” (Piñera Reference Piñera1997, p. 3).
The strategy of constructing the “success” of the Chilean Model through a mixture of technical inscriptions and passionate spokespersons has proven to be highly effective, and by mid 1990s coalition members have “induced specialists in several [Latin American] countries to … elaborate radical privatization projects” (Weyland Reference Weyland2006, p. 23). In all cases, the Chilean Model was the main template supporting these policies (Orenstein Reference Orenstein2005, p. 189).
In parallel, coalition members began to engage with representatives of International Financial Institutions (IFIs). Already embarked on an extensive process of neoliberalization (Babb & Kentikelenis Reference Babb, Kentikelenis, Cooper, Konings, Cahill and Primrose2018), IFIs have proven to be fruitful listeners to the wonders of the Chilean Model and quite quickly started “advocating pension reforms involving the introduction of Chilean-like systems” to other countries (Casey & Dostal Reference Casey and Dostal2008, p. 239). A critical milestone in this regard was the World Bank’s (1994) highly influential report “Averting the Old Age Crisis.” While enthusiastically endorsing pension privatization policies, the report recognizes that “because Chile is the only fully implemented decentralized scheme at this point, the discussion here draws heavily on its experience” (p. 204). After the World Bank, other IFIs such as the Inter-American Development Bank (IDB), the Organisation for Economic Co-operation and Development (OECD), and United States Agency for International Development (USAID) also endorsed these measures (Orenstein Reference Orenstein, Kott and Droux2013, p. 285). Given such support, it is not surprising that, from the late 1990s, a wave of other countries adopted variations of the Model, especially in Eastern Europe (Müller Reference Müller2001). Even the US government made a highly publicized, but ultimately failed, attempt at enacting a reform inspired by the Chilean Model in 2005 (Kemmerling & Makszin Reference Kemmerling and Makszin2023).
Given such antecedents, it is not surprising that Nigeria would fall under the spell of the Chilean Model. In 1997, the Nigerian government began studying ways to replace its malfunctioning pay-as-you-go system, when the international hype about the Chilean Model was at its zenith. This was evident in the final report of the committee studying the issue when affirming that,
Countries that have implemented the right policies and undertaken the necessary reforms, such as Chile, have reaped substantial economic benefits, often exceeding the expectations of the initiators. Chile … is today a completely transformed economy and the envy of other South American countries. Chile’s rapid economic growth was mostly financed by long-term savings primarily from pension funds; channeled to the real sector through the capital market. … Nigeria desires a quantum leap in her economic output just as Chile in the early 1980s. If the reformed pension system facilitated Chile’s economic renaissance, adapting Nigeria’s system to some of the good attributes is only natural and sensible (Pension Subcommittee 1997, 47−48 quoted by Casey & Dostal Reference Casey and Dostal2008, p. 241)
This paragraph appears to be directly taken from the playbook of the Chilean Model coalition, mixing technical descriptions with emotional snippets to promise all kinds of wonders if Nigeria would follow Chile’s path. Given such an evaluation from a technical committee, it is not surprising at all that the PRA of 2003 was heavily based on the Chilean model.
The continued prominence of the Chilean Model in the international sphere is somewhat perplexing given that at the time, the system was experiencing serious problems in Chile, leading to calls for its radical transformation toward a more solidary system (Miranda Reference Miranda2023). More than its promised benefits, these problematic issues were the ones that were rapidly exported to Nigeria (Ubhenin Reference Ubhenin2012), forcing the government to make important reforms to the original scheme. In this regard, again, Nigerians were not alone. All over the world, adoptions of the Chilean Model “turned out to be short-lived, often leading to more or less complete reversals” (Kemmerling & Makszin Reference Kemmerling and Makszin2023, p. 529). As fast as it rose, the Chilean Model seemed to be falling out of policy fashion.
One of the primary lessons from laboratory studies in STS is that almost anything can be produced within an experimental space. If they put enough care into it, experimenters can build a point-by-point replica of reality and then reframe it according to the instrument’s mandates. However, the real challenge is commonly to “raise a world” (Latour Reference Latour, Cetina and Mulkay1983), to create a new reality in which such experimental reframing is deemed accurate and existent.
In policy experiments, raising a world entails creating a reality beyond the mesocosms where instruments are seen as having functioned smoothly, opening the ground for further mobilizations and testing. To create such a world is highly challenging, especially regarding issues with long histories, whose components have become embedded in many social and material layers. Only dense and long-lasting interrelations of transformations, inscriptions, and attachments can produce such an effect. This is especially true in the Global South, where multiple restrictions and frictions are evident due to long histories of colonialism and damage.
The Chilean Model – in a similar way to Transmilenio – was highly successful in raising such a world. This allowed the instrument to travel widely throughout the world, conducting multiple new experiments and transforming entire countries into mesocosms for testing neoliberal deregulatory schemes. The fact that most of these experiments did not materialize into the promised transformations in terms of pension improvements was largely irrelevant – inscriptions and attachments alone were more than enough to keep the instrument on the move. Only after more than a decade and dozens of tests did the lack of actual benefits in terms of better pensions for workers finally restrain (somewhat) the global appeal of the Chilean Model.
Careful Experiments
One of the key antecedents to the “Estallido Social” of October 2019 was a massive protest carried out on July 24, 2016. Organized by multiple NGOs, unions, and neighborhood associations, it successfully mobilized more than 750,000 people for a full day of rallies in cities throughout the country, all demanding reform of the pension system toward a more solidarity-based model (Miranda Reference Miranda2023). Its slogan, “NO mas AFPs” (No more AFPs), neatly encapsulated the extensive rage that many retiring workers and their families felt when, after saving their whole working lives, their actual pensions did not even last until the end of the month. Although still being praised elsewhere, Chileans had had enough of the Chilean Model.
With less fanfare, the effects of the other policy experiments seen in this Element on their issues were largely disappointing for their coalitions and the public in general. Public transport in Santiago, although significantly improved, remains years away from achieving the “world-class” status promised by Transantiago. Air pollution in the city has also improved, but not because there is a vibrant market for tradable emission permits. Ultimately, citizen involvement in technoscientific policy remains almost nonexistent. In some cases, these experiments evolved into new issues, even more “wicked” than before, as seen with the case of the fare indexer that opened this Element. As the millions of people protesting on the streets of Chile in October 2019 remind us, even seemingly humble malfunctioning policy instruments can trigger massive overflows.
This demeaning outcome has many causes. Concerning their initial discovery, the charismatic instruments presented here generated high levels of attraction among those who initially encountered them, especially regarding their seemingly inherently mobile character and their promises of effectiveness in addressing long-standing issues. This attraction, however, was not usually accompanied by a consistent analysis of their original contexts and/or the assumptions that animated them. The instruments seemed to “shine with their own light,” and that was more than enough to form a coalition pushing for their testing. In pursuing an experiment, coalition members usually took the role of consultants, acquiring broad powers to structure the experiment, but without the ties implied by being subjects of the issues (such as public officials and/or affected populations) or having the supervision of critical peers (as in scientific practice).
This lack of critical perspectives was again evident in the consequent (re)problematization of a local issue, where sociotechnical imaginaries were constructed, presenting various scenarios of future improvement thanks to the instrument. In the process, a few mentions were made of the costs involved or possible setbacks. In parallel, existing instruments were quickly dismissed as failed, without taking the time to evaluate their achievements or the challenges their implementation entailed. Also, the adherence to problematization models of the type “the solution precedes the problem” implied that the selected issues tended to be tailored to the terms imposed by the instrument, usually leaving aside several essential characteristics.
The mesocosms in which experimental practice took place tended to be quite restricted in terms of access, especially at the beginning. In most cases, the reconfiguration of experimental entities took the form of “boundary work” (Gieryn Reference Gieryn1983), centered on building a mesocosm within which the instrument could function rather than constituting a scaled-down version of the issue. Most entities that were difficult to locate and/or actively resisted the framing proposed by the instrument were excluded or disciplined. These mesocosms ultimately encompassed many entities but were also characterized by multiple forms of ignorance and exclusion.
Such strict purification and framing resulted in a significant division. Within the frames set by the experiment, the instrument appeared to be functioning relatively fine. Outside these frames, the experiment appeared to be causing numerous overflows, resulting in a range of unexpected outcomes. Unlike scientific experiments, these anomalous results were taken as problematic, especially when they led to different actors evaluating that the instrument had not worked as expected. The risks of these critical evaluations led coalition members to present these results in ways that allowed them to argue for some partial “success” of the experiment. This obsession with demonstrating the effectiveness of the instrument meant that many transformations and inscriptions were not thoroughly analyzed, resulting in the loss of valuable information to understand the real effects of the experiment and to sketch further forms of intervention on the issue.
Given these disappointing results, especially when they ultimately worsened the situation of the groups affected by the issue, no one took responsibility. The dependence on consulting as a basis for the creation of most instruments coalitions implied a partial and temporary involvement of the participants, usually without a greater commitment to assume responsibilities beyond (some) individual professional ethics. This meant that coalitions quickly faded away once the implementation phase ended, especially if the results had not been as good as expected, leaving no one to assume responsibility or try to correct the new issues generated by experimental practice. In all, we observed in these actors a careless approach to policy experiments, considering them little more than a means to earn recognition or advance in their careers.
A common threat that crosses all these different problematic aspects is an unblemished obsession with success. From the very first contact with the foreign instrument and its charisma to aiming at scrapping some degree of accomplishment even in the face of uncontrolled overflows, coalitions heavily invested in producing some form of success for the instrument. More than a mere assessment, success operated as something akin to a currency, its accumulation and strategic trading allowing the experiment to move from stage to stage.
The relentless focus on success is not unique to these policy experiments. An overemphasis on the successful implementation of policy instruments constitutes one of the leading characteristics of contemporary policymaking. This focus not only means that policymakers constantly emphasize the bright side of policies (while actively ignoring or obscuring their darker aspects). As noted by Mica and coauthors (Reference Mica, Pawlak, Horolets and Kubicki2023, p. 6), for most analysts, “failure is an element in a broader engagement towards ensuring policy responses and governance mechanisms of effectiveness and resilience.” Although common, failure is often viewed as a malaise of the policy process, something that could be averted if sufficient attention is given to policy implementation. Ultimately, “success, efficiency, learning, and resilience … [could be] achieved through steering failure, planning it, and redefining it as an element of policymaking and knowledge production” (Mica et al. Reference Mica, Pawlak, Horolets and Kubicki2023, p. 6).
Albeit laudable, and somewhat pragmatic (who will invest in a policy that does not promise success?), this overemphasis on success, especially in analytic terms, is not only misleading, but wrong. Because failure is everywhere. This is not only true for policy but for all aspects of life on Earth. Despite the highly sophisticated forms of organization that exist in the world, from bacteria to global capitalism, all of them are beset by massive forces of disorganization. As the second law of thermodynamics states, “the total entropy of a system either increases or remains constant in any spontaneous process; it never decreases.” Entropy – or the pull of entities toward disorganization – never ceases; it is only postponed. As Serres recognized, all that exists are “islands of negentropy [or negative entropy] in a sea of entropy” (Serres 1981 [1977], 263 quoted by Kroth Reference Kroth2024, p. 27).
This principle also applies to policy. Matters of policy are always entropic; hence, issues can never be solved once and for all, and success is perennially out of reach. This is especially true for contemporary societies, in which multiple systems and institutions – from global biodiversity to liberal democracy – appear to be embedded in sustained processes of entropic decay (Stiegler Reference Stiegler2018). Given this context, policy proposals could only be seen as speculative attempts to manage decay.
Policy experiments have the potential to help us explore new ways of addressing decay. To reach such potential, however, these experiments should set aside their current focus on carrying out successful experiments with successful instruments. In their place, new kinds of experiments should be devised, ones that begin by acknowledging the impossibility of escaping entropy, both physical and social. Ditching pompous arguments about finally solving issues, policy experiments must focus on the humbler attempts at managing decay in better, kinder ways. More than a matter of control, policymaking hence becomes a “matter of care” (Puig de la Bellacasa Reference Puig de la Bellacasa2011, emphasis added).
As the influential definition from Tronto states (Reference Latour1993, p. 103), care is “everything that we do to maintain, continue and repair ‘our world’ so that we can live in it as well as possible.” Caring is not about ultimate triumph, but about the recurrent maintenance necessary to keep sustaining lives in challenging circumstances, to cheat failure so we can continue to live in our world “as well as possible.” Caring is often temporary, recurring, and ultimately unsuccessful. It always includes a recognition of the inescapability of failure. Instead of aiming for triumph, caring concentrates on the humbler attempt at “holding together” objects that are constantly crumbling and falling apart (Puig de la Bellacasa Reference Puig de la Bellacasa2011, p. 90).
The inescapability of decay does not mean that caring is nihilistic. Caring is never solely an affect, as it is usually denigrated. It is “a mode, a style, a way of working” (Mol et al. Reference Mol, Moser and Pols2010, p. 7). Caring can also be thought of as a mode of experimenting with novel arrangements for holding things together in the face of universal decay. These arrangements might be (ideally) better than the ones that exist, but they are never final. All the issues seen in this Element, from urban air pollution to democratic deficits in technoscience, could become objects of careful policy experiments.
Careful experiments should begin by dismissing the notion that instruments are ready-made solutions to local problems, especially if they have been created for foreign issues. It has been noted that caring “is not a transaction in which something is exchanged (a product against a price), but an interaction in which the action goes back and forth (in an ongoing process)” (Mol Reference Mol2008a, p. 18). In this process, instrument charisma is not necessarily a virtue. However, it could be a problem, as it tends to solidify instruments, surrounding them by a thick layer of “success” that makes introducing necessary adaptations difficult. Furthermore, what we need is for these instruments to change, to adapt. Although some foreign principles and ideas could be central to produce these experiments, instruments must always be understood as entities to be transformed, made local, through multiple adaptations and additions. Instead of being thought of as “immutable mobiles” (Latour Reference Latour1987), good policy instruments would be the ones that are not “too rigorously bounded, that do not impose [themselves] but try to serve, that are adaptable, flexible and responsive” (De Laet & Mol Reference De Laet and Mol2000, p. 226). Policy instruments, in themselves, become fragile and tentative, requiring constant adjustments to remain effective.
In carefully conducted experiments, fragile and fluid policy instruments never engage with issues in a merely instrumental manner. As much as the instrument is cared for, the issue must be cared for. This implies working with issues on which the experimenters have a stake, affecting them and many others. A careful approach never fails to give prominence to actors and entities, human and nonhuman, that are affected by these issues but whose voices are regularly ignored or excluded from instrument coalitions, such as local communities and fragile nonhuman beings.
Given this framing, problematizations cease to be an excuse to present the many virtues of the instrument. They become dense aggregation exercises, in which the many facets of an issue are presented. In particular, “attention and worry [are expressed] for those who can be harmed by an [policy] assemblage but whose voices are less valued, as are their concerns and need for care” (Puig de la Bellacasa Reference Puig de la Bellacasa2011, p. 92). Especially when running experiments in the Global South (or in marginalized areas of industrialized countries), careful policy experiments must take into consideration all the precariousness surrounding them, derived from long histories of colonialism and damage (Klein Reference Klein2024). Policy experiments do not happen in a void but are marked by (and contribute to) long-lasting processes of structural violence. Additionally, critical attention is given to the instrument itself, carefully dissecting its assumptions and components, to identify elements that need correction and potential downsides of their local implementation.
Careful experimenters recognize themselves as vulnerable subjects. Within a logic of care, it is impossible to exert a detached gaze, to carry out the traditional “god’s trick” of western science (Haraway Reference Haraway1988), analyzing the experiment without being affected by it. As recognized by Mol (Reference Mol2008b, p. 93), “instead of dreaming that we are outsiders, it might be better to realise that we act, and to try to improve, from the inside.” Like any other assemblage, careful experiments have no externality. The only way to become engaged with them is to be affected by them. Such affection permeates all stages of the experimenters’ involvement in the process, from genuine interest in the issue at hand to assuming responsibilities when things do not turn out as expected. Instead of merely being (or failing to be) accountable for failed expectations or unexpected overflows afterwards, careful experimenters continually look for trouble, novel forms of failure, and try to introduce temporary fixes. Because overflows are felt in their very flesh.
These experiments generate fluid mesocosms, on which the demands for a workable experimental space are always balanced by the need to be inclusive and open to being surprised. Purification is partial; hence, experimental entities retain many capacities for alliance and contestation. More than guided by theory or hypothesis, frames are essentially inductive, emerging from the novel relationships that result from placing experimental entities in a common space. Some degree of manipulation is inevitable, but it is also preceded by careful analysis of its pros and cons. Commonly, experimenters cede important degrees of control to experimental entities, even becoming subjects of the novel frames they propose.
In contrast to the usual approach of public participation theory, these mesocosms do not aim for consensus. Careful experiments explicitly seek to generate conflictive situations, in the sense of facilitating the encounter between differences, even radical ones. These experiments can entail multiple forms of conflict and mistakes, but mistakes that are productive in allowing us to recognize the existence of insurmountable differences, more ontological than epistemological (Viveiros de Castro Reference Viveiros de Castro2004), which force us to accept that there are no final consensuses regarding how to intervene in the issue. Careful experiments force us to accept that, frequently, a well-negotiated dissent is much more convenient than a supposed consensus that only functions as a facade for the imposition of some (few) over others (many).
Careful experimentation also reveals that, paradoxically, a formally successful mesocosm should not be viewed as the ultimate goal of an experiment. Mesocosms can be successful in terms of the usability of the instrument, as occurred with the CCC, and yet fail to generate any effects on their issues of concern. On the other hand, instruments may fail miserably to produce the promised results but ultimately lead to significant transformations in the issue, as was the case with ETS. Instruments would undoubtedly remain central in these careful experiments. Nevertheless, their importance derives not from their promises of success but from their ability to summon multiple actors to think in novel ways about the issues that motivate the implementation. Sometimes, the mesocosm is the experiment.
Finally, careful experiments cannot be merely evaluated afterwards, balancing judgment between the poles of success or failure. A careful experiment “is not something to pass a judgment on, in general terms and from the outside, but something to do, in practice, as care goes on” (Mol et al. Reference Mol, Moser and Pols2010, p. 28). These experiments are reflexive processes, from the very start, integrating different forms of judgment and evaluation throughout the entire process. On such constant evaluation, not only does the instrument’s performance enter, but also the issue, its public, and even the experimenters themselves.
A key consequence of this reflexive stance is that commonly used instruments are not inherently good or bad, but rather are cared for sufficiently (or not) to become suitable for their particular purposes. The downsides presented in many of the policy experiments seen in this Element were not derived from inherent or structural contradictions between these instruments and the local issues they were meant to address. They mainly derived from the lack of care with which these experiments were designed, implemented, and evaluated. Instruments such as CCC or BRT can work in various ways, producing multiple types of results, but only if experimenters invest sufficient care in nurturing them.
Given their complexity, careful policy experiments can never be seen as silver bullets that would deliver automatic improvements on an issue. They can fail, and often do. Even if they do not fail, their “success” is never like the one promised by conventional forms of policy experimentation, the radical amelioration (even suppression) of an issue. A careful approach could only provide partial victories, reduce side effects, and buy us some time. Hence, they could never adequately deal with an issue, let alone manage one.
The recognition of their ultimate failure does not mean that policy experiments are useless. On the contrary, in a context of growing social unrest regarding conventional forms of public action, the development of careful forms of public experimentation is of utmost urgency. Nevertheless, these experiments could not become the thoughtless importation of charismatic instruments, naively believing that their aura of success and sophistication will allow us to deal with issues in expeditious ways. This approach only leads to innocuous mesocosms that produce some inscriptions and attachments, but few substantive transformations.
Given that our world is immersed in some of the most intense processes of decay of the last centuries, some of them even menacing the continuity of life on Earth (as the socioecological crisis), it is critical to generate systems of careful policy experimentation that can allow us to explore novel ways of ameliorating the consequences of the crisis, especially over already vulnerable populations and processes. If what we seek through these experiments is simply to produce and test new instruments, we will surely be disappointed. On the contrary, if we see in these experiments a possibility of redefining our ways of doing politics and testing new ways of thinking about issues of common interest, they appear pregnant with hope because it is precisely through logics of careful experimentation, from mesocosms as spaces formed by heterogeneous coalitions, plagued by tensions and divergences, from where we can start exploring novel ways to assemble increasingly plural, sustainable, and just forms of living in our fragile worlds.
Glossary
- AFP
Administradoras de Fondos de Pensiones (Pensions funds administrators)
- BRT
Bus Rapid Transit
- CCC
Citizen Consensus Conference
- DBT
Danish Board of Technology
- ECLAC
Economic Commission for Latin America and the Caribbean
- ETS
Emissions Trading Schemes
- IDB
Inter-American Development Bank
- IFI
International Financial Institutions
- IMF
International Monetary Fund
- INE
National Institute of Statistics
- MINSAL
Ministry of Health
- MTT
Ministry of Transport and Telecommunications
- ODEPLAN
Office of National Planning
- OECD
Organization for Economic Co-operation and Development
- PAHO
Pan-American Health Organization
- PHR
Public Health Record
- PTEP
Public Transport’s Expert Panel
- PTUS
Plan for Urban Transport for Santiago
- PRA
Pensions Reform Act
- RCT
Randomized Controlled Trials
- STS
Science and Technology Studies
- USAID
United States Agency for International Development
- WB
World Bank
M. Ramesh
National University of Singapore (NUS)
M Ramesh is UNESCO Chair on Social Policy Design at the Lee Kuan Yew School of Public Policy, NUS. His research focuses on governance and social policy in East and Southeast Asia, in addition to public policy institutions and processes. He has published extensively in reputed international journals. He is co-editor of Policy and Society and Policy Design and Practice.
Michael Howlett
Simon Fraser University, British Columbia
Michael Howlett is Burnaby Mountain Professor and Canada Research Chair (Tier1) in the Department of Political Science, Simon Fraser University. He specialises in public policy analysis, and resource and environmental policy. He is currently editor-in-chief of Policy Sciences and co-editor of the Journal of Comparative Policy Analysis, Policy and Society and Policy Design and Practice.
Xun WU
Hong Kong University of Science and Technology (Guangzhou)
Xun WU is currently a Professor at the Innovation, Policy and Entrepreneurship Thrust at the Society Hub of Hong Kong University of Science and Technology (Guangzhou). He is a policy scientist with a strong interest in the linkage between policy analysis and public management. Trained in engineering, economics, public administration, and policy analysis, his research seeks to make contribution to the design of effective public policies in dealing emerging policy challenges across Asian countries.
Judith Clifton
University of Cantabria
Judith Clifton is Professor of Economics at the University of Cantabria, Spain, and Editor-in-Chief of Journal of Economic Policy Reform. Her research interests include the determinants and consequences of public policy across a wide range of public services, from infrastructure to health, particularly in Europe and Latin America, as well as public banks, especially, the European Investment Bank. Most recently, she is principal investigator on the Horizon Europe Project GREENPATHS (www.greenpaths.info) on the just green transition.
Eduardo Araral
National University of Singapore (NUS)
Eduardo Araral specializes in the study of the causes and consequences of institutions for collective action and the governance of the commons. He is widely published in various journals and books and has presented in more than ninety conferences. Ed was a 2021–22 Fellow at the Center for Advanced Study of Behavioral Sciences, Stanford University. He has received more than US$6.6 million in external research grants as the lead or co-PI for public agencies and corporations. He currently serves as a Special Issue Editor (collective action, commons, institutions, governance) for World Development and is a member of the editorial boards of Water Economics and Policy, World Development Sustainability, Water Alternatives and the International Journal of the Commons.
About the Series
Elements in Public Policy is a concise and authoritative collection of assessments of the state of the art and future research directions in public policy research, as well as substantive new research on key topics. Edited by leading scholars in the field, the series is an ideal medium for reflecting on and advancing the understanding of critical issues in the public sphere. Collectively, the series provides a forum for broad and diverse coverage of all major topics in the field while integrating different disciplinary and methodological approaches.
