State-Building in the City: An Experiment in Civilian Alternatives to Policing

CHRISTOPHER BLATTMAN; GUSTAVO DUNCAN; BENJAMIN LESSING; SANTIAGO TOBÓN

doi:10.1017/S0003055426101555

State-Building in the City: An Experiment in Civilian Alternatives to Policing

Published online by Cambridge University Press: 06 April 2026

and

CHRISTOPHER BLATTMAN*: Affiliation:
University of Chicago , United States
GUSTAVO DUNCAN*: Affiliation:
EAFIT University , Colombia
BENJAMIN LESSING*: Affiliation:
University of Chicago , United States
SANTIAGO TOBÓN*: Affiliation:
EAFIT University , Colombia
*: Corresponding author: Christopher Blattman, Professor, Harris School of Public Policy, University of Chicago, United States, blattman@uchicago.edu.
Gustavo Duncan, Professor, School of Finance, Economics, and Government, EAFIT University, Colombia, gduncan@eafit.edu.co.
Benjamin Lessing, Associate Professor, Department of Political Science, University of Chicago, United States, blessing@uchicago.edu.
Santiago Tobón, Professor of Economics, School of Finance, Economics, and Government, EAFIT University, Colombia, stobonz@eafit.edu.co.

Article contents

Abstract
INTRODUCTION
CONTEXT
CONCEPTS, MEASUREMENT, AND DATA
INTERVENTION
EXPERIMENTAL DESIGN
RESULTS
DISCUSSION AND CONCLUSIONS
DATA AVAILABILITY STATEMENT
FUNDING STATEMENT
CONFLICT OF INTEREST
ETHICAL STANDARDS
Footnotes
References

Rights & Permissions

Abstract

We helped design and evaluate a statebuilding intervention in Medellín, Colombia. The municipal government dramatically intensified nonpolice state presence in 40 neighborhoods over 20 months. On average, perceptions of security and legitimacy changed negligibly, suggesting that returns to statebuilding investments are generally low, at least within electoral cycles. Prespecified heterogeneity analysis, however, reveals significant increases in security and legitimacy where state governance began relatively higher, while impacts were null or possibly negative where it began lower. This suggests increasing rather than diminishing returns to statebuilding. The divergence apparently resulted from city officials under-delivering in initially lower-governance sectors. One reason might be “start-up costs” in statebuilding. Alternatively, both initial state penetration and incentives to implement new programs might depend on neighborhoods’ ability to hold agencies accountable. Whatever their source, increasing returns could drive persistent “neglect traps”—channeling political attention and investment to areas where state penetration is already robust, reinforcing existing disparities.

Information

Type: Research Article
Information: American Political Science Review , First View , pp. 1 - 19

DOI: https://doi.org/10.1017/S0003055426101555 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2026. Published by Cambridge University Press on behalf of American Political Science Association

INTRODUCTION

Even within wealthy, well-governed cities, poor and informal neighborhoods are often islands of low state penetration. Indeed, many are partially governed by local criminal groups, sometimes competing with the state for residents’ loyalty (e.g., Arias Reference Arias2017; Leeds Reference Leeds1996; Melnikov, Schmidt-Padilla, and Sviatschi Reference Melnikov, Schmidt-Padilla and Sviatschi2025; Uribe et al. Reference Uribe, Lessing, Schouela and Stecher2025). Such within-city variation in state penetration is puzzling. Governance should be both easier and more rewarding in densely populated areas close to the center of state power than in distant, sparse rural areas (Scott Reference Scott2009; Tilly Reference Tilly1990). Urban peripheries should be the low-hanging fruit of statebuilding (Herbst Reference Herbst, Goodin and Tilly2006).

Why do states fail to establish Weberian monopolies throughout their urban cores? This study suggests some answers. One possibility is that improving state penetration and effectiveness is difficult, with low returns to investments in state presence and activity. Alternatively, success may depend on initial conditions. In particular, statebuilding efforts may exhibit increasing returns—low or even negative impacts in areas with little state penetration, but higher in areas that begin with moderate government capacity and more of a monopoly on violence. Such increasing returns could help explain path-dependent “neglect traps,” where political attention and investment flow to places where state penetration is already relatively robust.

These hypotheses flow from the unexpected results of a city-wide statebuilding experiment in Medellín, Colombia. The city wanted to test whether dramatically expanding nonpolice state presence could improve perceptions of state legitimacy and security provision, and crowd out governance by local gangs. The intervention was guided by the concept of convivencia (“coexistence”), common across Colombia, where mayors have been experimenting with nonpolice security strategies for decades. Convivencia efforts often involve street-level bureaucrats and municipal agencies working to improve everyday order, foster communication between communities and authorities, strengthen community organizations and their ability to solve local problems, and connect residents with state agencies that can solve disputes. Such efforts—especially nonpolice ones—have rarely if ever been evaluated experimentally.Footnote ¹

We worked with the Alcaldía (the Mayor’s office) to identify an experimental sample of 80 “sectors”—informal areas of up to 10 city blocks with 1,000–3,000 residents each—in low- and middle-income residential neighborhoods. Like most neighborhoods in Medellín, these sectors had local gangs known as combos that provide differing degrees of local governance. In 40 sectors, the city dramatically intensified its convivencia activities.

Operación Convivencia began in early 2018 and lasted 20 months. It represented at least a 10-fold increase in central and street-level state attention in treated sectors.

There is little data on the legitimacy or governance activities of city governments, let alone gangs. Thus, to design and evaluate this intervention, we developed and prespecified two new survey-based measures: Relative state–gang legitimacy and Relative state–gang security provision. Our relative legitimacy measure captured perceived trust, satisfaction, and fairness of the state versus the local gang. Our relative security measure captured the perceived responsiveness of each actor to local disorder and disputes. We collected these data in a representative citywide survey of nearly 5,000 residents and businesses, plus an additional 2,400 in the 80 experimental sectors. We also explore impacts on administrative measures of security—emergency hotline calls and reported crimes.

Anticipating that impacts might differ depending on initial conditions, we also prespecified a heterogeneity analysis based on a proxy for baseline state penetration: Relative state–gang governance. To measure this, we interviewed three community leaders per sector, asking the degree to which the state or the gang dominated 14 forms of everyday governing activities, security, and non-security related.

Despite the intensity of the intervention, we see no impact on our primary outcomes—at least on average. Residents barely noticed this huge increase in state presence, and it did not raise their perceptions of state legitimacy or security provision (in relative or absolute terms). If anything, treated sectors perceived a small decline in state presence and security.

This is surprising for several reasons. Like many middle- and high-income cities, Medellín has a relatively high-capacity bureaucracy. It dramatically increased nonpolice state activities and presence in underserved communities for almost two years. Perhaps changing citizen perceptions of effectiveness and legitimacy was more difficult than we or the city anticipated. This would challenge current enthusiasm for civilian alternatives to policing, and suggest that the returns to urban statebuilding may be generally lower than widely believed.

Our prespecified heterogeneity analysis, however, suggests an alternative interpretation—increasing returns to statebuilding. We initially expected diminishing returns, with the largest impacts where state penetration began relatively low and criminal rule relatively high. In fact, the opposite occurred. In relatively high-state/low-gang sectors, residents did notice large increases in services and attention; the program raised state legitimacy by roughly 10 percent; and reported crimes and emergency calls related to fights and public disorder decreased by 40 percent. An index of all outcomes improved by a huge margin—0.67 standard deviations. Conversely, residents in low-state/high-gang sectors did not notice improvements in state presence, perceptions of the state weakly worsened, and there was no change in reported crime or street-level disorder.

One likely cause of these heterogeneous effects was weaker implementation where state penetration began lower. Our data suggest that city officials were less likely to fulfill their obligations and promises in such sectors. This suggests that service provision can increase state legitimacy and improve local security—provided the state actually delivers. Raising and then disappointing expectations, by contrast, is a recipe for null or even negative impacts.

Why didn’t the state deliver in low-state/high-gang neighborhoods? One possibility is that gangs prevented or undermined effective state delivery. While we cannot rule this out, implementers reported little interference from gangs, and our interviews with gang members suggest that they did not find municipal staff threatening and are mainly concerned with police. Two other explanations seem more likely. First, some minimum level of penetration may be required to deliver at all, generating entry barriers to statebuilding. Second, low-state/high-gang communities may have less social cohesion and organization, and hence less ability to hold state agencies accountable, undermining incentives to deliver.

Any one of these mechanisms could produce increasing returns. These, in turn, have potentially wide-ranging and critical implications. Increasing returns help explain economic and demographic clustering at all scales (Krugman Reference Krugman1991), and drive path dependence in general (Pierson Reference Pierson2000), which can produce suboptimal but entrenched outcomes.

Increasing returns to urban state-building, if present, would fit these patterns. They could create conditions for what we call “neglect traps”—a political analog to the development traps arising from increasing returns to investment (e.g., Duflo and Banerjee Reference Duflo and Banerjee2011; Weil Reference Weil2008). Politicians and bureaucrats need to achieve observable results with limited resources in short time frames. If returns are highest in well-governed areas, their incentives are to exacerbate rather than redress disparities. Neglect traps—and the increasing returns that drive them—could help explain the persistence of uneven state penetration within urban cores even as overall state capacity grows, with zones of Weberian monopoly and robust security abutting areas of relative state absence, disorder, and criminal governance.

Caution is warranted. The evidence for increasing returns is rooted in a single experiment and rests on heterogeneity analysis in a modest sample. Yet the potential importance merits further experimentation and research.

CONTEXT

The State

Medellín is Colombia’s second-largest city, with a metropolitan population of more than four million. It is an industrial and commercial center, with a well-organized bureaucracy, high tax revenues, and public services.

The metropolitan police force has about 2.7 officers per 1,000 people—slightly higher than the U.S. average, and comparable to Los Angeles. Medellín is divided into 16 comunas. Most have its own police jurisdiction with a commander and station. In Colombia, however, the police are a national institution—a branch of the Defense Ministry. Although the constitution designates mayors as local police authorities, this only gives them influence over tactics and broad policy. The number of officers, their wages, and training decisions are made by the central government, not mayors.

Police autonomy is one reason why Colombian cities have been experimenting with civilian security measures for decades. Most cities have a large municipal agency, the Secretariat of Security, that directs a diverse array of activities. Medellín’s secretariat has roughly one staff per 1,000 residents, giving it roughly a third as many personnel as the police. Its budget and personnel have grown enormously over time, from spending as little as 2 USD per capita in 1985, to 40 in 2007, and well over 50 in recent years (Appendix A of the Supplementary Material). It is this combination of centralized coordination plus street-level staff that Operación Convivencia augmented.

Street Gangs

In Medellín, gangs and criminal governance are also important features of everyday life. Virtually every low- and middle-income neighborhood is home to a combo with a local monopoly on illicit activities. Combo territories—often no more than 10–25 blocks—are well demarcated, known to residents, and have been largely stable since at least the early 2010s. There are roughly 400 combos in the metropolitan area.

Since 2016, we have conducted semi-structured qualitative interviews with 178 gang leaders and members across 80 groups. Obviously, this is a convenience sample of actors who agreed to speak. Almost half took place in one of Medellín’s three major prisons, from which leaders typically direct street operations. Appendix B of the Supplementary Material discusses human subjects protections and ethical considerations.

We found that Medellín’s combos are generally small, well-organized firms whose profits come primarily from local drug retailing, supplemented by protection fees (described below). Combos typically have 15–50 salaried members, most aged 16–35, and each member typically has a well-defined position in one of the combo’s business lines. Many combos also sell private protection services and other governance services in return for fees and loyalty (Blattman et al. Reference Blattman, Duncan, Lessing and Tobón2025). The fees are often called a pago por la vigilancia (surveillance fee) or, more colloquially, a vacuna (vaccine).

CONCEPTS, MEASUREMENT, AND DATA

Concepts

State capacity, legitimacy, presence, and governance are contentious concepts, hard to define and measure, so we start by clarifying our use of terms. The intervention sought, at a minimum, to increase nonpolice state presence—the visible presence and activity of local bureaucrats and services. Of course, the city also tried to incentivize its agents to be engaged and effective in helping to govern everyday life—what we call state penetration. If successful, we predicted that greater presence and penetration could result in greater state legitimacy and, potentially, neighborhood security.

The hypothesis that state presence and penetration could improve its legitimacy is straightforward. Presence could directly foster personal relationships between residents and local staff (Karim Reference Karim2020), while more effective provision of everyday services and could win hearts and minds. States have many ways to build trust and the right to rule, but effective local service and security provision have strong track records in producing stability and popular support for governments (Beath, Christia, and Enikolopov Reference Beath, Christia and Enikolopov2012; Berman et al. Reference Berman, Felter, Shapiro and Troland2013; Carter Reference Carter2013; Gurr Reference Gurr1970; Krasner and Risse Reference Krasner and Risse2014; Levi, Sacks, and Tyler Reference Levi, Sacks and Tyler2009).

The connection between our intervention and neighborhood security is less obvious, since it did not directly affect policing. As we elaborate below, any impact of Operación Convivencia would mainly be indirect, via community mobilization and education, deterrence through tackling everyday problems, or signaling state presence to disorderly actors.

Measurement and Data

Measuring Baseline State Penetration

We predicted that the intervention would have the largest impacts in neighborhoods where state presence and penetration began low. We operationalized this with a measure we call baseline Relative state–gang governance. We interviewed approximately three government or community leaders in each experimental sector and asked them “Who solves [problem] in this sector?” or “Who gives permission for [activity] in this sector?” for 14 different types of governance. This included 10 order- and security-related items (such as dispute resolution, assaults, thefts, murders, infrastructure maintenance, and poverty relief) and four forms of regulation (noise, sports fields, political rallies, and becoming a community leader). We relied on leader reports largely because the rapid pace of the city’s intervention gave us only weeks to collect pre-intervention data.

These baseline measures were relative, reflecting a conventional view of state vs. gang governance as zero-sum. The response options we offered leaders were “Mostly the combo,” “Mostly the state and official community leaders,” or “Both in equal proportion.” Table C.1 in the Supplementary Material reports responses on a 0–1 scale, with 1 representing mostly the state, and 0 mostly the combo. Averages range from 0.44 for dealing with disorderly drug users and 0.49 for solving thefts, to 0.87 for poverty alleviation and 0.93 for local infrastructure. Most other measures to do with crime and regulating daily life are in the 0.6–0.8 range. To index these, we standardize and average the 14 measures and transform them into a 0–1 scale. For robustness, we also consider alternative measures of baseline state penetration. Appendix C.1 of the Supplementary Material elaborates.

Measuring Primary Outcomes

Nearly two years later, in December 2019, we ran a representative survey of residents and businesses in Medellín’s 223 low- and middle-income barrios. Years of qualitative fieldwork and question-piloting improved the quality and scope of our measures, which now included legitimacy and taxes/extortion. We also learned that state–gang governance was not zero sum, so we measured perceptions of the state and the gang separately, giving us both absolute and relative measures.

Citywide, we randomly sampled 1,900 blocks, stratified by barrio, and sought to interview two households and one business per block. In addition to this representative sample, we selected six additional blocks in each of the 80 experimental sectors and set out to interview four residents and one business per block (about 29 respondents per sector). This represents roughly 400 additional blocks. Altogether, this amounts to 2,347 unique blocks surveyed. Appendix C.2 of the Supplementary Material elaborates.

To operationalize legitimacy, we asked residents: how much they trusted each actor, whether each behaves fairly, how satisfied residents were with each, whether they thought their neighbors trust each, and how they thought their neighbors would rate each. Each question used a 4-item Likert scale, which we rescaled so that 0 = nothing, 0.33 = a little, 0.66 = somewhat, and 1 = very. Table 1 reports average responses, as well as an index for each actor that averages all five legitimacy measures (pooling the police and Alcaldía responses into a state measure). Relative state–gang legitimacy is the simple difference between the absolute state and combo indexes. It ranges from −1 to 1, where positive values imply the police and Alcaldía are seen as more trusted/fair/effective than the combo.

Table 1. State and Combo Legitimacy and Security Provision, Barrio Survey Averages, 2019

Note: HH indicates a household question and Biz indicates a business question. Columns 1–5 present averages from the city-wide representative survey ( $ N=4,\hskip-0.15em 598 $ ). Column 6 reports averages for the 40 control sectors of the experimental sample ( $ N=1,\hskip-0.35em 193 $ ). Legitimacy questions are only asked to Households ( $ N=2,\hskip-0.15em 958 $ for columns 1–5 and $ N=958 $ for column 6).

To measure security provision, the survey asked how often state and combo each intervened in 17 everyday forms of disorder.Footnote ² These included neighbor disputes, street fights, debt collection, robberies, and other crimes. For each of the 17, we asked how often each actor intervened—never, occasionally, frequently, or always. Table 1 reports average responses. Each question used a 1–4 Likert scale, rescaled so that: 0 = never, 0.33 = occasionally, 0.66 = frequently, and 1 = always. We averaged these 17 items into 0–1 indexes of state and gang security provision. We also calculate a Relative state–gang security measure ranging from −1 to 1.

A broader governance measure could have included the maintenance of public space, state access and communication, the quality of collective decision-making and coordination, and provision of welfare and additional public goods. Survey length restrictions forced us to make difficult choices. We focused on security-related governance because our qualitative work suggested that this is where gang and state governance most commonly overlapped, and because we were interested in whether greater nonpolice presence could improve security conditions.

State/Combo Performance Outside the Experimental Sample

Here, we describe findings from the 2019 citywide representative sample, both for context and to motivate the intervention and our predictions.

State

The average low- and middle-income resident has moderately positive views of the police and Alcaldía. The state legitimacy index has a mean of 0.57, which corresponds to being “somewhat” fair or trustworthy on the Likert scale. When it comes to security provision, the mean is 0.41—slightly better than “a little” responsive to disorder and disputes. Nonetheless, there is widespread variation in perceptions across neighborhoods. State legitimacy varies from 0.28 at the 10th percentile to 0.83 at the 90th, while security provision varies from 0.06 to 0.8. Not surprisingly, the two indexes are also positively correlated. Appendix C.3 of the Supplementary Material elaborates.

Combos

After the state, combos are the most common organization that residents use to settle household and business disputes, collect debts, stop fights, prevent thefts, and manage the homeless and drug addicts. Gang security provision has a mean of 0.33—80 percent the level of the state. Average combo legitimacy is 0.43—75 percent of the state’s. Combos vary widely in the extent to which they provide governance and security. Every low- and middle-income neighborhood has a combo, but not all govern. Again comparing 10th–90th percentiles, security provision ranges from 0 to 0.76, legitimacy from 0.07 to 0.8.

Relative State–Gang Governance

The state is by far the most important governing authority in all neighborhoods, as the combo does not provide infrastructure or solve local coordination problems. Yet with respect to the 17 forms of security provision, respondents ranked the combo as more responsive than the state in 31 percent of neighborhoods, as illustrated in Figure 1.

Figure 1. Relative State–Gang Security Provision by Barrio, and Location of Experimental Sectors

Note: We average the 4,598 responses in the 2019 survey by barrio. Red indicates that the combo responds more to disputes and disorder, and blue indicates the state.

Importantly, state and gang security provision are not strongly negatively correlated—in contrast to the conventional zero-sum strategic-substitutes view. Many communities report that both the state and combo provide order. These areas are typically wealthier and may have a higher willingness to pay for both actors. As we show in a companion study, combos seem to govern partly in order to protect drug-retailing rents and respond to police presence near drug corners by governing more (Blattman et al. Reference Blattman, Duncan, Lessing and Tobón2025). Appendix C.3 of the Supplementary Material elaborates.

INTERVENTION

Like many city governments, Medellín’s Alcaldía was interested in statebuilding in the sense of increasing public safety, improving community relationships, and increasing its perceived legitimacy. Ideally, this might lead citizens to increasingly turn to police and the Alcaldía for security and dispute resolution rather than local gangs.

There are many civilian-led security approaches, from alternative dispute resolution, to community violence interruption, to increased social services—many of which exist in Medellín. Here, the Alcaldía wanted to assess the returns to intensifying the broad array of everyday municipal services and improving the functioning and governance of small neighborhoods. Appendix D of the Supplementary Material elaborates.

Experimental Sample

Working with the Alcaldía, we determined that we could best evaluate the returns to city services with a highly-intensive intervention in 40 “sectors.” Each sector is an informal neighborhood, far smaller than an official barrio. The government identified 80 sectors that were broadly representative of the city’s low- and middle-income barrios, mapped in Figure 1. They ranged in size from 200 to 600 households (1,000–3,000 residents), typically covering up to 10 medium-density blocks.Footnote ³

The intervention was focused in sectors no more than 10 blocks large not because the approach demanded such intensity, nor because this change was the most important margin to evaluate. Rather, intervening intensively in a small number of sectors had three advantages: maximizing statistical power through treatment intensity, minimizing the chance of spillovers (by ensuring sectors were at least 250 meters distant from one another), and working within the city’s budgetary constraint.

Activities

Operación Convivencia began in April 2018 and ran for 20 months, until December 2019, when the mayor’s term ended. The city designed the intervention to be delivered similarly across all sectors, at a cost of approximately $27,500 per sector (Appendix D of the Supplementary Material). In each sector, the intervention had three main components.

Central Task Force

The Alcaldía created an interagency task force to respond to local concerns. This could include normal services—for example, poor trash pickup or broken playground equipment. The task force also tried to respond to security concerns, including attention from the city’s dispute-resolution officers and family services. Concerns reached the task force via liaisons, other city staff, or the general hotline Linea 123.

Community–Alcaldía Events

The city also sought to improve communications and relationships with residents by intensifying its Consejos de Convivencia. These community meetings, which local police commanders and Alcaldía staff are asked to attend, are an opportunity for residents and officials to identify specific problems and agree on mutual responsibilities and commitments. Normally, there is one Consejo per comuna per year (for 150,000 residents). Treated sectors held two sector-specific Consejos per year, representing a 50-fold increase in attention. Additionally, the Alcaldía organized large, one-time, sector-specific events called Caravanas de la Convivencia: a weekend-long street festival in each sector where, in addition to music, food, and entertainment, representatives from each agency were on hand to explain their services in detail and identify residents in need of assistance.

Street-Level Liaisons

The city also assigned a full-time street-level bureaucrat—a “liaison”—to each treated sector. Normally, the city has one liaison for each of the 16 comunas—roughly 1 per 540 blocks. For this intervention, the city hired 40 new liaisons as contractors. Thus treated sectors had one liaison per nine blocks—a roughly 60-fold increase in street-level staffing.

Liaisons were expected to spend 3–6 days per week in their assigned sector, and otherwise work in the Alcaldía offices. They were given a high level of autonomy to engage and mobilize the sector as they saw fit. Still, liaisons had weekly targets and quotas for neighborhood events and resident referrals. Liaisons had multiple roles, including:

• collect and formally register community concerns to the interagency task force;
• organize community events and meetings;
• help community organizations coordinate local collective action;
• provide training to community leaders and organizations in dispute resolution and related skills, and encourage them to take an active role in resolving local issues;
• proactively identify individual and neighborhood problems and refer them to the relevant city agency for assistance (e.g., connecting residents with interpersonal conflicts to the comuna’s inspecciones for dispute resolution or comisarías for family problems);
• work with police officers to better inform community members of the “police code”—the country’s legal guidelines for dealing with and correctly reporting nuisances, misdemeanors, and crimes, versus duties of the Secretariat of Security.

Like roughly two-thirds of municipal staff, liaisons were employed on a contract basis through a nongovernmental organization with extensive experience providing neighborhood outreach. They had a manager in the Security Secretariat that trained them, monitored their activities, and controlled quality.

Liaisons were not residents of their assigned sector. Rather, liaisons were professional hires with profiles similar to the city’s existing cadre of liaisons: university-educated (often in the social sciences, psychology, or social work), ages 25–35, and balanced across gender. Nonetheless, all came from low- and middle-income communities in Medellín, most of which would have had a combo and a degree of criminal governance.

Additional details are in Appendix D of the Supplementary Material, where we elaborate how the Alcaldía took steps to minimize changes in services to other neighborhoods (including control sectors).

Predicted Impacts

We had theoretical and empirical reasons to believe that Operación Convivencia could increase state legitimacy, as well as real and perceived efficacy at delivering security.

Legitimacy

As noted above, scholars commonly connect state presence and effective service delivery to its legitimacy. Our qualitative work bolstered this view. In community interviews, the speed and quality of services seemed to be a primary driver of confidence in the state. Our observations suggested that residents rewarded extra state attention with trust and collaboration. For instance, one liaison remarked, “Community members expressed things like: ‘We have never been this close to anyone in authority before’ [ $ \dots $ ] They were very grateful for it. They welcomed us warmly into the community. It was an opportunity to show them different ways of doing things that they were completely unaware of.”

Security

Of the civilian actors involved in Operación Convivencia, only a small set (the dispute resolution and family services offices) directly intervened in disputes. Thus, the intervention’s effects on security were likely to be indirect.

One channel is through community leaders and organizations, with whom liaisons worked to build conflict-resolution and problem-solving skills. Liaisons also tried to foster collective beliefs about appropriate behaviors, rules, and fora for resolving disputes. Similar skill and norm changes underlie most alternative dispute-resolution programs (Blattman, Hartman, and Blair Reference Blattman, Hartman and Blair2014; Mnookin Reference Mnookin1998).

Second, disorderly people may avoid or change their behavior in neighborhoods with more visible state presence or more active community organizations. Tackling minor problems and disorder could also avert escalation into larger and more violent disputes.

Third, the consejos and liaisons were intended to educate the public and manage expectations. For instance, town halls with officials could improve communication, create realistic expectations, and increase perceived effectiveness of government. Liaisons also taught residents about the responsibilities of officers and the limits on their authority, as well as the role of civilian agencies. “Some people didn’t know what the ‘Casa de Justicia’ is,” one liaison explained, “or what the ‘Comisaria de Familia’ does, or that there’s the possibility of free conciliation in a Conciliation Center. So, when they learn about these services and use them, it generates more trust.”

Expected Impacts on Gang Rule

We initially hypothesized that the intervention might reduce gang security-provision and legitimacy. We viewed the intervention, in part, through the lens of duopolistic competition, with the state and the combo offering residents distinct but substitutable services. Should the state exogenously increase its supply, its relative “market share” should rise. A simple formal model illustrates this in Appendix E of the Supplementary Material. This logic echoes a literature that attributes the emergence of organized crime and criminal governance to a power vacuum left by weak states (Gambetta Reference Gambetta1996; Skaperdas Reference Skaperdas2001; Skarbek Reference Skarbek2011), and a literature on rebel governance that sees state and nonstate actors as competitors (Arjona Reference Arjona2016; Blair and Kalmanovitz Reference Blair and Kalmanovitz2016; Kalyvas Reference Kalyvas2006; Mampilly Reference Mampilly2012; Staniland Reference Staniland2012). Over the course of the intervention, we moderated our view. First, qualitatively we observed an excess demand for governance that neither the state nor combo could fill. The 2019 data bore this out, as both state and combo governance measures are well below 0.5 in Table 1. Second, during the intervention we observed that combos generally regarded the liaisons and regular city services as benign. Combos’ main concern was the police. Finally, subsequent data and analysis suggested that gangs may even increase governance in response to increased state presence, especially when there are drug rents to protect (Blattman et al. Reference Blattman, Duncan, Lessing and Tobón2025).

Heterogeneity by Initial State Penetration

Being in the relatively high-state/low-gang subgroup likely reflects a mix of highly visible state presence, low gang visibility, and a decision by the gang to charge protection fees for security. As noted above, we anticipated diminishing returns to investments in state presence and activity, and prespecified heterogeneity analysis using a proxy for initial state penetration.

EXPERIMENTAL DESIGN

We preregistered our design, outcomes, estimation, and heterogeneity analysis. Appendix F.1 of the Supplementary Material summarizes minor deviations and nomenclature changes.

Outcomes

“First-Stage”: State Presence and Activity

To confirm whether the intervention actually resulted in more state presence, the survey measured six “first stage” indicators designed to assess implementation and compliance. In order to be relevant for both treatment and control sectors, these questions were deliberately general—whether residents saw municipal employees in the neighborhood, or knew about and attended city-organized events. Because of the brevity of the survey, these questions do not capture all aspects of state presence, nor do they capture efficacy or overall state penetration.

Primary Outcomes

As discussed above, we prespecified two primary outcomes: perceptions of Relative state–gang legitimacy and Relative state–gang security provision. Endline sample means for the 40 control sectors are reported in Column 6 of Table 1. We also consider absolute levels of legitimacy and security provision. Arguably, absolute levels are more direct and appropriate outcomes for our core hypotheses, even if they were not prespecified.

Additional Security Outcomes

Survey-based measures are valuable but subjective. We supplement them with two sources of administrative data on security: crime reports and emergency calls.First, we created a Sentence-weighted crime index that aggregates all crimes reported in the 20-month intervention period within a 125-meter radius of each sector.Footnote ⁴ The index ranges from 0 to 1, with crimes weighted by their severity (proxied by sentence length guidelines for each crime). Note that these are formally reported crimes only. In Colombia, reporting requires either traveling up to a kilometer to a station to fill out forms or a long-form-completion process online. Serious violence and major thefts are generally reported, but a majority of petty crimes go unreported (Blattman et al. Reference Blattman, Green, Ortega and Tobón2021).

We also count the total number of Security-related emergency calls to the city’s hotline per sector. The vast majority of calls report a street fight, a case of domestic abuse, concerns about a drug seller, or (more commonly) drug users causing a public disturbance. Thus, these calls are both a measure of disorder as well as a measure of a person’s likelihood of reaching out to the city. All calls are logged and geolocated to an address when the Secretariat of Security or police respond. Virtually all receive a response.

Note that both measures are affected by the probability a resident reports a crime or calls. If the intervention reduced crime and disorder on the street, but also increased the likelihood that people collaborate with the state and report incidents when they do happen, then our estimates will understate the true improvement in order.

Randomization and Estimation

To randomize, we used a matched-pair design, which shows balance along most covariates. Appendix F.2 of the Supplementary Material describes the construction of matched pairs and tests of randomization balance.

We estimate intent-to-treat effects via the simple OLS regression:

⁽¹⁾

$$ \begin{array}{rl}{Y}_{isb}=\beta {T}_s+\gamma {X}_s+{\alpha}_b+{\varepsilon}_{isb},& \end{array} $$

where Y is the outcome from survey respondent i in sector s and matched pair b; T is an indicator for random assignment to treatment; X is a vector of the four main baseline indexes; and $ {\alpha}_b $ is a vector of matched pair fixed effects.

There are three main threats to identification. The first is the moderate number of clusters—80, in 40 matched pairs. One problem with clustered robust standard errors (CRSEs) is that they tend to over-reject the null of no effect when the clusters are few in number, especially 30 or less (MacKinnon, Nielsen, and Webb Reference MacKinnon, Nielsen and Webb2023). Our sample is above this rule-of-thumb level, but for transparency we report randomization inference (RI) p-values from 10,000 placebo treatments alongside the CSREs (Gerber and Green Reference Gerber and Green2012). Using RI, the experiment is powered to changes roughly 4–10 percent greater than the control mean (Appendix F.3 of the Supplementary Material).

A second threat is potential interference between units, such as spillovers. We designed the intensity and spread of experimental sectors to minimize this risk and find no evidence of such spatial interference. Using our representative city data, we see no evidence that the intervention affected blocks outside the experimental sectors, including control sectors (Appendix F.4 of the Supplementary Material). We are unable to assess non-Euclidean spillovers, but the intervention was conducted in less than 2.5 percent of city blocks, so any effect on general service provision is likely be small.

Finally, since our primary outcomes come from surveys, we also need to be concerned that citizens under-report gang activities, attenuating estimated treatment effects. Appendix F.5 of the Supplementary Material discusses measurement error, and why it is unlikely to influence our results. We designed a survey experiment and find no evidence of response bias. It confirms our qualitative findings—that combos are a part of everyday life and not systematically stigmatized.

Estimating Heterogeneity

We use the median of each block-pair’s baseline relative governance index to calculate impacts in two subgroups: high-state/low-gang and low-state/high-gang. As discussed below, we examine robustness to a number of alternative proxies for state penetration. We originally committed to split the sample into quartiles, which we discuss briefly below and report in Table G.1 in the Supplementary Material for transparency. With just 10 sectors each in control and treatment, that analysis is under-powered, so we focus here on two subgroups: above- and below-median.

We estimate the OLS regression:

(2)

$$ \begin{array}{rl}{Y}_{isb}=\beta {T}_s+\delta ({T}_s\times {Low}_{sb})+\lambda {Low}_{sb}+\gamma {X}_s+{\alpha}_b+{\varepsilon}_{isb}.& \end{array} $$

Here, Low is an indicator for block-pairs with below-median baseline relative state–gang governance. Thus, $ \beta $ now estimates the program impact in relatively high-state/low-gang rule sectors, $ \delta $ estimates the difference between high- and low-state sectors, and $ \beta +\delta $ is the impact in low-state/high-gang sectors.Footnote ⁵ We are powered to detect changes roughly 6–15 percent greater than the control mean (Appendix F.3 of the Supplementary Material).

There are two caveats. First, city and community leaders’ reports could be inaccurate or biased. Those politically aligned with the municipal government might speak more favorably of the state’s presence at baseline than the average sector household.

A second and more general limitation of any heterogeneity analysis is the assumption of conditional unconfoundedness. Is it baseline relative state–gang governance itself that drives differing results in the two subgroups, or is it some other sector characteristic that is simply correlated with relative state–gang governance? The state and gangs might choose to rule more in denser or richer neighborhoods, for example. The data suggest that the baseline relative state–gang governance is relatively uncorrelated with most other baseline variables (Appendix C.1 of the Supplementary Material).

This reduces but does not eliminate the risk of confounding. For instance, initial low-state penetration might result from low levels of social organization and political power, reducing a neighborhood’s ability to hold government accountable. Below, we argue that this is better thought of as one mechanism by which low-state penetration and high criminal governance reduce the effectiveness of state interventions, rather than a confounder per se.

RESULTS

“First Stage”: Implementation Quality and Compliance

We collected four forms of quality and compliance data: qualitative observation of sector activities, endline survey data on residents’ perceptions of state presence, administrative data on major activities logged by liaisons, and a post-intervention survey of liaisons. While we find little variation in major events logged, both resident perceptions and liaison reports suggest that day-to-day program activities and broader municipal engagement were poorer in places with low baseline relative state–gang governance.

Qualitative Observation

Our research staff conducted spot visits and qualitative field observations during the first two months of the intervention. Overall our impression was one of resourceful, enthusiastic, hardworking efforts by skilled young professionals, and a good-faith effort by the city to implement the intervention. Spot visits suggest that liaisons spent several days or evenings per week in their sector, held regular community events, and attempted to meet referral quotas, all within the few blocks they were assigned to. The Secretariat, consistent with the professionalism of the program, monitored liaison performance and replaced under-performers.Footnote ⁶ We did not observe any pattern between the quality of the liaisons and the types of neighborhoods where they were assigned.

Survey-Based Resident Perceptions

On average, we see no evidence that residents noticed the increase in Alcaldía activity, nor that they attended more events. Column 2 of Table 2 reports average treatment effects on each question as well as a family index, to reduce the number of hypotheses tested. The average change in the overall index is 0.01, roughly 3 percent of the control mean. Only one of the 6 components is positive and statistically significant using RI p-values—seeing Alcaldía staff in the sector, which rose roughly 8 percent relative to the control mean. Interacting with these staff also rose 11 percent relative to the control mean as well, with a CRSE p-value of 0.083 but an RI p-value of 0.143. These signs are promising, because everyday liaison street presence is probably what citizens should have noticed most.

Table 2. Did Citizens Notice and Participate in Increased State Presence and Activities? Average Treatment Effects and Heterogeneity by Baseline Relative State–Gang Governance

Note: This table reports answers to six Yes/No questions in the survey regarding whether residents noticed municipal employees and events or attended them ( $ N=1,\hskip-0.15em 910 $ ). Each row is a different dependent variable. Column 1 reports control sector means. Column 2 reports ITT estimates. Columns 3–5 report treatment heterogeneity in sectors above and below the median level of baseline relative state–gang governance, and the difference between the two groups. We report p-values from cluster robust standard error estimation (CRSE) in parentheses and from randomization inference (RI) in brackets. *p < 0.1; **p < 0.05; ***p < .001.

However, the heterogeneity analysis reveals significant variation. In relatively high-state sectors, residents and businesses were dramatically more likely to notice and interact with municipal staff and be aware of and attend community events. Columns 3 and 4 report ITT estimates and randomization inference p-values for each subgroup, and Column 5 reports the difference. In high-state/low-gang governance sectors, residents report a 16 percent increase in municipal activities and participation; in low-state/high-gang governance sectors, they reported a roughly 12 percent decline. The divergence in the index between the two subgroups is 0.09, equivalent to 27 percent of the control mean. We see this divergence in every component of the index (Column 5).Footnote ⁷

Administrative Data

The sole administrative data are liaisons’ logs of large-scale and long-term events and activities—a requirement introduced by the city roughly halfway through the intervention. Over 10 months liaisons recorded roughly 10 major activities per sector per month. (To reduce paperwork burdens, liaisons were not required to log everyday facilitation, referrals, and other activities.) Most of the logged activities fit into three categories: organizing large public events, organizing meetings with police and other public officials and agencies, and helping to resolve major disputes.Footnote ⁸ In contrast to the resident surveys, liaison reports of major events do not vary by baseline relative state–gang governance. Panel a of Figure 2 plots logged activities against this heterogeneity measure. This suggests that liaisons’ organization of major events was even across sectors.Footnote ⁹

If municipal staff were less active or visible in the neighborhoods with lower initial state governance, this could be because the liaisons performed fewer unlogged activities, or because other municipal staff did not serve the community adequately or up to expectations. The latter is consistent with our liaison surveys.

Figure 2. How Treatment Experiences Varied by Baseline Relative State–Gang Governance (Treated Sectors Only)

Note: Panel a reports the number of activities they logged and assigned to the relevant sector by levels of baseline relative state–gang governance. Panel b reports the frequency with which liaisons reported that the wider state apparatus failed to deliver on promises. Panel c captures the degree with which the liaison reported that the combo interfered with activities.

Liaison Surveys

Once the intervention ended, we interviewed all liaisons, asking a mix of open-ended and structured questions. We asked them to rate the central government’s follow-through on promised service-delivery, on a 0–1 scale from full compliance to complete failure to deliver. Panel b of Figure 2 plots these responses against our main heterogeneity variable. On average, liaisons rated the Alcaldía’s compliance at roughly 0.33, meaning the state “sometimes” failed to deliver on the requested support, but liaisons in sectors with relatively low-state governance reported failures twice as often.

Our qualitative interviews suggest that the most common problems were failures to deliver on community requests and meet expectations. For example, playgrounds and public architecture often went unrepaired, despite requests. Or, as one liaison explained, “I managed to gather more than 60 people for the Consejo de Convivencia, but no one from the city showed up.” Another liaison reported that the dispute-resolution officer “never came to [this sector] during all the time I was there. And he never gave us an answer to why he did not.”

Several liaisons also reported facing difficulties due to low police quality or responsiveness. “The police have very little credibility,” said one, “I had a police station near my territory and, honestly, I rarely saw patrols come in here.” Another said how they had publicized the new police code—including official guidelines for when citizens should call the police versus civilian security and services agencies—but the residents were frustrated because the police did not follow it reliably. Such responses suggest that heterogeneous “first-stage” survey results could be driven by prior experience of state governance rather than uneven treatment compliance.

Combo Reactions

Finally, we monitored combos’ reactions to and interference with the intervention. Combos customarily monitor newcomers to the neighborhood. Therefore, as expected, almost all liaisons described having to explain their presence to the combo. However, we see no evidence that combo reactions shaped program impacts, with most liaisons reporting combo indifference to their activities. Two-thirds of liaisons reported no interference over the 20 months. The other third mostly said that the local combo was merely watchful, such as observing public events and meetings from a distance.Footnote ¹⁰ None of the liaisons reported extended harassment, violence, extortion, or bribes.

There is also no evidence of heterogeneous combo response, as illustrated in Panel c of Figure 2. The “combo capture” index aggregates several measures: a scale for the frequency and difficulties of interaction with local gangs, an indicator for whether the gang ever took credit for the intervention, and a set of binary variables for activities by which the gang helped the liaison. Values closer to 1 represent higher involvement from the gang. In general, the average level is low (close to 0.1 on a unit scale), and there is little systematic relationship with initial relative state–gang governance.

This is consistent with what we learned of combos in interviews during and after the intervention. Combos are principally concerned with the police, who interfere with their drug sales and conduct raids and arrests. Generally speaking, the gangs seldom impede—and sometimes welcome—the activities of civilian officials, community leaders, and regular residents.

Treatment Effects on Legitimacy and Security Provision

Table 3 reports the average treatment effects and heterogeneity analysis on our primary outcomes. Overall, the experimental results echo the first-stage results: null effects on average, but positive effects in high-state areas.

Table 3. Program Impacts on Legitimacy and Security Provision: Average Treatment Effects and Heterogeneity by Baseline Governance Quality

Note: The table reports ITT estimates of program impacts and treatment heterogeneity. Each row is a different dependent variable. We report p-values from cluster robust standard error estimation (CRSE) in parentheses and from randomization inference (RI) in brackets. Both households and businesses were surveyed on governance levels ( $ N=2,\hskip-0.15em 379 $ ), but only households were surveyed on legitimacy and hence there are fewer observations ( $ N=1,\hskip-0.15em 910 $ ). *p < 0.1; **p < 0.05; ***p < 0.01.

Column 2 reports average treatment effects. We see no signs of significant improvement in legitimacy, and we actually observe a 0.025 decrease in relative state security provision that is weakly significant ( $ p=0.068 $ ) when using CRSEs, though not so when using the more conservative RI p-values ( $ p=0.124 $ ).

Turning to the heterogeneity analysis in Columns 3–5, relative state–gang legitimacy rose by 0.05 in high-state sectors ( $ p=0.036 $ using CRSEs and 0.103 using RI). This is equal to 40 percent of the state–combo difference in legitimacy (Column 1) and 9 percent of the average level of absolute state legitimacy (0.57). There is also a small, nonsignificant decrease in legitimacy in low-state sectors. As a result, the difference between the two subgroups is even larger—0.071, equivalent to 12 percent of the city-wide average. We see the same pattern with absolute state legitimacy, with more robust effects.Footnote ¹¹ These results persist when splitting the sample into quartiles, with some evidence that the backlash is concentrated in the lowest quartile, consistent with the increasing-returns interpretation (Table G.1 in the Supplementary Material).

For security provision, we see weaker evidence of heterogeneous effects. There is no indication the program increased the frequency with which respondents said the state responded to the 17 forms of disorder—overall or in the high-state subgroup. Perceptions of relative and absolute state security provision decline slightly in all sectors, with no statistically significant difference between high- and low-state sectors.Footnote ¹²

Finally, Table 3 shows no evidence of program impacts on combo legitimacy or security provision, even in high-state sectors where the state elicited the largest improvements in state legitimacy and decreases in disorder. This is consistent with our qualitative interviews with combo leaders and city officials. Alternatively, it could be that the 20 months intervention was too short or weak to provoke a combo response.

Robustness of Average Treatment Effects

Results are robust to alternative estimation approaches: the omission of both control variables and block-pair dummies, the omission of control variables only, and the addition of demographic traits of survey respondents (Table G.5 in the Supplementary Material). Appendix F.5 of the Supplementary Material also reports formal tests of measurement error, including a survey experiment that uses randomized response techniques to elicit whether sensitive combo-related questions are underreported, and if there is any correlation with treatment status. We find no evidence of bias.

Robustness of Heterogeneous Effects

Our proxy for initial state penetration, baseline relative state–gang governance, is imperfect, so we tested three alternative ways to split the sample. Our results are robust to two of them.

First, we see qualitatively similar results and statistical significance if we divide the sample according to a predicted measure of baseline state security provision (Table G.6 in the Supplementary Material). To construct this, we use extensive baseline data (including administrative data and leader surveys) to train a LASSO algorithm to predict endline absolute state security provision in the 80 experimental sectors. The resulting prediction is a weighted average of all baseline variables that attempts to proxy for absolute rather than relative baseline state governance. Community leader responses are eligible for the LASSO, and many are selected, but a large number of baseline geographic, administrative, and crime variables from our baseline data are predictive as well.

Second, we consider an alternative proxy for initial state penetration: community leader’s reports of the visibility of police and Mayoral staff on the streets. These baseline questions were not part of our prespecified heterogeneity measure. The heterogeneity results are not robust to this more limited measure (Table G.7 in the Supplementary Material). One possible factor is that, based on just two questions, it has much lower variance than the other heterogeneity measures. In contrast, the predicted measure above includes these two police and Alcaldía measures in the LASSO training set and output, making it, in our view, a more reasonable robustness check.

Finally, we obtain nearly identical estimates if we divide the sample into above/below median governance using an index of baseline relative state–gang governance measures weighted by their principal component loadings rather than the prespecified evenly weighted index (Table G.8 in the Supplementary Material).

Impacts on Crime and Emergency Calls

Tables 4 and 5 report treatment effects on security-related administrative measures. Impacts are again concentrated in initially well-governed sectors.

Table 4. Program Impacts on Crime Index Components: Average Treatment Effects and Heterogeneity by Baseline Governance Quality

Note: The table reports summary statistics and treatment effects for the sentence-weighted crime index in Table 3 and its four main components. The index is standardized to have zero mean and unit standard deviation. We report p-values from cluster robust standard error estimation (CRSE) in parentheses and from randomization inference (RI) in brackets. *p < 0.1; **p < 0.05; ***p < 0.01.

Table 5. Impacts of Treatment on Security-Related Emergency Calls

Note: This table reports the total number of resident calls to the police emergency line over 20 months, including all calls made within each sector plus a 125-meter buffer zone around the sector. We report p-values from cluster robust standard error estimation (CRSE) in parentheses and from randomization inference (RI) in brackets. *p < 0.1; **p < 0.05; ***p < 0.01.

For reported crime, our index falls by 0.14 in initially high-state sectors—a 40 percent decline, significant at the 5 percent level. The divergence between above- and below-median baseline relative state governance sectors is even greater, a 0.16 decline. Proportionally, these are large declines for most crime types—vehicle thefts, other thefts and robber, and assault. Curiously, however, we see a rise in homicides overall in treated areas. We must treat all index component analyses as suggestive, however, and we have not adjusted standard errors for multiple hypothesis tests.

Note that these reductions in crime reports are not likely due to measurement error correlated with treatment (e.g., lower reporting of crime in treated communities). Residents in initially well-governed treated sectors view the state as more legitimate, and so if anything should be more willing to report crimes to the state. Moreover, the intervention explicitly educated communities on the police code and facilitated semi-annual meeting between the community and local police commander, thus making them more familiar with reporting requirements. Indeed, these factors could have increased crime reporting rates in treated sectors, leading to understated treatment effects.

The evidence from security-related calls further suggests that, in high-state sectors, municipal staff or the community itself is either dealing with everyday street disorder without the police, or successfully prevented forms of disorder. In high-state treated sectors, calls fall by 63 relative to a control mean of 136—a 45 percent decline, significant at the 5 percent level. There is no evidence of improvement in the low-state sectors. We see this differential across every category of call, except for the very small number of firearm-related altercations (see Column 5). The largest decline (and the only statistically significant component) is in calls regarding unarmed street fights and domestic abuse.

Impacts on a Summary Index of Outcomes

To avoid concerns of multiple hypothesis testing and non-prespecified outcomes, we construct a family index of all four measures—relative legitimacy, relative governance, security-related calls, and reported crime—and test for average and heterogeneous treatment effects, in Table G.5 in the Supplementary Material. The results are largely consistent with the patterns discussed above: no evidence of an average treatment effect, but robust evidence of improvements in the high-state areas.

DISCUSSION AND CONCLUSIONS

Cities around the world constitute patchworks of high-state penetration and legitimacy abutting areas where the state is weak and mistrusted. In Latin America, the issue takes on special importance, as tens if not hundreds of millions live with some form of criminal governance in addition to that of the state (Uribe et al. Reference Uribe, Lessing, Schouela and Stecher2025).

Why these disparities persist is a puzzle, as is what to do about it. The most common response has been to expand the coercive power of the state—professionalizing police in some cases, while more frequently pursuing militarized and heavy-handed mano dura approaches (Blattman Reference Blattman2024; Flores-Macias and Zarkin Reference Flores-Macias and Zarkin2021). This article examines one city’s attempt to tackle the problem noncoercively. Overall, our results suggest that improving perceptions of security and state legitimacy may be challenging. If a 10-fold increase in municipal attention cannot reliably change the average citizen’s trust and satisfaction in the state within two years, then perhaps generally low investment in urban statebuilding is unsurprising.

Yet our intervention does appear to have been effective where the state was already present and providing relatively good governance compared to criminal organizations. This was offset by null or negative impacts in initially low-state penetration sectors. One likely reason was uneven delivery. Despite a costly and good-faith effort to apply a uniform “dose” of enhanced state presence, Medellín’s street-level and central bureaucrats struggled to deliver in communities where initial penetration was relatively low. If the state raised expectations it then failed to meet, small wonder that residents’ opinions of it plateaued or worsened.

Still, this outcome surprised us; we anticipated that state efforts to enhance presence and service-provision might generate more awareness, impact, and gratitude where it began less present and effective. Across settings, such diminishing returns are all too common: each additional unit of labor, capital, or technology that is applied to, say, production, a public service, or a political campaign has a smaller effect on outcomes.

Instead, our results are consistent with increasing returns—with each additional unit having a larger effect. Increasing returns characterize phenomena ranging from the adoption of specific technologies to broad institutional arrangements (North Reference North1990). Indeed, Pierson (Reference Pierson2000) defines path dependence as a product of increasing returns, since the marginal cost of continuing down the current path falls relative to that of switching to alternatives. Increasing returns are thus what make critical junctures critical, shaping historical and political trajectories of all sorts (e.g., Collier and Munck Reference Collier and Munck2022; Robinson Reference Robinson, Collier and Munck2022). Particularly relevant here, increasing returns help explain the persistent unevenness of economic geography at different scales (Krugman Reference Krugman1991). Moreover, since increasing returns and path dependence can entrench inefficiencies and suboptimal outcomes, it is worth investigating whether and why they exist in statebuilding.

Here, we consider three potential sources for increasing returns to urban statebuilding. The first is start-up costs and barriers to entry, a canonical source of increasing returns in firms and economic development (Arthur Reference Arthur1994). In statebuilding, there would be increasing returns if a minimal level of penetration or capacity was required. For example, low initial penetration could make effective service delivery difficult. There might also be a threshold for resident awareness—the state must achieve some minimum level of presence and efficacy before the average resident reliably notices the activity and changes their beliefs. Conversely, levels of state effort below some threshold could draw attention to inequities and failures, and harm the state’s reputation. This echoes other experimental findings. Gottlieb (Reference Gottlieb2016) argues that a civics education program in Mali raised citizen expectations of politicians and led to greater willingness to sanction leaders. In Liberia, Blair, Karim, and Morse (Reference Blair, Karim and Morse2019) and Karim (Reference Karim2020) find that policing interventions can lower state and police legitimacy when they raise expectations beyond the capacity to deliver.

Second, our measure of low initial relative state–gang governance might reflect neighborhood characteristics—such as low social cohesion or marginalized ethnic or social background—that reduce communities’ ability for collective action, mobilization, and hence holding officials accountable (Moncada Reference MoncadaForthcoming). In this scenario, implementation efforts fail not because bureaucrats and agents are unable to deliver, but because they lack the incentives to do so. When state agents are tasked with delivering services across multiple communities, the more marginalized, disorganized, or low-social-capital areas may fall to the bottom of the priority list. This logic underlies accounts of gun violence in Chicago—among many poor and marginalized neighborhoods, those that receive violence-prevention services and enjoy calm are those with greater social capital and political connections (Vargas Reference Vargas2016).

Finally, criminal governance might crowd out state governance, making state-building efforts ineffective until some threshold of relative state–gang governance is reached. Indeed, prior to the intervention, we were concerned that gangs might thwart even nonpolice state activities in “their” territory. The potential need to neutralize or displace an incumbent governing authority—a special case of start-up costs if you like—is not unique to statebuilding in gang-ruled urban peripheries. Traditional authorities can weaken or deter attempts at governance by states, criminal groups, and rebels alike (e.g., Arjona Reference Arjona2016; Ley, Mattiace, and Trejo Reference Ley, Mattiace and Trejo2019). In Medellín, however, we saw no evidence that gangs responded to the Alcaldía’s intervention. This is consistent with studies finding that traditional and non-state authority can be complementary to state governance (Blattman, Hartman, and Blair Reference Blattman, Hartman and Blair2014; Cammett and MacLean Reference Cammett and MacLean2014; Henn Reference Henn2021; Van der Windt et al. Reference Van der Windt, Humphreys, Medina, Timmons and Voors2019). While our qualitative evidence suggests that the nonpolice nature of our intervention helped reduce crowding out by gangs, we cannot exclude the possibility here, much less with respect to statebuilding efforts in general.

Whatever their source, increasing returns could help explain a salient feature of contemporary city life: large, persistent disparities in state penetration across neighborhoods, often adjoining ones, even as overall state capacity grows. Increasing returns to localized statebuilding could produce perverse political incentives, leading bureaucrats and elected officials to focus effort and resources in the areas where visible, short-term progress is easiest to achieve. Breaking out of such political “neglect traps” might require larger and more sustained efforts than any one mayor, governor, or president is willing to undertake.

Granted, these are big conjectures—well beyond what a single experiment in 80 neighborhoods of one city can demonstrate. The first community-level randomized evaluation of a nonpolice security intervention will naturally have its limitations. We highlight these hypotheses to motivate future research and large-scale experimentation.

In that light, Operación Convivencia illustrates the viability of community-level experimentation and rigorous evaluation of alternative statebuilding strategies, and its potential contribution to both theory and practice. Its results forced us to reject our working hypotheses and question the theoretical assumptions behind them. This is an essential part of the scientific method, generating new theory and hypotheses for future testing. As the world becomes increasingly urban, uneven state penetration and the potential neglect traps that might drive it are critical avenues for research and policy experimentation.

SUPPLEMENTARY MATERIAL

To view supplementary material for this article, please visit https://doi.org/10.1017/S0003055426101555.

DATA AVAILABILITY STATEMENT

Research documentation and data that support the findings of this study are openly available at the American Political Science Review Dataverse: https://doi.org/10.7910/DVN/G2QBDY.

ACKNOWLEDGEMENTS

For comments, we thank Thomas Abt, Oriana Bandiera, Eli Berman, Robert Blair, Jennifer Doleac, Leopoldo Fergusson, Sara Heller, Max Kapustin, Zoë Gorman, Macartan Humphreys, Jakub Lonski, Raul Sánchez de la Sierra, Jacob Shapiro, Carlos Schmidt-Padilla, Paolo Pinotti, Daniel Ramos-Menchelli, Ernesto Schargrodsky, Maria Micaela Sviatschi, Juan F. Vargas, Elisabeth Wood, and participants at several seminars and conferences. Innovations for Poverty Action coordinated all research activities. For research assistance, we thank Verónica Abril, Bruno Aravena, David Cerero, Peter Deffebach, Felipe Fajardo, Sebastián Hernández, Sofía Jaramillo, Juan F. Martínez, Nelson Matta-Colorado, Juan Pablo Mesa-Mejía, Angie Mondragón, Helena Montoya, Hussein Moussa, José Miguel Pascual, Andrés Preciado, Arantxa Rodríguez-Uribe, Zachary Tausanovitch, Martín Vanegas-Arias, México Vergara, and Saskia Wright. We thank the Secretariat of Security of Medellín for their cooperation, especially the former Secretary of Security Andrés Tobón, as well as Lina Calle and Ana María Corpas.

FUNDING STATEMENT

This research was funded by the U.S. National Science Foundation (NSF) grant 1851543; the U.K. Foreign, Commonwealth & Development Office through the Peace and Recovery Program at Innovations for Poverty Action (IPA) and the Crime and Violence Initiative at J-PAL; the Economic Development and Institutions Programme (EDI) funded with U.K. aid from the U.K. Government, working in partnership with Oxford Policy Management Limited, University of Namur, Paris School of Economics and Aide á la Décision Économique; the Centro de Estudios sobre Seguridad y Drogas (CESED) at Universidad de los Andes; and the PROANTIOQUIA foundation.

CONFLICT OF INTEREST

The authors declare no ethical issues or conflicts of interest in this research.

ETHICAL STANDARDS

The authors declare that the human subjects research in this article was reviewed and approved by The University of Chicago Institutional Review Board under protocols IRB17-1780 and IRB22-0341. The authors affirm that this article adheres to the principles concerning research with human participants laid out in APSA’s Principles and Guidance on Human Subject Research (2020).

Footnotes

¹ Two exceptions are experiments in rural Pakistan (Acemoglu et al. Reference Acemoglu, Cheema, Khwaja and Robinson2020) and Liberia (Blattman, Hartman, and Blair Reference Blattman, Hartman and Blair2014; Hartman, Blair, and Blattman Reference Hartman, Blair and Blattman2021). They show that improvements in courts and dispute-resolution improved state legitimacy and security.

² Unlike legitimacy questions, the 17 security questions pooled the police and Alcaldía into one entity—the state—to condense survey length.

³ Columns 5 and 6 of Table 1 compare governance and legitimacy levels in the city and experimental samples. They are similar.

⁴ We chose 125 meters because of our requirement that every sector be at least 250 meters from one another (ensuring no overlap). Patterns are qualitatively similar for other radii.

⁵ Table F.2 in the Supplementary Material reports treatment-control balance within the subgroups. Note that the preanalysis plan specified splitting the sample by initial level of criminal governance; this was a misnomer, as we only ever had the relative measure available.

⁶ Within the first eight months, the Secretariat had replaced half of the liaisons with more able personnel.

⁷ Why might we observe a decrease in relatively low-state sectors? The effects are concentrated in community meetings rather than staff visibility. One possibility is that, over 20 months, residents in poorly-governed sectors became aware that events were held, but too late to attend. Alternatively, they may have perceived meetings as less useful and attended fewer of them. Finally, survey responses are inherently subjective, and may simply be capturing residents’ general sentiments toward the liaisons and activities—if expectations went unmet, these sentiments are likely to be negative.

⁸ Appendix D.4 of the Supplementary Material maps activities by treatment and control sectors.

⁹ Appendix D.4 of the Supplementary Material confirms there is no subgroup difference.

¹⁰ The few exceptions mostly affected the early weeks of the intervention. For example, in two sectors, combos initially prevented liaisons from entering; after 2–3 weeks, once the liaisons were able to explain their job and role, they were permitted entry and performed their jobs without interference.

¹¹ Table G.2 in the Supplementary Material breaks down legitimacy into its five component questions and by police and Alcaldía. In high-state sectors, both mayoral and police legitimacy increase.

¹² These patterns hold if we break the governance index into its 17 components and into more and less police-related actions, as reported in Table G.3 in the Supplementary Material. We did not ask for police and Alcaldía separately, but we can classify the 17 forms of disorder into 8 that are more likely to elicit a police response and 9 that are commonly solved by city actors. None of these differences are statistically significantly different from one another. The survey also included a number of supplementary measures of efficacy, including the speed of response, ease of accessing services, and the value placed on the actor. In Table G.4 in the Supplementary Material, we see no evidence that residents perceived an improvement.

References

REFERENCES

Acemoglu, Daron, Cheema, Ali, Khwaja, Asim I., and Robinson, James A.. 2020. “Trust in State and Nonstate Actors: Evidence from Dispute Resolution in Pakistan.” Journal of Political Economy 128 (8): 3090–147.10.1086/707765CrossRef Google Scholar

Arias, Enrique Desmond. 2017. Criminal Enterprises and Governance in Latin America and the Caribbean. New York: Cambridge University Press.10.1017/9781316650073CrossRef Google Scholar

Arjona, Ana. 2016. Rebelocracy. New York: Cambridge University Press.10.1017/9781316421925CrossRef Google Scholar

Arthur, W. Brian. 1994. Increasing Returns and Path Dependence in the Economy. Ann Arbor: University of Michigan Press.10.3998/mpub.10029CrossRef Google Scholar

Beath, Andrew, Christia, Fotini, and Enikolopov, Ruben. 2012. “Winning Hearts and Minds through Development? Evidence from a Field Experiment in Afghanistan.” Working Paper. https://doi.org/10.1596/1813-9450-6129.CrossRef Google Scholar

Berman, Eli, Felter, Joseph H., Shapiro, Jacob N., and Troland, Erin. 2013. “Modest, Secure, and Informed: Successful Development in Conflict Zones.” American Economic Review 103 (3): 512–17.10.1257/aer.103.3.512CrossRef Google Scholar

Blair, Robert A., and Kalmanovitz, Pablo. 2016. “On the Rights of Warlords: Legitimate Authority and Basic Protection in War-Torn Societies.” American Political Science Review 110 (3): 428–40.10.1017/S0003055416000423CrossRef Google Scholar

Blair, Robert A., Karim, Sabrina M., and Morse, Benjamin S.. 2019. “Establishing the Rule of Law in Weak and War-Torn States: Evidence from a Field Experiment with the Liberian National Police.” American Political Science Review 113 (3): 641–57.10.1017/S0003055419000121CrossRef Google Scholar

Blattman, Christopher. 2024. “Bad Medicine: Why Different Systems of Organized Crime Demand Different Solutions.” Working Paper. https://ideas.repec.org/p/osf/socarx/ghcpj.html.Google Scholar

Blattman, Christopher, Duncan, Gustavo, Lessing, Benjamin, and Tobón, Santiago. 2025. “Gang Rule: Understanding and Countering Criminal Governance.” Review of Economic Studies 92 (3): 1497–531.10.1093/restud/rdae079CrossRef Google Scholar

Blattman, Christopher, Green, Donald, Ortega, Daniel, and Tobón, Santiago. 2021. “Place-Based Interventions at Scale: The Direct and Spillover Effects of Policing and City Services on Crime.” Journal of the European Economic Association 19 (4): 2022–51.10.1093/jeea/jvab002CrossRef Google Scholar

Blattman, Christopher, Hartman, Alexandra C., and Blair, Robert A. 2014. “How to Promote Order and Property Rights under Weak Rule of Law? An Experiment in Changing Dispute Resolution Behavior through Community Education.” American Political Science Review 108 (1): 100–20.10.1017/S0003055413000543CrossRef Google Scholar

Blattman, Christopher, Duncan, Gustavo, Lessing, Benjamin, and Tobón, Santiago. 2026. “Replication Data for: State-Building in the City: An Experiment in Civilian Alternatives to Policing.” Harvard Dataverse. Dataset. https://doi.org/10.7910/DVN/G2QBDY.CrossRef Google Scholar

Cammett, Melani, and MacLean, Lauren M.. 2014. The Politics of Non-State Social Welfare. Ithaca, NY: Cornell University Press.10.7591/9780801470349CrossRef Google Scholar

Carter, Danielle. 2013. Non-State Security, State Legitimacy and Political Participation in South Africa. East Lansing: Michigan State University.Google Scholar

Collier, David, and Munck, Gerardo L.. 2022. Critical Junctures and Historical Legacies: Insights and Methods for Comparative Social Science. Lanham, MD: Rowman & Littlefield.10.5040/9798216402695CrossRef Google Scholar

Duflo, Esther, and Banerjee, Abhijit. 2011. Poor Economics. New York: Public Affairs.Google Scholar

Flores-Macias, Gustavo A., and Zarkin, Jessica. 2021. “The Militarization of Law Enforcement: Evidence from Latin America.” Perspectives on Politics 19 (2): 519–38.10.1017/S1537592719003906CrossRef Google Scholar

Gambetta, Diego. 1996. The Sicilian Mafia: The Business of Private Protection. Cambridge, MA: Harvard University Press.Google Scholar

Gerber, Alan S., and Green, Donald P.. 2012. Field Experiments: Design, Analysis, and Interpretation. New York: W.W. Norton.Google Scholar

Gottlieb, Jessica. 2016. “Greater Expectations: A Field Experiment to Improve Accountability in Mali.” American Journal of Political Science 60 (1): 143–57.10.1111/ajps.12186CrossRef Google Scholar

Gurr, Ted R. 1970. Why Men Rebel. Princeton, NJ: Princeton University Press.Google Scholar

Hartman, Alexandra C., Blair, Robert A., and Blattman, Christopher. 2021. “Engineering Informal Institutions: Long-Run Impacts of Alternative Dispute Resolution on Violence and Property Rights in Liberia.” The Journal of Politics 83 (1): 381–9.10.1086/709431CrossRef Google Scholar

Henn, Soeren J. 2021. “Complements or Substitutes? How Institutional Arrangements Bind Chiefs and the State in Africa.” Working Paper. http://soerenhenn.com/files/Henn_Chiefs.pdf.10.1017/S0003055422001137CrossRef Google Scholar

Herbst, Jeffrey. 2006. “Population Change, Urbanization, and Political Consolidation.” In Oxford Handbook of Contextual Political Analysis, eds. Goodin, Robert and Tilly, Charles, 649–63. New York: Oxford University Press.Google Scholar

Kalyvas, Stathis N. 2006. The Logic of Violence in Civil War. Cambridge: Cambridge University Press.10.1017/CBO9780511818462CrossRef Google Scholar

Karim, Sabrina. 2020. “Relational State Building in Areas of Limited Statehood: Experimental Evidence on the Attitudes of the Police.” American Political Science Review 114 (2): 536–51.10.1017/S0003055419000716CrossRef Google Scholar

Krasner, Stephen D., and Risse, Thomas. 2014. “External Actors, State-Building, and Service Provision in Areas of Limited Statehood.” In Domestic Politics and Norm Diffusion in International Relations, ed. Thomas Risse, 197–218. London: Routledge.10.4324/9781315623665-10CrossRef Google Scholar

Krugman, Paul. 1991. “History and Industry Location: The Case of the Manufacturing Belt.” American Economic Review 81 (2): 80–3.Google Scholar

Leeds, Elizabeth. 1996. “Cocaine and Parallel Polities in the Brazilian Urban Periphery: Constraints on Local-Level Democratization.” Latin American Research Review 31 (3): 47–83.10.1017/S0023879100018136CrossRef Google Scholar

Levi, Margaret, Sacks, Audrey, and Tyler, Tom. 2009. “Conceptualizing Legitimacy, Measuring Legitimating Beliefs.” American Behavioral Scientist 53 (3): 354–75.10.1177/0002764209338797CrossRef Google Scholar

Ley, Sandra, Mattiace, Shannan, and Trejo, Guillermo. 2019. “Indigenous Resistance to Criminal Governance.” Latin American Research Review 54 (1): 181–200.10.25222/larr.377CrossRef Google Scholar

MacKinnon, James G., Nielsen, Morten Ørregaard, and Webb, Matthew D.. 2023. “Cluster-Robust Inference: A Guide to Empirical Practice.” Journal of Econometrics 232 (2): 272–99.10.1016/j.jeconom.2022.04.001CrossRef Google Scholar

Mampilly, Zachariah Cherian. 2012. Rebel Rulers: Insurgent Governance and Civilian Life during War. Ithaca, NY: Cornell University Press.Google Scholar

Melnikov, Nikita, Schmidt-Padilla, Carlos, and Sviatschi, Maria Micaela. 2025. “Gangs, Labor Mobility, and Development.” Econometrica 93 (6): 2083–121.10.3982/ECTA21305CrossRef Google Scholar

Mnookin, Robert H. 1998. “Alternative Dispute Resolution.” Working Paper.Google Scholar

Moncada, Eduardo. Forthcoming. Citizens, Criminals, and Claim-Making in Latin America. Cambridge: Cambridge University Press.Google Scholar

North, Douglass C. 1990. Institutions, Institutional Change, and Economic Performance. New York: Cambridge University Press.10.1017/CBO9780511808678CrossRef Google Scholar

Pierson, Paul. 2000. “Increasing Returns, Path Dependence, and the Study of Politics.” American Political Science Review 94 (2): 251–67.10.2307/2586011CrossRef Google Scholar

Robinson, James A. 2022. “Critical Junctures and Developmental Paths: Colonialism and Long-Term Economic Prosperity.” In Criticial Junctures and Historical Legacies, eds. Collier, David and Munck, Gerardo L., 53–66. Lanham, MD: Rowman & Littlefield.Google Scholar

Scott, James C. 2009. The Art of Not Being Governed: An Anarchist History of Upland Southeast Asia. New Haven, CT: Yale University Press.Google Scholar

Skaperdas, Stergios. 2001. “The Political Economy of Organized Crime: Providing Protection When the State Does Not.” Economics of Governance 2 (3): 173–202.10.1007/PL00011026CrossRef Google Scholar

Skarbek, David. 2011. “Governance and Prison Gangs.” American Political Science Review 105 (4): 702–16.10.1017/S0003055411000335CrossRef Google Scholar

Staniland, Paul. 2012. “States, Insurgents, and Wartime Political Orders.” Perspectives on Politics 10 (2): 243–64.10.1017/S1537592712000655CrossRef Google Scholar

Tilly, Charles. 1990. Coercion, Capital, and European States, AD 990-1990. Oxford: Blackwell.Google Scholar

Uribe, Andres, Lessing, Benjamin, Schouela, Noah, and Stecher, Elayne. 2025. “Criminal Governance in Latin America: Prevalence and Correlates.” Perspectives on Politics: 1–19. https://doi.org/10.1017/S1537592725101849.Google Scholar

Van der Windt, Peter, Humphreys, Macartan, Medina, Lily, Timmons, Jeffrey F., and Voors, Maarten. 2019. “Citizen Attitudes toward Traditional and State Authorities: Substitutes or Complements?” Comparative Political Studies 52 (12): 1810–40.10.1177/0010414018806529CrossRef Google Scholar

Vargas, Robert. 2016. Wounded City: Violent Turf Wars in a Chicago Barrio. New York: Oxford University Press.10.1093/acprof:oso/9780190245900.001.0001CrossRef Google Scholar

Weil, David N. 2008. Economic Growth, 2nd edition. Boston, MA: Addison-Wesley.Google Scholar

Table 1. State and Combo Legitimacy and Security Provision, Barrio Survey Averages, 2019

Figure 1. Relative State–Gang Security Provision by Barrio, and Location of Experimental SectorsNote: We average the 4,598 responses in the 2019 survey by barrio. Red indicates that the combo responds more to disputes and disorder, and blue indicates the state.

Table 2. Did Citizens Notice and Participate in Increased State Presence and Activities? Average Treatment Effects and Heterogeneity by Baseline Relative State–Gang Governance

Figure 2. How Treatment Experiences Varied by Baseline Relative State–Gang Governance (Treated Sectors Only)Note: Panel a reports the number of activities they logged and assigned to the relevant sector by levels of baseline relative state–gang governance. Panel b reports the frequency with which liaisons reported that the wider state apparatus failed to deliver on promises. Panel c captures the degree with which the liaison reported that the combo interfered with activities.

Table 3. Program Impacts on Legitimacy and Security Provision: Average Treatment Effects and Heterogeneity by Baseline Governance Quality

Table 4. Program Impacts on Crime Index Components: Average Treatment Effects and Heterogeneity by Baseline Governance Quality

Table 5. Impacts of Treatment on Security-Related Emergency Calls

Blattman et al. supplementary material

DOI: https://doi.org/10.1017/S0003055426101555.sm001

File 1.8 MB

Submit a response

Comments

No Comments have been published for this article.

Article contents

State-Building in the City: An Experiment in Civilian Alternatives to Policing

Abstract

Information

INTRODUCTION

CONTEXT

The State

Street Gangs

CONCEPTS, MEASUREMENT, AND DATA

Concepts

Measurement and Data

Measuring Baseline State Penetration

Measuring Primary Outcomes

State/Combo Performance Outside the Experimental Sample

State

Combos

Relative State–Gang Governance

INTERVENTION

Experimental Sample

Activities

Central Task Force

Community–Alcaldía Events

Street-Level Liaisons

Predicted Impacts

Legitimacy

Security

Expected Impacts on Gang Rule

Heterogeneity by Initial State Penetration

EXPERIMENTAL DESIGN

Outcomes

“First-Stage”: State Presence and Activity

Primary Outcomes

Additional Security Outcomes

Randomization and Estimation

Estimating Heterogeneity

RESULTS

“First Stage”: Implementation Quality and Compliance

Qualitative Observation

Survey-Based Resident Perceptions

Administrative Data

Liaison Surveys

Combo Reactions

Treatment Effects on Legitimacy and Security Provision

Robustness of Average Treatment Effects

Robustness of Heterogeneous Effects

Impacts on Crime and Emergency Calls

Impacts on a Summary Index of Outcomes

DISCUSSION AND CONCLUSIONS

SUPPLEMENTARY MATERIAL

DATA AVAILABILITY STATEMENT

ACKNOWLEDGEMENTS

FUNDING STATEMENT

CONFLICT OF INTEREST

ETHICAL STANDARDS

Footnotes

References

REFERENCES

Blattman et al. supplementary material

Comments

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests