Shadows of Integrity in Public Administration: Survey Experiments on Ethics and Corruption

Luis Garrido-Vergara

doi:10.1017/9781009606554

Shadows of Integrity in Public Administration Survey Experiments on Ethics and Corruption

1 Introduction

In this Element, shadows of integrity is used as a conceptual device to capture the micro-level conditions under which formally codified ethical norms lose their practical force in everyday bureaucratic decision-making. Rather than treating corruption as isolated deviant behaviour, the concept foregrounds the situational configurations – discretion, ambiguity, organizational signals, and incentive structures – through which ethical reasoning becomes contingent. This framing provides a unifying lens for the experimental protocols developed in the remainder of the Element, which are designed to systematically vary precisely these conditions.

Modern public administration is normatively anchored in an ideal-type Weberian ethos that emphasized legality, meritocracy, impartiality, and discipline in the exercise of public authority (Bartels, Reference Bartels2009; Ferreira and Serpa, Reference Ferreira and Serpa2019; Udy, Reference Udy1959). Bureaucratic organizations are expected to operate according to formal rules and professional standards that distinguish public service from private interest and sustain expectations of integrity, fairness, and accountability (Byrkjeflot, Reference Byrkjeflot, Byrkjeflot and Engelstad2018; Garrido‐Vergara and Cienfuegos, Reference Garrido‐Vergara and Cienfuegos2025; Graycar, Reference Graycar2020). Yet, practice persistently differs from this normative ideal. Public servants routinely work in environments characterized by discretion, time pressures, informational asymmetries, and often competing organizational and political incentives (Halachmi and Holzer, Reference Halachmi and Holzer1993). In this context, while discretion is an essential component of administrative action – enabling bureaucrats to adapt formal rules to complex and evolving contexts – it simultaneously opens spaces in which ethical standards may be applied unevenly or interpreted selectively (Pliscoff-Varas and Lagos-Machuca, Reference 77Pliscoff-Varas and Lagos-Machuca2021).

Informational asymmetries between public officials and citizens, as well as between different levels of the administrative hierarchy, further complicate accountability and monitoring, reducing the visibility of decisions and increasing opportunities for misconduct (Lubk, Reference 75Lubk2017). Under such conditions, compliance with formal rules and ethical codes is frequently mediated by situational judgements, rather than governed by consistently enforced norms (Geuras and Garofalo, Reference Geuras and Garofalo2010; Perlman et al., Reference Perlman, Reddick and Demir2023). It is in these moments – when formal norms collide with situational constraints and moral ambiguity – that integrity becomes fragile and the conditions for unethical or corrupt behaviour emerge. Faced with ambiguous rules, conflicting expectations, or perceived organizational tolerance, public servants may come to rationalize actions that deviate from ethical standards, particularly when personal, professional, or organizational incentives appear to outweigh the anticipated costs of misconduct (Garrido-Vergara, Reference Garrido-Vergara2024).

These dynamics do not imply a wholesale rejection of public service values; rather, they illuminate what this Element conceptualizes as the shadows of integrity in public administration – the contingent and context-dependent spaces in which ethical behaviour becomes fragile within bureaucratic settings. Understanding corruption in public administration therefore requires close attention to the micro-level conditions under which ethical commitments are strained, negotiated, or ultimately overridden in everyday administrative practice.

Conceiving integrity in this way shifts the analytical focus away from isolated acts of wrongdoing to the conditions under which ethical reasoning and behaviour are shaped within public organizations. For example, if integrity failures emerge in these shadows created by discretion, ambiguity, and/or competing incentives, then understanding corruption requires examining how bureaucrats perceive ethical norms, the extent to which they are motivated by public values within their organization, and how they assess risks and rewards and interpret organizational signals in concrete decision-making contexts. This perspective foregrounds the role of beliefs, expectations, and attitudes as key mediating mechanisms between institutions, norms (formal and informal), and individual conduct (Garrido-Vergara and Quijada Donaire, Reference Garrido-Vergara and Quijada Donaire2025), raising a central empirical challenge for the field of public administration research: how to systematically observe and explain these micro-level dynamics in a rigorous manner.

Does access to information about ethical behaviour in the state influence the beliefs/attitudes of people working in public administration? A significant body of recent work has shown that corruption in the public sector is a complex, multifaceted phenomenon that is very difficult to prevent (Acar, Reference Acar and Torsello2016; Arellano-Gault, Reference Arellano-Gault2019; Graycar, Reference Graycar2020; Mugellini and Markwalder, Reference Mugellini, Markwalder and Schedler2022; Sulitzeanu-Kenan et al., Reference Sulitzeanu-Kenan, Tepe and Yair2022; Torsello, Reference Torsello2016). Among other reasons, when those who perform public functions decide to act unethically, the potential benefits they perceive are, in most cases, tied to contingent situations that shape their expectations about such behaviour. In other words, the opportunity to profit through an action that provides a personal benefit – ‘additional’ or ‘abnormal’ compared to what is strictly set out in contractual obligations – can serve as a motivating driver for individuals to engage in behaviour that undermines the ethos of public service.

While corruption is often examined in political science as a phenomenon involving political parties, elected officials, and/or high-level policymakers (Heywood, Reference Heywood1997; Monteduro et al., Reference 76Monteduro, Hinna and Moi2016; Rose-Ackerman and Truex, Reference Rose-Ackerman and Truex2012), this Element focuses specifically on administrative corruption, defined as unethical and/or corrupt practices occurring within public bureaucracies in the exercise of administrative authority (Garrido-Vergara, Reference Garrido-Vergara2024; Graycar, Reference Graycar2020; Mugellini and Markwalder, Reference Mugellini, Markwalder and Schedler2022). More broadly, political corruption encompasses practices such as campaign finance irregularities, vote-buying, legislative capture, or executive misconduct and is commonly analysed through theoretical lenses emphasizing electoral competition (Birch, Reference Birch2011), political accountability (Philp, Reference Philp2001), and elite incentives (Rose-Ackerman, Reference Rose-Ackerman, Dahlström and Wängnerud2015). By contrast, corruption in the field of public administration unfolds within organizational settings governed by formal rules, bureaucratic norms, standards of administrative responsibility, and hierarchical authority, where discretion, public service motivation, organizational human behaviour, organizational culture, and decision-making play a central role (Graycar, Reference Graycar2020).

Since public servants face ethical dilemmas that differ systematically from those encountered by political actors, it is analytically essential to distinguish between administrative and political corruption. The mechanisms, incentives, motivations, expected roles, and organizational contexts shaping bureaucratic behaviour operate in ways different from those governing political decision-making, warranting separate analytical and methodological treatment (Garrido-Vergara, Reference Garrido-Vergara2024; Weißmüller and Zuber, Reference Weißmüller and Zuber2023). This Element therefore concentrates on corruption as an administrative phenomenon, paying particular attention to the beliefs, attitudes, motivations, incentives, and contextual conditions that shape ethical behaviour within public organizations.

Why, then, use experiments to study corruption in public administration? Because administrative corruption manifests itself in individual decisions shaped by routines, discretion, beliefs, motivations, and organizational patterns, survey experiments offer a particularly suitable methodological approach. By systematically manipulating specific contextual cues while holding other factors constant, survey experiments allow researchers to isolate the causal mechanisms that shape ethical judgements and behaviour among public servants. In addition, experimental designs reduce social desirability bias and permit the study of ethically sensitive scenarios in controlled yet realistic environments. This approach allows researchers to capture the micro-level dynamics of ethical behaviour in bureaucratic settings more accurately.

Corruption in public administration involves the illegal use of public power, management, and state resources for personal benefit or for groups seeking to act against the state’s interests (Sulitzeanu-Kenan et al., Reference Sulitzeanu-Kenan, Tepe and Yair2022). This is why such acts can be observed in states regardless of their institutional design, developmental capacities, power, or level of economic growth (Graycar and Monaghan, Reference Graycar and Monaghan2015). In this context, why speak of the “shadows of integrity” in public administration? The main reason is that bureaucrats are constantly exposed to situations that generate dilemmas between their normative commitments and their practical attitudes (Garrido-Vergara and Quijada Donaire, Reference Garrido-Vergara and Quijada Donaire2025). Such tensions may, at times, lead to behaviours that erode ethical standards or even amount to overt instances of corruption within the public sector.

Corruption in the public sector is highly complex and has multiple causes and determinants. Among the micro-foundations of this type of behaviour, three typologies can be identified: (i) the importance of ethical awareness and formal normative knowledge, (ii) the consideration of behavioural beliefs and determinants of individual self-control, and (iii) the study of individuals’ socio-normative beliefs and the factors that condition control of human behaviour within the organization (Garrido-Vergara, Reference Garrido-Vergara2024).

In public administration, corruption involves acts and behaviours that harm the state’s actions by abusing powers inherent in public authority for private gain (Pozsgai-Alvarez, Reference Pozsgai-Alvarez2020; Sampford et al., Reference Sampford, Shacklock, Connors and Galtung2016). In this context, the relationship between ethics and public sector management is expressed in how codes of conduct – shaped by principles and values inherent to the nature of state action – are formulated and implemented, and how these regulations determine both collective and individual behaviours related to corruption and integrity in public service (Arellano-Gault, Reference Arellano-Gault2019; Pliscoff-Varas, Reference Pliscoff-Varas and Farazmand2019). This is closely related to factors affecting public service motivation (Christensen and Wright, Reference Christensen and Wright2018; Wright et al., Reference Wright, Christensen and Pandey2013; Wright and Christensen, Reference Wright and Christensen2010), a key dimension for analysing how prone individuals are to breaching ethical standards in public administration. It is worth noting here that, despite the growing literature on state corruption (Graycar, Reference Graycar2020), field research contributions and case studies remain scarce (Torsello, Reference Torsello2016), particularly in Latin America.

In the public sector, studies of corruption and unethical behaviour (Frederickson,Reference Frederickson 1993; Graycar, Reference Graycar2020) have shown that these issues most frequently arise in administrative and management systems characterized by high levels of non-transparency, weak legal enforcement, low salary incentives, and the absence of ethics code promotion (Pliscoff-Varas and Lagos-Machuca, Reference 77Pliscoff-Varas and Lagos-Machuca2021), and cultural factors expressed in norms or social values that tolerate or even encourage such behaviour and/or practices (Torsello, Reference Torsello2016).

Corruption has gained notable importance as a subject of study in the field of public administration due to its persistence over time and its negative effects on the state’s role in society. Corruption in the public sector adversely affects economic growth and generates significant social costs (Garrido-Vergara and Cienfuegos, Reference Garrido‐Vergara and Cienfuegos2025), particularly regarding trust in the public sector. It also weakens democracy and leads to political instability (Sulitzeanu-Kenan et al., Reference Sulitzeanu-Kenan, Tepe and Yair2022) by significantly undermining political and institutional trust (Arellano-Gault, Reference Arellano-Gault2019; Rotberg, Reference 78Rotberg2018).

Moreover, corruption and unethical behaviour affect motivation in the public sector and its capacity for agency (Christensen and Wright, Reference Christensen and Wright2018; Wright and Christensen, Reference Wright and Christensen2010), altering individual behaviours by increasing the likelihood of dishonest (Olsen et al., Reference Olsen, Hjorth, Harmon and Barfort2019) and individualistic actions at odds with collective objectives at the organizational and institutional levels (Sulitzeanu-Kenan et al., Reference Sulitzeanu-Kenan, Tepe and Yair2022). In addition, dishonest behaviours in the public sector reduce social capital by heightening distrust and the likelihood of empathy-deficient actions, thereby affecting the creation of collective bonds (Kumasey and Hossain, Reference Kumasey and Hossain2020).

One of the major challenges in public administration relates to how public servants conceive and apply what they believe is ‘right’ in their roles. This clash between reality and the rhetoric underlying public service represents a key analytical problem for understanding and confronting corruption and unethical behaviour (Ackerman and Coogan, Reference Ackerman and Coogan2013; Søreide and Rose-Ackerman, Reference Søreide, Rose-Ackerman and Arlen2018). It challenges the traditional view that sees such issues merely as matters requiring reactive or punitive interventions, rather than focusing on the development of institutional, behavioural, and practical capacities to guide the actions of public servants.

Legal systems impose penalties – an important measure to combat corruption – but do not necessarily foster preventive capacities. People seldom react to the possibility of committing an offence unless there is a powerful disincentive that highlights the consequences of their actions. This point is even more relevant in the public sector, where, in many cases, the gains associated with unethical behaviour can seem highly attractive, ultimately overshadowing a person’s ethical standards. In other words, even when a public servant has some awareness that an act violates ethical standards and constitutes corruption, incentives can ultimately lead them to act corruptly. Most studies on this topic conclude that a root cause of this issue emerges when individual benefit is placed above the public good. Among other consequences, this generates complex problems for trust and the functioning of public services, eventually weakening both governance and the quality of democracy.

It is within this premise that this Element focuses on one of the most recent challenges faced by studies of corruption in the public sector: How can we develop experiments through surveys in the public sector to analyse causal factors that help us understand this problem? While recent studies have addressed this emerging research agenda in public administration (Christensen and Wright, Reference Christensen and Wright2018; Clifford et al., Reference Clifford, Sheagley and Piston2021; James et al., Reference James, Jilke and Van Ryzin2017), most agree that the use of surveys still poses significant challenges, particularly regarding the drivers that encourage or inhibit unethical behaviour. Further complicating matters, administrative cultures differ across countries, raising the question of how to address barriers that significantly affect the feasibility of conducting such studies.

Nevertheless, the implementation of rigorous methodological designs and advanced econometric techniques in data analysis can help overcome many of these obstacles, bringing us closer to understanding the root causes of this phenomenon. This Element aims to do just that: to analyse recent disciplinary contributions and propose ways to design and implement experiments that facilitate such studies in the public sector. In doing so, it seeks to provide readers with tools to study the factors that lead to corrupt behaviours such as bribery, nepotism, fraud, and other types of conduct involving co-optation or violation of the fundamental ethics of public service (Garrido-Vergara, Reference Garrido-Vergara2024).

A central motivation for advancing experimental approaches in public administration research stems from the methodological limitations inherent in interview-based studies. While interviews have long provided valuable insights into bureaucratic behaviour and institutional processes, they often suffer from a significant source of bias: interviewees tend to interpret the interview setting as an opportunity for organizational self-portrayal and self-assurance. Rather than revealing latent dilemmas, contradictions, or deviant practices, participants frequently reproduce normative expectations and idealized accounts of public service (Grohs et al., Reference Grohs, Adam and Knill2016, p. 157). This tendency can obscure the informal rules, behavioural inconsistencies, and ethical trade-offs that shape administrative practice. Experimental designs, by contrast, allow researchers to observe actual behaviour in controlled or naturalistic settings, minimizing social desirability bias and enhancing the validity of inferences about how public officials make decisions under real-world constraints.

Despite growing interest in the study of corruption and behavioural ethics in public administration, the field continues to face significant methodological limitations, most notably the scarcity of data and research designs that permit robust causal inference (James et al., Reference James, Jilke and Van Ryzin2017; Margetts, Reference Margetts2011). In this context, experimental approaches provide a reliable methodological alternative for studying corruption in public administration, precisely because corrupt and unethical behaviour is difficult to describe, observe, and measure in real-world settings due to its illicit and socially sensitive nature (Anechiarico and Jacobs, Reference Anechiarico and Jacobs1994). Individuals are often unwilling to acknowledge unethical behaviour and, in some cases, may not even recognize that their actions deviate from normative standards (Belle and Cantarelli, Reference 69Belle and Cantarelli2017).

Experimental designs help address these challenges by allowing researchers to control for social desirability bias and to systematically examine how individuals respond to specific incentives and ethically sensitive scenarios (Christensen and Wright, Reference Christensen and Wright2018; Margetts, Reference Margetts2011). By eliciting reactions under controlled conditions, experiments make it possible to assess whether and how public servants would behave ethically in situations that mirror real administrative dilemmas. In this sense, experimental studies complement existing research based on descriptive evidence, correlational analyses, or qualitative accounts which, while valuable, offer limited leverage for identifying the causal mechanisms underlying ethical and unethical behaviour.

By combining experimental control with high internal validity, survey experiments in the study of corruption in public administration allow researchers to manipulate information and construct scenarios that permit the identification of causal effects, while remaining attentive to the organizational and institutional realities of public administration. This balance between realism and control makes survey experiments particularly well suited to the study of corruption in bureaucratic settings, where field experimentation is often impractical and observational data offer limited causal leverage.

The remainder of this Element is structured as follows. Section 2 reviews experimental approaches in public administration and situates survey experiments within the broader theoretical literature on corruption, ethics, and integrity. Section 3 develops research protocols for the design and implementation of survey experiments in public sector contexts, with particular attention to ethical considerations and feasibility constraints. Section 4 outlines future research avenues and discusses the implications of experimental approaches for the study of integrity in public administration. The concluding section synthesizes the main insights and reflects on the contribution of survey experiments to advancing theory and evidence in corruption research.

2 Experiments in Public Administration

2.1 Between Rules and Discretion: The Literature on Corruption in Public Administration

Corruption in public administration is a multidimensional phenomenon that challenges the foundational principles of legality, impartiality, and meritocracy in the state. It is commonly understood as the abuse of public office for private gain – a definition widely adopted by global institutions such as the World Bank, the OECD, Transparency International, and the United Nations Office on Drugs and Crime (Heidenheimer and Johnston, Reference Heidenheimer and Johnston2017; Marquette, Reference Marquette2001; Rose-Ackerman, Reference Rose-Ackerman1997). As a global governance concern, corruption is deeply linked to problems of inequality, state capacity, institutional trust, and democratic legitimacy (Mauro, Reference Mauro1995; Rothstein and Uslaner, Reference Rothstein and Uslaner2005).

Importantly, corruption does not only manifest as individual wrongdoing; it may also be systemic, emerging from entrenched institutional practices, informal rules, and organizational cultures that normalize misconduct and weaken accountability (Johnston, Reference Johnston2005; Rothstein, Reference Rothstein2011). In such contexts, corruption becomes embedded in the day-to-day operation of public administration, enabling state capture and eroding the quality and fairness of public service delivery (Hellman et al., Reference Hellman, Jones and Kaufmann2000; Lipsky, Reference Lipsky2010; Schedler, Reference Schedler2006).

It is also necessary to distinguish between unethical and corrupt behaviour. While all corruption is unethical, not all unethical conduct necessarily qualifies as corruption in the legal sense (Lindgreen, Reference Lindgreen2004; Menzel, Reference Menzel2015). Practices like nepotism, conflict of interest, or administrative inefficiency may fall into ethically grey areas that defy simple categorization. As Heidenheimer and Johnston (Reference Heidenheimer and Johnston2017) argue, perceptions of corruption vary culturally and institutionally, and can be differentiated into three types:

Black: Practices widely condemned both legally and ethically (e.g., bribery, extortion)
Grey: Ambiguous practices that may be legal but are ethically contested (e.g., nepotism, lobbying)
White: Practices that are tolerated or normalized despite ethical concerns (e.g., use of public resources for minor personal benefit)

This typology underscores the importance of considering social perceptions, norms, and organizational consensus when analysing corruption. It also reinforces the need to move beyond purely legalistic or punitive definitions towards frameworks that take into account the contextual, institutional, and behavioural dimensions of misconduct in the public sector.

In this context, three dominant theoretical approaches have emerged as particularly influential in public administration scholarship: the principal–agent model, collective action theory, and behavioural ethics. Each addresses the corruption phenomenon from a distinct analytical vantage point, focusing respectively on formal incentives, shared expectations, and psychological mechanisms. Together they provide a comprehensive framework for understanding integrity failures in bureaucratic settings. These three were selected not only for their prominence in the academic literature but also because they offer testable propositions and practical implications that lend themselves to survey-based experimental designs, which are the focus of this Element.

The following section presents these approaches in detail, identifying their core assumptions and relevance for the study of corruption, and highlighting how each can inform the design and interpretation of experiments in public administration.

To understand corruption within the discipline of public administration, it is necessary to take into account a range of factors related to conceptual ambiguities and the broad array of actions that may constitute behaviour affecting ethics in public service (Garrido-Vergara, Reference Garrido-Vergara2024; Graycar, Reference Graycar2020; Meyer-Sahling et al., Reference Meyer-Sahling, Mikkelsen and Schuster2019; Schuster et al., Reference Schuster, Fuenzalida, Mikkelsen and Meyer‐Sahling2024). In the public sector, bureaucrats are constantly exposed to different dilemmas, particularly in light of Weber’s classic principle of the iron cage, where rationalization and bureaucratization subject them to systems based on efficiency, rational calculation, and control, thus limiting freedom and creativity (Mitzman and Coser, Reference Mitzman and Coser2002). At this fundamental level, dilemmas arise regarding beliefs versus the attitudes that bureaucrats may develop in specific situations. This is what constitutes the shadows of integrity: the abuse of public office for private gain.

Corruption is a complex phenomenon that depends on specific contexts and situations, and its manifestation varies according to different types of organizational culture, institutional design, and political regime (Méndez and Sepúlveda, Reference Méndez and Sepúlveda2006). For example, some practices considered unethical or corrupt in one system may be ‘culturally’ normalized in another administrative setting, yet formally regulated and sanctioned elsewhere. This is the case, for example, of the use of bribes or informal favours to influence decisions or administrative procedures (Caputo et al., Reference Caputo, Ligorio and Venturelli2025; Graycar and Jancsics, Reference Graycar and Jancsics2017). For this reason, it is often difficult to establish clearly the legal and normative boundaries that define when and where corruption or ethical misconduct occur in the state. Tackling corruption therefore requires not only definitional clarity but also a robust theoretical understanding of its drivers and institutional dynamics (Graycar, Reference Graycar2020).

Despite these challenges, the past three decades have seen significant advances in the study of corruption in public administration (see Table 1). A number of scholars have developed a range of conceptual frameworks to explain why corruption occurs in bureaucratic settings and how it might be reduced (Mugellini and Markwalder, Reference Mugellini, Markwalder and Schedler2022; Søreide and Rose-Ackerman, Reference Søreide, Rose-Ackerman and Arlen2018). With regard to the dominant theoretical approaches to corruption in public administration, research has been shaped by three main frameworks: principal–agent models (Groenendijk, Reference 73Groenendijk1997), collective action perspectives (Persson et al., Reference Persson, Rothstein and Teorell2013), and behavioural or normative approaches (Cerrillo i Martínez and Ponce, Reference Cerrillo i Martínez and Ponce2017). Each framework advances distinct assumptions about the origins of corruption and the mechanisms shaping unethical behaviour. Clarifying these assumptions is essential for both theoretical coherence and experimental inquiry.

Table 1Corruption Theories

Table 1Corruption Theories
Theory	Core idea	Key authors	Assumptions	Implications for policy and research in PA
Principal–agent	Corruption occurs when agents (bureaucrats) pursue self-interest instead of acting on behalf of the principal (citizens or elected officials).	Klitgaard, Reference Klitgaard1991; Rose-Ackerman, Reference Rose-Ackerman1986	Agents are opportunistic and rational; corruption is more likely when monitoring is weak and sanctions are low.	Emphasizes institutional control: improve oversight, increase transparency, reduce information asymmetries, and raise detection probabilities.
Collective action	Corruption is a coordination problem where unethical behaviour is sustained because everyone expects others to be corrupt.	Mungiu‐Pippidi, Reference Mungiu‐Pippidi2023; Persson et al., Reference Persson, Rothstein and Teorell2013; Rothstein, Reference Rothstein2011	Norms of corruption become self-reinforcing; individual integrity is insufficient if systemic incentives remain unchanged.	Focus on changing norms and expectations: build broad coalitions for reform, address systemic trust deficits, and promote civic engagement.
Behavioural ethics	Corruption stems from bounded ethical reasoning, cognitive biases, and context-dependent moral behaviour.	Christensen and Wright, Reference Christensen and Wright2018; Gino et al., Reference Gino, Gu and Zhong2009; Tenbrunsel and Messick, Reference Tenbrunsel and Messick2004	Individuals often fail to recognize ethical dilemmas, or they rationalize misconduct; behaviour is shaped by framing, cues, and identity.	Use behavioural interventions: ethical nudges, professional identity priming, training, and framing effects in communication.

Source: Compiled by the author.

Principal–agent models conceptualize corruption in public administration as a problem that emerges from delegation, discretional practices, asymmetric information, and misaligned incentives between principals (i.e., authorities, citizens) and agents, namely public servants (Lane, Reference Lane and Hannah2020). This occurs because principal–agent approaches typically assume that bureaucratic behaviour is driven by rational self-interest, responsiveness to incentives, and the effectiveness of monitoring and sanctioning mechanisms as key determinants of action. From this perspective, corruption arises when agents leverage informational asymmetries and insufficient oversight to advance private interests in ways that undermine public objectives (Rose-Ackerman, Reference Rose-Ackerman2008; Søreide and Rose-Ackerman, Reference Søreide, Rose-Ackerman and Arlen2018). This logic provides a strong justification for studying corruption through survey experiments. By simulating realistic administrative scenarios and systematically controlling specific contextual factors, survey experiments allow researchers to isolate and examine how variations in incentive structures shape ethical judgements and behavioural intentions among public servants under controlled conditions.

This approach is particularly well suited to testing mechanisms related to incentives, detection probabilities, and anticipated sanctions, all of which lend themselves naturally to survey experimental designs. In such designs, researchers systematically manipulate information about monitoring, enforcement, or expected consequences in order to identify how variations in incentive structures shape ethical judgements and behavioural intentions among public servants under controlled conditions.

In contrast to principal–agency approaches, which conceptualize corruption primarily as an individual deviation, collective action perspectives emphasize the relational and systemic nature of corruption in public administration. Rather than attributing unethical behaviour to opportunistic individuals, collective action theory views corruption as a self-reinforcing equilibrium sustained by social norms, shared expectations, social structures, and mutual beliefs about others’ behaviour (Persson et al., Reference Persson, Rothstein and Teorell2013).

According to this approach, bureaucrats may engage in – and in some cases legitimize – corrupt or unethical practices based on the perception that such behaviour is widespread within the organization and therefore rational or unavoidable. In other words, if ‘everyone does it’, individuals may conclude that there is little reason to act differently. Under these conditions, ethical behaviour can come to be perceived as naïve, professionally risky, or even negatively evaluated by peers, reinforcing the persistence of corrupt practices through shared expectations rather than individual preference. The key theoretical implication is that unethical behaviour and corrupt practices can persist even in the presence of strong laws and sanctions if collective expectations remain unchanged. For survey experimental research, this framework is especially relevant since it highlights norm perceptions, beliefs, and expectations about peers’ behaviour as central causal mechanisms. Survey experiments are well suited to testing these mechanisms by manipulating information about social norms, prevalence cues, or institutional trust, thereby allowing researchers to examine how changes in perceived collective behaviour shape ethical attitudes and compliance intentions among public servants.

In contrast to principal–agent approaches, which emphasize incentives, monitoring, and sanctions, and collective action perspectives, which focus on shared expectations and social norms, behavioural and normative approaches locate the origins of unethical behaviour in the public sector in the moral, cognitive, and psychological processes underlying decision-making. Rather than assuming fully rational and utility-maximizing actors, these perspectives highlight bounded rationality, moral disengagement, self-control, and the internalization of professional values as key determinants of corruption (Gino et al., Reference Gino, Gu and Zhong2009; Olsen et al., Reference Olsen, Hjorth, Harmon and Barfort2019).

From this standpoint, unethical conduct may arise not only from strategic calculation or individual and/or collective expectations but also from motivated reasoning, ethical fading, or failures of moral awareness in situationally complex bureaucratic environments. Therefore, ethical behaviour is understood as context-dependent, shaped by emotions, moral identity, and framing effects, rather than solely by external incentives or perceptions of others’ behaviour. The central theoretical implication is that corruption can occur even in settings with strong formal rules and enforcement when cognitive biases or normative ambiguity weaken ethical judgement. This framework is also particularly well suited to survey experimental research, as experiments can systematically manipulate moral cues, identity priming, or situational framing through vignettes, allowing researchers to identify how subtle contextual variations influence public servants’ ethical reasoning and behavioural intentions. Table 1 synthesizes the principal theoretical approaches to corruption in public administration, illustrating how each framework conceptualizes the sources of unethical behaviour and the corresponding implications for research design and policy intervention.

These approaches are not mutually exclusive. Rather, they reflect different levels of analysis: from institutional arrangements (principal–agent) to social coordination (collective action) and individual cognition (behavioural ethics). Together, they offer a multidimensional understanding of corruption as both a rule-breaking behaviour and a normative failure shaped by institutional design, organizational context, and individual psychology.

The principal–agent model remains influential in anti-corruption strategies led by international organizations, which emphasize enforcement, auditing, and incentive alignment. However, critics argue that this approach underestimates the power of normative legitimacy, informal institutions, and collective expectations in shaping behaviour. As Rothstein (Reference Rothstein2011) argues, strengthening the rule of law in isolation is insufficient if public officials and citizens alike believe that corruption is how the system works.

The collective action framework, in contrast, highlights the importance of shared beliefs and expectations. If individuals believe others will act corruptly, ethical behaviour may be seen as naïve or even professionally risky. Under this view, reform must tip the balance of expectations through credible, visible, and sustained efforts to achieve changes that generate new behavioural equilibria.

Finally, the behavioural ethics approach introduces critical insights from social psychology and experimental research. Corruption does not always stem from explicit calculation or systemic cynicism; it may reflect ethical fading, motivated reasoning, or framing effects. This literature underscores the importance of understanding how public servants perceive ethical dilemmas, how identity cues (e.g., ‘public service professional’) influence decisions, and how experimental interventions can shift behaviour even in the absence of structural reform.

In Latin America and other contexts marked by institutional distrust and weak enforcement, these theoretical lenses help explain the coexistence of strong formal rules with persistent informal practices. As Garrido-Vergara (Reference Garrido-Vergara2024) notes, many bureaucrats operate in ‘grey zones’ where formal norms are regularly suspended or reinterpreted in favour of clientelism, politicization, or private gain. This highlights the importance of combining institutional design with behavioural insight and cultural analysis to study and address corruption effectively.

In sum, understanding corruption in public administration calls for a pluralistic theoretical approach that captures both the incentives and interpretations that shape behaviour. Survey experiments are uniquely suited to test propositions from each of these frameworks. For example, they can assess how monitoring cues affect self-reported intentions (principal–agent), evaluate norm-framing interventions (collective action), and explore the impact of professional identity or ethical nudges (behavioural ethics).

The decision to focus on these three frameworks rests on their centrality and complementarity in the contemporary study of corruption in public administration. Each offers a distinct but interrelated lens through which to understand the incentives, expectations, and psychological processes that shape ethical or unethical conduct in bureaucracies. Together, they capture the institutional, cultural, and individual dimensions of corruption, providing a robust foundation for experimental research. While other theories – such as institutional isomorphism, clientelism, or political economy models – may also shed light on corruption, the chosen approaches are the most directly relevant for designing, testing, and interpreting survey experiments that aim to reveal causal mechanisms behind integrity-related behaviour. They offer not only explanatory power but also practical implications for the development of interventions that can be ethically and empirically evaluated in public service contexts. These insights will guide the experimental protocols and research agenda developed in the following sections.

Importantly, each of these frameworks points to distinct mechanisms that are particularly amenable to investigation through survey experimental designs. Incentive structures, detection probabilities, and sanction cues derived from principal–agent theory can be directly manipulated through experimental treatments; collective action dynamics can be examined by varying information about peer behaviour, social norms, or institutional trust; and behavioural approaches lend themselves to experimental manipulations of moral salience, identity priming, and contextual framing. By making these theoretical mechanisms explicit, this Element establishes a clear bridge between dominant explanatory frameworks in the corruption literature and the experimental strategies discussed in the sections that follow.

This explicit mapping of theoretical assumptions to experimentally testable mechanisms provides the conceptual foundation for the survey experimental protocols developed in Section 3.

2.2 Experimental Studies of Behavioural Ethics and Corruption in Public Management Journals

Building on the preceding discussion of theoretical approaches to corruption in public administration, this subsection reviews the empirical patterns emerging from experimental research in the field, looking particularly at how ethical behaviour and corruption have been operationalized and studied.

Over the past two decades, experimental research on corruption and behavioural ethics in public administration has experienced a notable expansion, driven by increased access to experimental tools and a renewed focus on evidence-based governance. Nonetheless, systematic knowledge about how these studies are designed, what they test, and how they inform theory development remains fragmented.

This section provides a structured review of experimental studies on corruption and behavioural ethics in public administration, based on a transparent search and screening process to identify peer-reviewed experimental and quasi-experimental research published in leading public administration and public management journals such as Journal of Public Administration Research and Theory, Public Administration Review, Governance, International Public Management Journal, and Regulation & Governance. Major academic databases, including Web of Science and Scopus, were searched, using combinations of keywords related to corruption, unethical behaviour, behavioural ethics, public administration, and experimental methods. Titles and abstracts were screened for relevance, followed by full-text assessment based on substantive focus and methodological criteria. The final sample consisted of approximately sixty experimental studies published between 2005 and 2024.

Table 2 summarizes the main empirical patterns identified in this review. Rather than offering an exhaustive catalogue of individual studies, it synthesizes the recurring design choices, theoretical frameworks, substantive foci, and empirical limitations that characterize this growing body of research. This overview provides a structured basis for assessing how experimental approaches have been used to study corruption in public administration and identifying persistent gaps that motivate the methodological and protocol-oriented discussion developed in subsequent sections.

Table 2Experimental Studies on Corruption and Behavioural Ethics in Public Administration: Summary of Findings

Table 2Experimental Studies on Corruption and Behavioural Ethics in Public Administration: Summary of Findings
Dimension	Main patterns identified
Type of experimental design	Predominantly survey experiments; fewer field and laboratory experiments
Primary outcomes	Ethical judgements, behavioural intentions, norm perceptions; limited use of observed behaviour
Theoretical frameworks	Principal–agent models, behavioural ethics, collective action perspectives
Substantive focus	Corruption tolerance, compliance behaviour, norm perceptions, ethical decision-making
Geographic focus	Strong concentration in OECD countries; limited evidence from Global South contexts
Key limitations	Reliance on self-reported measures; limited differentiation between beliefs and behaviour

Source: Compiled by the author based on the systematic review.

Table 2 reveals several consistent patterns. First, survey experiments clearly dominate the field, reflecting both the ethical sensitivity of corruption research and the practical constraints associated with observing illicit behaviour in real administrative settings. Field and laboratory experiments remain comparatively rare, particularly in studies involving public servants.

Second, the literature relies predominantly on outcome measures to capture ethical judgements, attitudes, and behavioural intentions, with relatively limited use of observed behaviour. While these measures offer valuable insights into decision-making processes, they also underscore a persistent gap between beliefs about corruption and actual behaviour under concrete institutional constraints.

Third, most experimental studies draw on principal–agent and behavioural ethics frameworks, whereas collective action perspectives are less frequently operationalized experimentally, despite their theoretical relevance. This imbalance suggests that certain mechanisms, such as shared expectations and norm perceptions, remain underexplored in experimental designs.

Fourth, the geographic distribution of studies is heavily concentrated in OECD and high-income contexts, with limited experimental evidence from the Global South and emerging administrative systems. This pattern raises important questions about the external validity and generalizability of existing findings across diverse institutional environments.

Finally, the review highlights common methodological limitations, including heavy reliance on hypothetical scenarios, limited differentiation between beliefs, attitudes, and behaviour, and insufficient attention to issues of treatment intensity, power, and transparency. Taken together, these patterns underscore both the rapid expansion of experimental research on corruption in public administration and the need for more theoretically integrated, methodologically explicit, and context-sensitive survey experimental designs, precisely the gap addressed by the research protocol proposed in the following sections.

The review of the experimental literature on corruption and behavioural ethics in public administration reveals several consistent empirical patterns. First, there has been a marked shift from purely observational or survey-based studies towards experimental designs that probe causal mechanisms behind unethical behaviour. A growing number of articles use randomized survey experiments to test how exposure to institutional messages, organizational culture, or social norms affects ethical decision-making by civil servants (Christensen and Wright, Reference Christensen and Wright2018; Clifford et al., Reference Clifford, Sheagley and Piston2021; Sulitzeanu-Kenan et al., Reference Sulitzeanu-Kenan, Tepe and Yair2022).

Second, studies increasingly incorporate field experiments in partnership with public agencies. These collaborations enable researchers to embed interventions in actual bureaucratic routines – ranging from anti-corruption training to transparency nudges – and measure real behavioural outcomes. For example, Falisse and Leszczynska (Reference Falisse and Leszczynska2022) tested the effects of different messaging strategies on bribe-taking among civil servants in Burundi, while Bellé (Reference Bellé2014) and Linos (Reference Linos2018) demonstrate how field experiments can examine performance motivation, equity, and responsiveness in public service delivery.

Third, the experimental literature has diversified in terms of its geographical scope. While much of the early work focused on high-income OECD countries, recent studies increasingly analyse corruption and integrity in Latin American countries, such as Chile and Mexico, where survey experiments have begun to inform the design of training programmes and policy evaluations in ministries and municipal governments. This shift has illuminated the importance of administrative culture, path dependency, and informal institutions in shaping how corruption is perceived, justified, or resisted (Garrido‐Vergara and Cienfuegos, Reference Garrido‐Vergara and Cienfuegos2025; Graycar and Jancsics, Reference Graycar and Jancsics2017).

Despite these promising developments, key gaps persist. Most notably, relatively few experimental studies rigorously distinguish between beliefs about corruption and attitudes or behaviour under specific institutional constraints, an issue this Element explicitly addresses. Moreover, many studies still rely on hypothetical scenarios and self-reported intentions, which, while valuable, may not always predict real-world conduct.

To advance the field, future systematic reviews should map not only experimental designs but also their contribution to theory testing, policy relevance, and cross-contextual validity. Meta-analyses disaggregated by design type (e.g., lab, field, survey), outcome domain (e.g., bribery, nepotism, favouritism), and regional context would be particularly helpful in assessing generalizability and methodological rigour across studies.

In sum, the experimental shift in corruption research has opened the way to more precise testing of integrity-related theories and interventions. However, considerable room remains for more comparative designs, protocol standardization, and triangulation with administrative data and qualitative evidence to deepen insights into ethical behaviour in the public sector (Christensen and Wright, Reference Christensen and Wright2018).

Taken together, these remaining challenges point to a critical analytical distinction that is particularly consequential for the design of survey experiments in public administration (Schuster et al., Reference Schuster, Fuenzalida, Mikkelsen and Meyer‐Sahling2024; Søreide and Rose-Ackerman, Reference Søreide, Rose-Ackerman and Arlen2018). Beliefs about the prevalence of corruption capture actors’ perceptions of collective behaviour, whereas attitudes reflect normative evaluations, and behavioural intentions are shaped by the specific constraints under which decisions are made. When these dimensions are not analytically separated, experimental findings risk overstating the link between ethical reasoning and actual conduct. Survey experiments can address this limitation by explicitly manipulating contextual features – such as monitoring intensity, sanction severity, peer behaviour, or organizational signals – while measuring beliefs, attitudes, and intended actions as distinct outcomes.

Such designs make it possible to assess whether changes in perceived norms alter behaviour independently of moral attitudes or whether institutional constraints condition the translation of ethical judgements into action. By disentangling these dimensions, future survey experiments could more precisely identify the mechanisms through which institutional environments shape unethical behaviour, thereby strengthening both causal inference and policy relevance (Didier and Araya‐Orellana, Reference Didier and Araya‐Orellana2026; Garrido-Vergara, Reference Garrido-Vergara2024; Garrido‐Vergara and Cienfuegos, Reference Garrido‐Vergara and Cienfuegos2025).

To complement the general patterns identified in the experimental literature, a smaller but growing body of research has examined corruption and unethical behaviour in Latin American public administration. This work is particularly relevant given the region’s distinctive institutional characteristics, including higher perceived corruption, heterogeneous state capacity, and varying enforcement regimes. Experimental studies in this context have relied primarily on survey experiments, often motivated by the ethical, political, and practical constraints associated with conducting field or laboratory experiments involving public officials.

A recurrent feature of this literature is its emphasis on contextualized treatments that reflect region-specific institutional realities, such as weak monitoring, politicized bureaucracies, and uneven rule enforcement (Grindle, Reference Grindle2004; Meyer-Sahling et al., Reference Meyer-Sahling, Mikkelsen and Schuster2019). While this contextual sensitivity enhances realism, it has also contributed to substantial variation in experimental designs, outcome measures, and sampling strategies, limiting cross-study comparability. Similar to the broader literature, many Latin American studies rely on hypothetical scenarios and self-reported intentions, underscoring persistent challenges in distinguishing between beliefs, attitudes, and behaviour under concrete institutional constraints.

At the same time, experimental research in Latin America has generated important insights into the interaction between institutional signals, social norms, and enforcement expectations. Several studies suggest that perceived impunity and collective expectations about corruption play a central role in shaping ethical decision-making, lending empirical support to collective action and normative approaches (Aydin, Reference Aydin2025). However, comparative evidence – both within the region and across regions – remains limited, constraining the generalizability of these findings.

Taken together, the Latin American literature reinforces the methodological and conceptual issues identified in the broader experimental research on corruption in public administration. In particular, it highlights the need for more standardized yet context-sensitive survey experimental designs, clearer separation of analytical constructs, and sampling strategies aligned with the intended scope of inference (Garrido-Vergara and Quijada Donaire, Reference Garrido-Vergara and Quijada Donaire2025; Klitgaard, Reference Klitgaard1991; Linos, Reference Linos2018; Rotberg, Reference 78Rotberg2018; Rubin, Reference Rubin2005). These regional patterns thus serve not as a separate research agenda but as a contextual illustration of the challenges that motivate the protocol-oriented guidance developed in the subsequent sections of this Element.

The dominant theoretical approaches to corruption in public administration are shaped not only by individual incentives and beliefs but also by the institutional and normative environments in which public officials operate. In Latin America, these environments are characterized by heterogeneous ethics infrastructures, encompassing formal rules, oversight bodies, integrity policies, and informal norms governing administrative behaviour. This context provides a particularly relevant setting for examining how institutional design and normative expectations interact to shape unethical conduct.

From a principal–agent perspective, the region exhibits substantial variations in monitoring capacity, sanction credibility, and enforcement consistency, affecting the extent to which formal rules constrain behaviour. Collective action approaches are equally salient, as perceptions of impunity and shared expectations about corruption often weaken the credibility of individual ethical compliance. Behavioural and normative perspectives further highlight the role of moral framing, professional identity, and organizational culture in settings where formal controls coexist with entrenched informal practices.

Empirically, these features pose both challenges and opportunities for experimental research. On the one hand, reliance on formal ethics frameworks alone provides limited leverage for explaining behaviour when enforcement is uneven and norms are contested. On the other hand, the diversity of institutional arrangements across countries, administrative levels, and organizations offers fertile ground for experimentally testing how variations in rules, norms, and oversight mechanisms condition ethical judgements and behavioural intentions.

Integrating the ethics infrastructure of Latin American public administrations into the theoretical discussion thus serves a dual purpose. Conceptually, it illustrates how dominant approaches to corruption operate under conditions of institutional fragility and normative ambiguity. Methodologically, it underscores the value of survey experiments as tools for systematically introducing variations in institutional signals, enforcement expectations, and normative cues, while holding broader contextual factors constant. In this sense, the Latin American context constitutes a theoretically informed setting that highlights the importance of context-sensitive experimental designs in the study of corruption in public administration.

2.3 Why Is It Important to Carry Out Experiments in Public Administration?

The primary advantage of experimental approaches in public administration lies in their capacity to support credible causal inference (Bouwman and Grimmelikhuijsen, Reference Bouwman and Grimmelikhuijsen2016). Unlike observational or purely descriptive methods, experiments enable researchers to isolate the effects of specific institutional, organizational, or behavioural factors by systematically manipulating a focal element of the decision-making environment while holding other conditions constant. This is particularly valuable in the study of corruption and unethical behaviour, where empirical analysis is often constrained by limited data availability, social desirability bias, and the illicit or hidden nature of the outcomes of interest (Caputo et al., Reference Caputo, Ligorio and Venturelli2025). Although experimental research is often costly in terms of design, implementation, and access to participants, these costs are justified by the capacity of experiments to identify causal mechanisms rather than mere correlations. In this sense, experimentation offers a distinctive analytical leverage for understanding how incentives, norms, and contextual cues shape ethical judgements and behavioural intentions among public servants.

Public administration plays a fundamental role in countries’ development, as it is one of the key spheres through which the state engages with its citizens (Weber, Reference Weber1919). In this context, those working in public administration constantly face dilemmas related to their function, particularly given the complexities of governmental tasks and the challenges arising from civil society’s demands. For these reasons, public administrators are expected to make informed decisions based on empirical evidence as a way to reduce the probability of making mistakes and minimize the potential socio-political costs associated with the nature of their role.

One of the most effective ways to generate this evidence is through experiments. Experimental research in public administration allows policymakers to test interventions, measure impacts, and refine policies based on data rather than assumptions. Table 3 provides a detailed overview of the advantages of conducting experiments in the field of public administration. A key benefit of these studies is the ability to understand the micro-foundations behind why an individual might, eventually, engage in unethical or corrupt behaviour while performing public duties. In this regard, conducting experiments to study corruption in public administration has significant benefits for society.

Table 3Advantages of Experiments in Public Administration

Table 3 long description.

Table presents examples of how experimental designs in public administration can be adapted to different policy areas and institutional contexts. See long description.

Table 3 Long description

Reducing corruption and enhancing transparency:

Experimental designs explain factors influencing unethical attitudes among public officials and examine how governance structures affect corruption levels. Studies test whether transparency policies reduce bribery and opportunistic behaviour.

Evidence-based policymaking:

Public policy often relies on political considerations, historical data, or expert judgement. Experiments provide a systematic way to evaluate policy outcomes before large-scale implementation. Randomized controlled trials assess effectiveness across sectors such as healthcare, education, social welfare, political participation, and security.

Improving efficiency and effectiveness in government action:

Experiments identify cost-effective solutions for delivering public services and test administrative processes to optimize efficiency.

Understanding citizen behaviour and compliance:

Behavioural experiments predict citizen responses to policies and incentives. Examples include testing whether tax reminders improve compliance or how message framing influences public health behaviour.

Encouraging innovation in public sector management:

Experiments allow controlled testing of innovative solutions before scaling. Governments can trial digital platforms, alternative governance models, and public engagement strategies without immediate full implementation.

Minimizing unintended consequences:

Experiments help identify and mitigate unintended effects before programme rollout at national, regional, or local levels. For example, criminal justice experiments can evaluate whether alternative sentencing reduces recidivism without increasing crime rates.

Source: Compiled by the author based on James et al., 2017.

Over the past decade, experimental approaches have become increasingly prominent in public administration research. Scholars have expanded the use of field experiments, randomized controlled trials (RCTs), and survey experiments to study core administrative phenomena such as public service motivation, compliance, performance management, accountability, and ethical behaviour (Bertelli and Riccucci, Reference Bertelli and Riccucci2022; Christensen and Wright, Reference Christensen and Wright2018; James et al., Reference James, Jilke and Van Ryzin2017). This methodological shift has been closely associated with the rise of Behavioral Public Administration (BPA), which integrates insights from psychology and behavioural economics to examine how cognitive biases, moral reasoning, and contextual cues shape decision-making in bureaucratic settings. In this literature, experiments are valued for their capacity not only to support causal inference but also to capture micro-level behavioural mechanisms that are difficult to observe through traditional observational or qualitative methods. The growing adoption of experimental designs thus reflects a broader shift in the field towards evidence-based, mechanism-oriented explanations of administrative behaviour, providing a natural intellectual foundation for the use of survey experiments in the study of corruption and integrity.

From a methodological perspective, experimental approaches offer several distinctive advantages over observational designs. First, experimental studies permit the establishment of cause-and-effect relationships, helping to identify the factors that increase or reduce corruption. Experiments offer a higher level of precision than observational studies, as they can control external variables and isolate the effects of specific interventions.

Second, this type of research can significantly contribute to understanding human behaviour within organizations, particularly in explaining how organizational factors can influence individual attitudes towards corruption (Furnham, Reference 72Furnham2012). Corruption is not only an institutional problem but also a psychological phenomenon embedded in social systems, and laboratory and field experiments can reveal how organizational culture, social norms, or the perception of impunity influence public officials’ decision-making. This is highly relevant because evidence of this type helps optimize the development of preventive rather than reactive strategies to combat corruption and strengthen ethical public behaviour. Ultimately, this would lead to the optimization of public spending because a reduction in corruption ensures a more efficient use of public resources, translating into better services for the population, such as education, healthcare, security, and infrastructure.

In addition, experiments serve as a reliable and rigorous method for evaluating anti-corruption policies and strategies. They can test which mechanisms, incentives, or regulations are most effective. For example, experiments can be designed to assess whether transparency, citizen oversight, or changes in public officials’ salaries influence their ethical behaviour in relation to the ethos established by regulations and the public sector’s institutional culture (Margetts, Reference Margetts2011).

This is crucial because such research generates robust evidence for designing and implementing public policies that foster stronger ethical standards in public service. This, in turn, yields substantial benefits by enhancing citizen trust in government institutions, an essential element for democratic governance (Jensen and Piatak, Reference Jensen and Piatak2024). Furthermore, experimental designs can be effectively adapted to specific contexts. Experiments can be tailored to study corruption at different levels of government or in specific sectors (such as healthcare, education, or public procurement), facilitating the implementation of strategies that are better suited to each particular setting (see Table 3).

In conclusion, experiments in public administration enable the development of evidence-based policies, improve transparency and efficiency in the public sector, and contribute to building a more just and equitable society.

However, a number of complexities must be addressed when using experimental designs to study issues related to human behaviour in public administration. In general, social experiments through surveys can be costly, both economically and in terms of ensuring participant engagement, particularly when the surveys are self-administered. There are also challenges associated with the most commonly used methods in these designs. Table 4 presents the experimental methods most frequently used in the study of public administration.

Table 4Experimental Methods in Public Administration

Table 4Experimental Methods in Public Administration
Category	Type of experiment	Core characteristics	Typical applications in public administration
Natural experiments	Policy shocks, institutional rules, government-led A/B testing	Treatment assignment is determined by external events or administrative decisions; no direct researcher control; causal inference relies on quasi-experimental strategies (e.g., DiD, RDD, IV).	Introduction of freedom of information (FOI) laws; procurement reforms; administrative rule changes; government-implemented digital A/B testing.
Researcher-led experiments	Field experiments (including RCTs)	Researchers deliberately assign treatments, often through randomization, in real-world administrative settings; high policy relevance with contextual realism.	Performance incentives; monitoring interventions; transparency initiatives implemented in public organizations.
Researcher-led experiments	Laboratory experiments	High level of experimental control in artificial or simulated settings; strong internal validity but limited ecological realism.	Ethics games; corruption simulations; studies of moral decision-making under controlled conditions.
Researcher-led experiments	Survey experiments	Experimental manipulations embedded in surveys (e.g., vignettes, framing, conjoint designs); balance between control, feasibility, and contextual realism; well suited for sensitive topics.	Ethical dilemmas; corruption scenarios; norm framing; identity priming among public servants or citizens.

Source: Compiled by the author based on Hansen and Tummers, 2020.

To avoid conceptual ambiguity, this Element adopts a classification of experimental approaches based on the degree of researcher involvement in treatment assignment. First, natural experiments refer to situations in which exposure to a treatment is determined by external events, institutional rules, or policy changes, rather than by the researcher (Rosenzweig and Wolpin, Reference Rosenzweig and Wolpin2000). From an analytical perspective, government-led A/B testing initiatives fall into this category, as researchers typically observe rather than control treatment assignment (Berliner Senderey et al., Reference Berliner Senderey, Kornitzer, Lawrence, Zysman, Hallak, Ariely and Balicer2020). Second, researcher-led experiments involve the deliberate manipulation of one or more variables by the researcher and include field experiments, laboratory experiments, and survey experiments. Within this category, RCTs are best understood not as a distinct type of experiment but as a design feature that can be implemented across different settings, particularly in field and survey-based studies (Stolberg et al., Reference Stolberg, Norman and Trop2004).

Survey experiments constitute one of the most frequently used and methodologically versatile experimental approaches in contemporary public administration research. They involve the randomized manipulation of information, frames, or scenarios within survey instruments – often through vignettes, conjoint designs, or factorial treatments – allowing researchers to combine experimental control with contextual realism. In contrast to laboratory experiments, survey experiments can be administered to large and diverse samples at relatively low cost, while avoiding the artificiality of highly controlled experimental settings. Compared to field experiments, they do not require direct intervention in organizational processes, which makes them particularly suitable for studying sensitive phenomena such as corruption and unethical behaviour, where access, ethical constraints, and implementation risks often limit the feasibility of field-based designs. Although survey experiments typically rely on self-reported judgements or behavioural intentions rather than observed behaviour, they offer a powerful compromise between internal validity, feasibility, and ecological relevance. For these reasons, survey experiments have become a central methodological tool within experimental and behavioural public administration studies, especially for examining the micro-level mechanisms, such as beliefs, norms, incentives, and moral reasoning, that underpin ethical decision-making in bureaucratic contexts.

Experimental designs based on RCTs are quite useful for making causal inferences (Stolberg et al., Reference Stolberg, Norman and Trop2004). They are used to assess the causal impact of an intervention or treatment and are characterized by the random assignment of subjects to two or more groups. First, a treatment group receives the intervention or treatment under evaluation, followed by a control group, which does not receive the intervention or receives a placebo or alternative condition.

In RCTs, randomization helps reduce potential selection biases and ensures that observed differences between groups are due to the intervention rather than other factors (Bertelli and Riccucci, Reference Bertelli and Riccucci2022, p. 181). Moreover, this type of experiment facilitates comparability because its random distribution of individual characteristics across groups implies that any differences in outcomes can be more confidently attributed to the intervention (Jensen, Reference Jensen2020). Moreover, due to their high internal validity, RCTs are often considered an efficient technique for establishing causal relationships (James et al., Reference James, Jilke and Van Ryzin2017) and are widely used in disciplines such as medicine, economics, and public policy. In public administration research, they are generally used to evaluate reforms or innovations in government management.

A classic example of RCT use in public administration is the evaluation of conditional cash transfer (CCT) programmes (Baird et al., Reference Baird, Ferreira, Özler and Woolcock2013) to assess, for instance, whether providing cash transfers to low-income families – conditional on meeting certain requirements (e.g., ensuring their children attend school and receive medical care) – could improve access to education and child health. In this context, an experimental design based on RCTs could be structured around a target population of communities with high levels of poverty. Once these communities have been identified, one group is randomly assigned to the treatment group (receiving the conditional transfer) and others to the control group (not receiving the transfer during the study period). Throughout the experiment, relevant indicators associated with the programme are monitored (e.g., school attendance, academic performance, and medical visits in both communities), and an impact analysis is conducted by comparing results in the two groups.

If the programme is effective, the treatment group should exhibit higher school attendance rates and greater use of healthcare services compared to the control group. Given that an RCT is used, any difference between the two groups can be more confidently attributed to the intervention rather than external factors. In applied research in the field of public administration, RCTs significantly contribute to the development of evidence-based research, which, in turn, helps ensure that public policies are anchored in rigorous data rather than assumptions. Regarding potential selection biases, RCTs “generate an experimental control group constituted of subjects who would have participated but were randomly excluded from the program” (Bertelli and Riccucci, Reference Bertelli and Riccucci2022, p. 181). Studies of this type favour the optimization of resources by increasing the likelihood that public funds will be allocated to programmes with a genuine impact. Moreover, they permit scalability, implying that a successful programme can be expanded with greater certainty that it will function effectively on a larger scale. However, the use of RCTs provides information about average treatment effects (ATEs) but not about atypical cases (Heckman and Smith, Reference Heckman and Smith1995).

It is important to recognize that scalability constitutes a persistent challenge for all experimental approaches in public administration. Even well-designed field experiments and RCTs may exhibit attenuated effects when interventions are implemented at scale, a phenomenon widely discussed in the experimental literature as the “voltage effect” (List, Reference List2022). Differences in implementation capacity, institutional context, and behavioural adaptation can substantially weaken treatment effects outside the original experimental setting, highlighting the limits of direct policy extrapolation from experimental findings.

Although often regarded as the benchmark for causal inference, RCTs are not without methodological challenges. The credibility of RCT-based findings depends on rigorous implementation, including successful randomization, adequate treatment take-up, and high levels of implementation fidelity. In practice, issues such as contamination between treatment and control groups, differential attrition, or non-compliance can introduce imbalances that compromise causal identification. These risks are particularly salient in public administration settings, where organizational complexity, political constraints, and adaptive behaviour by participants may interfere with experimental protocols. Acknowledging these limitations underscores the importance of careful design, monitoring, and reporting practices across all experimental approaches, rather than privileging any single design as methodologically flawless.

A second experimental method in the study of public administration corresponds to natural experiments. These experiments permit the identification of causal relationships in real-world contexts. Unlike controlled experiments, they examine events, reforms, or policy changes that generate exogenous variation in the variables of interest. This is particularly appealing in the study of public administration given the complexities underlying the design and implementation of plans, programmes, and public policies (Bouwman and Grimmelikhuijsen, Reference Bouwman and Grimmelikhuijsen2016).

A natural experiment occurs when an external event or policy variation generates differences between groups in a quasi-random manner. Some of such experiments’ key characteristics include exogeneity, as the intervention is not determined by the study subjects; comparability, as treatment and control groups can be identified; and relevance for public policy, given the application to real-world scenarios (Cárdenas and Ramírez De La Cruz, Reference Cárdenas and Ramírez De La Cruz2017). The use of natural experiments offers several advantages. Since they occur in real contexts, they permit causal inferences without the need for traditional experimental designs. Moreover, experiments of this type can be applied to large-scale scenarios in public administration and help avoid ethical issues associated with the arbitrary assignment of interventions (Galope et al., Reference Galope, Bilyk and Woldeab2024).

However, such experimental designs also present challenges. For example, identifying exogenous variables can be complex, as many variables used in natural experiments are not entirely exogenous but may be correlated with unobserved factors that also affect the dependent variable. In this sense, it is difficult to ensure that the events or institutional changes used as ‘treatments’ are truly random and not the result of endogenous processes.

Furthermore, in these experiments, simultaneous effects or feedback loops may create confusion, as the relationship between the treatment variable and the outcome may be bidirectional, complicating causal identification (Dunning, Reference Dunning2015). Similarly, feedback may occur in situations where outcomes influence the variable that is assumed to be exogenous.

Other potential problems may also affect these experiments at the selection and treatment levels. People may react strategically to the intervention, affecting the results in a non-random way, and heterogeneity in treatment response may make it difficult to estimate average effects. For example, in the allocation of subsidies or CCTs in public policies, such problems may arise if some households attempt to strategically adjust their reported income to qualify for the subsidy, temporarily reducing their formal income (e.g., by working fewer hours or reporting a lower income on tax returns). This behaviour introduces selection bias since beneficiaries are not randomly assigned but are conditioned by their ability to adapt strategically to the programme’s rules.

Moreover, even if selection bias could be controlled, households are still likely to react heterogeneously to the subsidy. For instance, some might use it to improve their quality of life, while others might allocate it to less productive uses (such as immediate consumption of non-durable goods). This heterogeneity in responses makes it difficult to estimate an ATE that is representative of the entire population (Mossberger and Wolman, Reference Mossberger and Wolman2003).

In addition, in experiments of this type, the exogenous variable’s effects may extend beyond the treated group, affecting control units and biasing estimates. Suppose a country significantly increases the minimum wage in certain regions/cities while keeping it constant in others. A study could consider this variation a natural experiment, comparing employment in affected (treated) and unaffected (control) regions/cities. However, if companies in control regions adjust their wages in response to the increase elsewhere (to retain workers or due to union pressures), the control group ceases to be a valid comparison because it is also indirectly affected by the treatment. This type of contamination of the control group would bias the estimation of the minimum wage’s impact since its effects are not confined to the treated units but also extend to those that should serve as an unaffected reference (Reeves et al., Reference Reeves, McKee, Mackenbach, Whitehead and Stuckler2017). These experiments therefore call for strong assumptions with a clear identification of the counterfactual to establish causal relationships, either through difference-in-differences (DiD) analysis, regression discontinuity design (RDD), or instrumental variables (IV), since manipulation of the treatment threshold can invalidate causal identification.

In terms of generalization and external validity, a natural experiment can provide credible estimates in a specific context, but its applicability to other cases is limited. Replicability is challenging because natural experiments depend on unique events. To address these issues, it is crucial to conduct robustness tests, use multiple identification strategies, and complement quantitative analyses with qualitative evidence when possible.

Various examples of natural experiments can be observed in the field of public administration.

A classic example is their use to assess the impact of administrative reforms on government efficiency as, for instance, in the case of changes in public procurement systems, which provide an opportunity to analyse their effects on transparency and corruption. Similarly, some decentralization reforms occur unevenly across regions, permitting comparison of the impact of fiscal and administrative autonomy on public service delivery. Finally, the implementation of FOI laws at different times and in different jurisdictions has permitted analyses of their effects on perceptions of corruption and trust in institutions.

A third experimental method is A/B testing in digital governance. Also known as split testing or online randomized experimentation, A/B testing is a benchmarking method frequently used on digital platforms (Quin et al., Reference Quin, Weyns, Galster and Silva2024). Its main objective is to measure the impact of different versions of the same interface (connection), policy, or message on the behaviour of a platform’s users. In the field of digital governance, governments and public administrations have begun to use A/B testing to optimize the delivery of online services, improve communication with citizens, and increase the efficiency of digital public policies (Malik et al., Reference Malik, Mittal, Mavaluru, Narapureddy, Goyal, Martin, Srinivasan and Mittal2023).

A/B testing involves randomly dividing users into two or more groups to evaluate how they respond to different versions of a given variable. For example, if the goal is to analyse the impact of a communication change on user behaviour on a website, a control group could receive the standard or current version of the website, while a treatment group receives an alternative version with a specific change (design, wording, format, structure, and so on). The performance of each version is measured based on key metrics such as conversion rates, user interaction, response time, and user satisfaction (Siroker and Koomen, Reference Siroker and Koomen2013).

This type of experimental technique has become increasingly common for studying the implementation of digital governance strategies (Polonioli et al., Reference Polonioli, Ghioni, Greco, Juneja, Tagliabue, Watson and Floridi2023). Specific applications of A/B testing can be identified. First, in the case of the optimization of digital public services, governments use A/B testing to improve the usability and accessibility of digital platforms, such as government portals, by evaluating different menu designs, information layouts, and calls to action (CTAs) on official websites. The technique has also been employed to assess the efficiency of tax systems or administrative procedures on online platforms by, for instance, comparing different interfaces to determine which version better facilitates the intended actions (e.g., tax payment or the completion rate of administrative procedures) (Groth and Haslwanter, Reference Groth and Haslwanter2016).

A/B testing can also be valuable for improving governmental communication strategies through its use to evaluate how different forms of communication affect citizens’ responses (whether through emails, text messages, or notifications on government platforms). It can also be useful for enhancing transparency and citizen participation policies involving, for example, the design of public opinion surveys (by testing different question formats to achieve higher response rates) and citizen consultation platforms (by comparing simpler or more interactive interfaces to encourage participation) (Polonioli et al., Reference Polonioli, Ghioni, Greco, Juneja, Tagliabue, Watson and Floridi2023).

The benefits of using A/B testing in digital governance lie in its evidence-based nature, allowing digital government decisions to be based on actual data rather than assumptions; its lower cost and greater efficiency, helping to avoid costly changes without prior testing; its ability to enhance the user experience by optimizing how citizens interact with online services; and its contribution to reducing the digital divide, as it supports the development of more inclusive and accessible interfaces (Widiarso et al., Reference Widiarso, Muthohar, Tombe and Novianti2023). However, the use of this technique entails challenges and ethical considerations. In the case of privacy and data protection, it is essential to ensure that citizens’ data are anonymized and safeguarded. Secondly, to avoid biases in group assignment, randomization must be rigorous so as not to favour certain segments of the population. Finally, to uphold ethical standards in governmental experimentation, testing must be transparent and must not infringe citizens’ fundamental rights.

A fourth experimental method is field experiments, which have become one of the most effective and externally valid strategies for identifying causal relationships in real-world public administration settings (Hansen and Tummers, Reference Hansen and Tummers2020). In contrast to laboratory or survey-based experiments, field experiments take place in actual policy environments and involve real public officials, institutions, or citizens engaged in authentic decision-making processes. This realism allows researchers to test interventions, policies, or behavioural mechanisms in the context of the government agencies, municipal offices, regulatory bodies, or public service delivery settings where they naturally take place.

Field experiments share many core features with RCTs since they also involve deliberate manipulation of a treatment, random assignment to treatment and control groups, and measurement of outcomes. However, their distinguishing characteristic is that they are implemented in ‘the field’ or, in other words, real institutional environments rather than artificial or simulated contexts. This makes field experiments particularly valuable for studying the behaviour of public servants, the implementation of policy reforms, and the effectiveness of public sector interventions (Grohs et al., Reference Grohs, Adam and Knill2016).

A field experiment is a type of randomized controlled experiment in which manipulation of the independent variable occurs in a naturalistic setting, rather than a laboratory or survey context. The researcher introduces a treatment (e.g., a policy message, an incentive structure, or a procedural change) to some participants while others serve as controls. Importantly, subjects are usually unaware they are part of an experiment because the treatment is embedded in normal organizational routines. This allows researchers to observe genuine behaviour rather than hypothetical responses.

Field experiments differ from other types of experiments in terms of fieldness (Eden, Reference Eden2017), according to seven criteria (Hansen and Tummers, Reference Hansen and Tummers2020, p. 922), which include: (a) the intervention is realistic, (b) participants encounter the treatment in the real world, (c) the context is natural, and (d) outcome measures mirror the outcome of interest. The authors incorporate three further criteria considering Czibor et al. (Reference Czibor, Jimenez‐Gomez and List2019): (e) whether the population consists of students or the population of interest, (f) whether the environment is artefactual or natural, and (g) whether the treatment is overt or covert. Field experiments are especially well suited to public administration because they can be implemented through collaboration with government agencies, civil service departments, or public institutions, using existing processes as platforms for intervention.

In the discipline of public administration, field experiments typically exhibit several key characteristics (Linos, Reference Linos2018). First, regarding institutional embeddedness, they take place within public institutions (ministries, local governments, courts, schools, hospitals, or regulatory agencies). This setting ensures that the treatment is evaluated within the constraints, routines, and power dynamics of actual public management. Second, in contrast to laboratory settings, the participants in the experiment are real stakeholders (actual public servants, citizens, or decision-makers), with a direct stake in the outcomes, and their behaviour affects real policies or service provision. Third, when researchers are carrying out these experiments, random assignment occurs in natural contexts, which can eventually mimic or modify actual government communication or behaviour. Randomization remains crucial for causal identification but occurs in ways that preserve the everyday logic of public institutions (Bruhn and McKenzie, Reference Bruhn and McKenzie2009; Van Es and Van Es, Reference Van Es and Van Es1993).

The conduct of these experiments in the field ensures high external validity in public administration. Their findings are more likely to generalize to other public settings than those from hypothetical or laboratory scenarios. Recent studies in Latin America – such as Mikkelsen et al. (Reference Mikkelsen, Schuster, Meyer‐Sahling and Wettig2022), who analysed the impact of professionalization training among over 3,000 Chilean civil servants – demonstrate how field experiments can capture variations in ethical awareness and bureaucratic behaviour across different institutional settings.

Complementing this, field experiments are designed to capture observed behaviour, not just attitudes or perceptions, which is relevant for studying the differences between the beliefs and attitudes/conduct of bureaucrats facing challenging scenarios. These experiments are therefore powerful for studying phenomena such as corruption (Armantier and Boly, Reference Armantier and Boly2011), public service motivation and bureaucratic performance (Bellé, Reference Bellé2014; Christensen and Wright, Reference Christensen and Wright2018; Rasul and Rogger, Reference Rasul and Rogger2018), citizen-state interactions (Chaudhry, Reference Chaudhry2023), the impact of training and professionalization programmes on public personnel (Jakobsen et al., Reference Jakobsen, Jacobsen and Serritzlew2019; Mikkelsen et al., Reference Mikkelsen, Schuster, Meyer‐Sahling and Wettig2022), and patterns of implicit bias (discrimination vs. fairness) in how bureaucrats respond to inquiries or allocate resources (Prendergast, Reference Prendergast2003).

In another important use, field studies have been applied to anti-corruption interventions. For instance, a recent study by Falisse and Leszczynska (Reference Falisse and Leszczynska2022) in Burundi examined how anti-corruption sensitization messages influence public servants’ behaviour in both service delivery and bribe-taking. In the study, 527 officials were tasked with allocating limited resources among citizens, some of whom offered bribes. The experiment tested the impact of brief messages invoking either good governance or professional identity. Exposure to the professional identity message led to a fairer and more equitable allocation, suggesting that activating ethical commitments can raise moral costs and improve service outcomes. However, neither message had a significant effect on bribe-taking. These findings highlight a distinction between fairness in public service provision and individual susceptibility to corruption, offering important insights for the design of targeted integrity interventions.

Another stream of work examines discrimination and fairness in service delivery (Cárdenas et al., Reference Cárdenas, Candelo, Gaviria, Polania and Sethi2008). By systematically varying the characteristics of hypothetical citizens, such as names, ethnicity, or gender, in official requests, researchers have uncovered patterns of implicit bias in bureaucratic responsiveness (Costa, Reference Costa2017). Such studies highlight enduring equity challenges in public administration and help identify mechanisms of exclusion and differential treatment.

Finally, a recent growing area of studies involves training and professionalization programmes (Dorssom, Reference Dorssom and Bennion2024). Field experiments have been used to test the effectiveness of ethics training, internal communications strategies, or nudges aimed at improving bureaucratic conduct. In some cases, public officials receive different versions of weekly motivational emails, allowing researchers to measure subsequent impacts on service quality or citizen satisfaction.

Designing field experiments in public administration requires close collaboration with public institutions and careful attention to ethical and operational considerations. Key elements include identifying a manipulable intervention, selecting the appropriate unit of analysis, ensuring rigorous randomization, and defining valid outcome measures (Hansen and Tummers, Reference Hansen and Tummers2020). These experiments have particular policy relevance because they test interventions in real-world settings, capturing actual behaviour rather than stated preferences, and offer scalable insights. Moreover, they foster collaboration between scholars and practitioners. However, field experiments also entail significant challenges, including ethical concerns around consent and risk, logistical hurdles in implementation, potential treatment contamination, vulnerability to external shocks, and limits on generalizability. Despite these constraints, well-designed field experiments are a valuable tool for generating actionable evidence in public sector governance. Field experiments are a cornerstone of experimental public administration studies. They combine methodological rigour with real-world relevance, enabling scholars and practitioners to understand not only whether policies work, but also how and why. When implemented carefully and ethically, field experiments can yield robust causal evidence that informs better governance, enhances public accountability, and improves administrative performance. As the field of public administration continues to evolve, such experiments will play an increasingly central role in bridging research and practice in public service.

Despite their growing appeal, experiments in public administration are not without significant challenges. Ethical considerations include the need to ensure that no group is disproportionately burdened or denied essential services, while political sensitivities can deter decision-makers from engaging in initiatives that risk exposing policy shortcomings. Logistical complexity, especially in large-scale designs involving multiple agencies, can hinder implementation, and public scepticism may arise when experiments are perceived as intrusive or technocratic (Margetts, Reference Margetts2011). Nonetheless, when designed and conducted with care, experiments offer a rigorous and transparent means of evaluating public policies. They provide actionable insights, promote accountability, and contribute to evidence-based governance. As such, the experimental approach represents not only a methodological innovation but also a normative commitment to more responsive and effective public administration.

2.4 Causal Inference and Analysis of Experiments in Public Management Research

Causal inference lies at the heart of experimental public administration studies (Hansen and Tummers, Reference Hansen and Tummers2020). Whether implemented as RCTs, natural experiments, or survey-based manipulations, experimental studies aim to identify the effect of specific treatments or interventions on outcomes related to ethics, behaviour, or institutional performance. However, drawing valid causal conclusions from experiments calls for attention to design logic, statistical estimation, and potential biases.

At the core of causal inference is the counterfactual logic: what would have happened to a subject had they not received the treatment? In randomized experiments, random assignment ensures that the treatment and control groups are statistically equivalent at baseline, allowing any differences in outcomes to be attributed to the treatment with high internal validity (Bertelli and Riccucci, Reference Bertelli and Riccucci2022). In public administration, this principle is essential for understanding whether a specific message, norm, or policy change genuinely causes shifts in ethical behaviour or perception.

However, challenges emerge when moving beyond ideal conditions. In natural experiments, for example, assignment to treatment is determined by external shocks or policy variations, which may not be fully random. Researchers must then demonstrate as-if random assignment and often rely on quasi-experimental estimators such as DiD, RDD, or IV. These techniques, which attempt to mimic randomization, require strong assumptions about comparability, functional form, and exclusion restrictions (Dunning, Reference Dunning2015).

Survey experiments, although randomized, face different challenges. They include the salience and credibility of treatments, ceiling or floor effects in outcome measures, and differential attrition or nonresponse. Moreover, the measurement of latent constructs, such as corruption tolerance, ethical sensitivity, or willingness to report misconduct, calls for well-validated instruments and attention to social desirability bias (Kim and Kim, Reference Kim and Kim2016).

When these challenges are addressed at the design stage, appropriate statistical techniques can be used to estimate causal effects reliably. To address these concerns, researchers increasingly apply robust statistical techniques to estimate treatment effects (Bouwman and Grimmelikhuijsen, Reference Bouwman and Grimmelikhuijsen2016). These include:

Ordinary least squares (OLS) and logistic regression, with covariate adjustment to improve precision
Randomization inference, particularly in small samples or blocked designs
Bayesian methods, which can incorporate prior knowledge and account for uncertainty more flexibly
Multi-level models, which are particularly useful when treatments are clustered at institutional or geographical levels

It is important to emphasize that these challenges are primarily matters of research design rather than statistical estimation. While robust statistical techniques are essential for analysing experimental data, they cannot compensate for shortcomings related to treatment construction, outcome measurement, or the distinction between beliefs, attitudes, and behaviour. The role of statistical estimators in experimental research is therefore conditional: when randomization is properly implemented and outcomes are conceptually well defined, estimators such as difference-in-means or regression-based adjustments provide unbiased and efficient estimates of causal effects. Conversely, when design choices conflate conceptual dimensions or rely on weak or purely hypothetical treatments, no estimator can recover the underlying causal mechanism. In this sense, statistical analysis should be understood as complementing, rather than substituting, careful experimental design.

However, it should be noted that, under proper random assignment, the ATE is fully identifiable through simple difference-in-means estimators, such as t-tests or ANOVA, without the need for regression adjustment. Regression-based approaches are primarily used to improve precision, incorporate covariates, or account for clustered data structures.

Equally important is the inclusion of manipulation checks, pre-analysis plans, and robustness tests. These enhance the credibility of results and reduce the risk of post hoc rationalization or multiple testing. When feasible, researchers should also combine experimental results with administrative records, behavioural logs, or qualitative interviews to validate findings and explore mechanisms.

The distinction between ATEs and heterogeneous treatment effects (HTEs) is another frontier in experimental public administration and policy studies (Chen et al., Reference Chen, Sridhar and Mittal2021). Increasingly, scholars recognize that interventions may not work uniformly across all bureaucrats or agencies. For instance, exposure to codes of ethics may deter corruption among junior officials but be ineffective or even counterproductive among entrenched elites. Estimating moderation effects (e.g., by organizational culture, tenure, or political alignment) can sharpen policy recommendations and enhance external validity.

Several of the statistical and methodological concepts referenced in this section warrant brief clarification. Ceiling and floor effects refer to situations in which outcome measures cluster at their upper or lower bounds, limiting the ability to detect treatment effects. Differential attrition arises when dropout rates differ systematically across treatment conditions, potentially biasing estimates if not addressed. Covariate adjustment and multi-level models are commonly used to improve precision and account for hierarchical data structures typical of public organizations, while Bayesian methods offer a probabilistic framework for estimation that can be particularly useful in small samples or complex designs. Manipulation checks are employed to verify whether experimental treatments were perceived as intended, and corrections for multiple testing are necessary to avoid inflated false-positive rates when researchers are estimating several effects simultaneously.

With respect to causal estimands, it is important to distinguish between the ATE, which captures the mean effect of an intervention across all units; treatment effects on the treated (TOT or THE), which focus on those who actually receive the treatment; and intention-to-treat (ITT) effects, which estimate the effect of assignment to treatment regardless of compliance. Although these estimands address different inferential questions, their interpretation depends fundamentally on research design choices rather than statistical estimation alone.

Finally, causal inference in corruption research must contend with ethical limitations. It is neither feasible nor desirable to induce actual corrupt acts in experimental settings (Fontaine et al., Reference Fontaine, Milán, Hernández-Luis, Peters and Fontaine2022). Thus, researchers must rely on indirect measures – vignettes, hypothetical scenarios, and behavioural proxies – which, although insightful, limit the strength of causal claims. Creative research designs, such as list experiments, endorsement experiments, and incentivized behavioural games, can help mitigate this limitation while preserving ethical integrity.

In conclusion, the strength of experimental research in public administration depends not only on design but also on careful and transparent inference. Advances in statistical methods and experimental ethics now enable public administration scholars to make credible claims about the causes of corruption and unethical behaviour and their possible solutions. However, this requires methodological humility, transparency in reporting, and sustained dialogue between theory, evidence, and practice.

Readers seeking accessible introductions to the statistical and inferential concepts referenced in this section may consult standard methodological texts widely used in experimental and applied research. Gerber and Green (Reference Gerber and Green2012), Angrist and Pischke (Reference Angrist and Pischke2009), and Imbens and Rubin (Reference Imbens and Rubin2015) provide clear and authoritative discussions of causal inference, experimental design, and estimation strategies that are applicable across disciplines. For perspectives more closely aligned with behavioural and public administration research, Margetts (Reference Margetts2011) and Olsen et al. (Reference Olsen, Hjorth, Harmon and Barfort2019) offer accessible treatments of experimental methods and behavioural approaches in public sector contexts.

2.5 Mathematical Models for Causal Inference in Survey Experiments

The growing use of experimental methods in public administration – as outlined in Section 2.1 – rests on solid mathematical and statistical foundations that allow scholars to identify causal effects with transparency and rigour. Whether researchers deploy RCTs, natural experiments, A/B testing, or field experiments, all share a common formal logic rooted in the Rubin Causal Model (RCM), also known as the Potential Outcomes Framework (Imbens and Rubin, Reference Imbens and Rubin2008; Rubin, Reference Rubin2005).

At the core of the Potential Outcomes Framework is the fundamental “missing data problem” of causal inference: for any given unit, it is impossible to observe both the outcome under treatment and the outcome under control at the same time. Causal effects are therefore defined as comparisons between potential outcomes that are, by definition, partially unobservable. Experimental designs address this problem through random assignment, which ensures that – on average – the units assigned to the treatment and control conditions are statistically equivalent prior to the intervention. As a result, observed outcomes in the control group provide a valid counterfactual for what would have happened to treated units in the absence of treatment, and vice versa. In this sense, experiments do not eliminate the missing data problem at the individual level, but they solve it at the group level by permitting unbiased estimation of average causal effects.

At the core of this model is the idea that causal effects can be conceptualized as comparisons between potential outcomes: what would happen to an individual or administrative unit under treatment versus control. Let $Y_{i}$ (1) and $Y_{i}$ (0) represent the potential outcomes for unit $i$ under treatment and control, respectively. The individual-level treatment effect is: vel treatment effect is:

τ_{i} = Y_{i} (1) - Y_{i} (0)

Since both outcomes are never simultaneously observable, researchers aim to estimate the ATE:

ATE = E [Y_{i} (1) - Y_{i} (0)]

This causal estimand is identifiable under specific conditions, particularly when treatment is assigned randomly. This is the case in RCTs and survey experiments, which offer the highest internal validity. In these designs, causal inference is straightforward:

\hat{τ} = E [Y_{i} | D_{i} = 1] - E [Y_{i} | D_{i} = 0]

where $D_{i} \in {0, 1}$ is the treatment indicator. This difference in means can also be estimated through a simple regression model:

Y_{i} = α + τ D_{i} + ε_{i}

where $τ$ captures the causal effect of the treatment on the outcome of interest (e.g., ethical sensitivity, willingness to report misconduct, or corruption tolerance), and $ε_{i}$ is an idiosyncratic error term.

This framework directly underpins the survey experiments and RCTs described in Section 2.1. For example, in a randomized survey experiment where public servants receive different vignettes about peer behaviour, the treatment assignment $D i$ is exogenous by design. As long as assumptions such as Stable Unit Treatment Value Assumption (SUTVA), random assignment, and no interference hold, $τ$ can be interpreted as a credible causal effect (Boesche, Reference Boesche2022).

2.5.1 Linking to Experimental Types

Each experimental method described in Section 2.1 has implications for mathematical specification and identification:

Randomized controlled trials (RCTs): These are the gold standard, with randomization ensuring independence between treatment status and potential outcomes. Linear regression or logistic models are commonly used, and covariate adjustment can improve precision.
Natural experiments: While the Rubin framework still applies, identification hinges on as-if random exposure to treatment (e.g., institutional reform, policy shock). Estimation may require quasi-experimental techniques such as:
- – Difference-in-differences (DiD):
  $Δ DiD (YT, post-YT, pre) - (YC, post-YC, pre)$
- – Instrumental variables (IV) and RDD, which call for strong assumptions for valid counterfactual identification.
A/B testing: This is mathematically identical to RCTs but applied in digital environments (e.g., different messages on a government platform). The logic remains the same. Often analysed using online experimentation platforms, A/B tests require attention to sample size, randomization integrity, and outcome metrics (e.g., click-through rate, completion rate).
Field experiments: These also follow the RCM framework but are implemented in real-world bureaucratic contexts. They often involve cluster randomization, where the unit of analysis is not the individual but the organization (e.g., department, agency). In such cases, multi-level modelling is necessary:
$Y_{i j} = α_{j} + τ D_{i j} + ε_{i j}, with α_{j} \sim N (μ, σ 2)$

This specification accounts for intra-cluster correlation, which is critical to avoid overestimating precision in public administration settings with nested organizational structures (Esarey and Menger, Reference Esarey and Menger2019).

2.5.2 Moderation, Heterogeneity, and Multi-level Dynamics

Public administration scholars are increasingly interested not only in estimating average effects but also in understanding HTEs or, in other words, how different types of bureaucrats (e.g., by seniority, agency culture, or sector) respond differently to the same intervention (Doberstein, Reference Doberstein2017):

Y_{i} = α + τ D_{i} + γ X_{i} + (D_{i} \times X_{i}) + ε_{i}

Here, $X_{i}$ is a moderator (e.g., years of service) and $δ$ captures differential responsiveness to treatment. Such models are critical in administrative systems where discretion and politicization vary across units.

In field or clustered designs, random intercept or random slope models permit partial pooling of information across groups, improving inference in underpowered samples while accounting for institutional hierarchy.

2.5.3 Implications for Theory and Policy

Mathematical formalization enhances transparency, replicability, and cumulative theory-building in public administration. It enables researchers to clarify assumptions, define precise estimands, and connect theory with data. In policy terms, it allows governments to test interventions (e.g., ethics training, transparency nudges) in pilot phases prior to full-scale implementation, improving evidence-informed reform.

Importantly, each experimental design outlined in Section 2.1 maps onto specific mathematical strategies:

RCTs and survey experiments → Randomization inference, regression
Natural experiments → DiD, IV, RDD
A/B testing → Binary outcome models with online metrics
Field experiments → Multi-level models, ITT, and covariate adjustment

By anchoring these methods in the Rubin framework, this Element contributes to more causal, experimental, and analytically rigorous studies of public administration, a field where complexity must be met with both ethical sensitivity and methodological precision.

The relationship between experimental designs and statistical estimation strategies should be understood as conditional rather than deterministic. Experimental designs – such as survey experiments, field experiments, laboratory experiments, or policy-based A/B tests – define how treatment is assigned and what sources of variation are available. By contrast, mathematical models and estimators correspond to the causal estimand of interest and the structure of the data, rather than to the experimental format per se. Simple difference-in-means estimators, regression adjustment, multi-level models, DiD, or IV may all be appropriate within the same experimental design, depending on whether researchers seek ATEs, conditional effects, ITT effects, or leverage pre-treatment measures.

For example, A/B tests are not analytically distinct from other randomized experiments in terms of estimators, nor are field experiments inherently associated with multi-level models or ITT estimands. Likewise, quasi-experimental strategies such as DiD or RDDs reflect identification strategies rather than experimental ‘types’ and may be implemented using linear regression frameworks across a wide range of designs. In this sense, experimental designs constrain but do not dictate the choice of mathematical strategy. Estimation should therefore be guided by the causal question, the available variation, and the data structure, rather than by a one-to-one mapping between experimental formats and statistical models.

3 Research Protocols for Different Survey Experiments

3.1 Survey Experiments for Public Administration

Survey experiments are a powerful methodological tool for examining attitudes, beliefs, and behavioural intentions in public administration. They combine the benefits of controlled experimental manipulation with the ability to reach large and diverse populations through survey instruments. This hybrid approach is particularly useful when studying sensitive topics such as corruption, where direct observation or administrative data may be unavailable or ethically problematic.

Survey experiments encompass a range of design types that differ in complexity, inferential leverage, and suitability for specific research questions. Making these design choices explicit is particularly important in public administration research, where ethical sensitivity, organizational realism, and respondent burden must be carefully balanced. This section therefore distinguishes between several commonly used survey experimental designs – single-factor experiments, factorial designs, conjoint analysis, and list experiments – and discusses their respective advantages and limitations for studying corruption and unethical behaviour in bureaucratic settings.

As Table 5 illustrates, survey experiments are not a monolithic approach but encompass a diverse set of designs that vary in complexity, inferential leverage, and suitability for different research objectives. Single-factor experiments offer analytical clarity and low respondent burden, making them well suited for isolating specific causal mechanisms, while factorial designs permit the examination of interaction effects that are central to many theoretical frameworks in corruption research. Conjoint experiments provide greater contextual realism by capturing multidimensional decision-making processes, particularly in settings where bureaucrats face complex trade-offs. List experiments, in turn, address one of the most persistent challenges in corruption research – social desirability bias – by permitting indirect measurement of sensitive attitudes or behaviours, albeit at the cost of reduced statistical power and limited insight into individual-level mechanisms. Taken together, these design options underscore the importance of aligning theoretical expectations, ethical considerations, and practical constraints when selecting survey experimental strategies to study integrity and corruption in public administration.

Table 5Survey Experimental Designs in Public Administration

Table 5Survey Experimental Designs in Public Administration
Design type	Core characteristics	Advantages	Limitations	Typical applications in corruption research
Single-factor experiments	One treatment dimension manipulated across conditions (e.g., presence vs. absence of monitoring)	Simple to implement; easy to interpret; low cognitive burden for respondents	Limited ability to test interactions; may oversimplify complex decision contexts	Testing the effect of a single incentive, sanction, or ethical cue
Factorial designs (e.g., 2×2)	Multiple treatment dimensions manipulated simultaneously, permitting estimation of interaction effects	Permits testing of conditional effects and theoretical interactions; efficient use of samples	Increased design complexity; higher respondent burden; interpretation more demanding	Examining how incentives interact with norms or monitoring with organizational culture
Conjoint experiments	Respondents evaluate multidimensional profiles composed of randomly varied attributes	High contextual realism; well suited for complex trade-offs; estimates marginal effects of attributes	Cognitively demanding; less suitable for causal narratives or process tracing	Studying trade-offs in recruitment, promotion, procurement, or ethical dilemmas involving multiple attributes
List experiments (item count technique)	Indirect measurement using item counts to mask a sensitive item in the treatment group	Reduces social desirability bias; suitable for illicit or stigmatized behaviours	Lower statistical power; strong assumptions about comprehension and design; limited insight into individual-level mechanisms	Estimating prevalence of tolerance for bribery, rule violations, or other unethical practices

Source: Compiled by the author.

In the context of public administration, survey experiments allow researchers to assess how bureaucrats and citizens respond to specific cues, information treatments, or hypothetical scenarios that mirror real-life dilemmas. By randomly assigning respondents to different conditions and measuring their responses to these manipulations, scholars can isolate the causal effects of particular frames, incentives, norms, or contextual variables on outcomes of interest such as ethical decision-making, perceptions of integrity, or willingness to report misconduct.

A growing body of literature has demonstrated the usefulness of survey experiments in this field. For instance, vignette-based experiments can explore how bureaucrats respond to ambiguous ethical dilemmas in varying institutional or cultural contexts (Christensen and Wright, Reference Christensen and Wright2018; Clifford et al., Reference Clifford, Sheagley and Piston2021). Other designs focus on the role of normative cues – such as exposure to codes of conduct, transparency messages, or peer behaviour – to test whether these interventions shift respondents’ likelihood of condoning or condemning corruption.

The appeal of survey experiments lies in their flexibility and scalability. Researchers can test theoretical mechanisms with large samples at a relatively low cost, tailor instruments to specific administrative contexts, and replicate studies across different regions or bureaucratic levels. Furthermore, survey experiments are particularly well suited for probing belief-attitude gaps or, in other words, the distinction between what individuals claim to believe and how they might behave when confronted with an ethically complex situation.

Nevertheless, the application of survey experiments in public administration research poses several challenges. These include issues of internal validity (e.g., whether the manipulation is salient or credible), external validity (e.g., whether responses reflect real-world behaviour), and measurement (e.g., distinguishing between normative judgements and behavioural intentions). Moreover, bureaucratic respondents may provide socially desirable responses, especially in contexts where institutional trust is fragile or professional norms are ambiguous.

To address these concerns, best practices in survey experiment design emphasize the need to ensure treatment realism and clarity, including manipulation checks to confirm treatment uptake, pre-testing vignettes with target populations, and combining survey data with administrative or behavioural indicators where possible. When implemented rigorously, survey experiments provide valuable insights into the cognitive and normative underpinnings of ethical behaviour in the public sector and offer a robust platform for theory testing and policy design.

While experimental approaches share a common commitment to causal inference, they differ systematically in terms of feasibility, cost, internal and external validity, and contextual realism. Clarifying these trade-offs is particularly important for public administration research, where ethical constraints, organizational access, and resource limitations often shape methodological choices. Table 6 provides a comparative overview of survey experiments, field experiments (including RCTs), and laboratory experiments, highlighting their respective strengths and limitations for studying corruption and unethical behaviour in public administration.

Table 6Comparison of Experimental Approaches in Public Administration

Table 6Comparison of Experimental Approaches in Public Administration
Dimension	Survey experiments	Field experiments (incl. RCTs)	Laboratory experiments
Practical feasibility	High; can be implemented remotely with minimal organizational disruption	Moderate to low; require access to organizations and cooperation from authorities	High; conducted in controlled research settings
Cost	Relatively low; scalable to large samples	High; costly in terms of coordination, time, and implementation	Moderate; costs mainly related to facilities and recruitment
Internal validity	High; randomization and controlled manipulations	High when implementation fidelity is maintained	Very high due to tight experimental control
External validity	Moderate; depends on sample and scenario realism	Potentially high but sensitive to context and scalability	Often limited due to artificial settings
Ecological validity (contextual realism)	Moderate to high; realistic vignettes can approximate administrative decision contexts	High; interventions occur in real-world settings	Low to moderate; simplified or abstract tasks
Ethical and access constraints	Relatively low; suitable for sensitive topics	High; ethical, legal, and political constraints are common	Low; fewer ethical constraints but limited realism
Typical outcomes measured	Attitudes, judgements, behavioural intentions	Observed behaviour and policy-relevant outcomes	Behaviour under controlled conditions
Suitability for studying corruption	High; well suited for sensitive and illicit behaviours	Limited; direct observation often infeasible	Moderate; useful for theory testing but abstract

Source: Compiled by the author.

3.2 Design of Protocols for Experiments in Public Administration

The research protocol proposed in this section is intended as a structured, high-level framework rather than a rigid step-by-step manual. Its purpose is to guide researchers through the key design decisions involved in conducting survey experiments in public administration, particularly when studying sensitive topics such as corruption and unethical behaviour. Rather than prescribing uniform solutions, the protocol highlights critical decision points, the questions researchers should ask at each stage, and the trade-offs associated with alternative methodological choices. In doing so, it aims to support transparent, theoretically informed, and context-sensitive experimental design. Developing robust protocols for survey experiments in public administration requires a careful balance between methodological rigour and sensitivity to institutional and political contexts. The credibility of survey experiments rests on transparent design, random assignment, treatment integrity, and ethical safeguards, particularly when addressing topics such as corruption or professional misconduct (Hansen and Tummers, Reference Hansen and Tummers2020; Walker et al., Reference Walker, Brewer, Lee, Petrovsky and Van Witteloostuijn2019).

The choice among different survey experimental designs has direct implications for research protocols in public administration. Single-factor experiments are particularly suitable when the objective is to isolate a clearly defined causal mechanism under minimal cognitive load. Factorial designs are preferable when theoretical frameworks posit interaction effects between incentives, norms, and contextual constraints. Conjoint experiments, by contrast, are most appropriate when decision-making involves multidimensional trade-offs that closely resemble real administrative choices. Explicitly aligning design selection with theoretical expectations and practical constraints is therefore a central component of rigorous survey experimental protocols.

A typical protocol for a survey experiment in public administration includes the following key components:

Research Question and Hypotheses
Protocols must begin with a well-defined causal question grounded in theory: for example, Does exposure to a professional ethics code reduce the acceptability of nepotism among civil servants? Hypotheses should specify the expected direction and mechanism of the treatment effect.
Sample and Recruitment Strategy
Identifying the appropriate sample is crucial. In public administration research, this often involves civil servants, public managers, or frontline bureaucrats. Depending on the study, researchers may recruit participants via institutional partnerships (e.g., government agencies), professional networks, civil service databases, or online platforms using screening criteria. Stratified sampling may be necessary to ensure representativeness across bureaucratic levels, regions, or policy sectors.
Experimental Treatments
Treatments in survey experiments typically take the form of vignettes, normative primes, or information cues. For example, participants might read a scenario where a public official is offered a gift and be randomly assigned to a version of the scenario that includes or excludes institutional consequences or peer norms. Treatments should be designed to vary only in the dimension of theoretical interest and must be pre-tested for clarity, realism, and salience.
Outcome Measures
Outcome variables should align with the causal mechanism of interest. These can include attitudinal outcomes (e.g., support for sanctioning corruption), behavioural intentions (e.g., willingness to report misconduct), and cognitive assessments (e.g., perceived seriousness of an ethical violation). Where possible, researchers should incorporate behavioural proxies (e.g., the anonymized opportunity to donate to an anti-corruption fund) or link responses to administrative behaviour over time.
Randomization and Control
Treatment assignment must be random and documented. Most commonly, this is achieved through computerized randomization algorithms embedded in online survey software (e.g., Qualtrics, SurveyMonkey). Blocking and stratification may be used to ensure balance across relevant covariates such as gender, tenure, or agency type.
Ethical and Institutional Review
Given the sensitive nature of corruption research, protocols should undergo thorough ethical review, including informed consent procedures, anonymity guarantees, and debriefing strategies. Collaborations with public institutions should be formalized through data-sharing agreements, confidentiality protocols, and mechanisms to avoid any coercion of respondents.
Pilot Testing and Validity Checks
Before full deployment, all experimental instruments should be piloted with a small subsample to assess treatment comprehension, survey flow, and potential response biases. Manipulation checks should be included to ensure that treatments are cognitively processed as intended. Attention checks can also improve data quality by identifying inattentive respondents.
Analysis Plan and Causal Inference
A clear analytic strategy should be specified in advance, ideally through a pre-analysis plan or registration (e.g., AEA RCT Registry, OSF). Estimands should be defined (e.g., ATE), and appropriate models (e.g., OLS, logistic regression, covariate adjustment, randomization inference) identified. Researchers should also plan for robustness checks, multiple hypothesis corrections, and subgroup analysis where relevant.

Survey experimental protocols can be customized to reflect the unique challenges of the public sector (Bozeman and Scott, Reference Bozeman and Scott1992). For instance, institutional culture, legal frameworks, and civil service hierarchies may affect both treatment perception and survey response. Hence, protocol development should be iterative and dialogical, incorporating feedback from practitioners, pre-testing in institutional settings, and adjustment for context-specific constraints.

Taken together, the protocol summarized in Table 7 is intended to function as a decision-oriented framework rather than a prescriptive, step-by-step guide. Each component of the protocol highlights a set of critical design choices that researchers must address when conducting survey experiments in public administration, particularly in the study of corruption and unethical behaviour. Rather than offering uniform solutions, the protocol emphasizes the trade-offs associated with alternative methodological options such as simplicity versus representativeness in sampling, feasibility versus contextual realism in recruitment strategies, or experimental control versus ecological validity in treatment design. By organizing these choices around key questions and their implications, the protocol encourages researchers to align theoretical expectations, ethical considerations, and practical constraints in a transparent and reflexive manner. In this sense, Table 7 should be read not as a checklist to be mechanically followed, but as an analytical tool that supports deliberate and context-sensitive experimental design.

Table 7Research Protocol for Survey Experiments in Public Administration

Table 7Research Protocol for Survey Experiments in Public Administration
Protocol component	Key design questions	Main options	Trade-offs and implications
Target population & sampling	Who is the population of interest? Is heterogeneity theoretically relevant?	Simple random sampling; stratified sampling	Simplicity vs. representativeness; statistical efficiency vs. design complexity
Sample size & statistical power	What effect sizes are theoretically meaningful? Is the study sufficiently powered?	A priori power analysis; minimum detectable effects	Precision vs. feasibility; treatment complexity vs. statistical power
Recruitment strategy	How will participants be accessed? What institutional or ethical constraints exist?	Online platforms; institutional partnerships	Cost and speed vs. contextual realism and external validity
Randomization procedure	Are balance concerns salient? Is sample size limited?	Simple randomization; block/stratified randomization	Ease of implementation vs. improved balance and statistical precision
Experimental design choice	Is the focus on main effects or interactions? Are decisions multidimensional?	Single-factor, factorial, conjoint, list experiments	Interpretability vs. realism; power vs. respondent burden
Control group design	Should the control condition reflect absence of treatment or status quo practice?	Placebo control; business-as-usual control	Internal validity vs. realism; interpretability vs. ethical transparency
Treatment intensity & realism	How strong should the manipulation be? How realistic should the scenario appear?	Subtle vs. explicit treatments; vignette complexity	Detectability vs. realism; demand effects vs. weak treatments
Outcome measurement	Are attitudes, intentions, or behaviours being measured?	Self-reported judgements; indirect measures	Feasibility vs. behavioural validity; sensitivity vs. measurement error
Ethical safeguards	How are risks, sensitivities, and confidentiality managed?	Anonymity; indirect questioning; informed consent	Data quality vs. participant protection
Pre-registration	What elements should be specified ex ante?	Hypotheses; power calculations; measurement strategies	Analytical flexibility vs. credibility and research integrity
Data management & reporting	How will transparency and replicability be ensured?	Data/code disclosure; deviation reporting	Openness vs. confidentiality and legal constraints

Source: Compiled by the author.

In addition to core design decisions, the protocol explicitly incorporates components that are central to the credibility and cumulative value of survey experimental research. Considerations of sample size and statistical power are treated as integral to design rather than as post hoc technical checks, underscoring the importance of a priori power analysis when determining feasible effect sizes and treatment complexity. The protocol further distinguishes between placebo and business-as-usual control conditions, highlighting their implications for causal interpretation and ethical transparency. Decisions regarding treatment intensity are framed as a balance between detectability and realism, which is particularly salient in sensitive research domains such as corruption. Moreover, the protocol treats pre-registration as a normative standard for rigorous research, emphasizing the specification of hypotheses, power calculations, and measurement strategies prior to data collection. Finally, data management and reporting practices – including data and code disclosure, documentation of design decisions, and transparent reporting of deviations from pre-registered plans – are incorporated as essential components of responsible and replicable survey experimental research in public administration.

Ultimately, well-designed survey experiments can play a pivotal role in building more experimental public administration research. They enable researchers to generate credible evidence about how bureaucrats respond to integrity dilemmas, how institutional reforms shape ethical reasoning, and how citizens interpret and react to public service behaviour. As part of a broader experimental toolkit, survey-based designs deepen our understanding of the moral and strategic calculations that underpin governance and help bridge the gap between administrative theory and policy practice (Peters and Guedes-Neto, Reference Peters, Guedes-Neto, Vigoda-Gadot and Vashdi2020).

To illustrate how the protocol outlined earlier can be operationalized in practice, the Appendix provides an example of a survey experimental design structured according to these components. The Appendix is intended as a pedagogical illustration of best practices in experimental public administration research rather than as a pre-registration for a specific empirical study.

4 Research Agendas for Experiments Using Surveys on Corruption in the Public Sector

4.1 Systematic Issues and Implications in Experiments on Corruption in Public Administration

As the use of experiments in public administration gains momentum, several systematic issues continue to shape the reliability, ethical robustness, and scientific contribution of survey-based studies of corruption (Clifford et al., Reference Clifford, Sheagley and Piston2021). While experimental methods promise significant advantages for causal inference, their implementation in the public sector calls for attention to structural, contextual, and epistemological complexities (Barr et al., Reference Barr, Lindelow and Serneels2009).

This section builds directly on the empirical gaps identified in the preceding literature review and translates them into a set of systematic challenges for experimental research on corruption in public administration. While Section 2.2 highlighted recurring limitations in existing studies – such as reliance on hypothetical scenarios, conflation of beliefs, attitudes, and behaviour, and a limited use of comparative and context-sensitive designs – the issues discussed here address why these patterns persist and how they constrain cumulative knowledge-building. Importantly, these systematic challenges are not merely descriptive shortcomings but design-relevant problems that motivate the protocol-oriented guidance developed in Section 3.2. By explicitly linking empirical gaps to concrete design choices – such as sampling strategies, treatment construction, outcome measurement, and transparency practices – this section clarifies how future survey experiments can move beyond ad hoc solutions towards more theoretically integrated and methodologically robust research agendas. Several of the issues discussed below correspond directly to specific protocol components outlined in Section 3.2, underscoring the importance of treating experimental design choices as cumulative and theory-driven rather than study-specific decisions.

First, a recurring issue is the gap between perceived and actual behaviour (Garrido-Vergara, Reference Garrido-Vergara2024; Garrido-Vergara and Quijada Donaire, Reference Garrido-Vergara and Quijada Donaire2025). Survey experiments often rely on hypothetical or attitudinal responses to vignettes, which may not fully capture how public officials behave in real-world conditions (Hansen and Tummers, Reference Hansen and Tummers2020; Peters and Guedes-Neto, Reference Peters, Guedes-Neto, Vigoda-Gadot and Vashdi2020). This problem is exacerbated in contexts where social desirability bias is high, legal enforcement is weak, or professional norms are poorly institutionalized. While self-administered surveys can reduce some forms of bias, they cannot eliminate it altogether. Future studies should therefore seek to integrate survey experiments with behavioural validation techniques such as audit studies, administrative data linkage, or embedded decision tasks.

Second, ethical challenges loom large in the experimental study of corruption. Researchers face dilemmas about how to ethically simulate unethical behaviour without inducing actual harm, stigmatizing participants, or violating institutional protocols (Monnery and Chirat, Reference Monnery and Chirat2024). This calls for the careful design of vignettes and treatment frames, the use of anonymized and low-risk outcome measures, and close collaboration with ethical review boards. Furthermore, in politically sensitive environments, survey experiments may risk triggering distrust or resistance among participants, especially when framed around misconduct or institutional failure.

Third, there are contextual limitations to external validity. The effects of experimental treatments often depend on local administrative cultures, historical legacies, and the broader institutional environment (Hansen and Tummers, Reference Hansen and Tummers2020; Meyer-Sahling et al., Reference Meyer-Sahling, Mikkelsen and Schuster2019). An integrity nudge that works in one country or agency may not yield similar effects elsewhere. Comparative and cross-national designs are thus essential to develop generalizable insights. However, they also raise practical issues regarding linguistic equivalence, the cultural adaptation of scenarios, and the harmonization of outcome measures.

Fourth, public administration experiments face logistical constraints, particularly in low- and middle-income countries (Bertelli et al., Reference Bertelli, Hassan, Honig, Rogger and Williams2020). These include limited access to sampling frames, the reluctance of bureaucracies to authorize interventions, and digital or literacy barriers that limit online survey deployment. Building institutional trust, establishing long-term collaborations with public agencies, and employing mixed-methods strategies can help mitigate these challenges.

Fifth, it is necessary to strengthen replication and transparency in the field (Walker et al., Reference Walker, Brewer, Lee, Petrovsky and Van Witteloostuijn2019). Pre-registration of hypotheses and analysis plans, open sharing of instruments and datasets, and collaborative replication initiatives remain relatively underdeveloped in public administration compared to fields like economics or political science. This limits cumulative knowledge and increases the risk of publication bias.

Taken together, the issues discussed in this section point to a set of concrete priorities for future experimental research on corruption in public administration. In particular, survey experiments should be designed to differentiate clearly between beliefs about corruption, normative attitudes, and behavioural intentions, and to situate these outcomes within explicitly defined institutional constraints. Greater attention is also required in the construction and calibration of treatments so that experimental manipulations capture realistic organizational conditions rather than abstract or purely hypothetical scenarios. In addition, sampling and recruitment strategies should be carefully aligned with the theoretical scope of the research question, especially in comparative and cross-institutional contexts. Finally, transparency practices – such as pre-registration, systematic reporting of design choices, and explicit documentation of limitations – should be treated as integral components of experimental research rather than optional enhancements. Taken together, these priorities outline a practical pathway towards more cumulative, theory-driven, and policy-relevant survey experimental research in public administration.

4.2 Towards an Experimental Public Administration: Relevant Research Topics

A future-oriented research agenda in experimental public administration research, particularly on corruption, must focus on both refining methodological tools and expanding the substantive questions addressed by survey experiments. Below are five priority areas for future investigation:

1. The Ethics-Performance Nexus in Bureaucracies (Garrido-Vergara, Reference Garrido-Vergara2024; Garrido-Vergara and Quijada Donaire, Reference Garrido-Vergara and Quijada Donaire2025; Mendy, Reference Mendy2023; Pliscoff-Varas, Reference Pliscoff-Varas and Farazmand2019)
How does exposure to ethical norms or anti-corruption training influence public servants’ performance, motivation, or organizational loyalty? Experimental studies could test whether appeals to integrity crowd in or crowd out intrinsic motivation and how these effects vary across professional roles, tenure, and institutional trust.
2. Social Norms, Peer Effects, and Integrity (De Graaf, Reference De Graaf2010; Isaeva et al., Reference Isaeva, Seki and Kakinaka2025; Menzel, Reference Menzel2015)
Understanding how bureaucrats respond to cues about peer behaviour is central to designing effective interventions. Future experiments should explore the impact of descriptive norms (what others do) versus injunctive norms (what others approve of) on ethical decision-making, and whether these effects are moderated by hierarchy, proximity, or sector.
3. Political Interference and Discretionary Abuse (Blaesser, Reference Blaesser1994; Mashaw, Reference Mashaw1995; Shabangu et al., Reference Shabangu, Kahn and Thani2023)
Survey experiments can probe how political influence affects ethical reasoning at different levels of public administration. For instance, do public officials alter their willingness to report misconduct when told that the perpetrator is politically connected? Such studies could help distinguish between individual-level ethics and systemic constraints on integrity.
4. Citizen-Bureaucrat Interactions and Trust (De Boer, Reference 71De Boer2023; Pepinsky et al., Reference Pepinsky, Pierskalla and Sacks2017; Wang et al., Reference Wang, Chen, Chien and Wang2024)
Experiments involving both citizens and bureaucrats can reveal the reciprocal dynamics of corruption perception and behaviour. How do citizens respond when told that most public servants act with integrity? Conversely, how do public officials react to citizen distrust or demands for bribes in hypothetical interactions? This line of inquiry connects behavioural ethics with democratic legitimacy.
5. Administrative Reforms and Organizational Change (Brunsson and Olsen, Reference Brunsson and Olsen2018; Durant, Reference Durant2008; Greve et al., Reference Greve, Ejersbo, Lægreid and Rykkja2016)
Finally, survey experiments can evaluate the causal effects of structural reforms, such as merit-based hiring, performance audits, and transparency laws, on bureaucratic norms and beliefs. By embedding hypothetical reform scenarios in vignettes, researchers can assess whether such policies are perceived as credible, fair, and effective deterrents to misconduct.

Beyond these topics, more experimental public administration studies should embrace intersectional approaches, examining how gender, race, social origin, and bureaucratic rank influence susceptibility to unethical behaviour and responsiveness to interventions. It should also invest in multi-method triangulation, blending survey experiments with ethnographic observation, qualitative interviews, and process tracing (Brower et al., Reference Brower, Schoorman and Tan2000).

Lastly, a forward-looking agenda should focus not only on diagnosing integrity deficits but also on testing solutions. What interventions work, for whom, and under what conditions? The goal is to move from normatively loaded critiques of corruption to empirically grounded, context-sensitive, and politically feasible reform strategies.

Table 8 illustrates how the priority research topics identified in this section can be operationalized through concrete survey experimental designs that directly address the systematic issues highlighted earlier in this Element.

Table 8Illustrative Survey Experimental Designs for Future Research on Corruption in Public Administration

Table 8Illustrative Survey Experimental Designs for Future Research on Corruption in Public Administration
Research topic	Core causal question	Illustrative experimental design	Systematic issues addressed
Institutional signals and integrity	Do integrity-oriented organizational messages influence ethical behaviour independently of enforcement?	Randomized survey experiment varying exposure to organizational messages (integrity vs. performance vs. compliance), combined with fixed monitoring conditions; outcomes measured separately as beliefs, attitudes, and behavioural intentions	Conflation of beliefs and behaviour; low treatment realism
Social norms and collective expectations	How do perceived peer behaviour and enforcement interact to shape unethical conduct?	Factorial survey experiment manipulating information about peer corruption prevalence and audit probability; separate measurement of descriptive norms and intended behaviour	Lack of norm–behaviour distinction; reliance on abstract scenarios
Discretion, rules, and accountability	How does administrative discretion condition ethical decision-making under accountability constraints?	Vignette-based experiment varying rule clarity (high/low) and accountability mechanisms (internal audit vs. external oversight); behavioural intentions measured under specified constraints	Oversimplified treatments; limited institutional specificity
Hierarchy and authority	Does the source of authority affect compliance with unethical directives?	Survey experiment varying the hierarchical source of an instruction (peer, supervisor, senior management) and detection risk; outcomes disaggregated by beliefs and behavioural intentions	Insufficient attention to organizational context; weak treatment calibration
Comparative and cross-institutional designs	Do institutional constraints operate similarly across administrative contexts?	Harmonized survey experiment implemented across countries or administrative levels with identical treatments but context-specific constraints; comparative estimation of treatment effects	Lack of comparative evidence; misalignment between sampling and theory

Source: Compiled by the author.

Across the examples, a common feature is the explicit alignment between causal questions, experimental manipulations, and outcome measurement, ensuring that each design targets a specific mechanism rather than relying on abstract or purely hypothetical scenarios.

First, the proposed designs emphasize the analytical separation between beliefs, normative attitudes, and behavioural intentions. By measuring these outcomes independently and embedding them within clearly specified institutional constraints, the illustrative experiments respond directly to the tendency in existing research to conflate ethical reasoning with anticipated behaviour. This distinction is particularly evident in designs focusing on institutional signals and social norms, where exposure to organizational messages or peer behaviour information is varied while enforcement conditions are held constant or systematically manipulated.

Second, the examples demonstrate how treatment construction and intensity can be calibrated to reflect realistic organizational settings. Rather than presenting generic ethical dilemmas, the designs incorporate concrete features of public sector environments, such as audit probabilities, hierarchical authority, and accountability mechanisms, thereby enhancing ecological validity while preserving experimental control. This approach addresses one of the core methodological limitations identified in the literature review: over-reliance on simplified vignettes detached from institutional context.

Third, the table highlights the potential of comparative and cross-institutional survey experiments to advance cumulative knowledge. By implementing harmonized experimental designs across administrative levels or national contexts, researchers can assess whether causal mechanisms operate similarly under different institutional arrangements while aligning sampling strategies with the theoretical scope of inference. Such designs directly respond to the lack of comparative evidence noted in the experimental literature on corruption in public administration.

Taken together, the examples summarized in Table 8 demonstrate how future survey experiments can be designed to both test clearly specified causal relationships and overcome persistent methodological challenges. Rather than prescribing a single research strategy, the table illustrates a flexible, protocol-consistent approach that links substantive research questions to explicit design choices, thereby supporting more theory-driven, comparable, and policy-relevant experimental research in public administration.

5 Conclusion

This Element has explored the challenges, methods, and opportunities involved in studying corruption and ethical behaviour in public administration through the lens of survey experiments. Its empirical and theoretical insights seek to advance both the methodological rigour and practical relevance of experimental approaches in the public sector.

5.1 General Conclusions

Corruption in public administration is a deeply embedded, multifaceted phenomenon that erodes trust, weakens institutions, and undermines democratic governance. While legal and institutional reforms remain crucial, this Element argues that a more behavioural and causal understanding of corruption is necessary to complement these efforts. Experimental methods and, particularly, survey experiments offer a promising strategy to illuminate the micro-foundations of unethical behaviour, belief systems, and organizational dynamics.

Survey experiments are especially suited to public administration research because they allow scholars to examine the subtle and often concealed processes through which individuals in bureaucratic settings interpret norms, respond to incentives, and navigate ethical dilemmas. Through the randomized manipulation of information, cues, and scenarios, survey experiments permit robust causal inference while respecting the ethical and logistical constraints of studying misconduct in real-world institutions.

Nonetheless, this Element has also emphasized that experimental research in the public sector is not without challenges. Issues of social desirability bias, limited external validity, institutional sensitivities, and ethical dilemmas must be carefully addressed through rigorous protocols, interdisciplinary collaboration, and context-aware research designs. Importantly, experiments must not be seen as technical fixes but as tools that form part of a broader commitment to reflective, ethical, and accountable scholarship.

5.2 Specific Findings and Contributions

Taken together, this Element demonstrates that survey experiments offer a distinctive and underutilized capacity to advance the study of corruption and unethical behaviour in public administration, provided they are designed with more theoretical and methodological precision than is common in existing research. Rather than merely expanding the use of experiments, the analysis clarifies how and why survey experimental designs matter for identifying causal mechanisms that remain opaque in observational and descriptive approaches.

Substantively, the manuscript shows that much of the experimental literature has focused on ethical judgements and stated intentions, often without clearly distinguishing between beliefs about corruption, normative attitudes, and behavioural responses under concrete institutional constraints. This conflation limits both causal inference and policy relevance. By contrast, survey experiments that explicitly separate these dimensions – and embed them within experimentally varied enforcement regimes, organizational signals, or social norms – are better suited to testing the mechanisms emphasized by principal–agent, collective action, and behavioural approaches to corruption.

Methodologically, the Element clarifies that the credibility of experimental findings hinges less on the sophistication of statistical estimators than on upstream design choices. Issues such as treatment realism, intensity, sampling alignment, and transparency practices fundamentally shape what can be learned from survey experiments. Robust estimation strategies are valuable only insofar as they are grounded in coherent experimental designs that reflect realistic administrative contexts and clearly defined causal estimands.

Importantly, the analysis also demonstrates that institutional context is not peripheral to experimental research. The discussion of ethics infrastructure and administrative environments – illustrated through the Latin American case – shows how variation in enforcement capacity, norms, and organizational cultures conditions ethical decision-making and must therefore be incorporated into experimental design rather than treated as background noise. Survey experiments are particularly well suited to this task because they allow researchers to systematically vary institutional signals while holding broader contexts constant.

Overall, the reader should take away three central insights. First, survey experiments are most informative when they are theory-driven and explicitly designed to test specific causal mechanisms rather than broad ethical dispositions. Second, addressing long-standing methodological limitations in the literature implies prioritizing design choices – such as outcome differentiation, treatment calibration, and sampling strategy – over increasingly complex statistical techniques. Third, a protocol-oriented approach to survey experimentation provides a practical pathway for producing more cumulative, comparable, and policy-relevant knowledge about corruption in public administration.

5.3 Implications for Practice and Scholarship

For practitioners, the experimental lens offers a means to pilot and evaluate anti-corruption strategies in a low-risk, high-evidence environment. It enables governments and public organizations to test what works before scaling up reforms and to understand which actors are most responsive to interventions.

For scholars, experiments provide a disciplined method of theory testing and refinement. They challenge researchers to think more clearly about mechanisms, heterogeneity, and context, whilst also creating opportunities for collaboration across disciplines, institutions, and regions.

Ultimately, the contribution of experimental methods to public administration research lies not only in their technical power but also in their ability to foreground behavioural realities, normative complexity, and the contingent nature of institutional reform. As corruption continues to pose a major challenge to the legitimacy and effectiveness of public institutions worldwide, survey experiments offer a timely and tractable pathway towards understanding – and changing – the ethical foundations of the state.

5.3.1 For Practitioners: Designing and Testing Real-World Reforms

One of the most direct implications of experimental public administration research is its potential to inform the design and evaluation of integrity-enhancing reforms. Public institutions often implement new codes of conduct, reporting mechanisms, or training programmes with limited evidence about their effectiveness. Survey and field experiments allow these interventions to be piloted in controlled conditions before scaling. This reduces risk, improves cost-effectiveness, and allows reformers to identify which strategies resonate most with specific bureaucratic audiences.

For instance, public sector managers can test whether ethics training is more impactful when framed around public service values or around legal compliance. Similarly, they can assess whether messages emphasizing peer norms (“Most of your colleagues would report this situation”) are more effective than those focused on deterrence or punishment. A/B testing can be deployed in digital settings – for example, municipal websites or internal platforms – to test different communications strategies for whistleblowing procedures or procurement guidelines.

Importantly, experiments need not be limited to academic research. Public agencies can institutionalize experimental thinking by:

Creating internal ‘policy labs’ or experimental units to design and test interventions
Including small-scale randomized trials in strategic planning cycles
Partnering with universities and civil society organizations to co-develop experiments
Establishing ethical review processes that enable innovation while protecting public trust.

These initiatives can democratize experimentation, making it a tool for learning rather than an elite academic exercise. When aligned with organizational learning systems, experiments can serve as a catalyst for cultural change, not just policy improvement.

5.3.2 For Scholars: Theory-Building, Methodological Pluralism, and Field Embeddedness

For scholars of public administration, experiments are not simply tools for evaluation; they also serve as engines for theory-building. Survey experiments, in particular, offer a bridge between abstract concepts (e.g., public service motivation, ethical fading, social norms) and concrete behavioural indicators. They allow researchers to test causal mechanisms and refine theoretical models in response to observed variance across individuals, institutions, and cultures.

This Element has emphasized the importance of three core theoretical frameworks – principal–agent theory, collective action theory, and behavioural ethics – and shown how survey experiments can generate evidence that either supports or challenges their predictions. For example, findings from Latin America suggest that integrity interventions based solely on enforcement may underperform when norms of cynicism or fatalism prevail. In such contexts, behavioural cues (e.g., identity priming, peer exemplars) may offer more effective paths to reform. These insights help move the field from generic ‘best practices’ to context-specific models of integrity.

At the same time, scholars must recognize the limitations of experimental designs. Not all ethical dilemmas can be simulated in a vignette, and not all behavioural outcomes can be observed through surveys. Methodological pluralism – combining experiments with ethnography, administrative records, or process tracing – can help triangulate findings and improve external validity. Embedding experiments in real bureaucratic settings, as through field designs, enhances realism and helps researchers stay attuned to the lived constraints of public service.

Scholars should also engage in reflexivity about the ethics of experimentation. Working with vulnerable bureaucracies or politically sensitive topics (e.g., favouritism, politicization, resource diversion) calls for care in framing, anonymization, and interpretation. Consent procedures must be robust, and findings must be communicated in ways that avoid blame or reputational harm. Responsible experimentation is not neutral: it must be guided by values of transparency, humility, and public accountability.

5.3.3 Bridging Research and Policy: The Role of Intermediaries

A third implication of experimental public administration research lies in the opportunity to bridge the long-standing gap between academic research and policy practice. Too often, research remains disconnected from the needs of reformers, and policy innovations proceed without evidence. Survey experiments, when co-developed with practitioners, can serve as a shared language for these domains, offering clear, testable hypotheses and actionable findings.

Universities, think tanks, and international organizations (such as the OECD, UNDP, or Transparency International) can play a key intermediary role. They can help:
Translate research findings into policy briefs, dashboards, or decision-making tools
Organize workshops that train public officials in experimental methods
Develop comparative databases of experimental results across countries or sectors
Facilitate multi-actor coalitions that promote learning from evidence across jurisdictions.

At the same time, scholars can be more intentional about the design of their research outputs. Pre-registration, open data, and accessible writing increase credibility and relevance. Collaborative authorship with practitioners helps ground theoretical insights in administrative realities. Journals and academic publishers – including the Cambridge Elements series – can foster this connection by promoting translational formats that speak to both communities.

5.3.4 Experimental Ethics and the Normative Commitments of the Field

Finally, the growing use of experiments in public administration invites renewed reflection on the discipline’s normative foundations. Experiments are not just technical tools; they are also expressions of epistemological and ethical commitments. They assume that governance problems can be studied through systematic inquiry, that public servants are capable of reasoning and learning, and that institutions can evolve through evidence-informed reform.

This perspective contrasts with more cynical or deterministic accounts of corruption and public failure. Instead of viewing unethical behaviour as inevitable, experimental public administration studies treat it as a testable and, potentially, modifiable phenomenon. The goal is not simply to diagnose dysfunction but to support reformers in building better institutions.

However, this normative orientation also entails responsibility. Scholars must be transparent about the limitations of their methods, inclusive in their engagement with diverse administrative settings, and cautious about over-generalization. Practitioners must avoid instrumentalizing experiments for symbolic legitimacy or political gain. The field as a whole must develop ethical standards that protect human dignity, institutional autonomy, and democratic integrity.

Survey experiments offer more than just methodological innovation; they offer a paradigm shift in how we understand and improve public administration. For practitioners, they provide a rigorous tool for testing reforms and strengthening ethics infrastructure; for scholars, they open new frontiers for theory development, causal inference, and real-world engagement; and, for both, they offer a space of constructive dialogue where evidence, values, and practice converge.

As the field moves forward, the challenge is not only to produce more experiments but to produce better ones: experiments that are contextually grounded, ethically responsible, theoretically meaningful, and practically useful. In doing so, experimental public administration research can live up to its promise: not just as a scientific endeavour, but as a contribution to more just, effective, and accountable governance.

Appendix

Appendix A: Sample Survey Experiment Protocol

This appendix presents an illustrative example of how the research protocol proposed in Section 3.2 can be implemented in a concrete survey experimental design. The purpose of this example is pedagogical: to demonstrate how core protocol components – such as sampling decisions, treatment construction, outcome measurement, and transparency practices – can be systematically aligned in applied research on corruption and unethical behaviour in public administration. The appendix does not describe a pre-registered or implemented study but rather serves as a stylized template that researchers may adapt to different institutional and substantive contexts.

Title

Professional Identity and Ethical Sensitivity among Municipal Officials: A Survey Experiment in Chile

Overview

This appendix presents a stylized protocol for a vignette-based survey experiment designed to examine the impact of professional identity priming on ethical sensitivity among municipal civil servants in Chile. The experiment seeks to test whether subtle cues invoking public service values influence how bureaucrats ethically evaluate ambiguous situations such as nepotism or clientelistic behaviour.

The protocol illustrates best practices in experimental public administration research, with careful attention paid to treatment construction, random assignment, ethical safeguards, and outcome measurement.

1 Research Question and Hypotheses

Research Question:

Does invoking a public servant’s professional identity increase ethical sensitivity in evaluating ambiguous administrative dilemmas?

Hypothesis (H1):

Respondents exposed to a professional identity prime will be more likely to identify unethical behaviour in a vignette involving nepotism compared to those who receive no such prime.

2 Sample and Recruitment Strategy

The target population consists of municipal officials employed in mid-sized cities (50,000–300,000 residents) in Chile. The sample includes frontline staff and mid-level managers from departments such as social development, public works, and finance.

Respondents are recruited via official institutional emails, following agreements with municipal HR offices and approval by the national civil service authority. Participation is voluntary and anonymous, with informed consent obtained at the beginning of the survey.

The target sample size is N = 600, powered to detect a minimum treatment effect of 8 percentage points on the primary outcome with 80% power at α = 0.05.

3 Experimental Design and Treatment Conditions

The experiment follows a between-subjects design with two conditions:

Treatment Group (Professional Identity Prime): Participants receive a short paragraph emphasizing their role as public servants committed to integrity, impartiality, and the collective good. The text is adapted from real language used in Chilean public service ethics codes.

As a public servant, your actions have a direct impact on the lives of citizens. Upholding principles of transparency, fairness, and accountability is essential to maintaining public trust. Every day, your work contributes to the common good and strengthens democratic institutions.

Control Group (Neutral Prime): Participants receive a short paragraph about their work routines, containing no normative or ethical framing.

Municipal employees work across a wide range of services, including infrastructure, social development, and urban management. Their tasks vary depending on department, seniority, and location.

After exposure to the prime, all participants read the same vignette.

4 Vignette: Nepotism Scenario

Imagine you are reviewing a list of candidates for a short-term administrative position. One candidate is the nephew of a senior official in your department. While his qualifications are adequate, he does not clearly stand out among the other applicants. During an informal conversation, the senior official expresses the hope that their nephew will be given ‘a fair chance’. You must now decide how to proceed with the recruitment process.

5 Outcome Measures

Primary Outcome:

Ethical judgement: How inappropriate is it to give the nephew preferential treatment?
(7-point Likert scale: 1 = Not at all inappropriate, 7 = Extremely inappropriate)

Secondary Outcomes:

Intended action: What would you do in this situation?
(Multiple choice: Proceed with the process neutrally; Flag the situation to HR; Reassign the selection to another staff member; Seek advice from a supervisor.)
Perceived norm: How common do you think such situations are in your department?
(5-point scale: Never to Very Often)
Trust in colleagues: I trust that my colleagues would act ethically in similar situations.
(7-point agreement scale)

6 Randomization and Implementation

Random assignment is performed via embedded software logic in the survey platform (Qualtrics), stratified by municipality and respondent seniority to ensure balance across units. Participants are unaware of the experimental manipulation and are debriefed at the end of the survey.

7 Ethical Considerations

Approval: The study is reviewed and approved by the university’s Research Ethics Committee and complies with Chilean data protection regulations.
Consent: A detailed consent form precedes the survey, explaining its purpose, voluntary nature, anonymity, and the right to withdraw at any time.
Debriefing: After the final question, respondents are informed of the study’s purpose and offered a summary of the research objectives and contact information for follow-up.
Sensitivity: The vignette avoids naming real institutions or individuals. Responses are anonymized and stored on encrypted servers.

8 Analysis Plan

The primary analysis involves estimating the ATE on ethical judgement scores:

Y_{i} = α + τ D_{i} + ε_{i}

where:

$Y_{i}$ = the ethical judgement score for respondent $i$
$D_{i}$ = the binary treatment indicator (1 = professional identity prime)
$τ$ = the estimated treatment effect.

Robustness checks include covariate adjustment for age, gender, tenure, and education. Pre-registered subgroup analyses will examine whether treatment effects differ by seniority (frontline vs. managerial staff) and by municipality type (urban vs. peri-urban).

9 Limitations and Extensions

The use of self-reported measures may introduce social desirability bias, although randomization mitigates this concern. Future extensions could include longitudinal follow-up or integration with administrative behaviour (e.g., complaint data, audit responses). Replication in other countries or sectors (e.g., health, education) could enhance external validity.

This protocol illustrates how survey experiments can test realistic ethical dilemmas in public administration using randomized design. By simulating everyday integrity challenges and measuring responses across treatment conditions, such experiments help identify which normative cues or professional identities shape ethical sensitivity and under what conditions. The protocol is adaptable for other Latin American contexts and represents a concrete tool for both researchers and reformers seeking to strengthen integrity in the public sector.

Public and Nonprofit Administration

Robert Christensen
Brigham Young University
Robert Christensen is the George W. Romney Professor of Public and Nonprofit Management at Brigham Young University.

Jaclyn Piatak
University of North Carolina at Charlotte
Jaclyn Piatak is co-editor of NVSQ and Professor of Political Science and Public Administration at the University of North Carolina at Charlotte.

Rosemary O’Leary
University of Kansas
Rosemary O’Leary is the Edwin O. Stene Distinguished Professor Emerita of Public Administration at the University of Kansas.

About the Series

The foundation of this series are cutting-edge contributions on emerging topics and definitive reviews of keystone topics in public and nonprofit administration, especially those that lack longer treatment in textbook or other formats. Among keystone topics of interest for scholars and practitioners of public and nonprofit administration, it covers public management, public budgeting and finance, nonprofit studies, and the interstitial space between the public and nonprofit sectors, along with theoretical and methodological contributions, including quantitative, qualitative and mixed-methods pieces.

The Public Management Research Association

The Public Management Research Association improves public governance by advancing research on public organizations, strengthening links among interdisciplinary scholars, and furthering professional and academic opportunities in public management.

Element contents

Shadows of Integrity in Public Administration

Summary

Keywords

Information

1 Introduction

2 Experiments in Public Administration

2.1 Between Rules and Discretion: The Literature on Corruption in Public Administration

2.2 Experimental Studies of Behavioural Ethics and Corruption in Public Management Journals

2.3 Why Is It Important to Carry Out Experiments in Public Administration?

2.4 Causal Inference and Analysis of Experiments in Public Management Research

2.5 Mathematical Models for Causal Inference in Survey Experiments

2.5.1 Linking to Experimental Types

2.5.2 Moderation, Heterogeneity, and Multi-level Dynamics

2.5.3 Implications for Theory and Policy

3 Research Protocols for Different Survey Experiments

3.1 Survey Experiments for Public Administration

3.2 Design of Protocols for Experiments in Public Administration

4 Research Agendas for Experiments Using Surveys on Corruption in the Public Sector

4.1 Systematic Issues and Implications in Experiments on Corruption in Public Administration

4.2 Towards an Experimental Public Administration: Relevant Research Topics

5 Conclusion

5.1 General Conclusions

5.2 Specific Findings and Contributions

5.3 Implications for Practice and Scholarship

5.3.1 For Practitioners: Designing and Testing Real-World Reforms

5.3.2 For Scholars: Theory-Building, Methodological Pluralism, and Field Embeddedness

5.3.3 Bridging Research and Policy: The Role of Intermediaries

5.3.4 Experimental Ethics and the Normative Commitments of the Field

Appendix

Appendix A: Sample Survey Experiment Protocol

Title

Overview

1 Research Question and Hypotheses

2 Sample and Recruitment Strategy

3 Experimental Design and Treatment Conditions

4 Vignette: Nepotism Scenario

5 Outcome Measures

6 Randomization and Implementation

7 Ethical Considerations

8 Analysis Plan

9 Limitations and Extensions

References

Accessibility standard: WCAG 2.0 A

Why this information is here

Accessibility Information

Content Navigation

Reading Order & Textual Equivalents

Visual Accessibility

Structural and Technical Features

Save element to Kindle

Save element to Dropbox

Save element to Google Drive