Hostname: page-component-89b8bd64d-4ws75 Total loading time: 0 Render date: 2026-05-13T19:23:24.388Z Has data issue: false hasContentIssue false

Inaccurate forecasting of a randomized controlled trial

Published online by Cambridge University Press:  22 November 2023

Mats Ahrenshop
Affiliation:
Department of Politics & International Relations, University of Oxford, Oxford, UK
Miriam Golden
Affiliation:
Department of Political and Social Sciences, European University Institute, Florence, Italy
Saad Gulzar*
Affiliation:
Department of Politics and School of Public and International Affairs, Princeton University, Princeton, NJ, USA
Luke Sonnet
Affiliation:
Independent Researcher, Redwood City, CA, USA
*
Corresponding author: Saad Gulzar; Email: gulzar@princeton.edu
Rights & Permissions [Opens in a new window]

Abstract

We report the results of a forecasting experiment about a randomized controlled trial that was conducted in the field. The experiment asks Ph.D. students, faculty, and policy practitioners to forecast (1) compliance rates for the RCT and (2) treatment effects of the intervention. The forecasting experiment randomizes the order of questions about compliance and treatment effects and the provision of information that a pilot experiment had been conducted which produced null results. Forecasters were excessively optimistic about treatment effects and unresponsive to item order as well as to information about a pilot. Those who declare themselves expert in the area relevant to the intervention are particularly resistant to new information that the treatment is ineffective. We interpret our results as suggesting that we should exercise caution when undertaking expert forecasting, since experts may have unrealistic expectations and may be inflexible in altering these even when provided new information.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of American Political Science Association
Figure 0

Figure 1. Experimental design. Forecast compliance before/after ITT refers to question ordering about when respondents forecast compliance rates and when they forecast treatment effects. Compliance percentages were requested for answering the MPA’s call and for answering the MPA’s question. Treatment effects were elicited in standard deviation units. Pilot results refer to the presentation of null treatment effects for three outcomes of interest in the pilot. Prime about pilot refers to a short informational vignette embedded in the survey where respondents read the following text: “Pilot and Scale-up: This program was designed through a pilot that we conducted with one MPA in [district name redacted] in 2016 (shown in red on the map). This pilot was conducted with 1,200 households. The scale-up project was implemented in the blue areas in this map.”

Figure 1

Table 1. Factorial experiment to elicit forecasts

Figure 2

Table 2. Estimands of interest

Figure 3

Table 3. Descriptive statistics of outcomes of interest in the pilot, RCT, and forecasts

Figure 4

Table 4. Experimental effects for ITT forecasts (average across indices)

Figure 5

Table 5. Experimental effects for compliance forecasts

Figure 6

Table 6. Effect heterogeneity for compliance forecasts

Figure 7

Figure 2. Heterogeneous treatment effects according to familiarity. We plot predicted compliance forecasts for different treatment groups according to whether they have received pilot results before making compliance forecasts or not separately for subgroups defined by familiarity with the use of technology in improving governance. To measure familiarity, we asked: “How familiar are you with research on the use of information technology to improve governance?” [1 = very familiar; 4 = very unfamiliar]. We dichotomized this variable by collapsing categories (1,2) into “familiar” and (3,4) into “unfamiliar.”

Figure 8

Figure 3. Published RCTs in political science and those reporting null results, 2012–2021. The left-hand panel shows the proportion of articles reporting results from an RCT out of all articles published in the respective journal. Over the period, the APSR published a total of 519 articles, AJPS 564, and JOP 793. The number of articles with RCTs is shown at the top of each bar. The right-hand panel looks only at articles that reported results from an RCT and distinguishes between those that reported null results on the main treatment effect of the intervention and those that reported a significant treatment effect. Relevant numbers are shown in each bar portion. Data collection procedures and coding principles detailed in online Appendix B.

Figure 9

Table 7. Distribution of subjects professing expertise by professional status

Figure 10

Table 8. Distribution of subjects professing expertise by site

Supplementary material: Link

Ahrenshop et al. Dataset

Link
Supplementary material: PDF

Ahrenshop et al. supplementary material

Appendices

Download Ahrenshop et al. supplementary material(PDF)
PDF 348.4 KB