Skip to main content Accessibility help
Hostname: page-component-559fc8cf4f-qpj69 Total loading time: 0.443 Render date: 2021-02-28T08:08:19.023Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": false, "newCiteModal": false, "newCitedByModal": true }

The Causal Effect of Polls on Turnout Intention: A Local Randomization Regression Discontinuity Approach

Published online by Cambridge University Press:  15 February 2021

Pablo Brugarolas
Pompeu Fabra University, Department of Political & Social Sciences, C/ Ramon Trias Fargas 25-27, 08005 Barcelona,  Spain. Email:
Luis Miller
Spanish National Research Council (IPP-CSIC), C/ Albasanz 26, 28037 Madrid,  Spain. Email:
Rights & Permissions[Opens in a new window]


This letter reports the results of a study that combined a unique natural experiment and a local randomization regression discontinuity approach to estimate the effect of polls on turnout intention. We found that the release of a poll increases turnout intention by 5%. This effect is robust to a number of falsification tests of predetermined covariates, placebo outcomes, and changes in the time window selected to estimate the effect. The letter discusses the advantages of the local randomization approach over the standard continuity-based design to study important cases in political science where the running variable is discrete; a method that may expand the range of empirical topics that can be analyzed using regression discontinuity methods.

© The Author(s) 2021. Published by Cambridge University Press on behalf of the Society for Political Methodology

1 Introduction

The effect of public opinion polls on voter turnout is a long-standing question in political science. The answer to this question has always been difficult due to the lack of good data from relevant naturally occurring elections. Researchers have either used observational data and relied on aggregate election results (Blais Reference Blais2000; Morton et al.  Reference Morton, Muller, Page and Torgler2015) or conducted experiments where information can be manipulated effectively (Großer and Schram Reference Großer and Schram2010; Gerber et al.  Reference Gerber, Hoffman, Morgan and Raymond2020). In this letter, we offer new evidence on this important question using a unique natural experiment on turnout intention and a novel analytical framework. This is the first study to use the local randomization approach to regression discontinuity (RD) (Cattaneo, Frandsen, and Titiunik Reference Cattaneo, Frandsen and Titiunik2015; Cattaneo, Idrobo, and Titiunik Reference Cattaneo, Idrobo and Titiunik2020a; Reference Cattaneo, Idrobo and Titiunik2020b; Cattaneo, Titiunik, and Vazquez-Bare Reference Cattaneo, Titiunik, Vazquez-Bare, Curini and Franzese2020) to study voter turnout intention. This randomization approach may expand the range of topics that can be analyzed using RD methods in political science, especially those cases where the running variable is discrete (e.g., days, years, federal states, etc.).

Our identification strategy relies on a natural experiment that took place in Spain before the April 2019 national elections.Footnote 1 In the spring of 2019, the Spanish Sociological Research Center (CIS, by its Spanish acronym) conducted two large pre-election polls on representative samples of the Spanish population. A total of 16,194 and 17,641 people were interviewed about their voting preferences in the national elections in the first half of March ($Poll_{1}$) and between 21 March and 23 April ($Poll_{2}$), respectively. The predictions for the national elections ($Poll_{1}$) were released on 9 April at 12:30 p.m. and were reported by all media channels immediately afterwards. This occurred while $Poll_{2}$ was being fielded. As a result of this overlap between the release of $Poll_{1}$ and the $Poll_{2}$ fieldwork, 4,125 people were interviewed about their preferences in the national elections after the results of the national election poll were released. We use the $Poll_{1}$ release as the experimental treatment and define the groups of respondents who completed the questionnaire before and after the release as the control and treatment groups, respectively. We rely on the same type of survey data employed recently by Balcells and Torrats-Espinosa (Reference Balcells and Torrats-Espinosa2018) and Muñoz, Falco-Gimeno, and Hernandez (Reference Muñoz, Falco-Gimeno and Hernandez2020) to study the effect of unexpected terrorist attacks on surveys. Still, the selection of the control and treatment groups was not random and differences in the composition of these two groups may bias the results. To deal with this potential bias, we use a local randomization approach to regression discontinuity (Cattaneo, Frandsen, and Titiunik Reference Cattaneo, Frandsen and Titiunik2015; Cattaneo, Idrobo, and Titiunik Reference Cattaneo, Idrobo and Titiunik2020a; Reference Cattaneo, Idrobo and Titiunik2020b; Cattaneo, Titiunik, and Vazquez-Bare Reference Cattaneo, Titiunik, Vazquez-Bare, Curini and Franzese2020) and compare the responses of people who participated in $Poll_{2}$ just before and after $Poll_{1}$ was released.

Our outcome variable is turnout intention, defined as the proportion of people who said they would vote in the national elections. We address two limitations of previous RD analyses. First, we perform a randomization-based falsification test for the set of predetermined covariates used by the CIS to define its random sample of the Spanish adult population, and assess whether our data can be treated as if the treatment was randomized. We work with a random sample that is representative of the Spanish population and was generated through randomly-chosen sampling points (random routes) and gender, age, and town size quotas. We use these three predetermined covariates as the basis for our falsification tests. Our control and treatment groups are therefore subgroups of a random sample of a large population that are balanced in terms of predetermined covariates. Second, we study the sensitivity of the main result to different time windows around the treatment and perform placebo estimations to further assess the plausibility of the effect.

One limitation of our study is that we use turnout intention measured using a survey item in a pre-election survey. Blais, Young, and Lapp (Reference Blais, Young and Lapp2000) use a similar item, finding a strong but imperfect correlation between turnout intention and reported vote post-election. The question is whether the fact that we are using intentions as the dependent variable may affect our results. In that respect, Achen and Blais (Reference Achen, Blais, Elkink and Farrell2016) suggest that effects will be larger in studies of vote intention than actual turnout. This suggestion must be considered when interpreting our results.

2 Methods

2.1 Identification: Two Approaches to RD Analysis

In this letter, we use the conceptual and methodological RD frameworks developed by Cattaneo, Idrobo, and Titiunik (Reference Cattaneo, Idrobo and Titiunik2020a; Reference Cattaneo, Idrobo and Titiunik2020b).Footnote 2 In our RD design, the day and time of the interview define the running variable and the main goal is to study changes produced near the exogenous shock ($Poll_{1}$’s release). The RD design is defined by a triplet: a score (running or forcing variable), a cutoff, and a treatment. We impose a sharp RD design; hence, every unit whose score exceeds the cutoff is assigned to the treatment condition. Assuming that only the conditional probability of treatment changes discontinuously at the cutoff,Footnote 3 the average treatment effect (ATE) at the cutoff can be identified as the difference between the observed turnout intention of respondents interviewed immediately after and immediately before the release of $Poll_{1}$. For the RD design, continuity implies that, as the two limits of the running variable approach the cutoff, their average potential outcomes become increasingly similar. It therefore becomes justifiable to use the observations in a very small neighborhood around the cutoff to infer the counterfactual.

In contrast to the continuity-based framework, the local randomization approach to RD assumes that the average potential outcomes do not depend on the units’ position at either side of the cutoff. Instead, there is a small positive range of the score which defines the size of the prespecified randomization window. Under the local randomization assumption, the regression functions describing the average potential outcomes must be flat inside that window, as they are independent of the values of the running variable. As the regression functions within the window are constant, the ATE under local randomization can be estimated as the difference between the average observed outcomes of all units in the treatment and control groups within the window.

The continuity assumption requires that the sole discontinuous change existing at the cutoff is the shift in treatment status. By contrast, the randomization assumption states that a small prespecified window might exist where treatment status does not depend on the traits of the respondents. Therefore, the randomization-based approach has a stronger identification assumption. Why would one then be willing to impose such an assumption? One substantive scenario is that of a noncontinuous running variable. Estimating the RD effect under this continuity-based framework will not always be valid for categorical running variables. This happens to be the case because categorical running variables will present mass points at each of the levels of the running variable. For example, imagine a categorical running variable with a small number of levels at each side of the cutoff (e.g., days, years, or rating levels). Regardless of the size of the sample, observations would be collapsed at the running variable level. That is, even if the sample is large enough, any of the observations would be paired to one of the levels of the running variable. Each mass point of the categorical running variable will thus behave as a single observation. The main advantage of the randomization-based approach is that a discrete running variable prevents the estimation procedure from homing in on observations arbitrarily close to the threshold. The continuity-based approach would only be advisable when the running variable is so rich that the use of additional mass points does not imply substantively violating the continuity assumption.

Finally, randomization-based estimation and RD inference methods rely on knowing the window in which randomization holds. However, unlike actual randomized experiments, situations requiring identification through local randomization RD designs would inevitably entail some ambiguity about the set of observations that received the as-if randomly assigned treatment. Therefore, the selection procedure of the window where causality can be plausibly identified is the most fundamental step in the implementation of local randomization RD methods. In practice, window selection procedures are based on nested balance tests on the relevant predetermined covariates (Cattaneo, Frandsen, and Titiunik Reference Cattaneo, Frandsen and Titiunik2015). Overall, the window selection algorithm selects the largest window such that all covariates are balanced in that window and in all the smaller windows inside it. Once the window in which randomization holds is known, the data within that window can be analyzed as one would analyze an experiment.

2.2 The Empirical Case

We exploited the variation in the time at which each interview was conducted to generate the running variable. Each day, the CIS conducted interviews in six time intervals: 9 a.m. to 12 p.m., 12:05 p.m. to 2 p.m., 2:05 p.m. to 4 p.m., 4:05 p.m. to 6 p.m., 6:05 p.m. to 9 p.m., and after 9 p.m. We discarded the latter interval as it contained very few observations. We transformed each of the remaining five intervals into 0.2 increments and added them to the variable containing the time of the interview in days. Finally, we normalized the running variable, which is categorical, with mass points at each of the time intervals. $Poll_{1}$ was released on 9 April at 12:30 p.m., causing some uncertainty about whether respondents interviewed in the 12:05 p.m. to 2 p.m. time interval were affected by the treatment. We removed all observations from this interval following the “donut hole” idea employed in previous RD analyses (Cattaneo, Idrobo, and Titiunik Reference Cattaneo, Idrobo and Titiunik2020b).Footnote 4 Therefore, the treatment status changes on 9 April at 2:05 p.m. Immediately after $Poll_{1}$’s release, all the main Spanish newspapers opened their online editions with the CIS’s forecast for the national elections. All the main TV networks also opened their news programs with the forecast at 2 p.m. and 3 p.m. For each time slot, about half of all TV viewers in Spain were watching one of these news broadcasts. $Poll_{1}$’s release had a noticeable impact on social media as well. The CIS and its president were a trending topic on Twitter in Spain on 9 April. The large impact of $Poll_{1}$’s release on traditional and new media suggests that most people interviewed in $Poll_{2}$ after $Poll_{1}$’s release were aware of the national election forecast.Footnote 5 However, $Poll_{2}$ did not ask whether the respondent had seen the newly released poll. Whereas our identification strategy assumes that all respondents were exposed to information about the poll and provide evidence about the high media impact of its release, we cannot test this assumption empirically and our estimates could be treated as an intention to treat (ITT) effect rather than as an ATE.Footnote 6

3 Results

3.1 Window Selection and Randomization-Based Falsification Tests

We conducted a RD analysis using randomization-based methods on all the windows that can be defined within the 2 days before and after the CIS released $Poll_{1}$ forecast. We ignored observations outside this interval, since the electoral campaign started immediately afterwards and we are interested in isolating the effect of polls on turnout intention. Within that window, we implemented window selection procedures to maximize the credibility of the identification assumption of the randomization-based approach. We started by selecting the window in which the randomization identification assumption is expected to hold. In principle, units under the treatment or control arms should be similar in terms of both their observable and nonobservable characteristics in the vicinity of the cutoff. That is, no substantive differences should be found in the predetermined covariates for respondents with similar running variable values. In practice, conducting these falsification tests involves testing the hypothesis of the null RD effect of the predetermined covariates for units within a chosen window. We conducted the window selection algorithm starting with the smallest possible window. In our application, the smallest window has a size of 0.2 days and comprises 296 respondents; the largest nested window that can be built without interfering with the beginning of the electoral campaign has a length of 2.6 days and comprises 3,047 respondents.

To select the optimal window, we used the automatic data-driven window selection procedure proposed by Cattaneo, Frandsen, and Titiunik (Reference Cattaneo, Frandsen and Titiunik2015) and selected the the largest window such that our predetermined covariates (gender, age, and town size) are balanced in that window and all the smaller windows inside it. More specifically, we tested the null hypothesis of no differences in the three predetermined covariates between the control and the treatment groups for each window considered. Figure 1 plots the p-values of Hotelling’s T-squared statistic and shows that the p-value is above 0.05 for all windows smaller than [-1, 1] and approaches 0 as the windows gets larger than [-1, 1]. This is why we select the window [-0.8, 0.8] for our main analysis. This window comprises four time intervals (mass points) before and after $Poll_{1}$’s release going from 8 April at 2:05 p.m. to 10 April at 12 p.m. and has the good property of including all time intervals (except for the treatment interval: 12:05 p.m. to 2 p.m.) before and after the treatment. Thus, not only do we have balance on covariates, but also on types of time intervals.

Figure 1 Window selection: minimum p-value against window length.

3.2 Main Result

Figure 2 presents a graphical depiction of the estimated RD effect of $Poll_{1}$’s release on turnout intention. On the horizontal axis, we show a range of windows around the cutoff point (0) at which the treatment is assumed to occur. This cutoff is represented by a solid vertical line and the left and right limits of the chosen window are marked by dotted lines. On the vertical axis, we present our measure of turnout intention. Finally, each dot inside the graph represents the average turnout intention reported in a given time interval. Within the 0.8 window, dots to the left of the cutoff tend to be below the dots to the right of it. The two horizontal lines provide the average turnout intention level before and after the cutoff and the vertical difference between these two lines should be interpreted as the ATE. To estimate the ATE, we used local randomization methods (Imbens and Rubin Reference Imbens and Rubin2015), as implemented by Calonico, Cattaneo, and Titiunik (Reference Calonico, Cattaneo and Titiunik2014), available in the Stata and R rdrandinf packages (Cattaneo, Titiunik, and Vazquez-Bare Reference Cattaneo, Titiunik and Vazquez-Bare2016).Footnote 7 Confidence intervals (CI) were also estimated using this package. We found that $Poll_{1}$’s release increased the turnout intention by 5.1% (CI: [2.0,9.0]).

Figure 2 RD effect for the turnout intention (0.8 window).

3.3 Robustness Checks

So far, we have tested the robustness of our main result by falsification tests of the predetermined covariates. Additionally, we performed two further robustness checks as follows. First, we assessed the sensitivity of the result to window choice. The result is, in fact, quite robust to window selection and the ATE in the 13 windows reported in Table A2 of the SI Appendix ranges from 3.4% to 5.1%. Second, we report three estimations using placebo cutoffs. The principle underlying the implementation of placebo tests is the same as the one for falsification tests of predetermined outcomes. We performed two falsification tests in the two weeks before the release’s week and one test that uses the treatment sample and exposes units to a pseudo-treatment occurring at the middle of the treatment interval. In these three placebos, the new cutoffs are Tuesday 26 March, Tuesday 2 April, and Wednesday 10 April. We found little evidence of RD effects for any of the possible windows for any of the three placebos (see Tables A3–A5 of the SI Appendix). The effect of the placebo on the outcome variable is not significant for any window and the magnitude of these effects is small and not even always in the same direction of the main effect.

4 Conclusions

In Western democracies, elections are preceded by dozens, sometimes hundreds of polls trying to forecast the elections results. These forecasts are released at different time points before the elections. In this letter, we address the question of whether voters may be affected by information about the predicted election outcomes. We have used a unique dataset resulting from a natural experiment that occurred in Spain before the April 2019 elections: the release of the most important election forecast during the fieldwork of a large-scale poll. Given the nature of our treatment variable, we used a local randomization approach to RD analysis. We found that the poll’s release had a robust positive effect of about 5% on turnout intention. Our estimates reflect short term effects, which likely differ from the long term effects. We can speculate that the effect on turnout would be weaker than on turnout intention, especially given the time lag between the measure of turnout intention and the election. Also, our effect decays as we move from the release of the poll to the electoral campaign. Finally, during the campaign other treatments (e.g., debates) may affect actual turnout as well.

A straightforward interpretation of these results is that the poll’s release could have acted as an activation device for the upcoming elections. Furthermore, a feature of the Spanish elections analyzed here may contribute to our general understanding of the interplay between polls and political behavior. Previous papers have found that turnout intention increases when elections are predicted to be close (Matsusaka Reference Matsusaka1993; Blais Reference Blais2000). This was not the case of the April 2019 elections in Spain: the Socialist Party led by more than 10 percentage points. Despite this important lead, there was uncertainty about the government coalition that would eventually be formed. The poll predicted a tie between potential left-wing and right-wing coalitions that could have mobilized the voters. This new result on the meaning of election closeness in multiparty systems deserves further attention in subsequent studies.Footnote 8

Data Availability Statement

Replication code for this article has been published at Harvard Dataverse (Miller and Brugarolas Reference Miller and Brugarolas2020) at

Supplementary Material

To view supplementary material for this article, please visit


Edited by Jeff Gill

1 According to Titiunik (Reference Titiunik, Druckman and Green2020), a natural experiment is a type of observational study in which an external force assigns the treatment of interest.

2 Cattaneo, Titiunik, and Vazquez-Bare (Reference Cattaneo, Titiunik and Vazquez-Bare2017) reports a previous empirical application and a discussion of the methods employed here.

3 Everything else is allowed to be changing at the cutoff, as long as those changes are continuous.

4 In Table A6 of the SI Appendix, we replicate the analysis excluding all the respondents interviewed during the day when the poll was released. Results hold under this more conservative estimation strategy.

5 Section 1 of the SI Appendix provides further information about the events surrounding the release of $Poll_{1}$.

6 The data, code, and any additional materials required to replicate all analyses in this article are available at Political Analysis Dataverse within the Harvard Dataverse Network (Miller and Brugarolas Reference Miller and Brugarolas2020).

7 The idea of using binomial tests for falsification of the RD design was first proposed by Cattaneo, Titiunik, and Vazquez-Bare (Reference Cattaneo, Titiunik and Vazquez-Bare2017).

8 An alternative explanation could be that, after seeing information about a poll were most people claim that they would vote, the social desirability of turnout intention questions increases.


Achen, C., and Blais, A.. 2016. “Intention to Vote, Reported Vote and Validated Vote.” In The Act of Voting: Identities, Institutions and Locale, edited by Elkink, J. and Farrell, D., 195209. New York: Routledge.Google Scholar
Balcells, L., and Torrats-Espinosa, G.. 2018. “Using a Natural Experiment to Estimate the Electoral Consequences of Terrorist Attacks.” Proceedings of the National Academy of Sciences 115(42):1062410629.CrossRefGoogle ScholarPubMed
Blais, A. 2000. To Vote or Not to Vote? The Merits and Limits of Rational Choice Theory. Pittsburgh, PA: University of Pittsburgh Press.CrossRefGoogle Scholar
Blais, A., Young, R., and Lapp, M.. 2000. “The Calculus of Voting: An Empirical Test.” European Journal of Political Research 37:181201.CrossRefGoogle Scholar
Calonico, S., Cattaneo, M., and Titiunik, R.. 2014. “Robust Data-Driven Inference in the Regression-Discontinuity Design.” The Stata Journal 14(4):909946.CrossRefGoogle Scholar
Cattaneo, M., Frandsen, B., and Titiunik, R.. 2015. “Randomization Inference in the Regression Discontinuity Design: An Application to Party Advantages in the U.S. Senate.” Journal of Causal Inference 3(1):124.CrossRefGoogle Scholar
Cattaneo, M., Idrobo, N., and Titiunik, R.. 2020a. A Practical Introduction to Regression Discontinuity Designs: Extensions. Cambridge: Cambridge University Press.Google Scholar
Cattaneo, M., Idrobo, N., and Titiunik, R.. 2020b. A Practical Introduction to Regression Discontinuity Designs: Foundations. Cambridge: Cambridge University Press.Google Scholar
Cattaneo, M., Titiunik, R., and Vazquez-Bare, G.. 2016. “Inference in Regression Discontinuity Designs Under Local Randomization.” Stata Journal 16(2):331367.CrossRefGoogle Scholar
Cattaneo, M., Titiunik, R., and Vazquez-Bare, G.. 2017. “Comparing Inference Approaches for RD Designs: A Reexamination of the Effect of Head Start on Child Mortality.” Journal of Policy Analysis and Management 36(3):643681.Google ScholarPubMed
Cattaneo, M., Titiunik, R., and Vazquez-Bare, G.. 2020. “The Regression Discontinuity Design.” Chap. 44 in The SAGE Handbook of Research Methods in Political Science and International Relations, edited by Curini, L. and Franzese, R., 835857. New York: Sage.CrossRefGoogle Scholar
Gerber, A., Hoffman, M., Morgan, J., and Raymond, C.. 2020. “One in a Million: Field Experiments on Perceived Closeness of the Election and Voter Turnout.” American Economic Journal: Applied Economics 12(3):287325.Google Scholar
Großer, J., and Schram, A.. 2010. “Public Opinion Polls, Voter Turnout, and Welfare: An Experimental Study.” American Journal of Political Science 54(3):700717.CrossRefGoogle Scholar
Imbens, G., and Rubin, D.. 2015. “Causal Inference in Statistics, Social, and Biomedical Sciences.” In Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press.CrossRefGoogle Scholar
Matsusaka, J. 1993. “Election Closeness and Voter Turnout: Evidence from California Ballot Proposition.” Public Choice 76:313334.CrossRefGoogle Scholar
Miller, L., and Brugarolas, P.. 2020. “Replication data for: The causal effect of polls on turnout intention: A local randomization regression discontinuity approach.”, Harvard Dataverse, V1, UNF:6:MZ2djX3cH8yGrKbwJSPGkA== [fileUNF].CrossRefGoogle Scholar
Morton, R., Muller, D., Page, L., and Torgler, B.. 2015. “Exit Polls, Turnout, and Bandwagon Voting: Evidence from a Natural Experiment.” European Economic Review 77:6581.CrossRefGoogle Scholar
Muñoz, J., Falco-Gimeno, A., and Hernandez, E.. 2020. “Unexpected Event During Survey Design: Promise and Pitfalls for Causal Inference.” Political Analysis 28(2):186206.CrossRefGoogle Scholar
Titiunik, R. 2020. “Natural Experiments.” In Advances in Experimental Political Science, edited by Druckman, J. and Green, D., 835857. New York: Cambridge University Press.Google Scholar

Brugarolas and Miller Dataset


Brugarolas and Miller supplementary material

Brugarolas and Miller supplementary material

PDF 223 KB

Altmetric attention score

Full text views

Full text views reflects PDF downloads, PDFs sent to Google Drive, Dropbox and Kindle and HTML full text views.

Total number of HTML views: 458
Total number of PDF views: 90 *
View data table for this chart

* Views captured on Cambridge Core between 15th February 2021 - 28th February 2021. This data will be updated every 24 hours.


Send article to Kindle

To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

The Causal Effect of Polls on Turnout Intention: A Local Randomization Regression Discontinuity Approach
Available formats

Send article to Dropbox

To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

The Causal Effect of Polls on Turnout Intention: A Local Randomization Regression Discontinuity Approach
Available formats

Send article to Google Drive

To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

The Causal Effect of Polls on Turnout Intention: A Local Randomization Regression Discontinuity Approach
Available formats

Reply to: Submit a response

Your details

Conflicting interests

Do you have any conflicting interests? *