Hostname: page-component-6766d58669-88psn Total loading time: 0 Render date: 2026-05-20T18:27:11.771Z Has data issue: false hasContentIssue false

Generalizing toward Nonrespondents: Effect Estimates in Survey Experiments Are Broadly Similar for Eager and Reluctant Participants

Published online by Cambridge University Press:  17 May 2024

Philip Moniz
Affiliation:
Department of Government, University of Texas at Austin, Austin, 78712 TX, USA
Rodrigo Ramirez-Perez
Affiliation:
Department of Political Science, University of California, Berkeley, Berkeley, 94704 CA, USA
Erin Hartman
Affiliation:
Department of Political Science, University of California, Berkeley, Berkeley, 94704 CA, USA
Stephen Jessee*
Affiliation:
Department of Government, University of Texas at Austin, Austin, 78712 TX, USA
*
Corresponding author: Stephen Jessee; Email: sjessee@utexas.edu
Rights & Permissions [Opens in a new window]

Abstract

Survey experiments on probability samples are a popular method for investigating population-level causal questions due to their strong internal validity. However, lower survey response rates and an increased reliance on online convenience samples raise questions about the generalizability of survey experiments. We examine this concern using data from a collection of 50 survey experiments which represent a wide range of social science studies. Recruitment for these studies employed a unique double sampling strategy that first obtains a sample of “eager” respondents and then employs much more aggressive recruitment methods with the goal of adding “reluctant” respondents to the sample in a second sampling wave. This approach substantially increases the number of reluctant respondents who participate and also allows for straightforward categorization of eager and reluctant survey respondents within each sample. We find no evidence that treatment effects for eager and reluctant respondents differ substantially. Within demographic categories often used for weighting surveys, there is also little evidence of response heterogeneity between eager and reluctant respondents. Our results suggest that social science findings based on survey experiments, even in the modern era of very low response rates, provide reasonable estimates of population average treatment effects among a deeper pool of survey respondents in a wide range of settings.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of The Society for Political Methodology
Figure 0

Figure 1 Example ranges of respondent eagerness/reluctance among different types of samples. Ranges chosen for illustrative purposes.

Figure 1

Figure 2 Standardized ATEs among reluctant and eager respondents closely resemble each other. Points show estimated treatment effects estimated separately among eager and reluctant respondents, with 95% confidence interval bars. Solid line shows Deming regression fit and shaded region shows its block-bootstrapped 95% confidence region.

Figure 2

Figure 3 Deming regression estimates relating CATE estimates for eager and reluctant respondents show that effects are similar within subgroups.

Figure 3

Table 1 An analysis of treatment effect heterogeneity. We conducted an omnibus test provided in the grf R package. We also compared the estimates among eager and NRFU respondents. Most studies show no statistically significant treatment effect heterogeneity in either test (Neither). Ten show evidence of heterogeneity using the omnibus test, but it is not driven by differences between eager and NRFU respondents (Omnibus Only). Two studies show no evidence using the omnibus test, but do display NRFU-specific heterogeneity (NRFU Only), and two others display both general and NRFU-specific effect heterogeneity (Both).

Figure 4

Figure 4 Correlations of individual-level treatment effects, $\widehat {\tau _i}$, and propensities of being a reluctant respondent, $\hat {\pi }_i^{*}$, are weak and do not relate to cases with significant treatment effect heterogeneity. The hollow points are cases with significant heterogeneity tests according to omnibus tests from the causal random forest models and solid circles are cases where no significant heterogeneity is detected.

Figure 5

Table 2 Confusion matrix showing the number of correctly and incorrectly predicted observations by the random forest model on the training set (based on out-of-bag data). Below it are the error rate for the training sample 10-fold cross-validation parameter-tuning procedure, training sample out-of-bag error rate for the tuned model on the training set, and the out-of-sample error rate on the test set. Error rates are proportions of cases predicted incorrectly. AUROC is the area under the ROC curve, which plots the true positive rate against the false positive rate.

Supplementary material: File

Moniz et al. supplementary material

Moniz et al. supplementary material
Download Moniz et al. supplementary material(File)
File 631.9 KB