Hostname: page-component-89b8bd64d-ksp62 Total loading time: 0 Render date: 2026-05-07T03:53:01.620Z Has data issue: false hasContentIssue false

Who's cheating on your survey? A detection approach with digital trace data

Published online by Cambridge University Press:  28 November 2022

Simon Munzert*
Affiliation:
Hertie School, Berlin, Germany
Sebastian Ramirez-Ruiz
Affiliation:
Hertie School, Berlin, Germany
Pablo Barberá
Affiliation:
Political Science and International Relations, University of Southern California, Los Angeles, USA
Andrew M. Guess
Affiliation:
Department of Politics, Princeton University, Princeton, USA
JungHwan Yang
Affiliation:
University of Illinois at Urbana-Champaign, Urbana, USA
*
*Corresponding author. Email: munzert@hertie-school.org
Rights & Permissions [Opens in a new window]

Abstract

In this note, we provide direct evidence of cheating in online assessments of political knowledge. We combine survey responses with web tracking data of a German and a US online panel to assess whether people turn to external sources for answers. We observe item-level prevalence rates of cheating that range from 0 to 12 percent depending on question type and difficulty, and find that 23 percent of respondents engage in cheating at least once across waves. In the US panel, which employed a commitment pledge, we observe cheating behavior among less than 1 percent of respondents. We find robust respondent- and item-level characteristics associated with cheating. However, item-level instances of cheating are rare events; as such, they are difficult to predict and correct for without tracking data. Even so, our analyses comparing naive and cheating-corrected measures of political knowledge provide evidence that cheating does not substantially distort inferences.

Information

Type
Research Note
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s), 2022. Published by Cambridge University Press on behalf of the European Political Science Association
Figure 0

Fig. 1. Distribution of instances of cheating at the respondent and item level. (a) Respondent-level cheating distribution. (b) Item-level cheating distribution. Note: Number of respondents: 666; number of items: 68. Panel (a) is based on instances of cheating identified in the tracking data at the respondent level, showing that for 77 percent of the respondents we find no evidence of cheating in the tracking data. Panel (b) is based on instances of cheating identified in the tracking data at the item level, showing systematic variation of propensities to cheat by item type.

Figure 1

Fig. 2. Estimated effects of respondent and item characteristics on response-level cheating incidence. (a) Fixed-effects estimates. (b) Predicted probabilities. Note: Results from a Bayesian logistic mixed-effects model with person and item random effects. Posterior means along with 80 and 95 percent credible intervals reported. Number of observations: 35,486; number of respondents: 656; number of items: 68. To compute the predicted probabilities, numeric covariates are held at their means and the other covariates are set to: female, intermediate education, item type “Elite - Verbal”, and habitual survey-taker.

Supplementary material: File

Munzert et al. supplementary material

Munzert et al. supplementary material
Download Munzert et al. supplementary material(File)
File 2.8 MB
Supplementary material: File

Munzert_et_al._Dataset

Dataset

Download Munzert_et_al._Dataset(File)
File