Skip to main content
    • Aa
    • Aa

The Generalizability of Survey Experiments*

  • Kevin J. Mullinix (a1), Thomas J. Leeper (a2), James N. Druckman (a3) and Jeremy Freese (a4)

Survey experiments have become a central methodology across the social sciences. Researchers can combine experiments’ causal power with the generalizability of population-based samples. Yet, due to the expense of population-based samples, much research relies on convenience samples (e.g. students, online opt-in samples). The emergence of affordable, but non-representative online samples has reinvigorated debates about the external validity of experiments. We conduct two studies of how experimental treatment effects obtained from convenience samples compare to effects produced by population samples. In Study 1, we compare effect estimates from four different types of convenience samples and a population-based sample. In Study 2, we analyze treatment effects obtained from 20 experiments implemented on a population-based sample and Amazon's Mechanical Turk (MTurk). The results reveal considerable similarity between many treatment effects obtained from convenience and nationally representative population-based samples. While the results thus bolster confidence in the utility of convenience samples, we conclude with guidance for the use of a multitude of samples for advancing scientific knowledge.

Hide All

The authors acknowledge support from a National Science Foundation grant for Time-Sharing Experiments in the Social Sciences (SES-1227179). Druckman and Freese are co-Principal Investigators of TESS, and Study 2 was designed and funded as a methodological component of their TESS grant. Study 1 includes data in part funded by an NSF Doctoral Dissertation Improvement Grant to Leeper (SES-1160156) and in part collected via a successful proposal to TESS by Mullinix and Leeper. Druckman and Freese were neither involved in Study 1 nor with any part of the review or approval of Mullinix and Leeper's TESS proposal (via recusal, given other existing collaborations). Only after data from both studies were collected did authors determine that the two studies were so complementary that it would be better to publish them together. The authors thank Lene Aarøe, Kevin Arceneaux, Christoph Arndt, Adam Berinsky, Emily Cochran Bech, Scott Clifford, Adrienne Hosek, Cindy Kam, Lasse Laustsen, Diana Mutz, Helene Helboe Pedersen, Richard Shafranek, Flori So, Rune Slothuus, Rune Stubager, Magdalena Wojcieszak, workshop participants at Southern Denmark University, and participants at The American Panel Survey Workshop at Washington University, St. Louis.

Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

Douglas J Ahler . 2014. “Self-Fulfilling Misperceptions of Public Polarization.” The Journal of Politics 76 (3): 607–20.

Adam J. Berinsky , Michele F. Margolis , and Michael W. Sances . 2014. “Separating the Shirkers from the Workers? Making Sure Respondents Pay Attention on Self-Administered Surveys.” American Journal of Political Science 58 (3): 739–53.

John Bohannon . 2011. “Social Science for Pennies.” Science 334 (October): 307.

David E. Broockman and Donald P. Green . 2013. “Do Online Advertisements Increase Political Candidates’ Name Recognition or Favorability? Evidence from Randomized Field Experiments.” Political Behavior 36 (2): 263–89.

Mario Callegaro , Reg Baker , Jelke Bethlehem , Anja S. Göritz , Jon A. Krosnick , and Paul J. Lavrakas . 2014. “Online Panel Research: History, Concepts, Applications, and a Look at the Future.” In Online Panel Research: A Data Quality Perspective, eds. Mario Callegaro , Reg Baker , Jelke Bethlehem , Anja S. Göritz , Jon A. Krosnick , and Paul J. Lavrakas . West Sussex , United Kingdom: John Wiley & Sons Ltd.

Jesse Chandler , Pam Mueller , and Gabriele Paolacci . 2014. “Nonnaiveté Among Amazon Mechanical Turk Workers: Consequences and Solution for Behavioral Researchers.” Behavior Research Methods 46 (1): 112–30.

Dennis Chong and James N. Druckman . 2007b. “Framing Theory.” Annual Review of Political Science 10 (1): 103–26.

Scott Clifford and Jennifer Jerit . 2015. “Is There a Cost to Convenience? An Experimental Comparison of Data Quality in Laboratory and Online Studies.” Journal of Experimental Political Science 1 (2): 120–31.

James N Druckman . 2001. “The Implications of Framing Effects for Citizen Competence.” Political Behavior 23 (3): 225–56.

James N Druckman . 2004. “Priming the Vote: Campaign Effects in a US Senate Election.” Political Psychology 25: 577–94.

James N. Druckman and Arthur Lupia . 2012. “Experimenting with Politics.” Science 335 (March): 1177–79.

James N. Druckman and Cindy D. Kam . 2011. “Students as Experimental Participants: A Defense of the ‘Narrow Data Base’.” In Cambridge Handbook of Experimental Political Science, eds. J. N. Druckman , D. P. Green , J. H. Kuklinski , and A. Lupia . New York: Cambridge University Press, 4157.

Robert M Entman . 1993. “Framing: Toward Clarification of a Fractured Paradigm.” Journal of Communication 43 (4): 5158.

Anthony Fowler and Michele Margolis . 2014. “The Political Consequences of Uninformed Voters.” Electoral Studies 34: 100–10.

Annie Franco , Neil Malhotra , and Gabor Simonovits . 2014. “Publication Bias in the Social Sciences: Unlocking the File Drawer.” Science 345 (August): 1502–5.

William A. Gamson and Andre Modigiliani . 1989. “Media Discourse and Public Opinion on Nuclear Power: A Constructionist Approach.” American Journal of Sociology 95 (1): 137.

Andrew Gelman and Hal Stern . 2006. “The Difference Between ‘Significant’ and ‘Not Significant’ is not Itself Statistically Significant.” The American Statistician 60 (4): 328–31.

Joseph K. Goodman , Cynthia E. Cryder , and Amar Cheema . 2012. “Data Collection in a Flat World: The Strengths and Weaknesses of Mechanical Turk Samples.” Journal of Behavioral Decision Making 26: 213–24.

Donald P. Green and Holger L. Kern . 2012. “Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression Trees.” Public Opinion Quarterly 76 (3): 491511.

D. Sunshine Hillygus , Natalie Jackson , and McKenzie Young . 2014. “Professional Respondents in Nonprobability Online Panels.” In Online Panel Research: A Data Quality Perspetive, eds. Mario Callegaro , Reg Baker , Jelke Bethlehem , Anja S. Göritz , Jon A. Krosnick , and Paul J. Lavrakas . West Sussex, United Kingdom: John Wiley & Sons Ltd.

John J. Horton , David G. Rand , and Richard J. Zeckhauser . 2011. “The Online Laboratory: Conducting Experiments in a Real Labor Market.” Experimental Economics 14 (3): 399425.

Carl I Hovland . 1959. “Reconciling Conflicting Results Derived from Experimental and Survey Studies of Attitude Change.” The American Psychologist 14: 817.

Connor Huff and Dustin Tingley . 2015. “‘Who are these people?’ Evaluating the demographic characteristics and political preferences of MTurk survey respondents.” Research & Politics 2 (3): 111. DOI: 10.1177/2053168015604648.

Shanto Iyengar . 1991. Is Anyone Responsible? How Television Frames Political Issues. Chicago, IL: The University of Chicago Press.

Jennifer Jerit , Jason Barabas , and Scott Clifford . 2013. “Comparing Contemporaneous Laboratory and Field Experiments on Media Effects.” Public Opinion Quarterly 77 (1): 256–82.

Cindy D. Kam , Jennifer R. Wilking , and Elizabeth J. Zechmeister . 2007. “Beyond the ‘Narrow Data Base’: Another Convenience Sample for Experimental Research.” Political Behavior 29 (4): 415–40.

Scott Keeter , Courtney Kennedy , Michael Dimock , Jonathan Best and Peyton Craighill . 2006. “Gauging the Impact of Growing Nonresponse on Estimates from a National RDD Telephone Survey.” Public Opinion Quarterly 70 (5): 759–79.

Samara Klar . 2013. “The Influence of Competing Identity Primes on Political Preferences.” Journal of Politics 75 (4): 1108–24.

Richard A Klein . et al.2014. “Investigating Variation in Replicability: A ‘Many Labs’ Replication Project.” Social Psychology 45: 142–52.

Peter Kraft . 2008. “Curses—Winner's and Otherwise—in Genetic Epidemiology.” Epidemiology 19 (September): 649–51.

Neil Malhotra and Alexander G. Kuo . 2008. “Attributing Blame: The Public's Response to Hurricane Katrina.” The Journal of Politics 70 (1): 120–35.

David G. Rand , Alexander Peysakhovich , Gordon T. Kraft-Todd , George E. Newman , Owen Wurzbacher , Martin A. Nowak , and Joshua D. Greene . 2014. “Social Heuristics Shape Intuitive Cooperation.” Nature Communications 5: 112.

David P. Redlawsk , Andrew J. Civettini , and Karen M. Emmerson . 2010. “The Affective Tipping Point: Do Motivated Reasoners Ever ‘Get It’?Political Psychology 31: 563593.

David O Sears . 1986. “College Sophomores in the Laboratory: Influences of a Narrow Data Base on Social Psychology's View of Human Nature.” Journal of Personality and Social Psychology 51: 515530.

Paul Sniderman . 2011. The Logic and Design of the Survey Experiment: An Autobiography of a Methodological Innovation.” In Cambridge Handbook of Experimental Political Science, eds. J. N. Druckman , D. P. Green , J. H. Kuklinski , and A. Lupia . New York: Cambridge University Press, 102–14.

Stephanie Steinmetz , Annamaria Bianchi , Kea Tijdens , and Silvia Biffignandi . 2014. “Improving Web Surveys Quality: Potentials and Constraints of Propensity Score Adjustments.” In Online Panel Research: A Data Quality Perspetive, eds. Mario Callegaro , Reg Baker , Jelke Bethlehem , Anja S. Göritz , Jon A. Krosnick , and Paul J. Lavrakas . West Sussex, United Kingdom: John Wiley & Sons Ltd.

Nicholas A. Valentino , Michael W. Traugott , and Vincent L. Hutchings . 2002. “Group Cues and Ideological Constraint: A Replication of Political Advertising Effects Studies in the Lab and in the Field.” Political Communicaton 19 (1): 2948.

Wei Wang , David Rothschild , Sharad Goel , and Andrew Gelman . 2015. “Forecasting Elections with Non-representative Polls.” International Journal of Forecasting 31 (3): 980991.

Jill D. Weinberg , Jeremy Freese , and David McElhattan . 2014. “Comparing Demographics, Data Quality, and Results of an Online Factorial Survey Between a Population-Based and a Crowdsource-Recruited Sample.” Sociological Science 1: 292310.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Journal of Experimental Political Science
  • ISSN: 2052-2630
  • EISSN: 2052-2649
  • URL: /core/journals/journal-of-experimental-political-science
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Type Description Title
Supplementary Materials

Mullinix supplementary material
Mullinix supplementary material 1

 Word (103 KB)
103 KB


Altmetric attention score

Full text views

Total number of HTML views: 15
Total number of PDF views: 232 *
Loading metrics...

Abstract views

Total abstract views: 1428 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 22nd August 2017. This data will be updated every 24 hours.