Skip to main content Accessibility help
×
Home
Hostname: page-component-559fc8cf4f-55wx7 Total loading time: 2.005 Render date: 2021-03-08T01:51:19.652Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": false, "newCiteModal": false, "newCitedByModal": true }

Multi-modes for Detecting Experimental Measurement Error

Published online by Cambridge University Press:  14 October 2019

Raymond Duch
Affiliation:
Nuffield College, University of Oxford, Oxford, UK. Email: raymond.duch@nuffield.ox.ac.uk
Denise Laroze
Affiliation:
Centre for Experimental Social Sciences and Departamento de Administración, Universidad de Santiago de Chile, Santiago, Chile. Email: denise.laroze@usach.cl
Thomas Robinson
Affiliation:
Department of Politics and International Relations, University of Oxford, Oxford, UK. Email: thomas.robinson@politics.ox.ac.uk
Pablo Beramendi
Affiliation:
Department of Political Science, Duke University, Durham, NC 27708, USA. Email: pablo.beramendi@duke.edu

Abstract

Experiments should be designed to facilitate the detection of experimental measurement error. To this end, we advocate the implementation of identical experimental protocols employing diverse experimental modes. We suggest iterative nonparametric estimation techniques for assessing the magnitude of heterogeneous treatment effects across these modes. And we propose two diagnostic strategies—measurement metrics embedded in experiments, and measurement experiments—that help assess whether any observed heterogeneity reflects experimental measurement error. To illustrate our argument, first we conduct and analyze results from four identical interactive experiments: in the lab; online with subjects from the CESS lab subject pool; online with an online subject pool; and online with MTurk workers. Second, we implement a measurement experiment in India with CESS Online subjects and MTurk workers.

Type
Articles
Copyright
Copyright © The Author(s) 2019. Published by Cambridge University Press on behalf of the Society for Political Methodology.

Access options

Get access to the full version of this content by using one of the access options below.

Footnotes

Authors’ note: We would like to acknowledge the contributions of the Nuffield College Centre for Experimental Social Sciences postdocs who were instrumental in helping design and implement the experiments reported on in the manuscript—these include, John Jensenius III, Aki Matsuo, Sonke Ehret, Mauricio Lopez, Hector Solaz, Wojtek Przepiorka, David Klinowski, Sonja Vogt, and Amma Parin. We have also benefited from the very helpful comments from colleagues including Vera Troeger, Thomas Pluemper, Dominik Duell, Luke Keele, and Mats Ahrenshop. And thanks to the Political Analysis reviewers, editor and editorial team who were extremely helpful. Of course we assume responsibility for all of the shortcomings of the design and analysis. All replication materials are available from the Political Analysis Dataverse, doi.org/10.7910/DVN/F0GMX1 (Duch et al.2019).

Contributing Editor: Jeff Gill

References

Ahlquist, J. S. 2018. “List Experiment Design, Non-Strategic Respondent Error, and Item Count Technique Estimators.” Political Analysis 26(1):3453.CrossRefGoogle Scholar
Al-Ubaydli, O., List, J. A., LoRe, D., and Suskind, D.. 2017. “Scaling for Economists: Lessons from the Non-Adherence Problem in the Medical Literature.” Journal of Economic Perspectives 31(4):125144.CrossRefGoogle ScholarPubMed
Athey, S., and Imbens, G.. 2017. “The Econometrics of Randomized Experiments.” Handbook of Economic Field Experiments 1:73140.CrossRefGoogle Scholar
Bader, F., Baumeister, B., Berger, R., and Keuschnigg, M.. 2019. “On the Transportability of Laboratory Results.” Sociological Methods & Research, https://doi.org/10.1177/0049124119826151.CrossRefGoogle Scholar
Bertrand, M., and Mullainathan, S.. 2001. “Do People Mean What They Say? Implications for Subjective Survey Data.” Economics and Social Behavior 91(2):6772.Google Scholar
Blair, G., Chou, W., and Imai, K.. 2019. “List Experiments with Measurement Error.” Political Analysis, https://doi.org/10.1017/pan.2018.56.CrossRefGoogle Scholar
Blattman, C., Jamison, J., Koroknay-Palicz, T., Rodrigues, K., and Sheridan, M.. 2016. “Measuring the Measurement Error: A Method to Qualitatively Validate Survey Data.” Journal of Development Economics 120:99112.CrossRefGoogle Scholar
Burleigh, T., Kennedy, R., and Clifford, S.. 2018. “How to Screen Out VPS and International Respondents Using Qualtrics: A Protocol.” Working Paper.CrossRefGoogle Scholar
Camerer, C.2015. The Promise and Success of Lab-Field Generalizability in Experimental Economics: A Critical Reply to Levitt and List. Oxford Scholarship Online.CrossRefGoogle Scholar
Centola, D. 2018. How Behavior Spreads: The Science of Complex Contagions. Princeton, NJ: Princeton University Press.Google Scholar
Chang, L., and Krosnick, J. A.. 2009. “National Surveys Via Rdd Telephone Interviewing Versus the InternetComparing Sample Representativeness and Response Quality.” Public Opinion Quarterly 73(4):641678.CrossRefGoogle Scholar
Coppock, A. 2018. “Generalizing from Survey Experiments Conducted on Mechanical Turk: A Replication Approach.” Political Science Research and Methods 7(3):613628.CrossRefGoogle Scholar
Coppock, A., Leeper, T. J., and Mullinix, K. J.. 2018. “Generalizability of heterogeneous treatment effect estimates across samples.” Proceedings of the National Academy of Sciences 115(49):1244112446.CrossRefGoogle ScholarPubMed
de Quidt, J., Haushofer, J., and Roth, C.. 2018. “Measuring and Bounding Experimenter Demand.” American Economic Review 108(11):32663302.CrossRefGoogle Scholar
Duch, R., Laroze, D., and Zakharov, A.. 2018. “Once a Liar Always a Liar?” Nuffield Centre for Experimental Social Sciences Working Paper.Google Scholar
Duch, R., Laroze, D., Robinson, T., and Beramendi, P.. 2019. “Replication Data for: Multi-Modes for Detecting Experimental Measurement Error.” https://doi.org/10.7910/DVN/F0GMX1, Harvard Dataverse, V1, UNF:6:4Q/frMH7kswKFTcwuHsmGQ== [fileUNF].CrossRefGoogle Scholar
Dupas, P., and Miguel, E.. 2017. “Impacts and Determinants of Health Levels in Low-Income Countries.” In Handbook of Economic Field Experiments, edited by Duflo, E. and Banerjee, A., 393. Amsterdam: North-Holland.Google Scholar
Engel, C., and Kirchkamp, O.. 2018. “Measurement Errors of Risk Aversion and How to Correct Them.” Working Paper.Google Scholar
Gelman, A. 2013. “Preregistration of Studies and Mock Reports.” Political Analysis 21(1):4041.CrossRefGoogle Scholar
Gerber, A. S., and Green, D. P.. 2008. “Field Experiments and Natural Experiments.” In The Oxford Handbook of Political Methodology, 357381. Oxford: Oxford University Press.Google Scholar
Gillen, B., Snowberg, E., and Yariv, L.. 2019. “Experimenting with Measurement Error: Techniques with Applications to the Caltech Cohort Study.” Journal of Political Economy 127(4):18261863.CrossRefGoogle Scholar
Gooch, A., and Vavreck, L.. 2019. “How Face-to-Face Interviews and Cognitive Skill Affect Item Non-Response: A Randomized Experiment Assigning Mode of Interview.” Political Science Research and Methods 7(1):143162.CrossRefGoogle Scholar
Green, D. P., and Kern, H. L.. 2012. “Modeling Heterogeneous Treatment Effects in Survey Experiments with Bayesian Additive Regression Trees.” Public Opinion Quarterly 76(3):491511.CrossRefGoogle Scholar
Grimmer, J., Messing, S., and Westwood, S. J.. 2017. “Estimating Heterogeneous Treatment Effects and the Effects of Heterogeneous Treatments with Ensemble Methods.” Political Analysis 25(4):413434.CrossRefGoogle Scholar
Hill, J. L. 2011. “Bayesian Nonparametric Modeling for Causal Inference.” Journal of Computational and Graphical Statistics 20(1):217240.CrossRefGoogle Scholar
Huff, C., and Tingley, D.. 2015. “‘Who are these People?’ Evaluating the Demographic Characteristics and Political Preferences of MTurk Survey Respondents.” Research & Politics 2(3): 2053168015604648.CrossRefGoogle Scholar
Imai, K., and Ratkovic, M.. 2013. “Estimating Treatment Effect Heterogeneity in Randomized Programme Evaluation.” The Annals of Applied Statistics 7(1):443470.CrossRefGoogle Scholar
Imai, K., and Strauss, A.. 2011. “Estimation of Heterogeneous Treatment Effects from Randomized Experiments, with Application to the Optimal Planning of the Get-Out-the-Vote Campaign.” Political Analysis 19(1):119.Google Scholar
Kennedy, R., Clifford, S., Burleigh, T., Jewell, R., and Waggoner, P.. 2018. “The Shape of and Solutions to the MTurk Quality Crisis.” Working Paper.CrossRefGoogle Scholar
Künzel, S. R., Sekhon, J. S., Bickel, P. J., and Yu, B.. 2019. “Metalearners for Estimating Heterogeneous Treatment Effects Using Machine Learning.” Proceedings of the National Academy of Sciences 116(10):41564165.CrossRefGoogle ScholarPubMed
Levitt, S., and List, J.. 2007. “What Do Laboratory Experiments Measuring Social Preferences Reveal about the Real World.” The Journal of Economic Perspectives 21(7):153174.CrossRefGoogle Scholar
Levitt, S. D., and List, J. A.. 2015. “What Do Laboratory Experiments Measuring Social Preferences Reveal About the Real World?” Oxford Scholarship Online.CrossRefGoogle Scholar
Loomes, G. 2005. “Modelling the Stochastic Component of Behaviour in Experiments: Some Issues for the Interpretation of Data.” Experimental Economics 8(4):301323.CrossRefGoogle Scholar
Maniadis, Z., Tufano, F., and List, J. A.. 2014. “One Swallow Doesn’t Make a Summer: New Evidence on Anchoring Effects.” American Economic Review 104(1):277290.CrossRefGoogle Scholar
Morton, R., and Williams, K.. 2010. From Nature to the Lab: Experimental Political Science and the Study of Causality. New York: Cambridge University Press.CrossRefGoogle Scholar
Mullainathan, S., and Spiess, J.. 2017. “Machine Learning: An Applied Econometric Approach.” Journal of Economic Perspectives 31(2):87106.CrossRefGoogle Scholar
Mutz, D. C. 2011. Population-Based Survey Experiments. Princeton, NJ: Princeton University Press.Google Scholar
Open Science Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349(6251): aac4716.Google Scholar
Tourangeau, R., and Yan, T.. 2007. “Sensitive Questions in Surveys.” Psychological Bulletin 5:859883.CrossRefGoogle Scholar
Wager, S., and Athey, S.. 2018. “Estimation and Inference of Heterogeneous Treatment Effects using Random Forests.” Journal of the American Statistical Association 113(523):12281242.CrossRefGoogle Scholar
Zizzo, D. J. 2010. “Experimenter Demand Effects in Economic Experiments.” Experimental Economics 13(1):7598.CrossRefGoogle Scholar

Duch et al. supplementary material

Duch et al. supplementary material

File 510 KB

Altmetric attention score

Full text views

Full text views reflects PDF downloads, PDFs sent to Google Drive, Dropbox and Kindle and HTML full text views.

Total number of HTML views: 89
Total number of PDF views: 224 *
View data table for this chart

* Views captured on Cambridge Core between 14th October 2019 - 8th March 2021. This data will be updated every 24 hours.

Send article to Kindle

To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Multi-modes for Detecting Experimental Measurement Error
Available formats
×

Send article to Dropbox

To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

Multi-modes for Detecting Experimental Measurement Error
Available formats
×

Send article to Google Drive

To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

Multi-modes for Detecting Experimental Measurement Error
Available formats
×
×

Reply to: Submit a response


Your details


Conflicting interests

Do you have any conflicting interests? *