Using Split Samples to Improve Inference on Causal Effects

Marcel Fafchamps; Julien Labonne

doi:10.1017/pan.2017.22

Using Split Samples to Improve Inference on Causal Effects

Published online by Cambridge University Press: 18 September 2017

Marcel Fafchamps and

Julien Labonne

Show author details

Marcel Fafchamps: Affiliation:
Stanford University, Freeman Spogli Institute for International Studies, Encina Hall E105, Stanford, CA 94305, USA. Email: fafchamp@stanford.edu
Julien Labonne*: Affiliation:
Blavatnik School of Government, University of Oxford Radcliffe Observatory Quarter, Woodstock Road, Oxford, OX2 6GG, UK. Email: julien.labonne@bsg.ox.ac.uk
*: *Email: julien.labonne@bsg.ox.ac.uk

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

We discuss a statistical procedure to carry out empirical research that combines recent insights about preanalysis plans (PAPs) and replication. Researchers send their datasets to an independent third party who randomly generates training and testing samples. Researchers perform their analysis on the training sample and are able to incorporate feedback from both colleagues, editors, and referees. Once the paper is accepted for publication the method is applied to the testing sample and it is those results that are published. Simulations indicate that, under empirically relevant settings, the proposed method delivers more power than a PAP. The effect mostly operates through a lower likelihood that relevant hypotheses are left untested. The method appears better suited for exploratory analyses where there is significant uncertainty about the outcomes of interest. We do not recommend using the method in situations where the treatment are very costly and thus the available sample size is limited. An interpretation of the method is that it allows researchers to perform direct replication of their work. We also discuss a number of practical issues about the method’s feasibility and implementation.

Information

Type: Articles
Information: Political Analysis , Volume 25 , Issue 4 , October 2017 , pp. 465 - 482

DOI: https://doi.org/10.1017/pan.2017.22 [Opens in a new window]
Copyright: Copyright © The Author(s) 2017. Published by Cambridge University Press on behalf of the Society for Political Methodology.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Author’s note: We thank Michael Alvarez (Co-Editor), two anonymous referees, Rob Garlick and Kate Vyborny for discussions and comments. All remaining errors are ours. Replication data are available on the Harvard Dataverse (Fachamps and Labonne 2017). Supplementary materials for this article are available on the Political Analysis Web site.

Contributing Editor: R. Michael Alvarez

References

Anderson, Michael L. 2008. Multiple inference and gender differences in the effects of early intervention: A reevaluation of the abecedaian, perry preschool, and early training projects. Journal of the American Statistical Association 103(484):1481–1495.Google Scholar

Athey, Susan, and Imbens, Guido. 2015. Machine learning methods for estimating heterogeneous causal effects. Stanford University. Mimeo.Google Scholar

Bell, Mark, and Miller, Nicholas. 2015. Questioning the effect of nuclear weapons on conflict. Journal of Conflct Resolution 59(1):74–92.Google Scholar

Belloni, Alexandre, Chernozhukov, Victor, and Hansen, Christian. 2014. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives 28(2):29–50.Google Scholar

Benjamini, Yoav, Krieger, Abba M., and Yekutieli, Daniel. 2006. Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93(3):491–507.Google Scholar

Benjamini, Yoav, and Yekutieli, Daniel. 2001. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics 29(4):1165–1188.Google Scholar

Benjamini, Yoav, and Hochberg, Yosef. 1995. Controlling the false discovery rate: A pactrical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 57(1):289–300.Google Scholar

Blair, Graeme, Cooper, Jasper, Coppock, Alexander, and Humphreys, Macartan. 2016. Declaring and diagnosing research designs. Columbia University. Mimeo.Google Scholar

Brodeur, Abel, Le, Mathias, Sangnier, Marc, and Zylberberg, Yanos. 2016. Star wars: The empirics strike back. American Economic Journal: Applied Economics 8(1):1–32.Google Scholar

Coffman, Lucas C., and Niederle, Muriel. 2015. Pre-analysis plans are not the solution replications might be. Journal of Economic Perspectives 29(3):81–98.Google Scholar

Dunning, Thad. 2016. Transparency, replication, and cumulative learning: What experiments alone cannot achieve. Annual Review of Political Science 19(1):S1–S23.Google Scholar

Einav, Liran, and Levin, Jonathan. 2014. Economics in the age of big data. Science 346(6210):715.Google Scholar

Fachamps, Marcel, and Labonne, Julien. 2017. Replication data for “Using split samples to improve inference on causal effects”. doi:10.7910/DVN/Q0IXQY, Harvard Dataverse, V1.Google Scholar

Findley, Michael G., Jensen, Nathan M., Malesky, Edmund J., and Pepinsky, Thomas B.. Forthcoming. Can results-free review reduce publication bias? The results and implications of a pilot study. Comparative Political Studies.Google Scholar

Franco, Annie, Malhotra, Neil, and Simonovits, Gabor. 2014. Publication bias in the social sciences: Unlocking the file drawer. Science 345(6203):1502–1505.Google Scholar

Gelman, Andrew. 2014. Preregistration: What’s in it for you? http://andrewgelman.com/2014/03/10/ preregistration-whats/.Google Scholar

Gelman, Andrew. 2015. The connection between varying treatment effects and the crisis of unreplicable research. Journal of Management 41(2):632–643.Google Scholar

Gelman, Andrew, Carlin, John, Stern, Hal, Dunson, David, Vehtari, Aki, and Rubin, Donald. 2013. Bayesian data analysis . 3rd edn. London: Chapman and Hall/CRC.Google Scholar

Gerber, Alan, and Malhotra, Neil. 2008. Do statistical reporting standards affect what is published? Publication bias in two leading political science journals. Quaterly Journal of Political Science 3(3):313–326.Google Scholar

Gerber, Alan S., Green, Donald P., and Nickerson, David. 2001. Testing for Publication Bias in Political Science. Political Analysis 9(4):385–392.Google Scholar

Green, Don, Humphreys, Macartan, and Smith, Jenny. 2013. Read it, understand it, believe it, use it: Principles and proposals for a more credible research publication. Columbia University. mimeo.Google Scholar

Grimmer, Justin. 2015. We are all social scientists now: How big data, machine learning, and causal inference work together. PS: Political Science & Politics 48(1):80–83.Google Scholar

Hainmueller, Jens, and Hazlett, Chad. 2013. Kernel regularized least squares: Reducing misspecification bias with a flexible and interpretable machine learning approach. Political Analysis 22(2):143–168.Google Scholar

Hartman, Erin, and Hidalgo, F. Daniel. 2015. What’s the alternative?: An equivalence approach to balance and placebo tests. UCLA. mimeo.Google Scholar

Humphreys, Macartan, Sanchez de la Sierra, Raul, and van der Windt, Peter. 2013. Fishing, commitment, and communication: A proposal for comprehensive nonbinding research registration. Political Analysis 21(1):1–20.Google Scholar

Ioannidis, John. 2005. Why most published research findings are false. PLOS Medicine 2(8):e124.Google Scholar

Laitin, David D. 2013. Fisheries management. Political Analysis 21:42–47.Google Scholar

Leamer, Edward. 1974. False models and post-data model construction. Journal of the American Statistical Association 69(345):122–131.Google Scholar

Leamer, Edward. 1978. Specification searches. Ad hocinference with nonexperimental data . New York, NY: Wiley.Google Scholar

Leamer, Edward. 1983. Let’s take the Con out of econometrics. American Economic Review 73(1):31–43.Google Scholar

Lin, Winston, and Green, Donald P.. 2016. Standard operating procedures: A safety net for pre-analysis plans. PS: Political Science & Politics 49(3):495–500.Google Scholar

Lovell, M. 1983. Data mining. Review of Economic and Statistics 65(1):1–12.Google Scholar

Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., Glennerster, R., Green, D. P., Humphreys, M., Imbens, G., Laitin, D., Madon, T., Nelson, L., Nosek, B. A., Petersen, M., Sedlmayr, R., Simmons, J. P., Simonsohn, U., and Van der Laan, M.. 2014. Promoting transparency in social science research. Science 343(6166):30–31.Google Scholar

Monogan, James E. 2015. Research preregistration in political science: The case, counterarguments, and a response to critiques. PS: Political Science & Politics 48(3):425–429.Google Scholar

Nyhan, Brendan. 2015. Increasing the credibility of political science research: A proposal for journal reforms. PS: Political Science & Politics 48(S1):78–83.Google Scholar

Olken, Benjamin. 2015. Pre-analysis plans in economics. Journal of Economic Perspectives 29(3):61–80.Google Scholar

Pepinsky, Tom. 2013. The perilous peer review process. http://tompepinsky.com/2013/09/16/the-perilous- peer-review-process/.Google Scholar

Rauchhaus, Robert. 2009. Evaluating the nuclear peace hypothesis a quantitative approach. Journal of Conflict Resolution 53(2):258–277.Google Scholar

Sankoh, A. J., Huque, M. F., and Dubey, S. D.. 1997. Some comments on frequently used multiple endpoint adjustment methods in clinical trials. Statistics in Medicine 16(22):2529–2542.Google Scholar

Fafchamps and Labonne supplementary material

Fafchamps and Labonne supplementary material 1

File 167.8 KB

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Josephson, Anna and Michler, Jeffrey D. 2018. Viewpoint: Beasts of the field? Ethics in agricultural and applied economics. Food Policy, Vol. 79, Issue. , p. 1.

Moshontz, Hannah Campbell, Lorne Ebersole, Charles R. IJzerman, Hans Urry, Heather L. Forscher, Patrick S. Grahe, Jon E. McCarthy, Randy J. Musser, Erica D. Antfolk, Jan Castille, Christopher M. Evans, Thomas Rhys Fiedler, Susann Flake, Jessica Kay Forero, Diego A. Janssen, Steve M. J. Keene, Justin Robert Protzko, John Aczel, Balazs Álvarez Solas, Sara Ansari, Daniel Awlia, Dana Baskin, Ernest Batres, Carlota Borras-Guevara, Martha Lucia Brick, Cameron Chandel, Priyanka Chatard, Armand Chopik, William J. Clarance, David Coles, Nicholas A. Corker, Katherine S. Dixson, Barnaby James Wyld Dranseika, Vilius Dunham, Yarrow Fox, Nicholas W. Gardiner, Gwendolyn Garrison, S. Mason Gill, Tripat Hahn, Amanda C. Jaeger, Bastian Kačmár, Pavol Kaminski, Gwenaël Kanske, Philipp Kekecs, Zoltan Kline, Melissa Koehn, Monica A. Kujur, Pratibha Levitan, Carmel A. Miller, Jeremy K. Okan, Ceylan Olsen, Jerome Oviedo-Trespalacios, Oscar Özdoğru, Asil Ali Pande, Babita Parganiha, Arti Parveen, Noorshama Pfuhl, Gerit Pradhan, Sraddha Ropovik, Ivan Rule, Nicholas O. Saunders, Blair Schei, Vidar Schmidt, Kathleen Singh, Margaret Messiah Sirota, Miroslav Steltenpohl, Crystal N. Stieger, Stefan Storage, Daniel Sullivan, Gavin Brent Szabelska, Anna Tamnes, Christian K. Vadillo, Miguel A. Valentova, Jaroslava V. Vanpaemel, Wolf Varella, Marco A. C. Vergauwe, Evie Verschoor, Mark Vianello, Michelangelo Voracek, Martin Williams, Glenn P. Wilson, John Paul Zickfeld, Janis H. Arnal, Jack D. Aydin, Burak Chen, Sau-Chin DeBruine, Lisa M. Fernandez, Ana Maria Horstmann, Kai T. Isager, Peder M. Jones, Benedict Kapucu, Aycan Lin, Hause Mensink, Michael C. Navarrete, Gorka Silan, Miguel A. and Chartier, Christopher R. 2018. The Psychological Science Accelerator: Advancing Psychology Through a Distributed Collaborative Network. Advances in Methods and Practices in Psychological Science, Vol. 1, Issue. 4, p. 501.

Narayan, Sneha TeBlunthuis, Nathan Hale, Wm Salt Hill, Benjamin Mako and Shaw, Aaron 2019. All Talk. Proceedings of the ACM on Human-Computer Interaction, Vol. 3, Issue. CSCW, p. 1.

Zhao, Shuaidong and Zhang, Kuilin 2019. A distributionally robust optimization approach to reconstructing missing locations and paths using high-frequency trajectory data. Transportation Research Part C: Emerging Technologies, Vol. 102, Issue. , p. 316.

Coey, Dominic and Cunningham, Tom 2019. Improving Treatment Effect Estimators Through Experiment Splitting. p. 285.

FUCHS‐SCHÜNDELN, NICOLA MASELLA, PAOLO and PAULE‐PALUDKIEWICZ, HANNAH 2020. Cultural Determinants of Household Saving Behavior. Journal of Money, Credit and Banking, Vol. 52, Issue. 5, p. 1035.

Kim, Nami and Lee, Jongseon 2020. Who is leaping through failure? The influence of innovation characteristics on learning from failure. Industry and Innovation, Vol. 27, Issue. 9, p. 1014.

Biewen, Martin Fitzenberger, Bernd and Seckler, Matthias 2020. Counterfactual quantile decompositions with selection correction taking into account Huber/Melly (2015): An application to the German gender wage gap. Labour Economics, Vol. 67, Issue. , p. 101927.

ten Broeke, Pam Olthof, Merlijn Beckers, Debby G. J. Hopkins, Nicola D. Graves, Lee E. F. Carter, Sophie E. Cochrane, Madeleine Gavin, David Morris, Abigail S. Lichtwarck-Aschoff, Anna Geurts, Sabine A. E. Thijssen, Dick H. J. and Bijleveld, Erik 2020. Temporal dynamics of sitting behavior at work. Proceedings of the National Academy of Sciences, Vol. 117, Issue. 26, p. 14883.

Aikens, Rachael C. Greaves, Dylan and Baiocchi, Michael 2020. A pilot design for observational studies: Using abundant data thoughtfully. Statistics in Medicine, Vol. 39, Issue. 30, p. 4821.

Ferman, Bruno Pinto, Cristine Possebom, Vitor and Barnow, Burt 2020. Cherry Picking with Synthetic Controls. Journal of Policy Analysis and Management, Vol. 39, Issue. 2, p. 510.

Powell, Michael Koenecke, Allison Byrd, James Brian Nishimura, Akihiko Konig, Maximilian F. Xiong, Ruoxuan Mahmood, Sadiqa Mucaj, Vera Bettegowda, Chetan Rose, Liam Tamang, Suzanne Sacarny, Adam Caffo, Brian Athey, Susan Stuart, Elizabeth A. and Vogelstein, Joshua T. 2021. Ten Rules for Conducting Retrospective Pharmacoepidemiological Analyses: Example COVID-19 Study. Frontiers in Pharmacology, Vol. 12, Issue. ,

Ozier, Owen 2021. Replication Redux: The Reproducibility Crisis and the Case of Deworming. The World Bank Research Observer, Vol. 36, Issue. 1, p. 101.

Janzen, Sarah A. and Michler, Jeffrey D. 2021. Ulysses' pact or Ulysses' raft: Using pre‐analysis plans in experimental and nonexperimental research. Applied Economic Perspectives and Policy, Vol. 43, Issue. 4, p. 1286.

Roettger, Timo B. 2021. Preregistration in experimental linguistics: applications, challenges, and limitations. Linguistics, Vol. 59, Issue. 5, p. 1227.

Leaver, Clare Ozier, Owen Serneels, Pieter and Zeitlin, Andrew 2021. Recruitment, Effort, and Retention Effects of Performance Contracts for Civil Servants: Experimental Evidence from Rwandan Primary Schools. American Economic Review, Vol. 111, Issue. 7, p. 2213.

Krutter, Simon Schaffler‐Schaden, Dagmar Eßl‐Maurer, Roland Seymer, Alexander Osterbrink, Juergen and Flamm, Maria 2022. Home care nursing for persons with dementia from a family caregivers' point of view: Predictors of utilisation in a rural setting in Austria. Health & Social Care in the Community, Vol. 30, Issue. 1, p. 389.

Alpino, Matteo Hauge, Karen Evelyn Kotsadam, Andreas and Markussen, Simen 2022. Effects of dialogue meetings on sickness absence—Evidence from a large field experiment. Journal of Health Economics, Vol. 83, Issue. , p. 102615.

Scholl, Jacqueline Trier, Hailey A. Rushworth, Matthew F. S. Kolling, Nils and Chambers, Christopher D. 2022. The effect of apathy and compulsivity on planning and stopping in sequential decision-making. PLOS Biology, Vol. 20, Issue. 3, p. e3001566.

Hsu, Carol Lee, Jae-Nam Fang, Yulin Straub, Detmar W. Su, Ning and Ryu, Hyun-Sun 2022. The Role of Vendor Legitimacy in IT Outsourcing Performance: Theory and Evidence. Information Systems Research, Vol. 33, Issue. 1, p. 337.

Download full list

Article contents

Using Split Samples to Improve Inference on Causal Effects

Abstract

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Fafchamps and Labonne supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests