Skip to main content

Combining Double Sampling and Bounds to Address Nonignorable Missing Outcomes in Randomized Experiments

  • Alexander Coppock (a1), Alan S. Gerber (a1), Donald P. Green (a2) and Holger L. Kern (a3)

Missing outcome data plague many randomized experiments. Common solutions rely on ignorability assumptions that may not be credible in all applications. We propose a method for confronting missing outcome data that makes fairly weak assumptions but can still yield informative bounds on the average treatment effect. Our approach is based on a combination of the double sampling design and nonparametric worst-case bounds. We derive a worst-case bounds estimator under double sampling and provide analytic expressions for variance estimators and confidence intervals. We also propose a method for covariate adjustment using poststratification and a sensitivity analysis for nonignorable missingness. Finally, we illustrate the utility of our approach using Monte Carlo simulations and a placebo-controlled randomized field experiment on the effects of persuasion on social attitudes with survey-based outcome measures.

Corresponding author
* Email:
Hide All

Authors’ note: The authors thank Sebastian Bauhoff, Bill Berry, Chris Blattman, Jake Bowers, Matias Cattaneo, Kosuke Imai, Molly Offer-Westort, Rocio Titiunik, and participants of the 2013 Joint Statistical Meetings for very helpful comments on previous versions of this manuscript. The authors especially thank Peter Aronow for his contributions to previous versions of this paper. This research was approved by the Columbia University IRB (Protocol AAAP1312) and the empirical analyses were preregistered at (ID: 20150702AA). Easy-to-use software for the statistical programming language R that implements the methods described in this paper is available at The replication materials for all analyses reported here are available at

Contributing Editor: Kosuke Imai

Hide All
An Ming-Wen, Frangakis Constantine E., Musick Beverly S., and Yiannoutsos Constantin T.. 2009. The need for double-sampling designs in survival studies: An application to monitor PEPFAR. Biometrics 65(1):301306.
Aronow Peter M., and Lee Donald K. K.. 2013. Interval estimation of population means under unknown but bounded probabilities of sample selection. Biometrika 100(1):235240.
Baird Sarah, Hamory Joan, and Miguel Edward. 2008. Tracking, attrition and data quality in the Kenyan life panel survey round 1 (KLPS-1). Working Paper.
Birnbaum Z. W., and Sirken Monroe G.. 1950a. On the total error due to non-interview and to random sampling. International Journal of Opinion and Attitude Research 4:179191.
Birnbaum Z. W., and Sirken Monroe G.. 1950b. Bias due to non-availability in sampling surveys. Journal of the American Statistical Association 45(249):98111.
Cassel C. M., Särndal C. E., and Wretman J. H.. 1983. Some uses of statistical models in connection with the non-response problem. In Incomplete data in sample surveys III. Symposium on incomplete data, proceedings , ed. Madow W. G. and Olkin I.. New York, NY: Academic Press, pp. 143160.
Cochran William G., Mosteller Frederick, and Tukey John W.. 1954. Statistical problems of the Kinsey report on sexual behavior in the human male . Washington, DC: American Statistical Association.
Cochran William G. 1977. Sampling techniques . 3rd edn. New York, NY: John Wiley & Sons.
Coppock Alexander, Gerber Alan S., Green Donald P., and Kern Holger L.. 2016. Replication Data for: Combining double sampling and bounds to address non-ignorable missing outcomes in randomized experiments. Harvard Dataverse.
Das Mitali, Newey Whitney K., and Vella Francis. 2003. Nonparametric estimation of sample selection models. Review of Economic Studies 70(1):3358.
DiNardo John, McCrary Justin, and Sanbonmatsu Lisa. 2006. Constructive proposals for dealing with attrition: An empirical example. Working Paper.
Ericson W. A. 1967. Optimal sample design with nonresponse. Journal of the American Statistical Association 62(317):6378.
Fraser Gary, and Yan Ru. 2007. Guided multiple imputation of missing data. Epidemiology 18(2):246252.
Freedman David, Pisani Robert, and Purves Roger. 2007. Statistics . 4th edn. New York, NY: W. W. Norton & Company.
Freedman David A., and Sekhon Jasjeet S.. 2010. Endogeneity in probit response models. Political Analysis 18(2):138150.
Frankel Laura Lazarus, and Hillygus D. Sunshine. 2014. Looking beyond demographics: Panel attrition in the ANES and GSS. Political Analysis 22(3):336353.
Gerber Alan S., Green Donald P., Kaplan Edward H., and Kern Holger L.. 2010. Baseline, placebo, and treatment: Efficient estimation for three-group experiments. Political Analysis 18(3):297315.
Gerber Alan S., and Green Donald P.. 2012. Field experiments: Design, analysis, and interpretation . New York, NY: W. W. Norton & Company.
Glynn Robert J., Laird Nan M., and Rubin Donald B.. 1993. Multiple imputation in mixture models for non-ignorable nonresponse with follow-ups. Journal of the American Statistical Association 88(423):984993.
Grogger Jeffrey. 2012. Bounding the effects of social experiments: Accounting for attrition in administrative data. Evaluation Review 36(6):449474.
Hansen M. H., and Hurwitz William N.. 1946. The problem of non-response in sample surveys. Journal of the American Statistical Association 41(236):517529.
Hartman Erin, Grieve Richard, Ramsahai Roland, and Sekhon Jasjeet S.. 2015. From sample average treatment effect to population average treatment effect on the treated: Combining experimental with observational studies to estimate population treatment effects. Journal of the Royal Statistical Society, Series A 178(3):757778.
Heckman James J. 1979. Sample selection bias as a specification error. Econometrica 47(1):153161.
Horowitz Joel L., and Manski Charles F.. 1998. Censoring of outcomes and regressors due to survey nonresponse: Identification and estimation using weights and imputations. Journal of Econometrics 84(1):3758.
Horowitz Joel L., and Manski Charles F.. 2000. Nonparametric analysis of randomized experiments with missing covariate and outcome data. Journal of the American Statistical Association 95(449):7784.
Imai Kosuke. 2008. Sharp bounds on the causal effects in randomized experiments with ‘truncation-by-death’. Statistics and Probability Letters 78(1):144149.
Imai Kosuke, King Gary, and Stuart Elizabeth A.. 2008. Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society: Series A (Statistics in Society) 171(2):481502.
Imai Kosuke. 2009. Statistical analysis of randomized experiments with non-ignorable missing binary outcomes: an application to a voting experiment. Journal of the Royal Statistical Society: Series C (Applied Statistics) 58(1):83104.
Imbens Guido W., and Manski Charles F.. 2004. Confidence intervals for partially identified parameters. Econometrica 72(6):18451857.
Imbens Guido W., and Rubin Donald B.. 2015. Causal inference for statistics, social, and biomedical sciences: An introduction . New York, NY: Cambridge University Press.
Jenkins Paul, Scheim Charles, Wang Jen-Ting, Reed Roberta, and Green Allan. 2004. Assessment of coverage rates and bias using double sampling methodology. Journal of Clinical Epidemiology 57(2):123130.
Kang Joseph D. Y., and Schafer Joseph L.. 2007. Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22(4):523539.
Kaufman G. M., and King Benjamin. 1973. A Bayesian analysis of nonresponse in dichotomous processes. Journal of the American Statistical Association 68(343):670678.
Keele Luke, and Minozzi William. 2013. How much is Minnesota like Wisconsin? Assumptions and counterfactuals in causal inference with observational data. Political Analysis 21(2):193216.
Kern Holger L., Stuart Elizabeth A., Hill Jennifer, and Green Donald P.. 2016. Assessing methods for generalizing experimental impact estimates to target populations. Journal of Research on Educational Effectiveness 9(1):103127.
Lee David S. 2010. Training, wages, and sample selection: Estimating sharp bounds on treatment effects. Review of Economic Studies 76(3):10711102.
Levendusky Matthew, and Malhotra Neil. 2016. Does media coverage of partisan polarization affect political attitudes? Political Communication 33(2):283301.
Little Roderick J. A., and Rubin Donald B.. 2002. Statistical analysis with missing data . 2nd edn. New York, NY: Wiley-Interscience.
Little Roderick J. A. 2008. Selection and pattern-mixture models. In Longitudinal data analysis , ed. Fitzmaurice Garrett, Davidian Marie, Verbeke Geert, and Molenberghs Geert. ch. 18, Chapman & Hall/CRC.
Lohr Sharon L. 2010. Sampling: Design and analysis . 2nd edn. Pacific Grove, CA: Brooks/Cole.
Manski Charles F. 1990. Nonparametric bounds on treatment effects. American Economic Review Papers and Proceedings 80(2):319323.
Manski Charles F. 1995. Identification problems in the social sciences . Cambridge, MA: Harvard University Press.
Manski Charles F. 2007. Identification for prediction and decision . Cambridge, MA: Harvard University Press.
Manski Charles F., and Nagin Daniel S.. 1998. Bounding disagreements about treatment effects: A case study of sentencing and recidivism. Sociological Methodology 28(1):99137.
Manski Charles F., and Pepper John V.. 2011. Deterrence and the death penalty: Partial identification analysis using repeated cross sections. NBER Working Paper 17455.
McConnell Sheena, Stuart Elizabeth A., and Devaney Barbara. 2008. The truncation-by-death problem: What to do in an experimental evaluation when the outcome is not always defined. Evaluation Review 32(2):157186.
Mebane Walter R. Jr, and Poast Paul. 2013. Causal inference without ignorability: Identification with nonrandom assignment and missing treatment data. Political Analysis 21(2):233251.
Miratrix Luke W., Sekhon Jasjeet S., and Yu Bin. 2013. Adjusting treatment effect estimates by post-stratification in randomized experiments. Journal of the Royal Statistical Society, Series B 75(2):369396.
Neyman J. 1938. Contribution to the theory of sampling human populations. Journal of the American Statistical Association 33(201):101116.
Peress Michael. 2010. Correcting for survey nonresponse using variable response propensity. Journal of the American Statistical Association 105(492):14181430.
Rao J. N. K. 1973. On double sampling for stratification and analytical surveys. Biometrika 60(1):125133.
Robins James M., Rotnitzky Andrea, and Scharfstein Daniel O.. 1999. Sensitivity analysis for selection bias and unmeasured confounding in missing data and causal inference models. In Statistical models in epidemiology , ed. Halloran M. E. and Berry D.. New York, NY: Springer, pp. 192.
Rosenbaum Paul R. 2002. Observational studies . 2nd edn. New York, NY: Springer.
Rosenbaum P. R., and Rubin Donald B.. 1983. Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society, Series B 45(2):212218.
Sharma R., Gupta M., and Kapoor G.. 2010. Some better bounds on the variance with applications. Journal of Mathematical Inequalities 4(3):355363.
Si Yajuan, Reiter Jerome P., and Hillygus D. Sunshine. 2014. Semi-parametric selection models for potentially non-ignorable attrition in panel studies with refreshment samples. Political Analysis 23(1):92112.
Stoye Jörg. 2009. More on confidence intervals for partially identified parameters. Econometrica 77(4):12991315.
Tamer Elie. 2010. Partial identification in econometrics. Annual Review of Economics 2:167195.
Zhang Junni L., and Rubin Donald B.. 2004. Estimation of causal effects via principal stratification when some outcomes are truncated by ‘death’. Journal of Educational and Behavioral Statistics 28(4):353368.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
Type Description Title
Supplementary materials

Coppock supplementary material
Online Appendix

 Unknown (238 KB)
238 KB


Altmetric attention score

Full text views

Total number of HTML views: 11
Total number of PDF views: 160 *
Loading metrics...

Abstract views

Total abstract views: 485 *
Loading metrics...

* Views captured on Cambridge Core between 23rd February 2017 - 21st February 2018. This data will be updated every 24 hours.