Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Chapter 27: Missing Data and Imputation

Chapter 27: Missing Data and Imputation

pp. 923-942

Authors

, University of California, Davis, , Indiana University, Bloomington
Resources available Unlock the full potential of this textbook with additional resources. There are free resources available for this textbook. Explore resources
  • Add bookmark
  • Cite
  • Share

Summary

Introduction

The problem of missing data in survey data is one of long standing, arising from nonresponse or partial response to survey questions. Reasons for nonresponse include unwillingness to provide the information asked for, difficulty of recall of events that occurred in the past, and not knowing the correct response. Imputation is the process of estimating or predicting the missing observations.

In this chapter we deal with the regression setup with data vector (y i, x i), i = 1, …, N. For some of the observations some elements of x i or of both (y i, x i) are missing. A number of questions are considered. When can we proceed with an analysis of only the complete observations, and when should we attempt to fill the gaps left by the missing observations? What methods of imputation are available? When imputed values for missing observations are obtained, how should estimation and inference then proceed?

If a data set has missing observations, and if these gaps can be filled by a statistically sound procedure, then benefit comes from a larger and possibly more representative sample and, under ideal circumstances, more precise inference. The cost of estimating missing data comes from having to make (possibly wrong) assumptions to support a procedure for generating proxies for the missing observations, and from the approximation error inherent in any such procedure. Further, statistical inference that follows data augmentation after imputed values replace missing data is more complicated because such inference must take into account the approximation errors introduced by imputation.

About the book

Access options

Review the options below to login to check your access.

Purchase options

eTextbook
US$111.00
Hardback
US$111.00

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers