Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-08T06:22:26.406Z Has data issue: false hasContentIssue false

The MIDAS Touch: Accurate and Scalable Missing-Data Imputation with Deep Learning

Published online by Cambridge University Press:  26 February 2021

Ranjit Lall*
Affiliation:
Department of International Relations, London School of Economics and Political Science, London, UK. Email: r.lall@lse.ac.uk
Thomas Robinson
Affiliation:
School of Government and International Affairs, Durham University, Durham, UK. Email: thomas.robinson@durham.ac.uk
*
Corresponding author Ranjit Lall

Abstract

Principled methods for analyzing missing values, based chiefly on multiple imputation, have become increasingly popular yet can struggle to handle the kinds of large and complex data that are also becoming common. We propose an accurate, fast, and scalable approach to multiple imputation, which we call MIDAS (Multiple Imputation with Denoising Autoencoders). MIDAS employs a class of unsupervised neural networks known as denoising autoencoders, which are designed to reduce dimensionality by corrupting and attempting to reconstruct a subset of data. We repurpose denoising autoencoders for multiple imputation by treating missing values as an additional portion of corrupted data and drawing imputations from a model trained to minimize the reconstruction error on the originally observed portion. Systematic tests on simulated as well as real social science data, together with an applied example involving a large-scale electoral survey, illustrate MIDAS’s accuracy and efficiency across a range of settings. We provide open-source software for implementing MIDAS.

Information

Type
Article
Copyright
© The Author(s) 2021. Published by Cambridge University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Supplementary material: Link

Lall and Robinson Dataset

Link
Supplementary material: PDF

Lall and Robinson supplementary material

Lall and Robinson supplementary material

Download Lall and Robinson supplementary material(PDF)
PDF 396.9 KB