Hostname: page-component-77f85d65b8-7lfxl Total loading time: 0 Render date: 2026-03-30T00:17:17.224Z Has data issue: false hasContentIssue false

Measuring Swing Voters with a Supervised Machine Learning Ensemble

Published online by Cambridge University Press:  17 October 2022

Christopher Hare*
Affiliation:
Department of Political Science, University of California, Davis, CA, USA. Email: cdhare@ucdavis.edu
Mikayla Kutsuris
Affiliation:
Desmond, Nolan, Livaich & Cunningham, Sacramento, CA, USA. Email: mkutsuris@dnlc.net
*
Corresponding author Christopher Hare
Rights & Permissions [Opens in a new window]

Abstract

Theory has long suggested that swing voting is a response to cross-pressures arising from a mix of individual attributes and contextual factors. Unfortunately, existing regression-based approaches are ill-suited to explore the complex combinations of demographic, policy, and political factors that produce swing voters in American elections. This gap between theory and practice motivates our use of an ensemble of supervised machine learning methods to predict swing voters in the 2012, 2016, and 2020 U.S. presidential elections. The results from the learning ensemble substantiate the existence of swing voters in contemporary American elections. Specifically, we demonstrate that the learning ensemble produces well-calibrated and externally valid predictions of swing voter propensity in later elections and for related behaviors such as split-ticket voting. Although interpreting black-box models is more challenging, they can nonetheless provide meaningful substantive insights meriting further exploration. Here, we use flexible model-agnostic tools to perturb the ensemble and demonstrate that cross-pressures (particularly those involving ideological and policy-related considerations) are essential to accurately predict swing voters.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of the Society for Political Methodology
Figure 0

Table 1 Description and interpretation of model predictive performance metrics.

Figure 1

Table 2 Model performance on the out-of-sample (validation) data.

Figure 2

Figure 1 Comparison of the learning ensemble with the baseline generalized additive model (GAM) on the out-of-sample (validation) data. Notes: AUC, area under the ROC curve; BSS, Brier skill score. The bars show 95$\%$ confidence intervals for the estimated proportions.

Figure 3

Figure 2 Predictive performance on additional indicators of swing voter propensity using out-of-sample (validation) data. The horizontal bars show the overall proportion of respondents satisfying the corresponding indicator. Ambivalence defined as placing the candidates within one point of each other on a five-point favorability scale. Republican presidential support scores are calculated by estimating ensemble models of 2012 presidential vote choice and 2020 presidential vote intention. Additional details are provided in Appendices D and E in the Supplementary Material.

Figure 4

Figure 3 Feature importance estimates from the learning ensemble. Five-hundred random permutations conducted using out-of-sample (validation) data.

Figure 5

Table 3 Learning ensemble performance in predicting swing voters with subsets of predictor variables.

Supplementary material: Link

Hare and Kutsuris Dataset

Link
Supplementary material: PDF

Hare and Kutsuris supplementary material

Hare and Kutsuris supplementary material

Download Hare and Kutsuris supplementary material(PDF)
PDF 898.7 KB