Skip to main content Accessibility help
×
Home
Hostname: page-component-7ccbd9845f-wr4x4 Total loading time: 0.27 Render date: 2023-01-27T07:34:24.097Z Has data issue: true Feature Flags: { "useRatesEcommerce": false } hasContentIssue true

Recalibration of Predicted Probabilities Using the “Logit Shift”: Why Does It Work, and When Can It Be Expected to Work Well?

Published online by Cambridge University Press:  09 January 2023

Evan T. R. Rosenman*
Affiliation:
Data Science Initiative, Harvard University, Cambridge, MA 02138, USA. E-mail: erosenm@fas.harvard.edu
Cory McCartan
Affiliation:
Department of Statistics, Harvard University, Cambridge, MA 02138, USA. E-mail: cmccartan@g.harvard.edu
Santiago Olivella
Affiliation:
Department of Political Science, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA. E-mail: olivella@unc.edu
*
Corresponding author Evan T. R. Rosenman

Abstract

The output of predictive models is routinely recalibrated by reconciling low-level predictions with known quantities defined at higher levels of aggregation. For example, models predicting vote probabilities at the individual level in U.S. elections can be adjusted so that their aggregation matches the observed vote totals in each county, thus producing better-calibrated predictions. In this research note, we provide theoretical grounding for one of the most commonly used recalibration strategies, known colloquially as the “logit shift.” Typically cast as a heuristic adjustment strategy (whereby a constant correction on the logit scale is found, such that aggregated predictions match target totals), we show that the logit shift offers a fast and accurate approximation to a principled, but computationally impractical adjustment strategy: computing the posterior prediction probabilities, conditional on the observed totals. After deriving analytical bounds on the quality of the approximation, we illustrate its accuracy using Monte Carlo simulations. We also discuss scenarios in which the logit shift is less effective at recalibrating predictions: when the target totals are defined only for highly heterogeneous populations, and when the original predictions correctly capture the mean of true individual probabilities, but fail to capture the shape of their distribution.

Type
Letter
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of The Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Edited by Daniel Hopkins

References

Biscarri, W., Zhao, S. D., and Brunner, R. J.. 2018. “A Simple and Fast Method for Computing the Poisson Binomial Distribution Function.” Computational Statistics & Data Analysis 122: 92100.CrossRefGoogle Scholar
Chen, S. X., and Liu, J. S.. 1997. “Statistical Applications of the Poisson-Binomial and Conditional Bernoulli Distributions.” Statistica Sinica 7 (4): 875892.Google Scholar
Ghitza, Y., and Gelman, A.. 2013. “Deep Interactions with MRP: Election Turnout and Voting Patterns among Small Electoral Subgroups.” American Journal of Political Science 57 (3): 762776.CrossRefGoogle Scholar
Ghitza, Y., and Gelman, A.. 2020. “Voter Registration Databases and MRP: Toward the Use of Large-Scale Databases in Public Opinion Research.” Political Analysis 28 (4): 507531.CrossRefGoogle Scholar
Hanretty, C., Lauderdale, B., and Vivyan, N.. 2016. “Combining National and Constituency Polling for Forecasting.” Electoral Studies 41: 239243.CrossRefGoogle Scholar
Junge, F. 2020. “Package ‘PoissonBinomial’.” Computational Statistics & Data Analysis 59: 4151.Google Scholar
King, G., Tanner, M. A., and Rosen, O.. 2004. Ecological Inference: New Methodological Strategies. New York: Cambridge University Press.CrossRefGoogle Scholar
Kullback, S., and Leibler, R. A.. 1951. “On Information and Sufficiency.” The Annals of Mathematical Statistics 22 (1): 7986.CrossRefGoogle Scholar
Kuriwaki, S., Ansolabehere, S., Dagonel, A., and Yamauchi, S.. 2022. “The Geography of Racially Polarized Voting: Calibrating Surveys at the District Level.” OSF Preprints. https://doi.org/10.31219/osf.io/mk9e6 CrossRefGoogle Scholar
Lin, Z., Wang, Y., and Hong, Y.. 2022. “The Poisson Multinomial Distribution and its Applications in Voting Theory, Ecological Inference, and Machine Learning.” https://doi.org/10.48550/ARXIV.2201.04237 CrossRefGoogle Scholar
Olivella, S., and Shiraito, Y.. 2017. “poisbinom: A Faster Implementation of the Poisson-Binomial distribution.” R Package Version 1.0.1. Google Scholar
Platt, J., et al. 1999. “Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods.” Advances in Large Margin Classifiers 10 (3): 6174.Google Scholar
Rosenman, E. 2019. “Some New Results for Poisson Binomial Models.” https://doi.org/10.48550/ARXIV.1907.09053 CrossRefGoogle Scholar
Rosenman, E., McCartan, C., and Olivella, S.. 2022. “Replication Data for: Recalibration of Predicted Probabilities using the ‘Logit Shift’: Why Does It Work, and When Can It Be Expected to Work Well?” Version V1. https://doi.org/10.7910/DVN/7MRDUW CrossRefGoogle Scholar
Schwenzfeier, M. 2019. “Which Non-Responders Drive Non-Response Bias?” In PolMeth XXXVI. Cambridge.Google Scholar
U.S. Census Bureau. 2021. 2020 Census. U.S. Department of Commerce.Google Scholar
Supplementary material: PDF

Rosenman et al. supplementary material

Rosenman et al. supplementary material

Download Rosenman et al. supplementary material(PDF)
PDF 206 KB

Save article to Kindle

To save this article to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Recalibration of Predicted Probabilities Using the “Logit Shift”: Why Does It Work, and When Can It Be Expected to Work Well?
Available formats
×

Save article to Dropbox

To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox.

Recalibration of Predicted Probabilities Using the “Logit Shift”: Why Does It Work, and When Can It Be Expected to Work Well?
Available formats
×

Save article to Google Drive

To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive.

Recalibration of Predicted Probabilities Using the “Logit Shift”: Why Does It Work, and When Can It Be Expected to Work Well?
Available formats
×
×

Reply to: Submit a response

Please enter your response.

Your details

Please enter a valid email address.

Conflicting interests

Do you have any conflicting interests? *