Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-07T17:45:57.112Z Has data issue: false hasContentIssue false

Estimating logit models with small samples

Published online by Cambridge University Press:  29 March 2021

Carlisle Rainey*
Affiliation:
Political Science, Florida State University, 540 Bellamy, Tallahassee, FL 32306, USA
Kelly McCaskey
Affiliation:
Operations Analyst at Accruent, 11500 Alterra Parkway, #110, Austin, TX 78758, USA
*
*Corresponding author. Email: crainey@fsu.edu
Rights & Permissions [Opens in a new window]

Abstract

In small samples, maximum likelihood (ML) estimates of logit model coefficients have substantial bias away from zero. As a solution, we remind political scientists of Firth's (1993, Biometrika, 80, 27–38) penalized maximum likelihood (PML) estimator. Prior research has described and used PML, especially in the context of separation, but its small sample properties remain under-appreciated. The PML estimator eliminates most of the bias and, perhaps more importantly, greatly reduces the variance of the usual ML estimator. Thus, researchers do not face a bias-variance tradeoff when choosing between the ML and PML estimators—the PML estimator has a smaller bias and a smaller variance. We use Monte Carlo simulations and a re-analysis of George and Epstein (1992, American Political Science Review, 86, 323–337) to show that the PML estimator offers a substantial improvement in small samples (e.g., 50 observations) and noticeable improvement even in larger samples (e.g., 1000 observations).

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press on behalf of the European Political Science Association
Figure 0

Figure 1. Illustration of the source of the bias using the score functions for two hypothetical samples shown with dashed lines. Notice that, when evaluated at $\beta ^{\rm true}$, the score functions cancel on average (Equation 2). However, the nonlinearity of the score function pushes the ML estimate of the high miss (top line) relatively further from the true value. On the other hand, the nonlinearity of the score function pulls the ML estimate of the low miss (bottom line) relatively closer.

Figure 1

Figure 2. Substantial bias of $\hat {\beta }^{\rm mle}$ and the near unbiasedness of $\hat {\beta }^{{\rm pmle}}$.

Figure 2

Figure 3. Smaller variance of $\hat {\beta }^{{\rm pmle}}$ compared to $\hat {\beta }^{\rm mle}$.

Figure 3

Figure 4. Percent increase in the MSE of $\hat {\beta }^{\rm mle}$ compared to $\hat {\beta }^{{\rm pmle}}$.

Figure 4

Figure 5. Relative contribution of the variance and bias to the MSE inflation. The relative contribution is defined in Equation 5.

Figure 5

Figure 6. Coefficients for a logit model estimating US Supreme Court Decisions by both ML and PML.

Figure 6

Figure 7. Quantities of interest for the effect of the solicitor general filing a brief amicus curiae on the probability of a decision in favor of capital punishment.

Figure 7

Figure 8. MSE inflation as the information in the data set increases. The left panel shows the MSE inflation for the slope coefficients and the right panel shows the MSE inflation for the intercept.

Figure 8

Table 1. Thresholds at which the cost of ML relative to PML becomes substantial, noticeable, and negligible when estimating the slope coefficients and the intercept

Supplementary material: Link

Rainey and McCaskey Dataset

Link
Supplementary material: PDF

Rainey and McCaskey supplementary material

Rainey and McCaskey supplementary material

Download Rainey and McCaskey supplementary material(PDF)
PDF 162.1 KB