Hostname: page-component-89b8bd64d-46n74 Total loading time: 0 Render date: 2026-05-09T13:12:16.936Z Has data issue: false hasContentIssue false

Separation and Rare Events

Published online by Cambridge University Press:  11 December 2020

Liam F. Beiser-McGrath*
Affiliation:
Department of Politics, International Relations, and Philosophy, Royal Holloway, University of London, Egham, United Kingdom Department of Politics and Public Administration, Universität Konstanz, Konstanz, Germany D-GESS, ETH Zürich, Zürich, Switzerland
Rights & Permissions [Opens in a new window]

Abstract

When separation is a problem in binary dependent variable models, many researchers use Firth's penalized maximum likelihood in order to obtain finite estimates (Firth, 1993; Zorn, 2005; Rainey, 2016). In this paper, I show that this approach can lead to inferences in the opposite direction of the separation when the number of observations are sufficiently large and both the dependent and independent variables are rare events. As large datasets with rare events are frequently used in political science, such as dyadic data measuring interstate relations, a lack of awareness of this problem may lead to inferential issues. Simulations and an empirical illustration show that the use of independent “weakly-informative” prior distributions centered at zero, for example, the Cauchy prior suggested by Gelman et al. (2008), can avoid this issue. More generally, the results caution researchers to be aware of how the choice of prior interacts with the structure of their data, when estimating models in the presence of separation.

Information

Type
Research Note
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is included and the original work is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use.
Copyright
Copyright © The Author(s), 2020. Published by Cambridge University Press on behalf of the European Political Science Association
Figure 0

Table 1. Negative quasi-complete separation and jeffreys prior

Figure 1

Table 2. Nuclear dyads and war

Figure 2

Figure 1. The use of priors for separation. (a) The joint prior density for Jeffreys prior overlaid with contours of the prior. (b) The prior density for Jeffreys prior overlaid with the contours of the likelihood. The point indicates the resulting estimate from the penalized maximum likelihood. (c) The joint prior density for the Cauchy priors suggested by Gelman et al. (2008), overlaid with contours of the prior. (d) The joint prior density for the Cauchy priors overlaid with the contours of the likelihood. The point indicates the resulting posterior mode.

Figure 3

Figure 2. Estimated coefficients using Firth's penalized maximum likelihood and Gelman et al.'s Cauchy prior when x and y are rare events and there is negative quasi-complete separation. Lines display the estimated coefficient for x, β, for different scenarios. As all scenarios have negative quasi-complete separation, negative estimated coefficients are in the same direction as the separation, while positive coefficients are in the opposite direction of the separation. The x-axis, displayed on a log scale, denotes the number of observations where $x = 0 \wedge y = 0$ (n1), the columns indicate the number of observations where $x = 1 \wedge y = 0$ (n2), and lines denote the number of observations where $x = 0 \wedge y = 1$ (n3).

Figure 4

Figure 3. Estimated coefficients, with measures of uncertainty, for two scenarios using Firth's penalized maximum likelihood and Gelman et al.'s Cauchy prior when x and y are rare events and there is negative quasi-complete separation. The shaded areas indicate the central 50, 68, 90, and 95 percent areas of the posterior density (Cauchy prior) and profile likelihood (Firth's PMLE) for estimated effect of x (β). Each panel indicates how these distributions change when increasing the number of observations, where $x = 0 \wedge y = 0$ (n1), from 1000 to 100,000. The number of observations where $x = 1 \wedge y = 0$ (n2) and $x = 0 \wedge y = 1$ (n3) are held constant at 50.

Figure 5

Table 3. Sanction onsets and war

Figure 6

Figure 4. Estimated coefficients, with measures of uncertainty, for the onset of sanctions using Firth's penalized maximum likelihood and Gelman et al.'s Cauchy prior, estimated via E–M or with MCMC. The shaded areas indicate the central 50, 68, 90, and 95 percent areas of the posterior density (Cauchy prior) and profile likelihood (Firth's PMLE) for the estimates. The left panel displays the results for models with no covariates (models 1–3) and the right panel displays the results for models including covariates (models 4–6).

Figure 7

Table 4. How choice of prior affects inferences about the effect of sanction onsets upon interstate war

Supplementary material: PDF

Beiser-McGrath supplementary material

Beiser-McGrath supplementary material

Download Beiser-McGrath supplementary material(PDF)
PDF 153.2 KB
Supplementary material: Link

Beiser-McGrath Dataset

Link