Assessing Fit Quality and Testing for Misspecification in Binary-Dependent Variable Models

Justin Esarey; Andrew Pierce

doi:10.1093/pan/mps026

Assessing Fit Quality and Testing for Misspecification in Binary-Dependent Variable Models

Published online by Cambridge University Press: 04 January 2017

Justin Esarey and

Andrew Pierce

Show author details

Justin Esarey*: Affiliation:
Department of Political Science, Rice University
Andrew Pierce: Affiliation:
Department of Political Science, Emory University. e-mail: awpierc@emory.edu
*: e-mail: justin@justinesarey.com (corresponding author)

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

In this article, we present a technique and critical test statistic for assessing the fit of a binary-dependent variable model (e.g., a logit or probit). We examine how closely a model's predicted probabilities match the observed frequency of events in the data set, and whether these deviations are systematic or merely noise. Our technique allows researchers to detect problems with a model's specification that obscure substantive understanding of the underlying data-generating process, such as missing interaction terms or unmodeled nonlinearities. We also show that these problems go undetected by the fit statistics most commonly used in political science.

Type: Research Article
Information: Political Analysis , Volume 20 , Issue 4 , Autumn 2012 , pp. 480 - 500

DOI: https://doi.org/10.1093/pan/mps026 [Opens in a new window]
Copyright: Copyright © The Author 2012. Published by Oxford University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors' note: We thank Drew Linzer, Mike Ward, Jacqueline H. R. Demeritt, Jeff Staton, John Freeman, Neal Beck, Patrick Brandt, Phil Schrodt, Teppei Yamamoto, Kevin Clarke, and Will H. Moore for their comments, suggestions, and conversations about previous iterations of the article. Replication materials for all our simulations and data analysis can be found online at the Political Analysis dataverse: http://hdl.handle.net/1902.1/18399. Supplementary materials for the article are available on the Political Analysis Web site.

References

Achen, Christopher H. 2002. Toward a new methodology: Microfoundations and ART. Annual Reviews of Political Science 5: 423–50.Google Scholar

Ai, Chunrong, and Norton, Edward C. 2003. Interaction terms in logit and probit models. Economics Letters 80: 123–9.Google Scholar

Azzalini, A., Bowman, A. W., and Hardle, W. 1989. On the use of nonparametric regression for model checking. Biometrika 76: 1–11.Google Scholar

Beck, Nathaniel, and Jackman, Simon. 1998. Beyond linearity by default: Generalized additive models. American Journal of Political Science 42: 596–627.Google Scholar

Bowman, Adrian W., and Azzalini, Adelchi. 1997. Applied smoothing techniques for data analysis. Oxford: Oxford University Press.Google Scholar

Brambor, Thomas, Clark, William, and Golder, Matt. 2006. Understanding interaction models: Improving empirical analyses. Political Analysis 14: 63–82.Google Scholar

Brown, Scott, and Heathcote, Andrew. 2002. On the use of nonparametric regression in assessing parametric regression models. Journal of Mathematical Psychology 46: 716–30.Google Scholar

Carter, David B., and Signorino, Curtis S. 2010. Back to the future: Modeling time dependence in binary data. Political Analysis 18: 271–92.CrossRef Google Scholar

Cleveland, William S., and Loader, Clive. 1996. Smoothing by local regression: Principles and methods. In Statistical Theory and Computational Aspects of Smoothing, eds. Hardle, W., and Schimek, M. G., 10–49. Heidelberg, Germany: Springer.Google Scholar

Cleveland, William S., and Devlin, Susan J. 1988. Locally weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association 83: 596–610.Google Scholar

Copas, J. B. 1983. Plotting p against x. Journal of the Royal Statistical Society, Series C 32: 25–31.Google Scholar

Craven, Peter, and Wahba, Grace. 1979. Smoothing noisy data with spline functions. Numerische Mathematik 31: 377–403.Google Scholar

Firth, D., Glosup, J., and Hinkley, D. V. 1991. Model checking with nonparametric curves. Biometrika 78: 245–52.Google Scholar

Franzese, Robert J., and Kam, Cindy D. 2007. Modeling and interpreting interactive hypotheses in regression analysis. Ann Arbor: University of Michigan Press.Google Scholar

Gartzke, Eric. 1999. War is in the error term. International Organization 53: 567–87.Google Scholar

Gelman, Andrew, Carlin, John B., Stern, Hal, and Rubin, Donald B. 2004. Bayesian data analysis. Boca Raton, FL: Chapman and Hill/CRC.Google Scholar

Greenhill, Brian, Ward, Michael D., and Sacks, Audrey. 2011. The separation plot: A new visual method for evaluating the fit of binary models. American Journal of Political Science 55: 991–1002.Google Scholar

Hardle, Wolfgang, Muller, Marlene, Sperlich, Stefan, and Werwatz, Alex. 2004. Nonparametric and semiparametric models. Berlin: Springer.Google Scholar

Hart, Jeffrey D. 1997. Nonparametric smoothing and lack-of-fit tests. New York: Springer.Google Scholar

Herron, Michael. 1999. Postestimation uncertainty in limited dependent variable models. Political Analysis 8: 83–98.Google Scholar

Hosmer, D. W., Hosmer, T., Le Cessie, S., and Lemeshow, S. 1997. A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine 16: 965–80.Google Scholar

Hosmer, David W., and Lemeshow, Stanley. 1980. A goodness-of-fit test for the multiple logistic regression model. Communications in Statistics A10: 1043–69.Google Scholar

Hosmer, David W., and Lemeshow, Stanley. 2000. Applied logistic regression. New York: Wiley Interscience.Google Scholar

Hurvich, Clifford M., Simonoff, Jeffrey S., and Tsai, Chih-Ling. 1998. Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion. Journal of the Royal Statistical Society, Series B 60: 271–93.Google Scholar

Klein, James P., Goetz, Gary, and Diehl, Paul F. 2006. The new rivalry data set: Procedures and patterns. Journal of Peace Research 43: 331–48.Google Scholar

le Cessie, S., and van Houwelingen, J. C. 1991. A goodness-of-fit test for binary regression models, based on smoothing methods. Biometrics 47: 1267–82.CrossRef Google Scholar

Lemeshow, Stanley, and Hosmer, David W. 1982. The use of goodness-of-fit statistics in the development of logistic regression models. American Journal of Epidemiology 115: 92–106.Google Scholar

Macdonald, Peter D. M. 2011. R functions for ROC curves and the Hosmer-Lemeshow test (accessed 7 August 2012).Google Scholar

Morey, Daniel. 2011. When war brings peace: A dynamic model of the rivalry process. American Journal of Political Science 55: 263–75.Google Scholar

Savun, Burcu, and Tirone, Daniel C. 2011. Foreign aid, democratization, and civil conflict: How does democracy aid civil conflict? American Journal of Political Science 55: 233–46.Google Scholar

Ward, Michael D., Greenhill, Brian, and Bakke, Kristin. 2010. The perils of policy by p-value: Predicting civil conflicts. Journal of Peace Research 46: 363–75.Google Scholar

Article contents

Assessing Fit Quality and Testing for Misspecification in Binary-Dependent Variable Models

Abstract

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests