Skip to main content
×
Home
    • Aa
    • Aa

Kernel Regularized Least Squares: Reducing Misspecification Bias with a Flexible and Interpretable Machine Learning Approach

  • Jens Hainmueller (a1) and Chad Hazlett (a2)
Abstract

We propose the use of Kernel Regularized Least Squares (KRLS) for social science modeling and inference problems. KRLS borrows from machine learning methods designed to solve regression and classification problems without relying on linearity or additivity assumptions. The method constructs a flexible hypothesis space that uses kernels as radial basis functions and finds the best-fitting surface in this space by minimizing a complexity-penalized least squares problem. We argue that the method is well-suited for social science inquiry because it avoids strong parametric assumptions, yet allows interpretation in ways analogous to generalized linear models while also permitting more complex interpretation to examine nonlinearities, interactions, and heterogeneous effects. We also extend the method in several directions to make it more effective for social inquiry, by (1) deriving estimators for the pointwise marginal effects and their variances, (2) establishing unbiasedness, consistency, and asymptotic normality of the KRLS estimator under fairly general conditions, (3) proposing a simple automated rule for choosing the kernel bandwidth, and (4) providing companion software. We illustrate the use of the method through simulations and empirical examples.

Copyright
Corresponding author
e-mail: jhainm@mit.edu (corresponding author)
Footnotes
Hide All

Authors' note: The authors are listed in alphabetical order and contributed equally. We thank Jeremy Ferwerda, Dominik Hangartner, Danny Hidalgo, Gary King, Lorenzo Rosasco, Marc Ratkovic, Teppei Yamamoto, our anonymous reviewers, the editors, and participants in seminars at NYU, MIT, the Midwest Political Science Conference, and the European Political Science Association Conference for helpful comments. Companion software written by the authors to implement the methods proposed in this article in R, Matlab, and Stata can be downloaded from the authors' Web pages. Replication materials are available in the Political Analysis Dataverse at http://dvn.iq.harvard.edu/dvn/dv/pan. The usual disclaimer applies. Supplementary materials for this article are available on the Political Analysis Web site.

Footnotes
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

N. Beck , G. King , and L. Zeng 2000. Improving quantitative studies of international conflict: A conjecture. American Political Science Review 94: 2136.

N. Beck , G. King , and L. Zeng 2000. Improving quantitative studies of international conflict: A conjecture. American Political Science Review 94: 2136.

T. Brambor , W. Clark , and M. Golder 2006. Understanding interaction models: Improving empirical analyses. Political Analysis 14(1): 6382.

T. Brambor , W. Clark , and M. Golder 2006. Understanding interaction models: Improving empirical analyses. Political Analysis 14(1): 6382.

E. De Vito , A. Caponnetto , and L. Rosasco 2005. Model selection for regularized least-squares algorithm in learning theory. Foundations of Computational Mathematics 5(1): 5985.

E. De Vito , A. Caponnetto , and L. Rosasco 2005. Model selection for regularized least-squares algorithm in learning theory. Foundations of Computational Mathematics 5(1): 5985.

T. Evgeniou , M. Pontil , and T. Poggio 2000. Regularization networks and support vector machines. Advances in Computational Mathematics 13(1): 150.

T. Evgeniou , M. Pontil , and T. Poggio 2000. Regularization networks and support vector machines. Advances in Computational Mathematics 13(1): 150.

R. J. Friedrich 1982. In defense of multiplicative terms in multiple regression equations. American Journal of Political Science 26(4): 797833.

R. J. Friedrich 1982. In defense of multiplicative terms in multiple regression equations. American Journal of Political Science 26(4): 797833.

G. H. Golub , M. Heath , and G. Wahba 1979. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2): 215–23.

G. H. Golub , M. Heath , and G. Wahba 1979. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2): 215–23.

B. Harff 2003. No lessons learned from the Holocaust? Assessing risks of genocide and political mass murder since 1955. American Political Science Review 97(1): 5773.

B. Harff 2003. No lessons learned from the Holocaust? Assessing risks of genocide and political mass murder since 1955. American Political Science Review 97(1): 5773.

T. Hastie , R. Tibshirani , and J. Friedman 2009. The elements of statistical learning: Data mining, inference, and prediction. 2nd ed. New York, NY: Springer.

T. Hastie , R. Tibshirani , and J. Friedman 2009. The elements of statistical learning: Data mining, inference, and prediction. 2nd ed. New York, NY: Springer.

J. E. Jackson 1991. Estimation of models with variable coefficients. Political Analysis 3(1): 2749.

J. E. Jackson 1991. Estimation of models with variable coefficients. Political Analysis 3(1): 2749.

G. Kimeldorf , and G. Wahba 1970. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics 41(2): 495502.

G. Kimeldorf , and G. Wahba 1970. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics 41(2): 495502.

G. King , and L. Zeng 2006. The dangers of extreme counterfactuals. Political Analysis 14(2): 131–59.

G. King , and L. Zeng 2006. The dangers of extreme counterfactuals. Political Analysis 14(2): 131–59.

S. N. Wood 2003. Thin plate regression splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65(1): 95114.

S. N. Wood 2003. Thin plate regression splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65(1): 95114.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×
MathJax
Type Description Title
PDF
Supplementary Materials

Hainmueller and Hazlett supplementary material
Appendix

 PDF (844 KB)
844 KB

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 23 *
Loading metrics...

Abstract views

Total abstract views: 97 *
Loading metrics...

* Views captured on Cambridge Core between 4th January 2017 - 29th July 2017. This data will be updated every 24 hours.