Skip to main content Accessibility help

Kernel Regularized Least Squares: Reducing Misspecification Bias with a Flexible and Interpretable Machine Learning Approach

  • Jens Hainmueller (a1) and Chad Hazlett (a2)


We propose the use of Kernel Regularized Least Squares (KRLS) for social science modeling and inference problems. KRLS borrows from machine learning methods designed to solve regression and classification problems without relying on linearity or additivity assumptions. The method constructs a flexible hypothesis space that uses kernels as radial basis functions and finds the best-fitting surface in this space by minimizing a complexity-penalized least squares problem. We argue that the method is well-suited for social science inquiry because it avoids strong parametric assumptions, yet allows interpretation in ways analogous to generalized linear models while also permitting more complex interpretation to examine nonlinearities, interactions, and heterogeneous effects. We also extend the method in several directions to make it more effective for social inquiry, by (1) deriving estimators for the pointwise marginal effects and their variances, (2) establishing unbiasedness, consistency, and asymptotic normality of the KRLS estimator under fairly general conditions, (3) proposing a simple automated rule for choosing the kernel bandwidth, and (4) providing companion software. We illustrate the use of the method through simulations and empirical examples.


Corresponding author

e-mail: (corresponding author)


Hide All

Authors' note: The authors are listed in alphabetical order and contributed equally. We thank Jeremy Ferwerda, Dominik Hangartner, Danny Hidalgo, Gary King, Lorenzo Rosasco, Marc Ratkovic, Teppei Yamamoto, our anonymous reviewers, the editors, and participants in seminars at NYU, MIT, the Midwest Political Science Conference, and the European Political Science Association Conference for helpful comments. Companion software written by the authors to implement the methods proposed in this article in R, Matlab, and Stata can be downloaded from the authors' Web pages. Replication materials are available in the Political Analysis Dataverse at The usual disclaimer applies. Supplementary materials for this article are available on the Political Analysis Web site.



Hide All
Beck, N., King, G., and Zeng, L. 2000. Improving quantitative studies of international conflict: A conjecture. American Political Science Review 94: 2136.
Brambor, T., Clark, W., and Golder, M. 2006. Understanding interaction models: Improving empirical analyses. Political Analysis 14(1): 6382.
De Vito, E., Caponnetto, A., and Rosasco, L. 2005. Model selection for regularized least-squares algorithm in learning theory. Foundations of Computational Mathematics 5(1): 5985.
Evgeniou, T., Pontil, M., and Poggio, T. 2000. Regularization networks and support vector machines. Advances in Computational Mathematics 13(1): 150.
Friedrich, R. J. 1982. In defense of multiplicative terms in multiple regression equations. American Journal of Political Science 26(4): 797833.
Golub, G. H., Heath, M., and Wahba, G. 1979. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics 21(2): 215–23.
Harff, B. 2003. No lessons learned from the Holocaust? Assessing risks of genocide and political mass murder since 1955. American Political Science Review 97(1): 5773.
Hastie, T., Tibshirani, R., and Friedman, J. 2009. The elements of statistical learning: Data mining, inference, and prediction. 2nd ed. New York, NY: Springer.
Jackson, J. E. 1991. Estimation of models with variable coefficients. Political Analysis 3(1): 2749.
Kimeldorf, G., and Wahba, G. 1970. A correspondence between Bayesian estimation on stochastic processes and smoothing by splines. Annals of Mathematical Statistics 41(2): 495502.
King, G., and Zeng, L. 2006. The dangers of extreme counterfactuals. Political Analysis 14(2): 131–59.
Rifkin, R. M., and Lippert, R. A. 2007. Notes on regularized least squares. Technical report, MIT Computer Science and Artificial Intelligence Laboratory.
Rifkin, R., Yeo, G., and Poggio, T. 2003. Regularized least-squares classification. Nato Science Series Sub Series III Computer and Systems Sciences 190: 131–54.
Saunders, C., Gammerman, A., and Vovk, V. 1998. Ridge regression learning algorithm in dual variables. In Proceedings of the 15th International Conference on Machine Learning. Volume 19980, 515–21. San Francisco, CA: Morgan Kaufmann.
Schölkopf, B., and Smola, A. 2002. Learning with kernels: Support vector machines, regularization, optimization, and beyond. Cambridge, MA: MIT Press.
Tychonoff, A. N. 1963. Solution of incorrectly formulated problems and the regularization method. Doklady Akademii Nauk SSSR 151: 501–4. Translated in Soviet Mathematics 4: 1035–8.
Wood, S. N. 2003. Thin plate regression splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65(1): 95114.
MathJax is a JavaScript display engine for mathematics. For more information see
Type Description Title
Supplementary materials

Hainmueller and Hazlett supplementary material

 PDF (844 KB)
844 KB


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed