Hostname: page-component-77f85d65b8-lfk5g Total loading time: 0 Render date: 2026-03-27T20:37:35.077Z Has data issue: false hasContentIssue false

Generalized Kernel Regularized Least Squares

Published online by Cambridge University Press:  01 September 2023

Qing Chang
Affiliation:
Ph.D. Candidate, Department of Political Science, University of Pittsburgh, Pittsburgh, PA, USA
Max Goplerud*
Affiliation:
Assistant Professor, Department of Political Science, University of Pittsburgh, Pittsburgh, PA, USA.
*
Corresponding author: Max Goplerud; Email: mgoplerud@pitt.edu
Rights & Permissions [Opens in a new window]

Abstract

Kernel regularized least squares (KRLS) is a popular method for flexibly estimating models that may have complex relationships between variables. However, its usefulness to many researchers is limited for two reasons. First, existing approaches are inflexible and do not allow KRLS to be combined with theoretically motivated extensions such as random effects, unregularized fixed effects, or non-Gaussian outcomes. Second, estimation is extremely computationally intensive for even modestly sized datasets. Our paper addresses both concerns by introducing generalized KRLS (gKRLS). We note that KRLS can be re-formulated as a hierarchical model thereby allowing easy inference and modular model construction where KRLS can be used alongside random effects, splines, and unregularized fixed effects. Computationally, we also implement random sketching to dramatically accelerate estimation while incurring a limited penalty in estimation quality. We demonstrate that gKRLS can be fit on datasets with tens of thousands of observations in under 1 min. Further, state-of-the-art techniques that require fitting the model over a dozen times (e.g., meta-learners) can be estimated quickly.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Society for Political Methodology
Figure 0

Figure 1 Comparison of running time for different models. This figure shows the average computational time in minutes averaged across simulations with 95% confidence intervals. Panel (a) presents average time in minutes. Panel (b) uses a logarithmic scale.

Figure 1

Figure 2 Performance on out of sample predictions. This figure shows the RMSE of predicting the outcome, averaged across 50 simulations. 95% confidence intervals using a percentile bootstrap (1,000 bootstrap samples) are shown.

Figure 2

Figure 3 Performance for average marginal effect. The figure reports the RMSE of the estimated average marginal effect on $x_{i,1}$ as $\rho $ varies. Each panel shows a different data generating process (linear or nonlinear). 95% confidence intervals using a percentile bootstrap (1,000 bootstrap samples) are shown.

Figure 3

Figure 4 Re-analysis of Newman (2016). The average predicted probability (a) and average marginal effect (b) with 95% confidence intervals are shown.

Figure 4

Figure 5 Effects of electoral quotas. This figure reports estimated treatment effects for all groups and outcomes. 95% confidence intervals are shown.

Supplementary material: Link

Chang and Goplerud Dataset

Link
Supplementary material: PDF

Chang and Goplerud supplementary material

Chang and Goplerud supplementary material

Download Chang and Goplerud supplementary material(PDF)
PDF 3.4 MB