THE FACTOR-LASSO AND K-STEP BOOTSTRAP APPROACH FOR INFERENCE IN HIGH-DIMENSIONAL ECONOMIC APPLICATIONS

Christian Hansen; Yuan Liao

doi:10.1017/S0266466618000245

THE FACTOR-LASSO AND K-STEP BOOTSTRAP APPROACH FOR INFERENCE IN HIGH-DIMENSIONAL ECONOMIC APPLICATIONS

Published online by Cambridge University Press: 22 August 2018

Christian Hansen and

Yuan Liao

Show author details

Christian Hansen*: Affiliation:
University of Chicago
Yuan Liao*: Affiliation:
Rutgers University
*: *Address correspondence to Christian Hansen, Booth School of Business, University of Chicago, Chicago, IL 60637, USA; e-mail: Christian.Hansen@chicagobooth.edu
†Yuan Liao, Department of Economics, Rutgers University, New Brunswick, NJ 08901, USA; e-mail: yuan.liao@rutgers.edu.

Article contents

Abstract
Footnotes
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We consider inference about coefficients on a small number of variables of interest in a linear panel data model with additive unobserved individual and time specific effects and a large number of additional time-varying confounding variables. We suppose that, in addition to unrestricted time and individual specific effects, these confounding variables are generated by a small number of common factors and high-dimensional weakly dependent disturbances. We allow that both the factors and the disturbances are related to the outcome variable and other variables of interest. To make informative inference feasible, we impose that the contribution of the part of the confounding variables not captured by time specific effects, individual specific effects, or the common factors can be captured by a relatively small number of terms whose identities are unknown. Within this framework, we provide a convenient inferential procedure based on factor extraction followed by lasso regression and show that the procedure has good asymptotic properties. We also provide a simple k-step bootstrap procedure that may be used to construct inferential statements about the low-dimensional parameters of interest and prove its asymptotic validity. We provide simulation evidence about the performance of our procedure and illustrate its use in an empirical application.

Information

Type: ARTICLES
Information: Econometric Theory , Volume 35 , Issue 3 , June 2019 , pp. 465 - 509

DOI: https://doi.org/10.1017/S0266466618000245 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2018

Footnotes

The authors are grateful to Shakheeb Khan, Roger Moon, and seminar participants at the Australasian Meetings of the Econometric Society, Inference in Large Econometric Models at Montréal, University of Chile, National University of Singapore, Xiamen University, University of Toronto, and Stevens Institute of Technology for helpful comments. This material is based upon work supported by the National Science Foundation under Grant No. 1558636 and the University of Chicago Booth School of Business. First version: June 2016. This version: July 23, 2018.

References

REFERENCES

Agarwal, A., Negahban, S., & Wainwright, M.J. et al. (2012) Fast global convergence of gradient methods for high-dimensional statistical recovery. The Annals of Statistics 40, 2452–2482.CrossRef Google Scholar

Ahn, S.C. & Horenstein, A.R. (2013) Eigenvalue ratio test for the number of factors. Econometrica 81, 1203–1227.Google Scholar

Andrews, D.W. (2002) Higher-order improvements of a computationally attractive k-step bootstrap for extremum estimators. Econometrica 70, 119–162.CrossRef Google Scholar

Arellano, M. (1987) Computing robust standard errors for within-groups estimators. Oxford Bulletin of Economics and Statistics 49, 431–434.CrossRef Google Scholar

Bai, J. (2003) Inferential theory for factor models of large dimensions. Econometrica 71, 135–171.CrossRef Google Scholar

Bai, J. (2009) Panel data models with interactive fixed effects. Econometrica 77, 1229–1279.Google Scholar

Bai, J. & Li, K. (2014) Theory and methods of panel data models with interactive effects. The Annals of Statistics 42, 142–170.CrossRef Google Scholar

Bai, J. & Ng, S. (2002) Determining the number of factors in approximate factor models. Econometrica 70, 191–221.CrossRef Google Scholar

Bai, J. & Ng, S. (2006) Confidence intervals for diffusion index forecasts and inference for factor-augmented regressions. Econometrica 74, 1133–1150.CrossRef Google Scholar

Belloni, A., Chen, D., Chernozhukov, V., & Hansen, C. (2012) Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80, 2369–2429.Google Scholar

Belloni, A., Chernozhukov, V., Fernández-Val, I., & Hansen, C. (2017) Program evaluation with high-dimensional data. Econometrica 85, 233–298.CrossRef Google Scholar

Belloni, A., Chernozhukov, V., & Hansen, C. (2014) Inference on treatment effects after selection among high-dimensional controls. The Review of Economic Studies 81, 608–650.CrossRef Google Scholar

Belloni, A., Chernozhukov, V., Hansen, C., & Kozbur, D. (2016) Inference in high-dimensional panel models with an application to gun control. Journal of Business & Economic Statistics 34, 590–605.CrossRef Google Scholar

Bernanke, B.S., Boivin, J., & Eliasz, P. (2005) Measuring the effects of monetary policy: A factor-augmented vector autoregressive (favar) approach. The Quarterly Journal of Economics 120, 387–422.Google Scholar

Bonhomme, S. & Manresa, E. (2015) Grouped patterns of heterogeneity in panel data. Econometrica 83, 1147–1184.CrossRef Google Scholar

Chatterjee, A. & Lahiri, S.N. (2011) Bootstrapping lasso estimators. Journal of the American Statistical Association 106, 608–625.CrossRef Google Scholar

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., & Newey, W. (2016) Double machine learning for treatment and causal parameters. ArXiv e-prints 1608.00060.Google Scholar

Chernozhukov, V., Chetverikov, D., & Kato, K. (2013) Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Annals of Statistics 41, 2786–2819.CrossRef Google Scholar

Cook, P.J. & Ludwig, J. (2006) The social costs of gun ownership. Journal of Public Economics 90, 379–391.CrossRef Google Scholar

Dezeure, R., Bühlmann, P., & Zhang, C.-H. (2017) High-dimensional simultaneous inference with the bootstrap. Test 26, 685–719.CrossRef Google Scholar

Fan, J. & Li, R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96, 1348–1360.CrossRef Google Scholar

Fan, J., Xue, L., & Yao, J. (2017) Sufficient forecasting using factor models. Journal of Econometrics 201, 292–306.CrossRef Google Scholar PubMed

Fu, W.J. (1998) Penalized regressions: The bridge versus the lasso. Journal of Computational and Graphical Statistics 7, 397–416.Google Scholar

Hahn, P.R., Mukeherjee, S., & Carvalho, C. (2013) Partial factor modeling: Predictor dependent shrinkage for linear regression. Journal of the American Statistical Association 108, 999–1008.CrossRef Google Scholar

Hsiao, C., Ching, H.S., & Wan, S. (2012) A panel data approach for program evaluation: measuring the benefits of political and economic integration of Hong Kong with Mainland China. Journal of Applied Econometrics 27, 705–740.CrossRef Google Scholar

Kadkhodaie, M., Sanjabi, M., & Luo, Z.-Q. (2014) On the linear convergence of the approximate proximal splitting method for non-smooth convex optimization. Journal of the Operations Research Society of China 2, 123–141.CrossRef Google Scholar

Li, H., Li, Q., & Shi, Y. (2017) Determining the number of factors when the number of factors can increase with sample size. Journal of Econometrics 197, 76–86.CrossRef Google Scholar

Li, K.T. & Bell, D.R. (2017) Estimation of average treatment effects with panel data: Asymptotic theory and implementation. Journal of Econometrics 197, 65–75.CrossRef Google Scholar

Liang, K.-Y. & Zeger, S. (1986) Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22.CrossRef Google Scholar

Loh, P.-L. & Wainwright, M.J. (2015) Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima. Journal of Machine Learning Research 16, 559–616.Google Scholar

Mammen, E. (1993) Bootstrap and wild bootstrap for high dimensional linear models. The Annals of Statistics 21, 255–285.CrossRef Google Scholar

Moon, H.R. & Weidner, M. (2017) Dynamic linear panel regression models with interactive fixed effects. Econometric Theory 33, 158–195.CrossRef Google Scholar

Moon, R. & Weidner, M. (2015) Linear regression for panel with unknown number of factors as interactive fixed effects. Econometrica 83, 1543–1579.CrossRef Google Scholar

Nesterov, Y. (2007) Gradient Methods for Minimizing Composite Objective Function. Technical report, University College London.Google Scholar

Pesaran, H. (2006) Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74, 967–1012.CrossRef Google Scholar

Prakasa Rao, B. (2009) Conditional independence, conditional mixing and conditional association. Annals of the Institute of Statistical Mathematics 61, 441–460.CrossRef Google Scholar

Raskutti, G., Wainwright, M.J., & Yu, B. (2010) Restricted eigenvalue properties for correlated gaussian designs. Journal of Machine Learning Research 99, 2241–2259.Google Scholar

Stock, J. & Watson, M. (2002) Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97, 1167–1179.CrossRef Google Scholar

Su, L. & Chen, Q. (2013) Testing homogeneity in panel data models with interactive fixed effects. Econometric Theory 29, 1079–1135.CrossRef Google Scholar

van de Geer, S., Bühlmann, P., Ritov, Y., & Dezeure, R. (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. Annals of Statistics 42, 1166–1202.CrossRef Google Scholar

Wager, S. & Athey, S. (2017) Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association. Forthcoming 2018.Google Scholar

Zhang, C.-H. & Zhang, S.S. (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B 76, 217–242.CrossRef Google Scholar

Hansen and Liao supplementary material

Appendices D-I

PDF 1.3 MB

Hansen and Liao supplementary material

Hansen and Liao supplementary material 1

File 413.6 KB

Article contents

THE FACTOR-LASSO AND K-STEP BOOTSTRAP APPROACH FOR INFERENCE IN HIGH-DIMENSIONAL ECONOMIC APPLICATIONS

Abstract

Information

Footnotes

References

REFERENCES

Hansen and Liao supplementary material

Hansen and Liao supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests