Skip to main content
×
×
Home

THE FACTOR-LASSO AND K-STEP BOOTSTRAP APPROACH FOR INFERENCE IN HIGH-DIMENSIONAL ECONOMIC APPLICATIONS

  • Christian Hansen (a1) and Yuan Liao (a2)
Abstract

We consider inference about coefficients on a small number of variables of interest in a linear panel data model with additive unobserved individual and time specific effects and a large number of additional time-varying confounding variables. We suppose that, in addition to unrestricted time and individual specific effects, these confounding variables are generated by a small number of common factors and high-dimensional weakly dependent disturbances. We allow that both the factors and the disturbances are related to the outcome variable and other variables of interest. To make informative inference feasible, we impose that the contribution of the part of the confounding variables not captured by time specific effects, individual specific effects, or the common factors can be captured by a relatively small number of terms whose identities are unknown. Within this framework, we provide a convenient inferential procedure based on factor extraction followed by lasso regression and show that the procedure has good asymptotic properties. We also provide a simple k-step bootstrap procedure that may be used to construct inferential statements about the low-dimensional parameters of interest and prove its asymptotic validity. We provide simulation evidence about the performance of our procedure and illustrate its use in an empirical application.

Copyright
Corresponding author
*Address correspondence to Christian Hansen, Booth School of Business, University of Chicago, Chicago, IL 60637, USA; e-mail: Christian.Hansen@chicagobooth.edu
Yuan Liao, Department of Economics, Rutgers University, New Brunswick, NJ 08901, USA; e-mail: yuan.liao@rutgers.edu.
Footnotes
Hide All

The authors are grateful to Shakheeb Khan, Roger Moon, and seminar participants at the Australasian Meetings of the Econometric Society, Inference in Large Econometric Models at Montréal, University of Chile, National University of Singapore, Xiamen University, University of Toronto, and Stevens Institute of Technology for helpful comments. This material is based upon work supported by the National Science Foundation under Grant No. 1558636 and the University of Chicago Booth School of Business. First version: June 2016. This version: July 23, 2018.

Footnotes
References
Hide All
Agarwal, A., Negahban, S., & Wainwright, M.J. et al. (2012) Fast global convergence of gradient methods for high-dimensional statistical recovery. The Annals of Statistics 40, 24522482.
Ahn, S.C. & Horenstein, A.R. (2013) Eigenvalue ratio test for the number of factors. Econometrica 81, 12031227.
Andrews, D.W. (2002) Higher-order improvements of a computationally attractive k-step bootstrap for extremum estimators. Econometrica 70, 119162.
Arellano, M. (1987) Computing robust standard errors for within-groups estimators. Oxford Bulletin of Economics and Statistics 49, 431434.
Bai, J. (2003) Inferential theory for factor models of large dimensions. Econometrica 71, 135171.
Bai, J. (2009) Panel data models with interactive fixed effects. Econometrica 77, 12291279.
Bai, J. & Li, K. (2014) Theory and methods of panel data models with interactive effects. The Annals of Statistics 42, 142170.
Bai, J. & Ng, S. (2002) Determining the number of factors in approximate factor models. Econometrica 70, 191221.
Bai, J. & Ng, S. (2006) Confidence intervals for diffusion index forecasts and inference for factor-augmented regressions. Econometrica 74, 11331150.
Belloni, A., Chen, D., Chernozhukov, V., & Hansen, C. (2012) Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80, 23692429.
Belloni, A., Chernozhukov, V., Fernández-Val, I., & Hansen, C. (2017) Program evaluation with high-dimensional data. Econometrica 85, 233298.
Belloni, A., Chernozhukov, V., & Hansen, C. (2014) Inference on treatment effects after selection among high-dimensional controls. The Review of Economic Studies 81, 608650.
Belloni, A., Chernozhukov, V., Hansen, C., & Kozbur, D. (2016) Inference in high-dimensional panel models with an application to gun control. Journal of Business & Economic Statistics 34, 590605.
Bernanke, B.S., Boivin, J., & Eliasz, P. (2005) Measuring the effects of monetary policy: A factor-augmented vector autoregressive (favar) approach. The Quarterly Journal of Economics 120, 387422.
Bonhomme, S. & Manresa, E. (2015) Grouped patterns of heterogeneity in panel data. Econometrica 83, 11471184.
Chatterjee, A. & Lahiri, S.N. (2011) Bootstrapping lasso estimators. Journal of the American Statistical Association 106, 608625.
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., & Newey, W. (2016) Double machine learning for treatment and causal parameters. ArXiv e-prints 1608.00060.
Chernozhukov, V., Chetverikov, D., & Kato, K. (2013) Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Annals of Statistics 41, 27862819.
Cook, P.J. & Ludwig, J. (2006) The social costs of gun ownership. Journal of Public Economics 90, 379391.
Dezeure, R., Bühlmann, P., & Zhang, C.-H. (2017) High-dimensional simultaneous inference with the bootstrap. Test 26, 685719.
Fan, J. & Li, R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96, 13481360.
Fan, J., Xue, L., & Yao, J. (2017) Sufficient forecasting using factor models. Journal of Econometrics 201, 292306.
Fu, W.J. (1998) Penalized regressions: The bridge versus the lasso. Journal of Computational and Graphical Statistics 7, 397416.
Hahn, P.R., Mukeherjee, S., & Carvalho, C. (2013) Partial factor modeling: Predictor dependent shrinkage for linear regression. Journal of the American Statistical Association 108, 9991008.
Hsiao, C., Ching, H.S., & Wan, S. (2012) A panel data approach for program evaluation: measuring the benefits of political and economic integration of Hong Kong with Mainland China. Journal of Applied Econometrics 27, 705740.
Kadkhodaie, M., Sanjabi, M., & Luo, Z.-Q. (2014) On the linear convergence of the approximate proximal splitting method for non-smooth convex optimization. Journal of the Operations Research Society of China 2, 123141.
Li, H., Li, Q., & Shi, Y. (2017) Determining the number of factors when the number of factors can increase with sample size. Journal of Econometrics 197, 7686.
Li, K.T. & Bell, D.R. (2017) Estimation of average treatment effects with panel data: Asymptotic theory and implementation. Journal of Econometrics 197, 6575.
Liang, K.-Y. & Zeger, S. (1986) Longitudinal data analysis using generalized linear models. Biometrika 73, 1322.
Loh, P.-L. & Wainwright, M.J. (2015) Regularized m-estimators with nonconvexity: Statistical and algorithmic theory for local optima. Journal of Machine Learning Research 16, 559616.
Mammen, E. (1993) Bootstrap and wild bootstrap for high dimensional linear models. The Annals of Statistics 21, 255285.
Moon, H.R. & Weidner, M. (2017) Dynamic linear panel regression models with interactive fixed effects. Econometric Theory 33, 158195.
Moon, R. & Weidner, M. (2015) Linear regression for panel with unknown number of factors as interactive fixed effects. Econometrica 83, 15431579.
Nesterov, Y. (2007) Gradient Methods for Minimizing Composite Objective Function. Technical report, University College London.
Pesaran, H. (2006) Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica 74, 9671012.
Prakasa Rao, B. (2009) Conditional independence, conditional mixing and conditional association. Annals of the Institute of Statistical Mathematics 61, 441460.
Raskutti, G., Wainwright, M.J., & Yu, B. (2010) Restricted eigenvalue properties for correlated gaussian designs. Journal of Machine Learning Research 99, 22412259.
Stock, J. & Watson, M. (2002) Forecasting using principal components from a large number of predictors. Journal of the American Statistical Association 97, 11671179.
Su, L. & Chen, Q. (2013) Testing homogeneity in panel data models with interactive fixed effects. Econometric Theory 29, 10791135.
van de Geer, S., Bühlmann, P., Ritov, Y., & Dezeure, R. (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. Annals of Statistics 42, 11661202.
Wager, S. & Athey, S. (2017) Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association. Forthcoming 2018.
Zhang, C.-H. & Zhang, S.S. (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B 76, 217242.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Econometric Theory
  • ISSN: 0266-4666
  • EISSN: 1469-4360
  • URL: /core/journals/econometric-theory
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×
Type Description Title
PDF
Supplementary materials

Hansen and Liao supplementary material
Appendices D-I

 PDF (1.3 MB)
1.3 MB
UNKNOWN
Supplementary materials

Hansen and Liao supplementary material
Hansen and Liao supplementary material 1

 Unknown (414 KB)
414 KB

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 9 *
Loading metrics...

Abstract views

Total abstract views: 59 *
Loading metrics...

* Views captured on Cambridge Core between 22nd August 2018 - 19th September 2018. This data will be updated every 24 hours.