Published online by Cambridge University Press: 05 May 2013
We consider linear, high-dimensional sparse (HDS) regression models in econometrics. The HDS regression model allows for a large number of regressors, p, which is possibly much larger than the sample size, n, but imposes that the model is sparse. That is, we assume that only s ≪ n of these regressors are important for capturing the main features of the regression function. This assumption makes it possible to effectively estimate HDS models by searching for approximately the correct set of regressors. In this chapter, we review estimation methods for HDS models that make use of ℓ1-penalization and then provide a set of novel inference results. We also provide empirical examples that illustrate the potential wide applicability of HDS models and methods in econometrics.
The motivation for considering HDS models comes in part from the wide availability of datasets with many regressors. For example, the American Housing Survey records prices as well as a multitude of features of houses sold, and scanner datasets record prices and numerous characteristics of products sold at a store or on the Internet. HDS models also are partly motivated by the use of series methods in econometrics. Series methods use many constructed or series regressors – regressors formed as transformation of elementary regressors – to approximate regression functions. In these applications, it is important to have a parsimonious yet accurate approximation of the regression function. One way to achieve this is to use the data to select as mall of number of informative terms from among a very large set of control variables or approximating functions.