Skip to main content
×
Home
    • Aa
    • Aa

MODEL SELECTION AND INFERENCE: FACTS AND FICTION

  • Hannes Leeb (a1) and Benedikt M. Pötscher (a2)
Abstract

Model selection has an important impact on subsequent inference. Ignoring the model selection step leads to invalid inference. We discuss some intricate aspects of data-driven model selection that do not seem to have been widely appreciated in the literature. We debunk some myths about model selection, in particular the myth that consistent model selection has no effect on subsequent inference asymptotically. We also discuss an “impossibility” result regarding the estimation of the finite-sample distribution of post-model-selection estimators.

Copyright
Corresponding author
Address correspondence to Benedikt Pötscher, Department of Statistics, University of Vienna, Universitätsstrasse 5, A-1010, Vienna, Austria; e-mail: Benedikt.Poetscher@univie.ac.at
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

Ahmed, S.E. & A.K. Basu (2000) Least squares, preliminary test and Stein-type estimation in general vector AR(p) models. Statistica Neerlandica 54, 4766.

Altissimo, F. & V. Corradi (2002) Bounds for inference with nuisance parameters present only under the alternative. Econometrics Journal 5, 494519.

Altissimo, F. & V. Corradi (2003) Strong rules for detecting the numbers of breaks in a time series. Journal of Econometrics 117, 207244.

Andrews, D.W.K. (1986) Complete consistency: A testing analogue of estimator consistency. Review of Economic Studies 53, 263269.

Bauer, P., B.M. Pötscher, & P. Hackl (1988) Model selection by multiple test procedures. Statistics 19, 3944.

Bunea, F. (2004) Consistent covariate selection and post model selection inference in semiparametric regression. Annals of Statistics 32, 898927.

Chen, S.S., D.L. Donoho, & M.A. Saunders (1998) Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20, 3361.

Corradi, V. (1999) Deciding between I(0) and I(1) via FLIL-based bounds. Econometric Theory 15, 643663.

Danilov, D. & J.R. Magnus (2004) On the harm that ignoring pretesting can cause. Journal of Econometrics 122, 2746.

Dijkstra, T.K. & J.H. Veldkamp (1988) Data-driven selection of regressors and the bootstrap. Lecture Notes in Economics and Mathematical Systems 307, 1738.

Ensor, K.B. & H.J. Newton (1988) The effect of order estimation on estimating the peak frequency of an autoregressive spectral density. Biometrika 75, 587589.

Fan, J. & R. Li (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association 96, 13481360.

Frank, I.E. & J.H. Friedman (1993) A statistical view of some chemometrics regression tools (with discussion). Technometrics 35, 109148.

Giles, J.A. & D.E.A. Giles (1993) Pre-test estimation and testing in econometrics: Recent developments. Journal of Economic Surveys 7, 145197.

Hall, A.R. & F.P.M. Peixe (2003) A consistent method for the selection of relevant instruments. Econometric Reviews 22, 269287.

Hidalgo, J. (2002) Consistent order selection with strongly dependent data and its application to efficient estimation. Journal of Econometrics 110, 213239.

Hjort, N.L. & G. Claeskens (2003) Frequentist model average estimators. Journal of the American Statistical Association 98, 879899.

Kabaila, P. (1995) The effect of model selection on confidence regions and prediction regions. Econometric Theory 11, 537549.

Kabaila, P. (1998) Valid confidence intervals in regression after variable selection. Econometric Theory 14, 463482.

Kapetanios, G. (2001) Incorporating lag order selection uncertainty in parameter inference for AR models. Economics Letters 72, 137144.

Kempthorne, P.J. (1984) Admissible variable-selection procedures when fitting regression models by least squares for prediction. Biometrika 71, 593597.

Kilian, L. (1998) Accounting for lag order uncertainty in autoregressions: The endogenous lag order bootstrap algorithm. Journal of Time Series Analysis 19, 531548.

Kulperger, R.J. & S.E. Ahmed (1992) A bootstrap theorem for a preliminary test estimator. Communications in Statistics: Theory and Methods 21, 20712082.

Leeb, H. & B.M. Pötscher (2003a) The finite-sample distribution of post-model-selection estimators and uniform versus nonuniform approximations. Econometric Theory 19, 100142.

Lütkepohl, H. (1990) Asymptotic distributions of impulse response functions and forecast error variance decompositions of vector autoregressive models. Review of Economics and Statistics 72, 116125.

Magnus, J.R. (1999) The traditional pretest estimator. Teoriya Veroyatnost. i Primenen. 44, 401418; translation in Theory of Probability and Its Applications 44 (2000), 293–308.

Nishii, R. (1984) Asymptotic properties of criteria for selection of variables in multiple regression. Annals of Statistics 12, 758765.

Phillips, P.C.B. (2005) Automated discovery in econometrics. Econometric Theory (this issue).

Pötscher, B.M. (1983) Order estimation in ARMA-models by Lagrangian multiplier tests. Annals of Statistics 11, 872885.

Pötscher, B.M. (1991) Effects of model selection on inference. Econometric Theory 7, 163185.

Pötscher, B.M. (1995) Comment on “The effect of model selection on confidence regions and prediction regions.” Econometric Theory 11, 550559.

Pötscher, B.M. (2002) Lower risk bounds and properties of confidence sets for ill-posed estimation problems with applications to spectral density and persistence estimation, unit roots, and estimation of long memory parameters. Econometrica 70, 10351065.

Pötscher, B.M. & A.J. Novak (1998) The distribution of estimators after model selection: Large and small sample results. Journal of Statistical Computation and Simulation 60, 1956.

Sargan, D.J. (2001) The choice between sets of regressors. Econometric Reviews 20, 171186.

Sclove, S.L., C. Morris, & R. Radhakrishnan (1972) Non-optimality of preliminary-test estimators for the mean of a multivariate normal distribution. Annals of Mathematical Statistics 43, 14811490.

Sen, P.K (1979) Asymptotic properties of maximum likelihood estimators based on conditional specification. Annals of Statistics 7, 10191033.

Sen, P.K & A.K.M.E. Saleh (1987) On preliminary test and shrinkage M-estimation in linear models. Annals of Statistics 15, 15801592.

Söderström, T. (1977) On model structure testing in system identification. International Journal of Control 26, 118.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Econometric Theory
  • ISSN: 0266-4666
  • EISSN: 1469-4360
  • URL: /core/journals/econometric-theory
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 94
Total number of PDF views: 282 *
Loading metrics...

Abstract views

Total abstract views: 937 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 24th September 2017. This data will be updated every 24 hours.