Search results for Statistical theory and methods

8 - The Bootstrap
David A. Freedman, University of California, Berkeley
Book:

Statistical Models

Published online:

05 June 2012

Print publication:

27 April 2009, pp 155-175
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
David A. Freedman, University of California, Berkeley
Book:

Statistical Models

Published online:

05 June 2012

Print publication:

27 April 2009, pp v-x
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
David A. Freedman, University of California, Berkeley
Book:

Statistical Models

Published online:

05 June 2012

Print publication:

27 April 2009, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
David A. Freedman, University of California, Berkeley
Book:

Statistical Models

Published online:

05 June 2012

Print publication:

27 April 2009, pp 431-442
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Preface
David A. Freedman, University of California, Berkeley
Book:

Statistical Models

Published online:

05 June 2012

Print publication:

27 April 2009, pp xiii-xiv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book is primarily intended for advanced undergraduates or beginning graduate students in statistics. It should also be of interest to many students and professionals in the social and health sciences. Although written as a textbook, it can be read on its own. The focus is on applications of linear models, including generalized least squares, two-stage least squares, probits and logits. The bootstrap is explained as a technique for estimating bias and computing standard errors.
The contents of the book can fairly be described as what you have to know in order to start reading empirical papers that use statistical models. The emphasis throughout is on the connection—or lack of connection—between the models and the real phenomena. Much of the discussion is organized around published studies; the key papers are reprinted for ease of reference. Some observers may find the tone of the discussion too skeptical. If you are among them, I would make an unusual request: suspend belief until you finish reading the book. (Suspension of disbelief is all too easily obtained, but that is a topic for another day.)
The first chapter contrasts observational studies with experiments, and introduces regression as a technique that may help to adjust for confounding in observational studies. There is a chapter that explains the regression line, and another chapter with a quick review of matrix algebra. (At Berkeley, half the statistics majors need these chapters.)

The Computer Labs
David A. Freedman, University of California, Berkeley
Book:

Statistical Models

Published online:

05 June 2012

Print publication:

27 April 2009, pp 294-309
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Answers to Exercises
David A. Freedman, University of California, Berkeley
Book:

Statistical Models

Published online:

05 June 2012

Print publication:

27 April 2009, pp 235-293
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Chapter 1 Observational Studies and Experiments
Exercise Set A
1. In table 1, there were 837 deaths from other causes in the total treatment group (screened plus refused) and 879 in the control group. Not much different.
Comments. (i) Groups are the same size, so we can look at numbers or rates. (ii) The difference in number of deaths is relatively small, and not statistically significant.
2. This comparison is biased. The control group includes women who would have accepted screening if they had been asked, and are therefore comparable to women in the screening group. But the control group also includes women who would have refused screening. The latter are poorer, less well educated, less at risk from breast cancer. (A comparison that includes only the subjects who follow the investigators' treatment plans is called “per protocol analysis,” and is generally biased.)
3. Natural experiment. The fact that the Lambeth Company moved its pipe (i) sets up the comparison with Southwark & Vauxhall (table 2) and (ii) makes it harder to explain the difference in death rates between the Lambeth customers and the Southwark & Vauxhall customers on the basis of some difference between the two groups—other than the water. For instance, people were generally not choosing between the two water companies on the basis of how the water tasted. If they had been, selfselection and confounding would be bigger issues. The change in water intake point is one basis for the view that the data could be analyzed as if they were from a randomized controlled experiment.
4. […]

Preface
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp xi-xiii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Every statistician and data analyst has to make choices. The need arises especially when data have been collected and it is time to think about which model to use to describe and summarise the data. Another choice, often, is whether all measured variables are important enough to be included, for example, to make predictions. Can we make life simpler by only including a few of them, without making the prediction significantly worse?
In this book we present several methods to help make these choices easier. Model selection is a broad area and it reaches far beyond deciding on which variables to include in a regression model.
Two generations ago, setting up and analysing a single model was already hard work, and one rarely went to the trouble of analysing the same data via several alternative models. Thus ‘model selection’ was not much of an issue, apart from perhaps checking the model via goodness-of-fit tests. In the 1970s and later, proper model selection criteria were developed and actively used. With unprecedented versatility and convenience, long lists of candidate models, whether thought through in advance or not, can be fitted to a data set. But this creates problems too. With a multitude of models fitted, it is clear that methods are needed that somehow summarise model fits.

2 - Akaike's information criterion
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 22-69
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Data can often be modelled in different ways. There might be simple approaches and more advanced ones that perhaps have more parameters. When many covariates are measured we could attempt to use them all to model their influence on a response, or only a subset of them, which would make it easier to interpret and communicate the results. For selecting a model among a list of candidates, Akaike's information criterion (AIC) is among the most popular and versatile strategies. Its essence is a penalised version of the attained maximum log-likelihood, for each model. In this chapter we shall see AIC at work in a range of applications, in addition to unravelling its basic construction and properties. Attention is also given to natural generalisations and modifications of AIC that in various situations aim at performing more accurately.
Information criteria for balancing fit with complexity
In Chapter 1 various problems were discussed where the task of selecting a suitable statistical model, from a list of candidates, was an important ingredient. By necessity there are different model selection strategies, corresponding to different aims and uses associated with the selected model. Most (but not all) selection methods are defined in terms of an appropriate information criterion, a mechanism that uses data to give each candidate model a certain score; this then leads to a fully ranked list of candidate models, from the ostensibly best to the worst.

5 - Bigger is not always better
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 117-144
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Lack-of-fit and goodness-of-fit tests
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 227-247
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - A comparison of some selection methods
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 99-116
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter we compare some information criteria with respect to consistency and efficiency, which are classical themes in model selection. The comparison is driven by a study of the ‘penalty’ applied to the maximised log-likelihood value, in a framework with increasing sample size. AIC is not strongly consistent, though it is efficient, while the opposite is true for the BIC. We also introduce Hannan and Quinn's criterion, which has properties similar to those of the BIC, while Mallows's Cp and Akaike's FPE behave like AIC.
Comparing selectors: consistency, efficiency and parsimony
If we make the assumption that there exists one true model that generated the data and that this model is one of the candidate models, we would want the model selection method to identify this true model. This is related to consistency. A model selection method is weakly consistent if, with probability tending to one as the sample size tends to infinity, the selection method is able to select the true model from the candidate models. Strong consistency is obtained when the selection of the true model happens almost surely. Often, we do not wish to make the assumption that the true model is amongst the candidate models. If instead we are willing to assume that there is a candidate model that is closest in Kullback–Leibler distance to the true model, we can state weak consistency as the property that, with probability tending to one, the model selection method picks such a closest model.

Contents
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp vii-x
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

1 - Model selection: data examples and introduction
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 1-21
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book is about making choices. If there are several possibilities for modelling data, which should we take? If multiple explanatory variables are measured, should they all be used when forming predictions, making classifications, or attempting to summarise analysis of what influences response variables, or will including only a few of them work equally well, or better? If so, which ones can we best include? Model selection problems arrive in many forms and on widely varying occasions. In this chapter we present some data examples and discuss some of the questions they lead to. Later in the book we come back to these data and suggest some answers. A short preview of what is to come in later chapters is also provided.
Introduction
With the current ease of data collection which in many fields of applied science has become cheaper and cheaper, there is a growing need for methods which point to interesting, important features of the data, and which help to build a model. The model we wish to construct should be rich enough to explain relations in the data, but on the other hand simple enough to understand, explain to others, and use. It is when we negotiate this balance that model selection methods come into play. They provide formal support to guide data users in their search for good models, or for determining which variables to include when making predictions and classifications.

6 - The focussed information criterion
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 145-191
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

The model selection methods presented earlier (such as AIC and the BIC) have one thing in common: they select one single ‘best model’, which should then be used to explain all aspects of the mechanisms underlying the data and predict all future data points. The tolerance discussion in Chapter 5 showed that sometimes one model is best for estimating one type of estimand, whereas another model is best for another estimand. The point of view expressed via the focussed information criterion (FIC) is that a ‘best model’ should depend on the parameter under focus, such as the mean, or the variance, or the particular covariate values, etc. Thus the FIC allows and encourages different models to be selected for different parameters of interest.
Estimators and notation in submodels
In model selection applications there is a list of models to consider. We shall assume here that there is a ‘smallest’ and a ‘biggest’ model among these, and that the others lie between these two extremes. More concretely, there is a narrow model, which is the simplest model that we possibly might use for the data, having an unknown parameter vector θ of length p. Secondly, in the wide model, the largest model that we consider, there are an additional q parameters γ = (γ1, …, γq). We assume that the narrow model is a special case of the wide model, which means that there is a value γ0 such that with γ = γ0 in the wide model, we get precisely the narrow model.

3 - The Bayesian information criterion
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 70-98
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Frontmatter
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp i-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

A guide to notation
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp xiv-xviii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

9 - Model selection and averaging schemes in action
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 248-268
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In this chapter model selection and averaging methods are applied in some usual regression set-ups, like those of generalised linear models and the Cox proportional hazards regression model, along with some less straightforward models for multivariate data. Answers are suggested to several of the specific model selection questions posed about the data sets of Chapter 1. In the process we explain in detail what the necessary key quantities are, for different strategies, and how these are estimated from data. A concrete application of methods for statistical model selection and averaging is often a nontrivial task. It involves a careful listing of all candidate models as well as specification of focus parameters, and there might be different possibilities for estimating some of the key quantities involved in a given selection criterion. Some of these issues are illustrated in this chapter, which is concerned with data analysis and discussion only; for the methodology we refer to earlier chapters.
AIC and BIC selection for Egyptian skull development data
We perform model selection for the data set consisting of measurements on skulls of male Egyptians, living in different time eras; see Section 1.2 for more details. Our interest lies in studying a possible trend in the measurements over time and in the correlation structure between measurements.
Assuming the normal approximation at work, we construct for each time period, and for each of the four measurements, pointwise 95% confidence intervals for the expected average measurement of that variable and in that time period.

Subject index
Gerda Claeskens, Katholieke Universiteit Leuven, Belgium, Nils Lid Hjort, Universitetet i Oslo
Book:

Model Selection and Model Averaging

Published online:

05 September 2012

Print publication:

28 July 2008, pp 310-312
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Statistical theory and methods

Refine search

Refine search

Actions for selected content:

2326 results in Statistical theory and methods

8 - The Bootstrap

Contents

Frontmatter

Index

Preface

Summary

The Computer Labs

Answers to Exercises

Summary

Preface

Summary

2 - Akaike's information criterion

Summary

5 - Bigger is not always better

8 - Lack-of-fit and goodness-of-fit tests

4 - A comparison of some selection methods

Summary

Contents

1 - Model selection: data examples and introduction

Summary

6 - The focussed information criterion

Summary

3 - The Bayesian information criterion

Frontmatter

A guide to notation

9 - Model selection and averaging schemes in action

Summary

Subject index

Statistical theory and methods

Refine search

Refine search

Actions for selected content:

Save Search

2326 results in Statistical theory and methods

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary