To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Summary. Approximate forms of inference based on local approximations of the log likelihood in the neighbourhood of its maximum are discussed. An initial discussion of the exact properties of log likelihood derivatives includes a definition of Fisher information. Then the main properties of maximum likelihood estimates and related procedures are developed for a one-dimensional parameter. A notation is used to allow fairly direct generalization to vector parameters and to situations with nuisance parameters. Finally numerical methods and some other issues are discussed in outline.
General remarks
The previous discussion yields formally exact frequentist solutions to a number of important problems, in particular concerning the normal-theory linear model and various problems to do with Poisson, binomial, exponential and other exponential family distributions. Of course the solutions are formal in the sense that they presuppose a specification which is at best a good approximation and which may in fact be inadequate. Bayesian solutions are in principle always available, once the full specification of the model and prior distribution are established.
There remain, however, many situations for which the exact frequentist development does not work; these include nonstandard questions about simple situations and many models where more complicated formulations are unavoidable for any sort of realism. These issues are addressed by asymptotic analysis. That is, approximations are derived on the basis that the amount of information is large, errors of estimation are small, nonlinear relations are locally linear and a central limit effect operates to induce approximate normality of log likelihood derivatives.
Summary. Key ideas about probability models and the objectives of statistical analysis are introduced. The differences between frequentist and Bayesian analyses are illustrated in a very special case. Some slightly more complicated models are introduced as reference points for the following discussion.
Starting point
We typically start with a subject-matter question. Data are or become available to address this question. After preliminary screening, checks of data quality and simple tabulations and graphs, more formal analysis starts with a provisional model. The data are typically split in two parts (y : z), where y is regarded as the observed value of a vector random variable Y and z is treated as fixed. Sometimes the components of y are direct measurements of relevant properties on study individuals and sometimes they are themselves the outcome of some preliminary analysis, such as means, measures of variability, regression coefficients and so on. The set of variables z typically specifies aspects of the system under study that are best treated as purely explanatory and whose observed values are not usefully represented by random variables. That is, we are interested solely in the distribution of outcome or response variables conditionally on the variables z; a particular example is where z represents treatments in a randomized experiment.
We use throughout the notation that observable random variables are represented by capital letters and observations by the corresponding lower case letters.
Much of this book has involved an interplay between broadly frequentist discussion and a Bayesian approach, the latter usually involving a wider notion of the idea of probability. In many, but by no means all, situations numerically similar answers can be obtained from the two routes. Both approaches occur so widely in the current literature that it is important to appreciate the relation between them and for that reason the book has attempted a relatively dispassionate assessment.
This appendix is, by contrast, a more personal statement. Whatever the approach to formal inference, formalization of the research question as being concerned with aspects of a specified kind of probability model is clearly of critical importance. It translates a subject-matter question into a formal statistical question and that translation must be reasonably faithful and, as far as is feasible, the consistency of the model with the data must be checked. How this translation from subject-matter problem to statistical model is done is often the most critical part of an analysis. Furthermore, all formal representations of the process of analysis and its justification are at best idealized models of an often complex chain of argument.
Frequentist analyses are based on a simple and powerful unifying principle. The implications of data are examined using measuring techniques such as confidence limits and significance tests calibrated, as are other measuring instruments, indirectly by the hypothetical consequences of their repeated use.
Most statistical work is concerned directly with the provision and implementation of methods for study design and for the analysis and interpretation of data. The theory of statistics deals in principle with the general concepts underlying all aspects of such work and from this perspective the formal theory of statistical inference is but a part of that full theory. Indeed, from the viewpoint of individual applications, it may seem rather a small part. Concern is likely to be more concentrated on whether models have been reasonably formulated to address the most fruitful questions, on whether the data are subject to unappreciated errors or contamination and, especially, on the subject-matter interpretation of the analysis and its relation with other knowledge of the field.
Yet the formal theory is important for a number of reasons. Without some systematic structure statistical methods for the analysis of data become a collection of tricks that are hard to assimilate and interrelate to one another, or for that matter to teach. The development of new methods appropriate for new problems would become entirely a matter of ad hoc ingenuity. Of course such ingenuity is not to be undervalued and indeed one role of theory is to assimilate, generalize and perhaps modify and improve the fruits of such ingenuity.
The original motivation for writing this book was rather personal. The first author, in the course of his teaching career in the Department of Pure Mathematics and Mathematical Statistics (DPMMS), University of Cambridge, and St John's College, Cambridge, had many painful experiences when good (or even brilliant) students, who were interested in the subject of mathematics and its applications and who performed well during their first academic year, stumbled or nearly failed in the exams. This led to great frustration, which was very hard to overcome in subsequent undergraduate years. A conscientious tutor is always sympathetic to such misfortunes, but even pointing out a student's obvious weaknesses (if any) does not always help. For the second author, such experiences were as a parent of a Cambridge University student rather than as a teacher.
We therefore felt that a monograph focusing on Cambridge University mathematics examination questions would be beneficial for a number of students. Given our own research and teaching backgrounds, it was natural for us to select probability and statistics as the overall topic. The obvious starting point was the first-year course in probability and the second-year course in statistics. In order to cover other courses, several further volumes will be needed; for better or worse, we have decided to embark on such a project.
Thus our essential aim is to present the Cambridge University probability and statistics courses by means of examination (and examination-related) questions that have been set over a number of past years.