To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
When a die (with 3 or more faces) is rolled, the result of each trial can take one of as many possible values. The same is true in the context of an urn experiment, when the balls in the urn are of multiple different colors. Such models are broadly applicable. Indeed, even `yes/no’ polls almost always include at least one other option like `not sure’ or `no opinion’. Another situation where discrete variables arise is when two or more coins are compared in terms of their chances of landing heads, or more generally, when two or more (otherwise identical) dice are compared in terms of their chances of landing on a particular face. In terms of urn experiments, the analog is a situation where balls are drawn from multiple urns. This sort of experiments can be used to model clinical trials where several treatments are compared and the outcome is dichotomous. When the coins are tossed together, or when the dice are rolled together, we might want to test for independence. We thus introduce some classical tests for comparing multiple discrete distributions and for testing for the independence of two or more discrete variables that are observed together.
A prototypical (although somewhat idealized) workflow in any scientific investigation starts with the design of the experiment to probe a question or hypothesis of interest. The experiment is modeled using several plausible mechanisms. The experiment is conducted and the data are collected. These data are finally analyzed to identify the most adequate mechanism, meaning the one among those considered that best explains the data. Although an experiment is supposed to be repeatable, this is not always possible, particularly if the system under study is chaotic or random in nature. When this is the case, the mechanisms above are expressed as probability distributions. We then talk about probabilistic modeling --- albeit with not one but several probability distributions. It is as if we contemplate several probability experiments, and the goal of statistical inference is to decide on the most plausible one in view of the collected data. We introduce core concepts such as estimators, confidence intervals, and tests.
The chapter focuses on discrete probability spaces, where probability calculations are combinatorial in nature. Urn models are presented as the quintessential discrete experiments.
Statistics is the science of data collection and data analysis. We provide, in this chapter, a brief introduction to principles and techniques for data collection, traditionally divided into survey sampling and experimental design --- each the subject of a rich literature. While most of this book is on mathematical theory, covering aspects of Probability Theory and Statistics, the collection of data is, by nature, much more practical, and often requires domain-specific knowledge. And careful data collection is of paramount importance. Indeed, data that were improperly collected can be completely useless and unsalvageable by any technique of analysis. And it is worth keeping in mind that the collection phase is typically much more expensive that the analysis phase that ensues (e.g., clinical trials, car crash tests, etc). Thus the collection of data should be carefully planned according to well-established protocols or with expert advice. We discuss the basics of data collection in this chapter.
This chapter introduces Kolmogorov’s probability axioms and related terminology and concepts such as outcomes and events, sigma-algebras, probability distributions and their properties.
In this chapter we introduce and briefly discuss some properties of estimators and tests that make it possible to compare multiple methods addressing the same statistical problem. We discuss the notions of sufficiency and consistency, and various notions of optimality (including minimax optimality), both for estimators and for tests.
In a wide range of real-life situations, not one but several, even many hypotheses are to be tested, and not accounting for multiple inference can lead to a grossly incorrect analysis. In this chapter we look closely at this important issue, describing some pitfalls and presenting remedies that `correct’ for this multiplicity. Combination tests assess whether there is evidence against any of the null hypotheses being tested. Other procedures aim instead at identifying the null hypotheses that are not congruent with the data while controlling some notion of error rate.
Randomization was presented in a previous chapter as an essential ingredient in the collection of data, both in survey sampling and in experimental design. We argue here that randomization is the essential foundation of statistical inference: It leads to conditional inference in an almost canonical way, and allows for causal inference, which are the two topics covered in the chapter.
Estimating a proportion is one of the most basic problems in statistics. Although basic, it arises in a number of important real-life situations. Examples include election polls, conducted to estimate the proportion of people that will vote for a particular candidate; quality control, where the proportion of defective items manufactured at a particular plant or assembly line needs to be monitored, and one may resort to statistical inference to avoid having to check every single item; and clinical trials, which are conducted in part to estimate the proportion of people that would benefit (or suffer serious side effects) from receiving a particular treatment. The fundamental model is that of Bernoulli trials. The binomial family of distributions plays a central role. Also discussed are sequential designs, which lead to negative binomial distributions.
We consider an experiment that yields, as data, a sample of independent and identically distributed (real-valued) random variables with a common distribution on the real line. The estimation of the underlying mean and median is discussed at length, and bootstrap confidence intervals are constructed. Tests comparing the underlying distribution to a given distribution (e.g., the standard normal distribution) or a family of distribution (e.g., the normal family of distributions) are introduced. Censoring, which is very common in some clinical trials, is briefly discuss.
This compact course is written for the mathematically literate reader who wants to learn to analyze data in a principled fashion. The language of mathematics enables clear exposition that can go quite deep, quite quickly, and naturally supports an axiomatic and inductive approach to data analysis. Starting with a good grounding in probability, the reader moves to statistical inference via topics of great practical importance – simulation and sampling, as well as experimental design and data collection – that are typically displaced from introductory accounts. The core of the book then covers both standard methods and such advanced topics as multiple testing, meta-analysis, and causal inference.
This chapter gives a brief overview of Bayesian hypothesis testing. We first describe a standard Bayesian analysis of a single binomial response, going through the prior distribution choice and explaining how the posterior is calculated. We then discuss Bayesian hypothesis testing using the Bayes factor, a measure of how much the posterior odds of believing in one hypothesis changes from the prior odds. We show, using a binomial example, how the Bayes factor may be highly dependent on the prior distribution, even with extremely large sample sizes. We next discuss Bayes hypothesis testing using decision theory, reviewing the intrinsic discrepancy of Bernardo, as well as the loss functions proposed by Freedman. Freedman’s loss functions allow the posterior belief in the null hypothesis to equal the p-value. We next discuss well-calibrated null preferences priors, which applied to parameters from the natural exponential family (binomial, negative binomial, Poisson, normal), also give the posterior belief in the null hypothesis equal to valid one-sided p-values, and give credible intervals equal to valid confidence intervals.
The chapter addresses testing when using models. We review linear models, generalized linear models, and proportional odds models, including issues such as checking model assumptions and separation (e.g., when one covariate completely predicts a binary response). We discuss the Neyman–Scott problem, that is, when bias for a fixed parameter estimate can result when the number of nuisance parameters grows with the sample size. With clustered data, we compare mixed effects models and marginal models, pointing out that for logistic regression and other models the fixed effect estimands are different in the two type of models. We present simulations showing that many models may be interpreted as a multiple testing situation, and adjustments should often be made if testing for many effects in a model. We discuss model selection using methods such as Akaike’s information criterion, the lasso, and cross-validation. We compare different model selection processes and their effect on the Type I error rate for a parameter from the final chosen model.