To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Measurements are often numerical in nature, which naturally leads to distributions on the real line. We start our discussion of such distributions in the present chapter, and in the process introduce the concept of random variable, which is really a device to facilitate the writing of probability statements and the derivation of the corresponding computations. We introduce objects such as the distribution function, survival function, and quantile function, any of which characterizes in the underlying distribution.
Some experiments lead to considering not one, but several measurements. As before, each measurement is represented by a random variable, and these are stacked into a random vector. For example, in the context of an experiment that consists in flipping a coin multiple times, we defined in a previous chapter as many random variables, each indicating the result of one coin flip. These are then concatenated to form a random vector, compactly describing the outcome of the entire experiment. Concepts such as conditional probability and independence are introduced.
We consider an experiment that yields, as data, a sample of independent and identically distributed (real-valued) random variables with a common distribution on the real line. The estimation of the underlying mean and median is discussed at length, and bootstrap confidence intervals are constructed. Tests comparing the underlying distribution to a given distribution (e.g., the standard normal distribution) or a family of distribution (e.g., the normal family of distributions) are introduced. Censoring, which is very common in some clinical trials, is briefly discuss.
In this chapter we introduce some tools for sampling from a distribution. We also explain how to use computer simulations to approximate probabilities and, more generally, expectations, which can allow one to circumvent complicated mathematical derivations. The methods that are introduced include Monte Carlo sampling/integration, rejection sampling, and Markov Chain Monte Carlo sampling.
An expectation is simply a weighted mean, and means are at the core of Probability Theory and Statistics. In Statistics, in particular, such expectations are used to define parameters of interest. It turns out that an expectation can be approximated by an empirical average based on a sample from the distribution of interest, and the accuracy of this approximation can be quantified via what is referred to as concentration inequalities.
We derive formulae for the moments of the time of ruin in both ordinary and modified Sparre Andersen risk models without specifying either the inter-claim time distribution or the individual claim amount distribution. We illustrate the application of our results in the special case of exponentially distributed claims, as well as for the following ordinary models: the classical risk model, phase-type(2) risk models, and the Erlang($\mathscr{n}$) risk model. We also show how the key quantities for modified models can be found.
An empirical average will converge, in some sense, to the corresponding expectation. This famous result, called the Law of Large Numbers, can be anticipated based on the concentration inequalities introduced in the previous chapter, but some appropriate notions of convergence for random variables need to be defined in order to make a rigorous statement. Beyond mere convergence, the fluctuations of an empirical average around the associated expectation can be characterized by the Central Limit Theorem, and are known to be Gaussian in some asymptotic sense. The chapter also discusses the limit of extremes such as the maximum of a sample.
This study assessed neonatal visual maturity in infants with congenital heart disease (CHD) and its predictive value for neurodevelopmental outcomes. Neonates with CHD underwent a standardized visual assessment before and after cardiopulmonary bypass surgery. Visual maturity was rated as normal versus abnormal by means of normative reference data. Twelve-month neurodevelopment was assessed with the Bayley-III. Twenty-five healthy controls served as the reference group. Neonatal visual assessment was performed in five neonates with CHD preoperatively and in 24 postoperatively. Only postoperative assessments were considered for further analysis. Median [IQR] age at assessment was 27.0 [21.5, 42.0] days of life in postoperative neonates with CHD and 24.0 [15.0, 32.0] in controls. Visual performance was within reference values in 87.5% in postoperative CHD versus 90.5% in healthy controls (p = 1.0). Visual maturity was not predictive of neurodevelopment at 12 months. These results demonstrate the limited feasibility and predictive value of neonatal visual assessments in CHD.
We consider a collection of statistically identical two-state continuous time Markov chains (channels). A controller continuously selects a channel with the view of maximizing infinite horizon average reward. A switching cost is paid upon channel changes. We consider two cases: full observation (all channels observed simultaneously) and partial observation (only the current channel observed). We analyze the difference in performance between these cases for various policies. For the partial observation case with two channels or an infinite number of channels, we explicitly characterize an optimal threshold for two sensible policies which we name “call-gapping” and “cool-off.” Our results present a qualitative view on the interaction of the number of channels, the available information, and the switching costs.
Stochastic processes model experiments whose outcomes are collections of variables organized in some fashion. We focus here on Markov processes, which include random walks (think of the fortune of a person gambling on black/red at the roulette over time) and branching processes (think of the behavior of a population of an asexual species where each individual gives birth to a number of otherwise identical offsprings according to a given probability distribution) .
In this chapter we consider distributions on the real line that have a discrete support. It is indeed common to count certain occurrences in an experiment, and the corresponding counts are invariably integer-valued. In fact, all the major distributions of this type are supported on the (non-negative) integers. We introduce the main ones here.
We consider an experiment resulting in two paired numerical variables. The general goal addressed in this chapter is that of quantifying the strength of association between these two variables. By association we mean dependence. Contrary to the previous chapter, here the two variables can be measurements of completely different kinds (e.g., height and weight). Several measures of association are introduced, and used to test for independence.
In some areas of mathematics, physics, and elsewhere, continuous objects and structures are often motivated, or even defined, as limits of discrete objects. For example, in mathematics, the real numbers are defined as the limit of sequences of rational numbers, and in physics, the laws of thermodynamics arise as the number of particles in a system tends to infinity (the so-called thermodynamic or macroscopic limit). Taking certain discrete distributions (discussed in the previous chapter) to their continuous limits, which is done by letting their support size increase to infinity in a controlled manner, gives rise to continuous distributions on the real line. We introduce and discuss such distributions in this chapter, including the normal (aka Gaussian) family of distributions, and in the process cover probability densities.
We consider in this chapter experiments where the variables of interest are paired. Importantly, we assume that these variables are directly comparable (in contrast with the following two chapters). Crossover trials are important examples of such experiments. The main question of interest here is that of exchangeability, which reduces to testing for symmetry when there are only two variables.
When a die (with 3 or more faces) is rolled, the result of each trial can take one of as many possible values. The same is true in the context of an urn experiment, when the balls in the urn are of multiple different colors. Such models are broadly applicable. Indeed, even `yes/no’ polls almost always include at least one other option like `not sure’ or `no opinion’. Another situation where discrete variables arise is when two or more coins are compared in terms of their chances of landing heads, or more generally, when two or more (otherwise identical) dice are compared in terms of their chances of landing on a particular face. In terms of urn experiments, the analog is a situation where balls are drawn from multiple urns. This sort of experiments can be used to model clinical trials where several treatments are compared and the outcome is dichotomous. When the coins are tossed together, or when the dice are rolled together, we might want to test for independence. We thus introduce some classical tests for comparing multiple discrete distributions and for testing for the independence of two or more discrete variables that are observed together.