To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In embarking on statistics we are entering a vast area, enormously developed for the Gaussian distribution in particular. This is classical territory; historically, statistics were developed because the approach now called Bayesian had fallen out of favour. Hence, direct probabilistic inferences were superseded by the indirect and conceptually different route, going through statistics and intimately linked to hypothesis testing. The use of statistics is not particularly easy. The alternatives to Bayes' methods are subtle and not very obvious; they are also associated with some fairly formidable mathematical machinery. We will avoid this, presenting only results and showing the use of statistics, while trying to make clear the conceptual foundations.
Statistics
Statistics are designed to summarize, reduce or describe data. The formal definition of a statistic is that it is some function of the data alone. For a set of data X1, X2, …, some examples of statistics might be the average, the maximum value or the average of the cosines. Statistics are therefore combinations of finite amounts of data. In the following discussion, and indeed throughout, we try to distinguish particular fixed values of the data, and functions of the data alone, by upper case (except for Greek letters). Possible values, being variables, we will denote in the usual algebraic spirit by lower case.
(interchange between Peter Scheuer and his then student, CRJ)
(The) premise that statistical significance is the only reliable indication of causation is flawed.
(US Supreme Court, Matrixx Initiatives, Inc. vs. Siracusano, 22 March 2011)
It is often the case that we need to do sample comparison: we have someone else's data to compare with ours; or someone else's model to compare with our data; or even our data to compare with our model. We need to make the comparison and to decide something. We are doing hypothesis testing – are our data consistent with a model, with somebody else's data? In searching for correlations as we were in Chapter 4, we were hypothesis testing; in the model-fitting of Chapter 6 we are involved in data modelling and parameter estimation.
A frequentist point of view might be to consider the entire science of statistical inference as hypothesis testing followed by parameter estimation. However, if experiments were properly designed, the Bayesian approach would be right: it answers the sample-comparison questions we wished to pose in the first place, namely what is the probability, given the data, that a particular model is right? Or: what is the probability, given two sets of data, that they agree? The two-stage process should be unecessary at best.
It is difficult to understand why statisticians commonly limit their inquiries to Averages, and do not revel in more comprehensive views.
(Francis Galton, 1889)
When we make a set of measurements, it is instinct to try to correlate the observations with other results. One or more motives may be involved in this instinct. For instance we might wish (a) to check that other observers' measurements are reasonable, (b) to check that our measurements are reasonable, (c) to test a hypothesis, perhaps one for which the observations were explicitly made, or (d) in the absence of any hypothesis, any knowledge or anything better to do with the data, to find if they are correlated with other results in the hope of discovering some new and universal truth.
The fishing trip
Take the last point first. Suppose that we have plotted something against something, on a fishing expedition of this type. There are grave dangers on this expedition, and we must ask ourselves the following questions.
Does the eye see much correlation? If not, calculation of a formal correlation statistic is probably a waste of time.
Could the apparent correlation be due to selection effects? Consider, for instance, the beautiful correlation in Figure 4.1, in which Sandage (1972) plotted radio luminosities of sources in the 3CR catalogue as a function of distance modulus. […]
An examination of the distribution of the numbers of galaxies recorded on photographic plates shows that it does not conform to the Poisson law and indicates the presence of a factor causing ‘contagion’.
(Neyman et al. 1953)
God not only plays dice. He also sometimes throws the dice where they cannot be seen.
(Stephen Hawking)
The distribution of objects on the celestial sphere, or on an imaged patch of this sphere, has ever been a major preoccupation of astronomers. Avoiding here the science of image processing, province of thousands of books and papers, we consider some of the common statistical approaches used to quantify sky distributions in order to permit contact with theory. Before we turn to the adopted statistical weaponry of galaxy distribution, we discuss some general statistics applicable to the spherical surface.
Statistics on a spherical surface
The distribution of objects on the celestial sphere is the distribution of directions of a set of unit vectors. Many other 3D spaces face similar issues of distribution, such as the Poincaré sphere with unit vectors indicating the state of polarization of radiation. Geophysical topics (orientation of paeleomagnetism, for instance) motivate much analysis.
Thus, this is a thriving sub-field of statistics and there is an excellent handbook (Fisher et al., 1987). The emphasis is on statistical modelling and a variety of distributions is available.
Whether He does or not, the concepts of probability are important in astronomy for two reasons.
Astronomical measurements are subject to random measurement error, perhaps more so than most physical sciences because of our inability to re-run experiments and our perpetual wish to observe at the extreme limit of instrumental capability. We have to express these errors as precisely and usefully as we can. Thus, when we say ‘an interval of 10-6 units, centred on the measured mass of the Moon, has a 95 per cent chance of containing the true value’, it is a much more quantitative statement than ‘the mass of the Moon is 1 ± 10-6 units’. The second statement really only means anything because of some unspoken assumption about the distribution of errors. Knowing the error distribution allows us to assign a probability, or measure of confidence, to the answer.
The inability to do experiments on our subject matter leads us to draw conclusions by contrasting properties of controlled samples. These samples are often small and subject to uncertainty in the same way that a Gallup poll is subject to ‘sampling error’. In astronomy we draw conclusions such as: ‘the distributions of luminosity in X-ray-selected Type I and Type II objects differ at the 95 per cent level of significance.’ Very often the strength of this conclusion is dominated by the number of objects in the sample and is virtually unaffected by observational error.
Of the vast literature, we point to some works which we have found useful, enlightening or just plain entertaining. We bin these into six types (somewhat arbitrarily as there is much overlap): popular, the basic text, the rigorous text, the data analysis manual, the texts considering statistical packages, and the statistics treatments of specialist interest to astronomers.
The classic popular books have legendary titles: How to Lie with Statistics (Huff, 1973), Facts from Figures (Moroney, 1965), Statistics in Action (Sprent, 1977) and Statistics without Tears (Rowntree, 1981). They are all fun. To this list we can now add The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century (Salsburg, 2002), an entertaining exposition of the development of modern statistics; Struck by Lightning: the Curious World of Probabilities (Rosenthal, 2006); Making Sense of Statistics: A Non-mathematical Approach (Wood, 2003), and Dicing with Death: Chance, Risk and Health (Senn, 2003). This latter is a devastatingly blunt, funny and erudite exposition of the importance and application of statistics in decision processes which may affect the lives of millions. As a popular book it is heavy-going in parts; but for scientists, budding or mature, it is a rewarding read.
Textbooks come in types (a) and (b), both of which cover similar material for the first two-thirds of each book. They start with descriptive or summarizing statistics (mean, standard deviation), the distributions of these statistics, and move to the concept of probability and hence statistical inference and hypothesis testing, including correlation of two variables. […]
Watson, you are coming along wonderfully. You have really done very well indeed. It is true that you have missed everything of importance, but you have hit upon the method.
(Sherlock Holmes in ‘A Case of Identity’, Sir Arthur Conan Doyle)
By a small sample we may judge of the whole piece.
(Don Quixote, Miguel de Cervantes)
‘Detection’ is one of the commonest words in the practising astronomer's vocabulary. It is the preliminary to much else that happens in astronomy, whether it means locating a spectral line, a faint star or a gamma-ray burst. Indeed, of its wide range of meanings, here we take the location, and confident measurement, of some sort of feature in a fixed region of an image or spectrum. When a detection is obvious to even the most sceptical referee, statistical questions usually do not arise in the first instance. The parameters that result from such a detection have signal-to-noise ratio so high that the detection finds its way into the literature as fact. However, elusive objects or features at the limit of detectability tend to become the focus of interest in any branch of astronomy. Then, the notion of detection (and non-detection) requires careful examination and definition.
Non-detections are especially important because they define how representative any catalogue of objects may be. This set of non-detections can represent vital information in deducing the properties of a population of objects; if something is never detected, that too is a fact, and can be exploited statistically.
Phase space is a good model of what we are interested in when we talk about a certain range of possibilities. Not all possibilities can be captured in phase space, but, for certain purposes, many of the interesting ones can be. In this chapter, I will introduce the idea that we have not yet seen the most powerful and useful application of phase space: to model certain types of probabilities or chances.
The leaking tyre
You wake up to find yourself in a closed room, isolated from outside causal influence. In the middle of the room is a bicycle tyre. Because all else is quiet, you can hear that the tyre is hissing very gently, and as you go over to it, you are able to locate the small stream of air that is leaking out of a tiny hole, producing the noise. You judge, by a squeeze of the tyre, that it will take some time before the air will stop hissing out of the tyre.
What is the macro-condition of the room right now? It is one in which there is a large volume of gas at relatively low pressure, and a small volume of gas in the tyre, at relatively high pressure. Moreover, there is a small aperture between these two volumes of gas.
Suppose you were offered the chance to play a simple gambling game, in which you are invited to bet on the outcome of a die-roll. There are only two bets allowed. You can wager that the die will land 6, or you can wager that it will land any of 1, 2, 3, 4, or 5. In either case, if your wager is successful, you will win the same prize: one dollar.
So the bets are:
Die lands 1–5 Pays $1.
Die lands 6 Pays $1.
Assume that you know, moreover, that the die has no significant asymmetry in its construction. It does not have a physical bias to one or more sides.
Which bet ought you to take? Assuming you would prefer more money to less, it is obvious that you ought to take the bet on 1–5, rather than the bet on 6.
Now suppose that you really do play this game, and you play it at the same time as a friend. You sensibly choose to bet on 1–5. Your friend, bizarrely, insists that she has a hunch that the die will land 6; so that is the bet she takes. The die lands 6. Your friend wins.
In the previous chapter we examined phase space as a way of representing mechanical possibilities. These are the ways the world might be, assuming something like the classical picture is correct, and assuming that the world contains a certain number of particles.
Phase space does not capture all of the possibilities, but it is – I suggest – a useful set of possibilities for certain purposes. In this chapter, I want to focus on the limitations of phase space, and similarly constructed spaces of possibilities. Knowing the limits of these approaches, we'll be better placed to use them with confidence in an analysis of chance.
Propositions in phase space
Typically, when we entertain thoughts about the world, it is not in the same degree of detail as the highly specific possibilities that were introduced in the previous chapter. For instance, we might wonder about whether or not there is an elephant in the room. If we try to identify, in terms of mechanical possibilities, which mechanical possibility correlates with an elephant being in the room, then we find that there is a mismatch between the specificity of our thoughts and the mechanical possibilities. There are lots of different mechanical ways you can arrange particles in space so as to have an elephant in the room. So what might have seemed like ‘one’ possibility turns out to correspond to many different mechanical possibilities.
We understand the actual world only when we can locate it accurately in logical space.
(Bigelow and Pargetter 1990)
Possibilism
In the opening chapter, I characterised chance as the degree of belief recommended by the best identifiable advice function, given the available evidence. In the following chapters, I have sketched the sort of conceptual tools used by physicists to obtain probabilities in classical statistical mechanics: a measure over phase space. So a naive response to this presentation is to think that classical statistical mechanics actually tells us what makes a fact of chance. It is a measure over a space of possibilities. The space of possibilities contains the states the system might be in, given the available evidence. Given the right measure – something that we have confirmed by experience – chances are just facts about the relative measures of different macro-conditions. Call this the modal volume theory of chance. In order to assess this idea adequately, we first need to unpack it.
Chances are ratios of volumes
Given I toss a coin, what is the chance that it will land heads? The answer has something to do with two sets of possibilities: that in which I toss a coin, and that in which I toss a coin and it lands heads.