To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Everything should be made as simple as possible, but not simpler.
Attributed to Einstein
We can use what we have learnt to start making some inferences about data. Maybe we have collected measurements of a quantity and wish to see if these are consistent with some theoretical expectation. We don't just want to compute the sample mean but to compare it with something else. Perhaps we have two samples, taken under different conditions (such as a ‘treatment’ and ‘control’ group) and wish to see if their mean responses differ. Another very common situation is that we have measurements of some response (y) taken at different values of some explanatory variable (x) and wish to quantify the way that y responds. We can go some way to getting useful inferences out of such data using numerical and graphical summaries (Chapter 2). These can be refined once we have studied some probability theory (Chapters 4 and 5).
Inference about the mean of a sample
We take repeated measurements of a single quantity, or measure the same quantity for each member of a finite sample, and wish to discover whether these data are consistent with a predetermined theoretical value. We want to know if our sample is consistent with being randomly drawn from a theoretical population, with some particular population mean. As an example, let's consider the first ‘experiment’ (batch of 20 runs) of Michelson's dataset (see Appendix B, section B.1).
Dotted throughout the book are extracts of computer code that show how to perform the calculations under discussion. The examples are based on specific problems discussed in the text, but should be clear enough that they can also be used, with very little effort, for ‘real life’ data analysis problems. The computer codes are written in the R environment, which is introduced in this appendix.
What is R?
R is an environment for statistical computation and data analysis. You can think of it as a suite of software for manipulating data, producing plots and performing calculations, with a very wide range of powerful statistical tools. But it is also a programming language, so you can construct your own analyses with a little programming effort. It is one of the standard packages used by statisticians (professional and academic). To install R visit www.r-project.org/.
A first R session
First of all, start R, either by typing R at the command prompt (e.g. Linux) or double-clicking on the relevant icon (e.g. Windows).
A typical R session involves typing some commands into a ‘console’ window, and viewing the text and/or graphical output (which may appear in a pop-out window). The prompt is usually a ‘>’ sign, but can be changed if desired. At the prompt you can enter commands to execute.Virtually all commands in R have a command(arguments) format, where the name of the command is followed by some arguments enclosed in brackets (if there are no arguments the brackets are still present but empty).
It is remarkable that a science which began with the consideration of games of chance should have become the most important object of human knowledge.
Pierre-Simon Laplace (1812) Théorie Analytique des Probabilités
Why should a scientist bother with statistics? Because science is about dealing rigorously with uncertainty, and the tools to accomplish this are statistical. Statistics and data analysis are an indispensable part of modern science.
In scientific work we look for relationships between phenomena, and try to uncover the underlying patterns or laws. But science is not just an ‘armchair’ activity where we can make progress by pure thought. Our ideas about the workings of the world must somehow be connected to what actually goes on in the world. Scientists perform experiments and make observations to look for new connections, test ideas, estimate quantities or identify qualities of phenomena. However, experimental data are never perfect. Statistical data analysis is the set of tools that helps scientists handle the limitations and uncertainties that always come with data. The purpose of statistical data analysis is insight not just numbers. (That's why the book is called Scientific Inference and not something more like Statistics for Physics.)
Scientific method
Broadly speaking, science is the investigation of the physical world and its phenomena by experimentation. There are different schools of thought about the philosophy of science and the scientific method, but there are some elements that almost everyone agrees are components of the scientific method.
The generation of random numbers is too important to be left to chance.
Title of an article by Coveyou (1969)
In the preceding chapters, we have discussed ways to estimate various statistics that summarise data and/or hypotheses, such as sample means and variances, parameters of models, their distributions, confidence intervals and p-values from goodness-of-fit tests. We can calibrate these if we know the sampling distribution of the relevant statistics. That is, we can place the observed value in the distribution expected (for a given hypothesis) and assess whether it is in the expected range or not. For example, in order to compute a p-value from a goodness-of-fit test, we need to know the distribution of the test statistics, or to find the variance (or bias) of some estimator we need to know the sampling distribution of the estimator. These follow from the distribution of the data and the mathematical relationship between the data and the statistic. Often this is difficult, sometimes even impossible, to perform analytically. But the Monte Carlo method makes many of these problems tractable, and provides a powerful tool for analysing data, and understanding the properties of analysis procedures and experiments.
The core of the Monte Carlo method is to generate random data and use this to compute estimates of derived quantities. We can use the Monte Carlo method to evaluate integrals, explore distributions of estimators and estimate any other quantities of sampling distributions.
An important difference between the classical and quantum perspectives is their different criteria of distinguishability. Identical particles are classically distinguishable when separated in phase space. On the other hand, identical particles are always quantum mechanically indistinguishable for the purpose of counting distinct microstates. But these concepts and these distinctions do not tell the whole story of how we count the microstates and determine the multiplicity of a quantized system.
There are actually two different ways of counting the accessible microstates of a quantized system of identical, and so indistinguishable, particles. While these two ways were discovered in the years 1924–1926 independently of Erwin Schrödinger’s (1887–1961) invention of wave mechanics in 1926, their most convincing explanation is in terms of particle wave functions. The following two paragraphs may be helpful to those familiar with the basic features of wave mechanics.
A system of identical particles has, as one might expect, a probability density that is symmetric under particle exchange, that is, the probability density is invariant under the exchange of two identical particles. But here wave mechanics surprises the classical physicist. A system wave function may either keep the same sign or change signs under particle exchange. In particular, a system wave function may be either symmetric or antisymmetric under particle exchange.
The existence of entropy follows inevitably from the first and second laws of thermodynamics. However, our purpose is not to reproduce this deduction, but rather to focus on the concept of entropy, its meaning and its applications. Entropy is a central concept for many reasons, but its chief function in thermodynamics is to quantify the irreversibility of a thermodynamic process. Each term in this phrase deserves elaboration. Here we define thermodynamics and process; in subsequent sections we take up irreversibility. We will also learn how entropy or, more precisely, differences in entropy tell us which processes of an isolated system are possible and which are not.
Thermodynamics is the science of macroscopic objects composed of many parts. The very size and complexity of thermodynamic systems allow us to describe them simply in terms of a mere handful of equilibrium or thermodynamic variables, for instance, pressure, volume, temperature, mass or mole number, internal energy, and, of course, entropy. Some of these variables are related to others via equations of state in ways that differently characterize different kinds of systems, whether gas, liquid, solid, or composed of magnetized parts.
The thermodynamic view of a physical system is the “black box” view. We monitor the input and output of a black box and measure its superficial characteristics with the human-sized instruments available to us: pressure gauges, thermometers, and meter sticks. The laws of thermodynamics govern the relations among these measurements. For instance, the zeroth law of thermodynamics requires that two black boxes each in thermal equilibrium with a third are in thermal equilibrium with each other, the first law that the energy of an isolated black box can never change, and the second law that the entropy of an isolated black box can never decrease. According to these laws and these measurements each black box has an entropy function S(E,V, …) whose dependence on a small set of variables encapsulates all that can be known of the black box system.
But we are not satisfied with black boxes – especially when they work well! We want to look inside a black box and see what makes it work. Yet when we first look into the black box of a thermodynamic system we see even more thermodynamic systems. A building, for instance, is a thermodynamic system. But so also is each room in the building, each cabinet in each room, and each drawer in each cabinet. But actual thermodynamic systems cannot be subdivided indefinitely. At some point the concepts and methods of thermodynamics cease to apply. Eventually the subdivisions of a thermodynamic system cease to be smaller thermodynamic systems and instead become groups of atoms and molecules.
One of the most important contributions that the science of astronomy has made to human progress is an understanding of the distance and size of celestial objects. After millennia of using our eyes and about four centuries of using telescopes, we now have a very good idea of where we are in the Universe and how our planet fits in among the other bodies in our Solar System, the Milky Way galaxy, and the Universe. Several of the techniques astronomers use to estimate distance and size are based on angles, and the purpose of this chapter is to make sure you understand the mathematical foundation of these techniques. Specifically, the concepts of parallax and angular size are discussed in the first two sections of this chapter, and the third section describes the angular resolution of astronomical instruments.
Parallax
Parallax is a perspective phenomenon that makes a nearby object appear to shift position with respect to more distant objects when the observation point is changed. This section begins with an explanation of the parallax concept and proportionality relationships and concludes with examples of parallax calculations relevant to astronomy.
Parallax concept
You can easily demonstrate the effect of parallax by holding your index finger upright at arm's length and then observing that finger and the background behind it with your left eye open and your right eye closed.
The mathematician John von Neumann once urged the information theorist Claude Shannon to assign the name entropy to the measure of uncertainty Shannon had been investigating. After all, a structurally identical measure with the name entropy had long been an element of statistical mechanics. Furthermore, “No one really knows what entropy really is, so in a debate you will always have the advantage.” Most of us love clever one-liners and allow each other to bend the truth in making them. But von Neumann was wrong about entropy. Many people have understood the concept of entropy since it was first discovered 150 years ago.
Actually, scientists have no choice but to understand entropy because the concept describes an important aspect of reality. We know how to calculate and how to measure the entropy of a physical system. We know how to use entropy to solve problems and to place limits on processes. We understand the role of entropy in thermodynamics and in statistical mechanics. We also understand the parallelism between the entropy of physics and chemistry and the entropy of information theory.
But von Neumann’s witticism contains a kernel of truth: entropy is difficult, if not impossible, to visualize. Consider that we are able to invest the concept of the energy of a rod of iron with meaning by imagining the rod broken into its smallest parts, atoms of iron, and comparing the energy of an iron atom to that of a macroscopic, massive object attached to a network of springs that model the interactions of the atom with its nearest neighbors. The object’s energy is then the sum of its kinetic and potential energies – types of energy that can be studied in elementary physics laboratories. Finally, the energy of the entire system is the sum of the energy of its parts.