To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The subject of probability is made particularly interesting and useful by certain universal features that appear when an experiment with random outcomes is tried a large number of times. This topic is developed intuitively here. We shall play with an example used in Chapter 2 and, after extracting the general pattern from the particular case, we shall infer the remarkable fact that only a very small fraction of the possible outcomes associated with many trials has a reasonable likelihood of occurring. This principle is at the root of the statistical regularities on which the banking and insurance industries, heat engines, chemistry and much of physics, and to some extent life itself depend. A relatively simple mathematical phenomenon has such far reaching consequences because, in a manner to be made clear in this chapter, it is the agency through which certainty almost re-emerges from uncertainty,
The binomial distribution
To illustrate these ideas, we will go back to rolling our hypothetical fair dice. Following the example of the dissolute French noblemen of the seventeenth century, one of whose games we analyzed in such detail in the last chapter, we shall classify outcomes for each die into the mutually exclusive categories ‘six’ and ‘not-six’, which exhaust all possibilities. If the repeatable experiment consists of rolling a single die, the probabilities for these two outcomes are the numbers 1/6 = 0.16667 and 5/6 = 0.83333.
The mathematical theory of probability was born somewhat disreputably in the study of gambling. It quickly matured into a practical way of dealing with uncertainties and as a branch of pure mathematics. When it was about 200 years old, the concept was introduced into physics as a way of dealing with the chaotic microscopic motions that constitute heat. In our century probability has found its way into the foundations of quantum mechanics, the physical theory of the atomic and subatomic world. The improbable yet true tale of how a way of thinking especially suited to the gambling salon became necessary for understanding the inner workings of nature is the topic of this book.
The next three chapters contain some of the basic ideas of the mathematical theory of probability, presented by way of a few examples. Although common sense will help us to get started and avoid irrelevancies, we shall find that a little mathematical analysis yields simple, useful, easy to remember, and quite unobvious results. The necessary mathematics will be picked up as we go along.
In the couplet by John Gay (1688–1732), the author of the Beggar's Opera, probability has a traditional meaning, implying uncertainty but reasonable likelihood. At roughly the same time that the verse was written, the word was acquiring its mathematical meaning.
The most subtle of the concepts that surfaced in the last chapter is equilibrium. Although the word suggests something unchanging in time, the molecular viewpoint has offered a closer look and shown temperature to be the average effect of molecular agitation. At first sight, it seems hard to say anything, let alone anything quantitative, about such chaotic motions. Yet, paradoxically, their very disorder provides a foundation upon which to build a microscopic theory of heat.
How does a dilute gas reach equilibrium? Contemplate a system in a thermally insulating rigid container, so that there is no transfer of energy or matter between the inside and the outside. As time passes, the energy the gas started with is being exchanged between the molecules in the system through random collisions. Finally, a state is reached which is unchanging as far as macroscopic observations are concerned. In this (equilibrium) state each molecule is still engaged in a complicated dance. To calculate the motion of any one molecule, one would eventually need to calculate the motion of all the molecules, because collisions between more and more of them, in addition to collisions with container walls, are the cause of the random motion of any one. Such a calculation, in addition to being impossible, even for the largest modern computer, would be futile: the details of the motion depend on the precise positions and velocities of all the molecules at some earlier time, and we lack this richness of knowledge.
If a thing is worth doing, it is worth doing badly.
G.K. Chesterton
The only mathematical operations we have needed so far have been addition, subtraction, division, multiplication, and an extension of the last: the operation of taking the square-root. [The square-root of a number multiplied by itself is equal to the number.] We have also had the help of a friendly work-horse, the computer, which gave insight into formulas we produced. That insight was always particular to the formula being evaluated; the general rules of thumb we used in the last chapter were obtained by nothing more or less than sleight of hand. To do better, there is no way of avoiding the mathematical operation associated with taking arbitrarily small steps an arbitrarily large number of times. This is the essential ingredient of what is called ‘calculus’ or ‘analysis.’ But don't be alarmed: our needs are modest. We shall be able to manage very well with only some information about the exponential and the logarithm. You will have heard, at least vaguely, of these ‘functions’ – to use the mathematical expression for a number that depends on another number, so that it can be represented by a graph – but rather than relying on uncertain prior knowledge, we shall learn what is needed by empirical self discovery.
Powers
The constructions we need are very natural generalizations of the concept of ‘powers’ which you will find that you are quite familiar with from what has gone before.
Here are a few randomly chosen and occasionally whimsical uses for your new knowledge about the workings of chance. The situations I shall describe all have to do with everyday life. In such applications, the difficulty is not only in the mathematical scheme but also in the frequently unstated assumptions that lie beneath it. It helps to be able to identify the repeatable random experiment and, when many trials are being treated as independent, to be able to argue that they are in fact unconnected.
The examples have been chosen to illustrate the role of statistical fluctuations, because this is the most interesting aspect of randomness for the physical applications to follow. A statistician or a mathematician would choose other examples, but, then, such a person would write a different book.
Polling
Opinion polls are second only to weather forecasts in bringing probability, often controversially, into our daily lives. ‘Polls Wrong,’ a headline might say after an election. Opinions change, and often suddenly. A pollster needs experience and common sense; his or her statistical knowledge need not be profound. But, there is a statistical basis to polling. Consider the question: ‘If 508 of 1000 randomly selected individuals prefer large cars to small, what information is gleaned about the car preferences of the population at large?’ If we ignore the subtleties just alluded to, the question is equivalent to the following one.
The eternal mystery of the world is its comprehensibility
Albert Einstein
The purpose of this little book is to introduce the interested non-scientist to statistical reasoning and its use in physics. I have in mind someone who knows little mathematics and little or no physics. My wider aim is to capture something of the nature of the scientific enterprise as it is carried out by physicists – particularly theoretical physicists.
Every physicist is familiar with the amiable party conversation that ensues when someone – whose high school experience of physics left a residue of dread and despair – says brightly: ‘How interesting! What kind of physics do you do?’ How natural to hope that passing from the general to the particular might dispel the dread and alleviate the despair. Inevitably, though, such a conversation is burdened by a sense of futility: because there are few common premises, there is no reasonable starting point. Yet it would be foolishly arrogant not to recognize the seriousness behind the question. As culprit or as savior, science is perceived as the force in modern society, and scientific illiteracy is out of fashion.
However much I would like to be a guru in a new surge toward literacy in physics, ministering to the masses on television and becoming rich beyond the dreams of avarice, this, alas, is not to be.
A man hath sapiences thre Memorie, engin and intellect also
Chaucer
Engines are, etymologically and in fact, ingenious things. By the end of this chapter we shall understand heat-engines, which are devices that convert ‘heat’ – an intuitive idea to be made precise here – into pushes or pulls or turns. First, however, we need to systematize some of the things we have already learnt about energy and entropy, and thereby extract the science of Thermodynamics from the statistical viewpoint of the last chapter.
Thermodynamics is a theory based on two ‘Laws,’ which are not basic laws of nature in the sense mentioned in Chapter 1, but commonsense rules concerning the flow of energy between macroscopic systems. From the point of view we are taking here, these axioms follow from an understanding of the random motion of the microscopic constituents of matter. It is a tribute to the genius of the scientists – particularly Carnot, Clausius, and Kelvin – who formulated thermodynamics as a macroscopic science, that their scheme is in no way undermined by the microscopic view, which does, however, offer a broader perspective and open up avenues for more detailed calculations. (For example, effusion – discussed at the end of the last chapter – lies beyond the scope of thermodynamics.)
Work, heat, and the First Law of Thermodynamics
The First Law of Thermodynamics is about the conservation of energy.
Linear polymers may be thought of as very long flexible chains made up of single units called monomers. When placed in a solvent at low dilutions, they may exhibit several different types of morphology. If the interactions between different parts of the chain are primarily repulsive, they tend to be in extended configurations with a large entropy. If, however, the forces are sufficiently attractive, the chains collapse into compact objects with little entropy. The collapse transition between these two states occurs at the theta point, where the energy of attraction balances the entropy difference between the two states. This turns out to be a continuous phase transition, to be described later in Section 9.5. However, even in the swollen, entropy dominated, phase, it turns out that the statistics of very long chains are governed by non-trivial critical exponents. Like the percolation problem, this is a purely geometrical phenomenon, yet, through a mapping to a magnetic system, all the standard results of the renormalization group and scaling may be applied. Before describing this, however, it is important to recall some of the simpler approaches to the problem.
Random walk model
If the problem of a long polymer chain is equivalent to some kind of critical behaviour, we would expect universality to hold, and some of the important quantities to be independent of the microscopic details. This means that we may forget all about polymer chemistry, and regard the monomers as rigid links of length a, like a bicycle chain.
In the previous chapter, the ferromagnetic Ising model provided a simple example of a phase diagram with an associated fixed point structure of the renormalization group flows. There were stable fixed points corresponding to low and high temperature phases, and a critical fixed point controlling the behaviour of critical Ising Systems. However, more realistic systems often have more complicated phase diagrams, and therefore a richer fixed point structure. In this chapter we study some of these examples, and show how, even with a rather qualitative description of renormalization group flows, it is possible to understand phase diagrams from the renormalization group viewpoint. More importantly, when more than one non-trivial fixed point is present, the question arises as to which is the dominant one in a particular region of the phase space. The renormalization group answers this question through the theory of cross-over behaviour. The existence of such phenomena, whereby different fixed points may influence the properties of the same system on different length scales, is totally absent in mean field treatments.
Ising model with vacancies
As a first example, consider a generalisation of the Ising model in which the spin variables s(r) may take the value 0 as well as ±1. This may be viewed as the classical version of a quantum spin-1 magnet, or as a lattice gas of magnetic particles, with |s(r)| playing the role of the occupation number.
We saw in Chapter 3 that the hamiltonian for a System at a critical point flows under the renormalization group into a critical fixed point. Under a renormalization group transformation, the microscopic length scale is rescaled by a constant factor b, and so the coordinates of a given point, as measured in units of this length scale, transform according to r → b−1r. This is called a scale transformation. Once the flows reach such a fixed point, the parameters of the hamiltonian no longer change, and it is said to be scale invariant. As well as being scale invariant, the fixed point hamiltonian usually possesses other spatial symmetries. For example, if the underlying model is defined on a lattice, so that its hamiltonian is invariant under lattice translations, the corresponding critical fixed point hamiltonian is generally invariant under arbitrary uniform translations. This is because terms which might be added to the hamiltonian which break the symmetry under continuous translations down to its subgroup of lattice translations are irrelevant at such a fixed point. Similarly, if the lattice model is invariant under a sufficiently large subgroup of the rotation group (for example, if the interactions in the x and y directions on a Square lattice are equal), then the fixed point hamiltonian enjoys full rotational invariance. As discussed on p.58, even if the interactions are anisotropic, rotational invariance may often be recovered by a suitable finite relative rescaling of the coordinates.
One of the most striking aspects of critical behaviour is that of the crucial role played by the geometry of the System. Critical exponents depend in a non-trivial manner on the dimensionality d. The very existence of a phase transition depends on the way in which the infinite volume limit is taken, as discussed in Section 4.4. This happens because the critical fluctuations, which determine the universal properties, occur at long wavelengths and are therefore very sensitive to the large scale geometry. By contrast, the non-critical properties of a system are sensitive to fluctuations on the scale of the correlation length and are therefore much less influenced. This line of reasoning also suggests that not all points in a system are equivalent in the way the local degrees of freedom couple to these critical fluctuations. So far, we have considered the behaviour of scaling operators only at points deep inside the bulk of a system. Near a boundary, however, the local environment of a given degree of freedom is different, and we might expect to find different critical properties there. In general, such differences should extend into the bulk only over distances of the order of the bulk correlation length. However, at a continuous bulk phase transition, this distance diverges, and we should expect the influence of boundaries to be more pronounced.
The simplest modification of the bulk geometry to consider is that of a (d − 1)-dimensional hyperplane bounding a semi-infinite d-dimensional system.
Scaling concepts play a central role in the analysis of the ever more complex systems which nowadays are the focus of much attention in the physical sciences. Whether these problems relate to the very large scale structure of the universe, to the complicated forms of everyday macroscopic objects, or to the behaviour of the interactions between the fundamental constituents of matter at very short distances, they all have the common feature of possessing large numbers of degrees of freedom which interact with each other in a complicated and highly non-linear fashion, often according to laws which are only poorly understood. Yet it is often possible to make progress in understanding such problems by isolating a few relevant variables which characterise the behaviour of these systems on a particular length or time scale, and postulating simple scaling relations between them. These may serve to unify sets of experimental and numerical data taken under widely differing conditions, a phenomenon called universality. When there is only a single independent variable, these relations often take the form of power laws, with exponents which do not appear to be simple rational numbers yet are, once again, universal.
The existence of such scaling behaviour may often be explained through a framework of theoretical ideas loosely grouped under the term renormalization. Roughly speaking, this describes how the parameters specifying the system must be adjusted, under putative changes of the underlying dynamics, in such a way as not to modify the measurable properties on the length or time scales of interest.
Before embarking on an exploration of the modern theories of critical behaviour, it is wise to consider first the more traditional approaches, generally lumped together under the heading of mean field theory. Despite the fact that such methods generally fail sufficiently close to the critical point, there are nonetheless several good reasons for their study. Mean field theory is relatively simple to apply in most cases, and often gives a qualitatively correct picture of the phase diagram of a given model. In some cases, where fluctuation effects are suppressed for some physical reason, its predictions are also quantitatively accurate. In a sufficiently large number of spatial dimensions, it gives exact values for the various critical exponents. Moreover, it often serves as an important adjunct to the renormalization group, since the latter by itself may give direct information about the existence and location of phase transitions, but not about the nature of the phases which they separate. For this, further input is necessary, and this often may be provided by applying mean field theory in a region of the phase diagram far from the critical region, where it is applicable.
The mean field free energy
There are as many derivations of the basic equations of mean field theory as there are books written on the subject, all of varying degrees of rigour and completeness.