To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Several real data sets are used in this book to illustrate aspects of the methods that are developed. Here we provide brief descriptions of each of these real data examples, along with key points to indicate which substantive questions they relate to. Key words are also included to indicate the data sources, the types of models we apply, and pointers to where in our book the data sets are analysed. For completeness and convenience of orientation the list below also includes the six ‘bigger examples’ already introduced in Chapter 1.
Egyptian skulls
There are four measurements on each of 30 skulls, for five different archaeological eras (see Section 1.2). One wishes to provide adequate statistical models that also make it possible to investigate whether there have been changes over time. Such evolutionary changes in skull parameters might relate to influx of immigrant populations. Source: Thomson and Randall-Maciver (1905), Manly (1986).
We use multivariate normal models, with different attempts at structuring for mean vectors and variance matrices, and apply AIC and the BIC for model selection; see Example 9.1.
The (not so) Quiet Don
We use sentence length distributions to decide whether Sholokhov or Kriukov is the most likely author of the Nobel Prize winning novel (see Section 1.3). Source: Private files of the authors, collected by combining information from different tables in Kjetsaa et al. (1984), also with some additional help of Geir Kjetsaa (private communication); see also Hjort (2007a).
Chirality is a term coined by Lord Kevin meant to designate the quality of “any geometrical figure, or group of points of which a plane mirror image, ideally realized, cannot be brought to coincide with itself.” The handedness in molecules was first identified by Pasteur and since then the investigation of its presence, effects, and quantification in the natural sciences has received continued interest.
For example, molecular chirality is a necessary and sufficient condition for a substance to exibit optical activity, a property discovered by Huygens in 1690 when studying a crystal of calcite. Optically active substances are capable of rotating the plane of polarization of a ray of plane polarized light. The presence of the symmetry of an improper rotation (an axial rotational followed by reflection on a plane orthogonal to the axis of rotation) in a molecule is a sufficient condition for its nonchirality as its mirror image created by the plane of reflection, after a rotation, is superimposable. In particular, molecules with the symmetry of a planar reflection or with point symmetry are achiral and hence optically inactive.
Many molecules such as amino acids and sugars are chiral, which in turn can cause DNA molecules to be chiral. The differential effect of the two pairs becomes observable only in the presence of a collective of chiral molecules, or by probing the pair with circularly polarized light, which is in itself a chiral mechanism. Planar chirality adds the constraint that the (planar and bounded) objects cannot be lifted from the plane, and consequently, the allowed symmetry operations are restricted to planar transformations.
George Pólya, in his introduction to mathematics and plausible reasoning, observes that
A great part of the naturalist's work is aimed at describing and classifying the objects that he observes. A good classification is important because it reduces the observable variety to relatively few clearly characterized and wellordered types.
Pólya's (1954, p. 88) remark introduces us directly to the practical aspect of partitioning a large number of objects by exploring certain rules of equivalence among them. This is how symmetry will be understood in the present text: as a set of rules with which we may describe certain regularities among experimental objects or concepts. The classification of crystals, for example, is based on the presence of certain symmetries in their molecular framework, which in turn becomes observable by their optical activity and other measurable quantities.
The delicate notion of measuring something on these objects and recording their data is included in the naturalist's methods of description, so that the classification of the objects may imply the classification or partitioning of their corresponding data. Pólya's picture also includes the notion of interpreting, or characterizing, the resulting types of varieties. That is, the naturalist has a better result when he can explain why certain varieties fall into the same type or category.
This chapter is an introduction to the interplay among symmetry, classification, and experimental data, which is the driving motive underlying any symmetry study and is often present in the basic sciences. The purpose here is to demonstrate that principles derived from such interplay often lead to novel ways of looking at data, particularly of planning experiments and, potentially, of facilitating contextual explanation.
This text is an introduction to data-analytic applications of symmetry principles and arguments or symmetry studies. Its motivation comes from a variety of disciplines in which these principles continue to play a significant role in describing natural phenomena, and from the goal of methodologically applying them to classification, description, and analysis of data. The product of the methodology presented here is a broader class of data-analytic tools derived from well-established and theoretically related areas in algebra and statistics, such as group representations and analysis of variance.
The principles discussed in the text reflect many defining aspects of symmetry, a Greek conception dating from the Hellenic Era and part of a class of terms and forms of expression that designated harmony, rhythm, balance, stability, good proportions, and evenness of structure. Early Greek art and architecture often capture the outstanding dualism intrinsic to the original notion of symmetry – that of retaining the static uniqueness of one's being and at the same time promoting its dynamical multipresent realizations. This dualism is only apparently hidden in the 12th-century Athenian detail shown in the front cover. I invite you to recognize the presence of these pleasant concepts in the methodology to be introduced in the coming pages.
The text is divided as follows. Chapter 1 gives a complete overview of the methodology, including an introduction to the concepts of data indexed by symmetries, finite groups, group actions, orbits and classification, and representations in the data space. At the same time it outlines the step-by-step connection between the algebra and statistical inference, in the context of analysis of variance.
The analysis of variance shown in Table (1.13) on page 9 of Chapter 1 was a consequence of the joint action of G = {1, h, ν, o} and S5 shuffling, respectively, the rows and columns of Table (1.12). The study and data-analytic applications of these rules for shuffling the experimental labels, called group actions, are among the main objectives of the present chapter. The study of group actions and orbits is an integral part of any symmetry study, and was identified earlier in the summary of Chapter 1, on page 23, steps 3, and 4. The algebraic aspects in this and the next two chapters follow closely from Serre (1977, Part I).
Permutations
In the classification of the binary sequences in length of 4 introduced in Chapter 1, the symmetries of interest were the permutations of the four positions and the permutations of the two symbols {u, y}. These sets of permutations, together with the operation of composition of functions, share all the defining algebraic properties identified in the multiplication table of {1, ν, h, o}, characteristics of a finite group. In Section 1.7 of Chapter 1, it was shown that permutations appear in many, if not all, steps of a symmetry study. They appear as labels for the voting preference data on page 12, in matrix form as linear representations, on page 6, and as shuffling machanisms with which sets of labels could be classified and interpreted for the purpose of describing the data indexed by those labels.