To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Since about the turn of the millennium, the study of parametric families of probability distributions has received new, intense interest. The present work is an account of one approach which has generated a great deal of activity.
The distinctive feature of the construction to be discussed is to start from a symmetric density function and, by suitable modification of this, generate a set of non-symmetric distributions. The simplest effect of this process is represented by skewness in the distribution so obtained, and this explains why the prefix ‘skew’ recurs so often in this context. The focus of this construction is not, however, skewness as such, and we shall not discuss the quintessential nature of skewness and how to measure it. The target is in-stead to study flexible parametric families of continuous distributions for use in statistical work. A great deal of those in standard use are symmetric, when the sample space is unbounded. The aim here is to allow for possible departure from symmetry to produce more flexible and more realistic families of distributions.
The concentrated development of research in this area has attracted the interest of both scientists and practitioners, but often the variety of proposals and the existence of related but different formulations bewilders them, as we have been told by a number of colleagues in recent years. The main aim of this work is to provide a key to enter this theme.
The skew-normal density has very short tails. In fact, the rate of decay to 0 of the density φ(x; α) as |x| → ∞ is either the same as the normal density or even faster, depending on whether x and α have equal or opposite sign, as specified by Proposition 2.8. This behaviour makes the skew-normal family unsuitable for a range of application areas where the distribution of the observed data is known to have heavier tails than the normal ones, sometimes appreciably heavier.
To construct a family of distributions of type (1.2) whose tails can be thicker than a normal ones, a solution cannot be sought by replacing the term Φ;(αx) in (2.1) with some other term G0{w(x)}, since essentially the same behaviour of the SN tails would be reproduced. The only real alternative is to adopt a base density f0 in (1.2) with heavier tails than the normal density.
For instance, we could select the Laplace density exp(−|x|)/2, whose tails decrease at exponential rate, to play the role of base density and proceed along lines similar to the skew-normal case. This is a legitimate program, but it is preferable that f0 itself is a member of a family of symmetric density functions, depending on a tail weight parameter, v say, which allows us to regulate tail thickness. For instance, one such choice for f0 is the Student's t family, where v is represented by the degrees of freedom.
In the remaining two chapters of this book we consider some more specialized topics. The enormous number of directions which have been explored prevent, however, any attempt at a detailed discussion within the targeted area. Consequently, we adopt a quite different style of exposition compared with previous chapters: from now on, we aim to present only the key concepts of the various formulations and their interconnections, referring more extensively to the original sources in the literature for a detailed treatment. Broadly speaking, this chapter focuses more on probabilistic aspects, the next chapter on statistical and applied work.
Use of multiple latent variables
General remarks
In Chapters 2 to 6 we dealt almost exclusively with distributions of type (1.2), or of its slight extension (1.26), closely associated with a selection mechanism which involves one latent variable; see (1.8) and (1.11). For the more important families of distributions, an additional type of genesis exists, based on an additive form of representation, of type (5.19), which again involves an auxiliary variable. Irrespective of the stochastic representation which one prefers to think of as the underlying mechanism, the effect of this additional variable is to introduce a factor of type G0{w(x)} or G0{α0 + w(x)} which modulates the base density, where G0 is a univariate distribution function.
Interest in the skew-normal and related families of distributions has grown enormously over recent years, as theory has advanced, challenges of data have grown, and computational tools have made substantial progress. This comprehensive treatment, blending theory and practice, will be the standard resource for statisticians and applied researchers. Assuming only basic knowledge of (non-measure-theoretic) probability and statistical inference, the book is accessible to the wide range of researchers who use statistical modelling techniques. Guiding readers through the main concepts and results, it covers both the probability and the statistics sides of the subject, in the univariate and multivariate settings. The theoretical development is complemented by numerous illustrations and applications to a range of fields including quantitative finance, medical statistics, environmental risk studies, and industrial and business efficiency. The author's freely available R package sn, available from CRAN, equips readers to put the methods into action with their own data.
This introduction to wavelet analysis 'from the ground level and up', and to wavelet-based statistical analysis of time series focuses on practical discrete time techniques, with detailed descriptions of the theory and algorithms needed to understand and implement the discrete wavelet transforms. Numerous examples illustrate the techniques on actual time series. The many embedded exercises - with complete solutions provided in the Appendix - allow readers to use the book for self-guided study. Additional exercises can be used in a classroom setting. A Web site offers access to the time series and wavelets used in the book, as well as information on accessing software in S-Plus and other languages. Students and researchers wishing to use wavelet methods to analyze time series will find this book essential.
Man denkt an das, was man verließ; was man gewohnt war, bleibt ein Paradies (Johann Wolfgang von Goethe, Faust II, 1749–1832). We think of what we left behind; what we are familiar with remains a paradise.
Introduction
Gaussian random vectors are special: uncorrelated Gaussian vectors are independent. The difference between independence and uncorrelatedness is subtle and is related to the deviation of the distribution of the random vectors from the Gaussian distribution.
In Principal Component Analysis and Factor Analysis, the variability in the data drives the search for low-dimensional projections. In the next three chapters the search for direction vectors focuses on independence and deviations from Gaussianity of the low-dimensional projections:
• Independent Component Analysis in Chapter 10 explores the close relationship between independence and non-Gaussianity and finds directions which are as independent and as non-Gaussian as possible;
• Projection Pursuit in Chapter 11 ignores independence and focuses more specifically on directions that deviate most from the Gaussian distribution.
• the methods of Chapter 12 attempt to find characterisations of independence and integrate these properties in the low-dimensional direction vectors.
As in Parts I and II, the introductory chapter in this final part collects and summarises ideas and results that we require in the following chapters.
We begin with a visual comparison of Gaussian and non-Gaussian data.
The truth is rarely pure and never simple (Oscar Wilde, The Importance of Being Ernest, 1854–1900).
Introduction
In the Factor Analysis model X = AF + μ + ε;, an essential aim is to find an expression for the unknown d × k matrix of factor loadings A. Of secondary interest is the estimation of F. If X comes from a Gaussian distribution, then the principal component (PC) solution for A and F results in independent scores, but this luxury is lost in the PC solution of non-Gaussian random vectors and data. Surprisingly, it is not the search for a generalisation of Factor Analysis, but the departure from Gaussianity that has paved the way for new developments.
In psychology, for example, scores in mathematics, language and literature or comprehensive tests are used to describe a person's intelligence. A Factor Analysis approach aims to find the underlying or hidden kinds of intelligence from the test scores, typically under the assumption that the data come from the Gaussian distribution. Independent Component Analysis, too, strives to find these hidden quantities, but under the assumption that the data are non-Gaussian. This assumption precludes the use of the Gaussian likelihood, and the independent component (IC) solution will differ from the maximum-likelihood (ML) Factor Analysis solution, which may not be appropriate for non-Gaussian data.
To get some insight into the type of solution one hopes to obtain with Independent Component Analysis, consider, for example, the superposition of sound tracks.
In mathematics you don't understand things. You just get used to them (John von Neumann, 1903–1957; in Gary Zukav (1979), The Dancing Wu Li Masters).
Introduction
Suppose that we have n objects and that for each pair of objects a numeric quantity or a ranking describes the relationship between objects. The objects could be geographic locations, with a distance describing the relationship between locations. Other examples are different types of food or drink, with judges comparing items pairwise and providing a score for each pair. Multidimensional Scaling combines such pairwise information into a whole picture of the data and leads to a visual representation of the relationships.
Visual and geometric aspects have been essential parts of Multidimensional Scaling. For geographic locations, they lead to a map (see Figure 8.1). From comparisons and rankings of foods, drinks, perfumes or laptops, one typically reconstructs low-dimensional representations of the data and displays these representations graphically in order to gain insight into the relationships between the different objects of interest. In addition to these graphical representations, in a ranking of wines, for example, we might want to know which features result in wines that will sell well; the type of grape, the alcohol content and the region might be of interest.