To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In subsequent chapters we will make substantial use of some basic results from the Fourier theoryof sequences and – to a lesser extent – functions, and we will find that filters playa central role in the application of wavelets. This chapter is intended as a self-contained guide tosome key results from Fourier and filtering theory. Our selection of material is intentionallylimited to just what we will use later on. For a more thorough discussion employing the samenotation and conventions adopted here, see Percival and Walden (1993). We also recommend Briggs andHenson (1995) and Hamming (1989) as complementary sources for further study.
Readers who have extensive experience with Fourier analysis and filters can just quickly scanthis chapter to become familiar with our notation and conventions. We encourage others to study thematerial carefully and to work through as many of the embedded exercises as possible (answers areprovided in the appendix). It is particularly important that readers understand the concept ofperiodized filters presented in Section 2.6 since we use this idea repeatedly in Chapters 4 and5.
Complex Variables and Complex Exponentials
The most elegant version of Fourier theory for sequences and functions involves the use ofcomplex variables, so here we review a few key concepts regarding them (see, for example, Brown andChurchill, 1995, for a thorough treatment). Let i ≡ √–1 sothat i2 = –1 (throughout the book, we take‘≡’ to mean ‘equal by definition’).
As discussed in Chapter 4, the discrete wavelet transform (DWT) allows us to analyze (decompose) a time series X into DWT coefficients W, from which we can then synthesize (reconstruct) our original series. We have already noted that the synthesis phase can be used, for example, to construct a multiresolution analysis of a time series (see Equation (64) or (104a)) and to simulate long memory processes (see Section 9.2). In this chapter we study another important use for the synthesis phase that provides an answer to the signal estimation (or function estimation, or denoising) problem, in which we want to estimate a signal hidden by noise within an observed time series. The basic idea here is to modify the elements of W to produce, say, W′, from which an estimate of the signal can be synthesized. With the exception of methods briefly discussed in Section 10.8, once certain parameters have been estimated, the elements Wn of W are treated one at a time; i.e., how we modify Wn is not directly influenced by the remaining DWT coefficients. The wavelet-based techniques that we concentrate on here are thus conceptually very simple, yet they are remarkably adaptive to a wide variety of signals.
Wavelets are mathematical tools for analyzing time series or images (although not exclusively so:for examples of usage in other applications, see Stollnitz et al., 1996, andSweldens, 1996). Our discussion of wavelets in this book focuses on their use with time series,which we take to be any sequence of observations associated with an ordered independent variablet (the variable t can assume either a discrete set of values suchas the integers or a continuum of values such as the entire real axis - examples of both typesinclude time, depth or distance along a line, so a time series need not actually involve time).Wavelets are a relatively new way of analyzing time series in that the formal subject dates back tothe 1980s, but in many aspects wavelets are a synthesis of older ideas with new elegant mathematicalresults and efficient computational algorithms. Wavelet analysis is in some cases complementary toexisting analysis techniques (e.g., correlation and spectral analysis) and in other cases capable ofsolving problems for which little progress had been made prior to the introduction of wavelets.
Broadly speaking (and with apologies for the play on words!), there have been two main waves ofwavelets. The first wave resulted in what is known as the continuous wavelet transform (CWT), whichis designed to work with time series defined over the entire real axis; the second, in the discretewavelet transform (DWT), which deals with series defined essentially over a range of integers(usually t = 0, 1,…,N – 1, where Ndenotes the number of values in the time series). In this chapter we introduce and motivate waveletsvia the CWT.
Here we introduce the discrete wavelet transform (DWT), which is the basic tool needed forstudying time series via wavelets and plays a role analogous to that of the discrete Fouriertransform in spectral analysis. We assume only that the reader is familiar with the basic ideas fromlinear filtering theory and linear algebra presented in Chapters 2 and 3. Our exposition buildsslowly upon these ideas and hence is more detailed than necessary for readers with strongbackgrounds in these areas. We encourage such readers just to use the Key Facts and Definitions ineach section or to skip directly to Section 4.12 – this has a concise self-containeddevelopment of the DWT. For complementary introductions to the DWT, see Strang (1989, 1993), Riouland Vetterli (1991), Press et al. (1992) and Mulcahy (1996).
The remainder of this chapter is organized as follows. Section 4.1 gives a qualitativedescription of the DWT using primarily the Haar and D(4) wavelets as examples. The formalmathematical development of the DWT begins in Section 4.2, which defines the wavelet filter anddiscusses some basic conditions that a filter must satisfy to qualify as a wavelet filter. Section4.3 presents the scaling filter, which is constructed in a simple manner from the wavelet filter.The wavelet and scaling filters are used in parallel to define the pyramid algorithm for computing(and precisely defining) the DWT – various aspects of this algorithm are presented inSections 4.4, 4.5 and 4.6.
The last decade has seen an explosion of interest in wavelets, a subject area that has coalescedfrom roots in mathematics, physics, electrical engineering and other disciplines. As a result,wavelet methodology has had a significant impact in areas as diverse as differential equations,image processing and statistics. This book is an introduction to wavelets and their application inthe analysis of discrete time series typical of those acquired in the physical sciences. While wepresent a thorough introduction to the basic theory behind the discrete wavelet transform (DWT), ourgoal is to bridge the gap between theory and practice by
• emphasizing what the DWT actually means in practical terms;
• showing how the DWT can be used to create informative descriptive statistics fortime series analysts;
• discussing how stochastic models can be used to assess the statisticalproperties of quantities computed from the DWT; and
• presenting substantive examples of wavelet analysis of time seriesrepresentative of those encountered in the physical sciences.
To date, most books on wavelets describe them in terms of continuous functions and oftenintroduce the reader to a plethora of different types of wavelets. We concentrate on developingwavelet methods in discrete time via standard filtering and matrix transformation ideas.
The continuous time wavelet transform is becoming a well-established tool for multiple scale representation of a continuous time ‘signal,’ which by definition is a finite energy function denned over the entire real axis. This transform essentially correlates a signal with ‘stretched’ versions of a wavelet function (in essence a continuous time band-pass filter) and yields a multiresolution representation of the signal. In this chapter we summarize the important ideas and results for the multiresolution view of the continuous time wavelet transform. Our primary intent is to demonstrate the close relationship between continuous time wavelet analysis and the discrete time wavelet analysis presented in Chapter 4. To make this connection, we adopt a formalism that allows us to bridge the gap between the inner product convention used in mathematical discussions on wavelets and the filtering convention favored by engineers. For simplicity we deal only with signals, scaling functions and wavelet functions that are all taken to be real-valued. Only the case of dyadic wavelet analysis (where the scaling factor in the dilation of the basis function takes the value of two) is considered here.
As we saw in Chapters 4 and 5, one important use for the discrete wavelet transform (DWT) and its variant, the maximal overlap DWT (MODWT), is to decompose the sample variance of a time series on a scale-by-scale basis. In this chapter we explore wavelet-based analysis of variance (ANOVA) in more depth by defining a theoretical quantity known as the wavelet variance (sometimes called the wavelet spectrum). This theoretical variance can be readily estimated based upon the DWT or MODWT and has been successfully used in a number of applications; see, for example, Gamage (1990), Bradshaw and Spies (1992), Flandrin (1992), Gao and Li (1993), Hudgins et al. (1993), Kumar and Foufoula-Georgiou (1993, 1997), Tewfik et al. (1993), Wornell (1993), Scargle (1997), Torrence and Compo (1998) and Carmona et al. (1998). The definition for the wavelet variance and rationales for considering it are given in Section 8.1, after which we discuss a few of its basic properties in Section 8.2. We consider in Section 8.3 how to estimate the wavelet variance given a time series that can be regarded as a realization of a portion of length N of a stochastic process with stationary backward differences. We investigate the large sample statistical properties of wavelet variance estimators and discuss methods for determining an approximate confidence interval for the true wavelet variance based upon the estimated wavelet variance (Section 8.4).
In Chapter 4 we discussed the discrete wavelet transform (DWT), which essentially decomposes atime series X into coefficients that can be associated with different scales and times. We can thusregard the DWT of X as a ‘time/scale’ decomposition. The wavelet coefficients for agiven scale Tj ≡ 2J−1 tell ushow localized weighted averages of X vary from one averaging period to the next. The scaleTj gives us the effective width in time (i.e., degree of localization)of the weighted averages. Because the DWT can be formulated in terms of filters, we can relate thenotion of scale to certain bands of frequencies. The equivalent filter that yields the waveletcoefficients for scale Tj is approximately a band-pass filter with apass-band given by [l/2j+1, 1/2j].For a sample size N = 2J, the N - 1wavelet coefficients constitute - when taken together - an octave band decomposition of thefrequency interval [1/2J+1, 1/2], while the single scalingcoefficient is associated with the interval [0, 1/2J+1]. Taken asa whole, the DWT coefficients thus decompose the frequency interval [0, 1/2] into adjacentindividual intervals.
In this chapter we consider the discrete wavelet packet transform (DWPT), whichcan be regarded as any one of a collection of orthonormal transforms, each of which can be readilycomputed using a very simple modification of the pyramid algorithm for the DWT.
The chi-square statistic For testing hypotheses concerning multinomial distributions derives its name From the asymptotic approximation to its distribution. Two important applications are the testing of independence in a two-way classification and the testing of goodness-of-fit. In the second application the multinomial distribution is created artificially by grouping the data, and the asymptotic chi-square approximation may be lostifthe original data are used to estimate nuisance parameters.
Quadratic Forms in Normal Vectors
The chi-square distribution with k degrees of freedom is (by definition) the distribution of for i.i.d. N(O, 1)-distributed variables, The sum of squares is the squared norm of the standard normal vector. The following lemma gives a characterization of the distribution of the norm of a general zero-mean normal vector.
17.1 Lemma. If the vector X is Nk-distributed, thenis distributed as, the eigenvalues.
Proof. There exists an orthogonal matrix such that. Then the vector is diag -distributed, which is the same as the distribution of the vector has the same distribution.
The distribution of a quadratic form of the type is complicated in general. However, in the case that every is either, it reduces to a chi-square distribution. If this is not naturally the case in an application, then a statistic is often transformed to achieve this desirable situation. The definition of the Pearson statistic illustrates this.
Pearson Statistic
Suppose that we observe a vector with the multinomial distribution corresponding to trials and classes having probabilities. The Pearson statistic for testing the null hypothesis is given by
We shall show that the sequence converges in distribution to a chi-square distribution if the null hypothesis is true. The practical relevance is that we can use the chi-square table to find critical values for the test.
In this chapter we derive the asymptotic distribution of estimators of quantiles from the asymptotic distribution of the corresponding estimators of a distribution function. Empirical quantiles are an example, and hence we also discuss some results concerning order statistics. Furthermore, we discuss the asymptotics of the median absolute deviation, which is the empirical 1/2-quantile of the observations centered at their 1/2-quantile.
Weak Consistency
The quantile function of a cumulative distribution function is the generalized inverse given by
It is a left-continuous function with range equal to the support of F and hence is often unbounded. The following lemma records some useful properties.
Proof. The proofs of the inequalities in (i) through (iv) are best given by a picture. The equalities (v) follow from (ii) and (iv) and the monotonicity of by (iv). This proves the first statement in (ii); the second is immediate from the inequalities in (ii) and (iii). Statement (vi) follows from (i) and the definition of Consequences of (ii) and (iv) are that is strictly increasing i.e., has no. Thusis a proper inverse if and only if F is both continuous and strictly increasing, as one would expect.
By (i) the random variable has distribution function is uniformly distributed on [0, 1]. This is called the quantile transformation. On the other hand, by (i) and (ii) the variable is uniformly distributed on [0, 1] if and only if X has a continuous distribution function This is called the probability integral transformation.
A sequence of quantile functions is defined to converge weakly to a limit quantile function, denoted is continuous. This type of convergence is not only analogous in form to the weak convergence of distribution functions, it is the same.