To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Wavelets are everywhere nowadays. Be it in signal or image processing, in astronomy, in fluid dynamics (turbulence), in condensed matter physics, wavelets have found applications in almost every corner of physics. In addition, wavelet methods have become standard in applied mathematics, numerical analysis, approximation theory, etc. It is hardly possible to attend a conference on any of these fields without encountering several contributions dealing with them. Correspondingly, hundreds of papers appear every year and new books on the topic get published at a sustained pace, with publishers strongly competing with each other. So, why bother to publish an additional one?
The answer lies in the finer distinction between various types of wavelet transforms. There is, indeed, a crucial difference between two approaches, namely, the continuous wavelet transform (CWT) and the discrete wavelet transform (DWT). Furthermore, one has to distinguish between problems in one dimension (signal analysis) and problems in two dimensions (image processing), since the status of the literature is very different in the two cases.
Take first the one-dimensional case. Beginning with the classic textbook of Ingrid Daubechies [Dau92], several books, such as those of M. Holschneider [Hol95], B. Torrésani [Tor95] or A. Arnéodo et al. [Arn95], cover the continuous wavelet transform, in a more or less mathematically oriented approach.
We live in a world where objects (cars, animals, men, birds, aeroplanes, the Sun, etc.) that surround us are constantly in relative motion. One would like to extract the motion information from the observation of the scene and use it for various purposes, such as detection, tracking and identification. In particular, tracking of multiple objects is of great importance in many real world scenarios. The examples include traffic monitoring, autonomous vehicle navigation, and tracking of ballistic missile warheads. Tracking is a complex problem, often requiring to estimate motion parameters – such as position, velocity – under very challenging situations. Algorithms of this type typically have difficulty in the presence of noise, when the object is obscured, in situations including crossing trajectories, and when highly maneuvering objects are present.
Most motion estimation (ME) techniques such as the ones based on block matching, optical flow, and phase difference [Jah97,280,281] assume that the object is constant from frame to frame. That is, the signature of the object does not change with time. Consequently, these techniques tend to have difficulty handling complex motion, particularly when noise is present.
The time-dependent continuous wavelet transform (CWT) is attractive as a tool for analysis, in that important motion parameters can be compactly and clearly represented.
In the previous chapters, we have thoroughly discussed the 2-D CWT and some of its applications. Then we have made the connection with the group theoretical origins of the method, thus establishing a general framework, based on the coherent state formalism. In the present chapter, we will apply the same technique to a number of different situations involving higher dimensions: wavelets in 3-D space ℝ3, wavelets in ℝn (n > 3), and wavelets on the 2-sphere S2. Then, in the next chapter, we will treat time-dependent wavelets, that is, wavelets on space–time, designed for motion analysis.
In all cases, the technique is the same. First one identifies the manifold on which the signals are defined and the appropriate group of transformations acting on the latter. Next one chooses a square integrable representation of that group, possibly modulo some subgroup. Then one constructs wavelets as admissible vectors and derives the corresponding wavelet transform.
Three-dimensional wavelets
Some physical phenomena are intrinsically multiscale and three-dimensional. Typical examples may be found in fluid dynamics, for instance the appearance of coherent structures in turbulent flows, or the disentangling of a wave train in (mostly underwater) acoustics, as discussed above. In such cases, a 3-D wavelet analysis is clearly more adequate and likely to yield a deeper understanding [56].
In Chapters 1 and 2, we have studied systematically the continuous wavelet transform in one and two dimensions, respectively. As already emphasized there, the properties of the transforms in the two cases are remarkably similar. In 2-D we have formalized them in the three propositions 2.2.1, 2.2.2 and 2.2.3, and essentially the same statements may be made in 1-D. A moment's reflection shows that one could write out, without difficulty, an entirely parallel mathematical description in any dimension n ≥ 1. Clearly there must be some unifying principle underlying the picture. The question is, of course, what is this principle? As so often in such situations, the answer is to be found in group representation theory, i.e., by looking at the underlying geometry of the space of signals. The various transformations (translation, rotation, zoom, etc.) that a signal may undergo, determine a set of mathematical symmetries, which, interestingly enough, can be expressed in simple matrix terms and, as will be made clear in the following, the signal space itself – as a mathematical object – emerges as a consequence of this geometry.
But we have been using group theory all along! Indeed, to draw on a literary analogy, like Molière's Monsieur Jourdain speaking in prose without knowing so, we have been using group-theoretical language throughout our analysis! It is the aim of the present chapter to demonstrate this fact.
The last chapter has already familiarized us with the use of group theoretical methods for the construction and analysis of wavelets and gaborettes. We aim in this chapter to first indicate the general applicability of these techniques and then to look at the case of the two-dimensional continuous transform, using the SIM(2) group. Later, we look at general matrix groups of the type that can be used for constructing other types of wavelet transforms in two dimensions. We shall be led, in this manner, to studying a class of semidirect product type groups, certain coadjoint orbits of which are isomorphic to the group itself. In all these cases, the common features of such a matrix-group analysis will be: (a) the group will refer to a set of possible symmetry transformations which the signal may undergo; (b) the space over which the signals are defined (as L2-functions) is intrinsic to the group; (c) the parameters in terms of which the wavelet transform is expressed are the parameters of the group itself, i.e., symmetry parameters of the signal, and (d) these parameter spaces, which arise as coadjoint orbits of the group, are also identifiable with phase spaces of signals.
Referring back to the 2-D wavelet transform introduced in Chapter 2, we shall see that this transform is again related to a square integrable representation of a matrix group.
Segmentation of speech signals based on fractal dimension
Computer speech recognition is an important subject that has been studied for many years. Until relatively recently, classical mathematics and signal processing techniques have played a major role in the development of speech recognition systems. This includes the use of frequency-time analysis, the Wigner transform, applications of wavelets and a wide range of artificial neural network paradigms. Relatively little attention has been paid to the application of random scaling fractals to speech recognition. The fractal characterization of speech waveforms was first reported by Pickover and Al Khorasani [1], who investigated the self-affinity and fractal dimension for human speech in general. They found a fractal dimension of 1.66 using Hurst analysis (see e.g. [2]). In the present chapter, we investigate the use of fractal-dimension segmentation for feature extraction and recognition of isolated words. We shall start with a few preliminaries that relate to speech recognition techniques in general.
Speech recognition techniques
Speech recognition systems are based on digitizing an appropriate waveform from which useful data is then extracted using appropriate pre-processing techniques. After that, the data is processed to obtain a signature or representation of the speech signal. This signature is ideally a highly compressed form of the original data that represents the speech signal uniquely and unambiguously. The signature is then matched against some that have been created previously (templates) by averaging a set of such signatures for a particular word.
Developing mathematical models to simulate and analyse noise has an important role in digital signal and image processing. Computer generated noise is routinely used to test the robustness of different types of algorithm; it is used for data encryption and even to enhance or amplify signals through ‘stochastic resonance’. Accurate statistical models for noise (e.g. the probability density function or the characteristic function) are particularly important in image restoration using Bayesian estimation [1], maximum-entropy methods for signal and image reconstruction [2] and in the image segmentation of coherent images in which ‘speckle’ (arguably a special type of noise, i.e. coherent Gaussian noise) is a prominent feature [3]. The noise characteristics of a given imaging system often dictate the type of filters that are used to process and analyse the data. Noise simulation is also important in the synthesis of images used in computer graphics and computer animation systems, in which fractal noise has a special place (e.g. [4, 5]).
The application of fractal geometry for modelling naturally occurring signals and images is well known. This is due to the fact that the ‘statistics’ and spectral characteristics of random scaling fractals are consistent with many objects found in nature, a characteristic that is expressed in the term ‘statistical self-affinity’. This term refers to random processes whose statistics are scale invariant. An RSF signal is one whose PDF remains the same irrespective of the scale over which the signal is sampled.
In this chapter we investigate the use of fractal geometry for segmenting digital signals and images. A method of texture segmentation is introduced that is based on the fractal dimension. Using this approach, variations in texture across a signal or image can be characterized in terms of variations in the fractal dimension. By analysing the spatial fluctuations in-fractal dimension obtained using a conventional moving-window approach, a digital signal or image can be texture segmented; this is the principle of fractal-dimension segmentation (FDS). In this book, we apply this form of texture segmentation to isolated speech signals.
An overview of methods for computing the fractal dimension is presented, focusing on an approach that makes use of the characteristic power spectral density function (PSDF) of a random scaling fractal (RSF) signal. A more general model for the PSDF of a stochastic signal is then introduced and discussed with reference to texture segmentation.
We shall apply fractal-dimension segmentation to a number of different speech signals and discuss the results for isolated words and the components (e.g. fricatives) from which these words are composed. In particular, it will be shown that by pre-filtering speech signals with a low-pass filter of the form 1/k, they can be classified into fractal dimensions that lie within the correct range, i.e. [1, 2]. This provides confidence in the approach to speech segmentation considered in this book and, in principle, allows a template-matching scheme to be designed that is based exclusively on FDS.
Modern information security manifests itself in many ways, according to the situation and its requirements. It deals with such concepts as confidentiality, data integrity, access control, identification, authentication and authorization. Practical applications, closely related to information security, are private messaging, electronic money, online services and many others.
Cryptography is the study of mathematical techniques related to aspects of information security. The word is derived from the Greek kryptos, meaning hidden. Cryptography is closely related to the disciplines of cryptanalysis and cryptology. In simple words, cryptanalysis is the art of breaking cryptosystems, i.e. retrieving the original message without knowing the proper key or forging an electronic signature. Cryptology is the mathematics, such as number theory, and the application of formulas and algorithms that underpin cryptography and cryptanalysis.
Cryptology is a branch of mathematical science describing an ideal world. It is the only instrument that allows the application of strict mathematical methods to design a cryptosystem and estimate its theoretical security. However, real security deals with complex systems involving human beings from the real world. Mathematical strength in a cryptographic algorithm is a necessary but not sufficient requirement for a system to be acceptably secure.
Moreover, in the ideal mathematical world, the cryptographic security of an object can be checked only by means of proving its resistance to various kinds of known attack. Practical security does not imply that the system is secure: other, unknown, types of attack may occur.
Speech is and will remain perhaps the most desirable medium of communication between humans. There are several ways of characterizing the communications potential of speech. One highly quantitative approach is in terms of information theory. According to information theory, speech can be represented in terms of its message content, or information. An alternative way of characterizing speech is in terms of the signal carrying the message information, that is the acoustic waveform [1].
The widespread application of speech processing technology required that touchtone telephones be readily available. The first touch-tone (dual-tone multifrequency, DTMF) telephone was demonstrated at Bell Laboratories in 1958, and deployment in the business and consumer world started in the early 1960s. Since DTMF service was introduced to the commercial and consumer world less than 40 years ago, it can be seen that voice processing has a relatively short history.
Research in speech processing by computer has traditionally been focused on a number of somewhat separable, but overlapping, problem areas. One of these is isolated word recognition, where the signal to be recognized consists of a single word or phrase, delimited by silence, to be identified as a unit without characterization of its internal structure. For this kind of problem, certain traditional pattern recognition techniques can be applied directly.
The computational background to digital signal processing (DSP) involves a number of techniques of numerical analysis. Those techniques which are of particular value are:
solutions to linear systems of equations
finite difference analysis
numerical integration
A large number of DSP algorithms can be written in terms of a matrix equation or a set of matrix equations. Hence, computational methods in linear algebra are an important aspect of the subject. Many DSP algorithms can be classified in terms of a digital filter. Two important classes of digital filter are used in DSP, as follows.
Convolution filters are nonrecursive filters. They use linear processes that operate on the data directly.
Fourier filters operate on data obtained by computing the discrete Fourier transform of a signal. This is accomplished using the fast Fourier transform algorithm.
Digital filters
Digital filters fall into two main categories:
real-space filters
Fourier-space filters
Real-space filters Real-space filters are based on some form of ‘moving window’ principle. A sample of data from a given element of the signal is processed giving (typically) a single output value. The window is then moved on to the next element of the signal and the process repeated. A common real-space filter is the finite impulse response (FIR) filter.
The bootstrap genesis is generally attributed to Bradley Efron. In 1977 he wrote the famous Rietz Lecture on the estimation of sampling distributions based on observed data (Efron, 1979a). Since then, a number of outstanding and nowadays considered classical statistical texts have been written on the topic (Efron, 1982; Hall, 1992; Efron and Tibshirani, 1993; Shao and Tu, 1995), complemented by other interesting monographic exposés (LePage and Billard, 1992; Mammen, 1992; Davison and Hinkley, 1997; Manly, 1997; Barbe and Bertail, 1995; Chernick, 1999).
Efron and Tibshirani (1993) state in the Preface of their book Our goal in this book is to arm scientists andengineers, as well as statisticians, with computational techniques that they can use to analyze and understand complicated data sets. We share the view that Efron and Tibshirani (1993) have written an outstanding book which, unlike other texts on the bootstrap, is more accessible to an engineer. Many colleagues and graduate students of ours prefer to use this text as the major source of knowledge on the bootstrap. We believe, however, that the readership of (Efron and Tibshirani, 1993) is more likely to be researchers and (post-)graduate students in mathematical statistics than engineers.
To the best of our knowledge there are currently no books or monographs on the bootstrap written for electrical engineers, particularly for signal processing practitioners.