Search results for Computational statistics, machine learning and information science

Preface
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp vii-x
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In recent years there has been an explosion of interest in the application and theory of independent component analysis (ICA). This book is aimed to provide a self-contained introduction to the subject as well as offering a set of invited contributions which we see as lying at the cutting edge of ICA research.
ICA is intimately linked with the problem of blind source separation-attempting to recover a set of underlying sources when only a noisy mapping from these sources, the observations, is given-and we regard this as the canonical form of ICA. Until recently this mapping was taken to be linear (but see Chapter 4) and “traditionally” (if tradition is allowed in a field of such recent developments) noiseless with the number of observations being equal to the number of hypothesised sources. It is surprising that even the simplest of ICA models can be invaluable and offer new insights into data analysis and interpretation. This, at first sight unreasonable, claim may be supported by noting that many observations of physical systems are produced by a linear combination of underlying sources. Furthermore, in many applications, it is an end in itself to produce a set of “sources” which are statistically independent rather than just decorrelated (see Chapter 1) and for this ICA would appear an ideal tool.

11 - Particle filters for non-stationary ICA
- By R.M. Everson, University of Exeter, S.J. Roberts, University of Oxford
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 280-298
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
One may think of blind source separation as the problem of identifying speakers (sources) in a room given only recordings from a number of microphones, each of which records a linear mixture of the sources, whose statistical characteristics are unknown.
Here we consider the blind source separation problem when the mixing of the sources is non-stationary. Pursuing the speakers in a room analogy, we address the problem of identifying the speakers when they (or equivalently, the microphones) are moving. The problem is cast in terms of a hidden state (the mixing proportions of the sources) which we track using particle filter methods, which permit the tracking of arbitrary state densities. Murata et al. [I9971 have addressed this problem by adapting the learning rate and we mention work by Penny et al. [2000] on hidden Markov models for ICA which allows for abrupt changes in the mixing matrix with stationary periods in between.
We first briefly re-review classical Independent Component Analysis. ICA with non-stationary mixing is described in terms of a hidden state model and methods for estimating the sources and the mixing are described. Particle filter techniques are then introduced for the modelling of state densities. Finally, we address the non-stationary mixing problem when the sources are independent, but possess temporal correlations.

4 - Nonlinear ICA
- By J. Karhunen, Helsinki University of Technology
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 113-134
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter deals with independent component analysis and blind source separation for nonlinear data models. A fundamental difficulty, especially in the nonlinear ICA problem, is that it is highly non-unique without a suitable regularization. After considering this, two methods for solving the nonlinear ICA and BSS problems are presented in more detail. The first one is a maximum likelihood method based on a modified generative topographic mapping. The second approach applies Bayesian ensemble learning to a flexible multi-layer perceptron model for finding the sources and nonlinear mixing mapping that have most probably given rise to the observed mixed data. Finally, other techniques introduced for the nonlinear ICA and BSS problems are briefly reviewed.
Introduction
Independent Component Analysis [Lee, 1998, Oja et al., 1997, Girolami, 1999bl is a statistical technique which tries to represent the observed data in terms of statistically independent component variables. ICA is closely related to the blind source separation (BSS) problem [Cardoso, 1998a, Amari et al., 1996, Lee, 1998, Oja et al., 1997, Girolami, 1999b1, where the general goal is to separate mutually independent but otherwise unknown source signals from their observed mixtures without knowing the mixing process.

1 - Introduction
- By S.J. Roberts, University of Oxford, R.M. Everson, University of Exeter
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 1-70
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
Independent Component Analysis (ICA) has recently become an important tool for modelling and understanding empirical datasets as it offers an elegant and practical methodology for blind source separation and deconvolution. It is seldom possible to observe a pure unadulterated signal. Instead most observations consist of a mixture of signals usually corrupted by noise, and frequently filtered. The signal processing community has devoted much attention to the problem of recovering the constituent sources from the convolutive mixture; ICA may be applied to this Blind Source Separation (BSS) problem to recover the sources. As the appellation independent suggests, recovery relies on the assumption that the constituent sources are mutually independent.
Finding a natural coordinate system is an essential first step in the analysis of empirical data. Principal component analysis (PCA) has, for many years, been used to find a set of basis vectors which are determined by the dataset itself. The principal components are orthogonal and projections of the data onto them are linearly decorrelated, properties which can be ensured by considering only the second order statistical characteristics of the data. ICA aims at a loftier goal: it seeks a transformation to coordinates in which the data are maximally statistically independent, not merely decorrelated.

Index
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 336-338
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp v-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

8 - Ensemble Learning for blind source separation
- By J. W. Miskin, Uk, D.J.C. MacKay, Uk
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 209-233
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

7 - Blind source separation by sparse decomposition in a signal dictionary
- By M. Zibulevsky, Technion Israel Institute of Technology, B.A. Pearlmutter, University of New Mexico, P. Bofill, Universitat Politècnica de Catalunya, P. Kisilev, Technion Israel Institute of Technology
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 181-208
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

6 - Separation of non-stationary sources: algorithms and performance
- By Jean-François Cardoso, France, Dinh-Tuan Pham, France
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 158-180
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This chapter shows how to take advantage of the non-stationarity of the source signals to achieve the blind separation of instantaneous mixtures. Off-line and on-line algorithms are developed based on the likelihood principle and the mutual information objective. The analysis of the achievable performance also reveals how non-stationarity and non-Gaussianity both contribute to the separability.
Introduction
This chapter addresses the problem of separating instantaneous mixtures of non-stationary sources. More to the point, we investigate how to take advantage of a possible non-stationarity of the sources in order to achieve separation.
When the source signals are modelled as i.i.d. (independently and identically distributed) sequences, blind separation can be achieved only by exploiting the non-Gaussianity of the sources. In this case, sources must be non-Gaussian to be separated; poor results are to be expected when the (marginal) distributions of source signals are too close to a Gaussian distribution. In contrast, the existence of a temporal structure in the source signals makes it possible to separate even Gaussian sources. Two types of temporal structures have been considered in the literature, each corresponding to breaking one ‘i’ in i.i.d. The first case is that of stationary signals with a temporal dependence, thus having a non-flat spectrum; these are ‘non-white’ or ‘coloured’ sequences. Identifiability is granted provided no two sources have proportional spectra (see e.g., [Pham & Garrat, 19931 or [Belouchrani et al., 19971 for more elaborate statements and some algorithms) even when the signals are normally distributed.

Frontmatter
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

References
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 315-335
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

5 - Separation of non-stationary natural signals
- By Lucas C. Parra, Usa, Clay D. Spence, Usa
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 135-157
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Most approaches to the problem of source separation use the assumption of statistical independence. To capture statistical independence higher order statistics are required. In this chapter we will demonstrate how higher order criteria, such as maximum kurtosis, arise naturally from the property of non-stationarity. We will also show that source separation of non-stationary signals can be based entirely on second order statistics of the signals. Natural signals, be they images or time sequences, are for the most part non-stationary. For natural signals therefore we argue that non-stationarity is the fundamental property, from which specific second or higher order separation criteria can be derived. We contrast the linear bases obtained using second order non-stationarity and ICA for the cases of natural images and speech powers. Based on these results we argue that speech powers can in fact be understood as a linear superposition of non-stationary spectro-temporal independent components, while this is not so evident for a spatial basis of images intensities. Finally we demonstrate the practical utility of the second order non-stationarity concept with a separation algorithm for the problem of convolutive source separation. We show its effectiveness on acoustic mixtures in real reverberant environments.
Second and higher order separation criteria in the context of non-stationary signals
Most approaches to source separation have been based on the condition of statistical independence of the constituent signals. Conventionally, higher order statistics are required to capture statistical independence. In fact if the source signals are identically and independently distributed (i.i.d.) samples of a stationary distribution second order statistics are not sufficient for Separation. Fortunately, natural signals are often not stationary, or independently distributed.

3 - ICA, graphical models and variational methods
- By H. Attias, Usa
Edited by Stephen Roberts, University of Oxford, Richard Everson, University of Exeter
Book:

Independent Component Analysis

Published online:

05 July 2014

Print publication:

01 March 2001, pp 95-112
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
Early work on ICA [Jutten & Herault, 1991, Comon, 1994, Bell & Sejnowski, 19951 has focused on the case where the number of sources equals the dimensionality of the data, the mixing is invertible, the data are noise free, and the source distributions are known in advance. These assumptions were very restrictive, and several authors have proposed ways to relax them (e.g., [Lewicki & Sejnowski, 1998, Lee et al., 1998, Lee et al., 1999b, Attias, 1999a1). This chapter presents one strand of research that aims to deal with the full generality of the blind separation problem in a principled manner. This is done by casting blind separation as a problem in learning and inference with probabilistic graphical models.
Graphical models (see [Jordan, 19991 for a review) serve as an increasingly important tool for constructing machine learning algorithms in many fields, including computer science, signal processing, text modelling, molecular biology, and finance. In the graphical model framework, one starts with a statistical parametric model which describes how the observed data are generated. This model uses a set of parameters, which in the case of blind separation include, e.g., the mixing matrix and the variance of the noise. It may contain hidden variables, e.g., the sources. The machinery of probability theory is then applied to learn the parameters from the dataset. Simultaneously with learning the parameters, the same machinery also computes the conditional distributions over the hidden variables given the data.

11 - Generating Random Variables from Other Distributions
John F. Monahan, North Carolina State University
Book:

Numerical Methods of Statistics

Published online:

21 March 2011

Print publication:

05 February 2001, pp 279-318
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
Chapter 10 provided an overview of Monte Carlo methods and dealt solely with the problem of generating from the uniform distribution. Since the uniform distribution is the fundamental distribution, we're now prepared to deal with the postponed problem of generating from other distributions. Given the results of Chapter 10, this problem should be viewed as transforming a source sequence of IID uniform random variables {Ui} to an IID sequence of random variables {Xi} with cumulative distribution function (cdf) F. A discussion of general methods for generating from continuous distributions forms Section 11.2. Specific algorithms designed for various distributions, such as the normal and Student's t, follow in Section 11.3. General methods for discrete distributions are discussed in Section 11.4, with specific cases in Section 11.5. Special problems, including random sampling from a population, are handled in Section 11.6. The problem of accuracy in Monte Carlo is tackled in Section 11.7.
Some general remarks are in order before pursuing the problem at hand. Algorithms for generating random variables should always be simple, fast, and exact. Simplicity is paramount, since users must often code and debug their own programs. Finding errors in random output is very difficult (see Exercises 11.14 and 11.20). If an algorithm is simple, most mistakes will bring consequences so severe that the error can be easily discovered. Speed is not so important, since the computational effort in generation is usually only a small fraction of the total effort in the Monte Carlo experiment.

6 - Eigenproblems
John F. Monahan, North Carolina State University
Book:

Numerical Methods of Statistics

Published online:

21 March 2011

Print publication:

05 February 2001, pp 114-136
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

9 - Maximum Likelihood and Nonlinear Regression
John F. Monahan, North Carolina State University
Book:

Numerical Methods of Statistics

Published online:

21 March 2011

Print publication:

05 February 2001, pp 199-234
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
Maximum likelihood is generally regarded as the best all-purpose approach for statistical analysis. Outside of the most common statistical procedures, when the “optimal” or “usual” method is unknown, most statisticians follow the principle of maximum likelihood for parameter estimation and statistical hypothesis tests. Bayesian statistical methods also rely heavily on maximum likelihood. The main reason for this reliance is that following the principle of maximum likelihood usually leads to very reasonable and effective estimators and tests. From a theoretical viewpoint, under very mild conditions, maximum likelihood estimators (MLEs) are consistent, asymptotically unbiased, and efficient. Moreover, MLEs are invariant under reparameterizations or transformations: the MLE of a function of the parameter is the function of the MLE. From a practical viewpoint, the estimates and test statistics can be constructed without a great deal of analysis, and large-sample standard errors can be computed. Overall, experience has shown that maximum likelihood works well most of the time.
The biggest computational challenge comes from the naive expectation that any statistical problem can be solved if the maximum of some function is found. Instead of relying solely on the unconstrained optimization methods presented in Chapter 8 to meet this unrealistic expectation, the nature of the likelihood function can be exploited in ways that are more effective for computing MLEs. Since the exploitable properties of likelihood functions follow from the large-sample theory, this chapter will begin with a summary of the consistency and asymptotic normality properties of MLEs.

13 - Markov Chain Monte Carlo Methods
John F. Monahan, North Carolina State University
Book:

Numerical Methods of Statistics

Published online:

21 March 2011

Print publication:

05 February 2001, pp 351-378
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
One of the main advantages of Monte Carlo integration is a rate of convergence that is unaffected by increasing dimension, but a more important advantage for statisticians is the familiarity of the technique and its tools. Although Markov chain Monte Carlo (MCMC) methods are designed to integrate high-dimensional functions, the ability to exploit distributional tools makes these methods much more appealing to statisticians. In contrast to importance sampling with weighted observations, MCMC methods produce observations that are no longer independent; rather, the observations come from a stationary distribution and so time-series methods are needed for their analysis. The emphasis here will be on using MCMC methods for Bayesian problems with the goal of generating a series of observations whose stationary distribution π(t) is proportional to the unnormalized posterior p*(t). Standard statistical methods can then be used to gain information about the posterior.
The two general approaches covered in this chapter are known as Gibbs sampling and the Metropolis–Hastings algorithm, although the former can be written as a special case of the latter. Gibbs sampling shows the potential of MCMC methods for Bayesian problems with hierarchical structure, also known as random effects or variance components. The key ingredient in Gibbs sampling is the ability to generate from the conditional distribution of each variable given the others; in the case of three components, generating from f(x | Y = y, Z = z), f(y | X = x, Z = z), and f(z | X = x, Y = y).

1 - Algorithms and Computers
John F. Monahan, North Carolina State University
Book:

Numerical Methods of Statistics

Published online:

21 March 2011

Print publication:

05 February 2001, pp 1-10
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
Discussing algorithms before computers emphasizes the point that algorithms are valuable mathematical constructs in themselves and exist even in the absence of computers. An algorithm is a list of instructions for the completion of a task. The best examples of algorithms in everyday life are cooking recipes, which specify how a list of ingredients are to be manipulated to produce a desired result. For example, consider the somewhat trivial recipe for cooking a three-minute egg. The recipe (or algorithm) for such a task exemplifies the need to take nothing for granted in the list of instructions.
Algorithm Three-Minute Egg
Put water in a pan.
Turn on the heat.
When the water boils, flip over the egg timer.
When the timer has run out, turn off the heat.
Pour some cold water in the pan to cool the water.
Remove egg.
Although this algorithm may appear trivial to most readers, detailed examination further emphasizes (or belabors) how clear and unambiguous an algorithm must be. First, the receiver of these instructions, call it the actor, must recognize all of the jargon of food preparation: water, pan, egg, egg timer, boil, and so forth. The actor must recognize constructions: “put ____ in ____”; “when ____, do ____.” Some parts of these instructions are unnecessary: “to cool the water.” If the actor is an adult who understands English, this may be a fine algorithm. To continue to belabor the point, the actor can only do what the instructions say.

5 - Regression Computations
John F. Monahan, North Carolina State University
Book:

Numerical Methods of Statistics

Published online:

21 March 2011

Print publication:

05 February 2001, pp 82-113
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

12 - Statistical Methods for Integration and Monte Carlo
John F. Monahan, North Carolina State University
Book:

Numerical Methods of Statistics

Published online:

21 March 2011

Print publication:

05 February 2001, pp 319-350
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Introduction
One of the advantages of Monte Carlo methods, as highlighted in Chapter 10, is that the whole array of statistical tools are available to analyze the results and assess the accuracy of any estimate. Sadly, the statistical analysis of many Monte Carlo experiments has been absent, with others poorly done. Quite simply, statisticians do not always practice what they preach. One rationalization with some validity is that the statistical tools for analyzing these data are beyond the mainstream of statistical methodology; one of the goals of this chapter is to remove this as a possible excuse. Some of the fundamental statistical tools are reviewed in Section 12.2. Density estimation, long an object of theoretical discourse, becomes an important tool in expressing the results of Monte Carlo studies; a brief discussion of the highlights of density estimation is included in this section. The most common statistical tests for these data involve testing whether a sample arises from a specified distribution; a brief discussion of goodness-of-fit tests forms Section 12.3. Importance sampling, discussed briefly in Chapter 10, presents a class of statistical problems with weighted observations. This requires some minor modifications of common statistical tools that are outlined in Section 12.4. An attendant problem with importance sampling is concern for the distribution of the weights; tests on the behavior of the distribution of weights are discussed in Section 12.5.
The other goal of this chapter is to introduce some specialized integration tools. In Section 12.6, Laplace's method provides an asymptotic approximation for moments of a posterior based mainly on the large-sample normal approximation to the posterior.

Computational statistics, machine learning and information science

Refine search

Refine search

Actions for selected content:

1031 results in Computational statistics, machine learning and information science

Preface

Summary

11 - Particle filters for non-stationary ICA

Summary

4 - Nonlinear ICA

Summary

1 - Introduction

Summary

Index

Contents

8 - Ensemble Learning for blind source separation

7 - Blind source separation by sparse decomposition in a signal dictionary

6 - Separation of non-stationary sources: algorithms and performance

Summary

Frontmatter

References

5 - Separation of non-stationary natural signals

Summary

3 - ICA, graphical models and variational methods

Summary

11 - Generating Random Variables from Other Distributions

Summary

6 - Eigenproblems

9 - Maximum Likelihood and Nonlinear Regression

Summary

13 - Markov Chain Monte Carlo Methods

Summary

1 - Algorithms and Computers

Summary

5 - Regression Computations

12 - Statistical Methods for Integration and Monte Carlo

Summary

Computational statistics, machine learning and information science

Refine search

Refine search

Actions for selected content:

Save Search

1031 results in Computational statistics, machine learning and information science

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary