Introduction
Early work on ICA [Jutten & Herault, 1991, Comon, 1994, Bell & Sejnowski, 19951 has focused on the case where the number of sources equals the dimensionality of the data, the mixing is invertible, the data are noise free, and the source distributions are known in advance. These assumptions were very restrictive, and several authors have proposed ways to relax them (e.g., [Lewicki & Sejnowski, 1998, Lee et al., 1998, Lee et al., 1999b, Attias, 1999a1). This chapter presents one strand of research that aims to deal with the full generality of the blind separation problem in a principled manner. This is done by casting blind separation as a problem in learning and inference with probabilistic graphical models.
Graphical models (see [Jordan, 19991 for a review) serve as an increasingly important tool for constructing machine learning algorithms in many fields, including computer science, signal processing, text modelling, molecular biology, and finance. In the graphical model framework, one starts with a statistical parametric model which describes how the observed data are generated. This model uses a set of parameters, which in the case of blind separation include, e.g., the mixing matrix and the variance of the noise. It may contain hidden variables, e.g., the sources. The machinery of probability theory is then applied to learn the parameters from the dataset. Simultaneously with learning the parameters, the same machinery also computes the conditional distributions over the hidden variables given the data.