We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Many real-world problems can be described by models that extend the classical linear Gaussian dynamical system with (unobserved) discrete regime indicators. In such extended models the discrete indicators dictate what transition and observation model the process follows at a particular time. The problems of tracking and estimation in models with manoeuvring targets [1], multiple targets [25], non-Gaussian disturbances [15], unknown model parameters [9], failing sensors [20] and different trends [8] are all examples of problems that have been formulated in a conditionally Gaussian state space model framework. Since the extended model is so general it has been invented and re-invented many times in multiple fields, and is known by many different names, such as switching linear dynamical system, conditionally Gaussian state space model, switching Kalman filter model and hybrid model.
Although the extended model has a lot of expressive power, it is notorious for the fact that exact estimation of posteriors is intractable. In general, exact filtered, smoothed or predicted posteriors have a complexity exponential in the number of observations. Even when only marginals on the indicator variables are required the problem remains NP-hard [19].
In this chapter we introduce a deterministic approximation scheme that is particularly suited to find smoothed one and two time slice posteriors. It can be seen as a symmetric backward pass and iteration scheme for previously proposed assumed density filtering approaches [9].
The chapter is organised as follows. In Section 7.2 we present the general model; variants where only the transition or only the observation model switch, or where states or observations are multi-or univariate can be treated as special cases.
Markov processes are probabilistic models for describing data with a sequential structure. Probably the most common example is a dynamical system, of which the state evolves over time. For modelling purposes it is often convenient to assume that the system states are not directly observed: each observation is a possibly incomplete, non-linear and noisy measurement (or transformation) of the underlying hidden state. In general, observations of the system occur only at discrete times, while the underlying system is inherently continuous in time. Continuous-time Markov processes arise in a variety of scientific areas such as physics, environmental modelling, finance, engineering and systems biology.
The continuous-time evolution of the system imposes strong constraints on the model dynamics. For example, the individual trajectories of a diffusion process are rough, but the mean trajectory is a smooth function of time. Unfortunately, this information is often under- or unexploited when devising practical systems. The main reason is that inferring the state trajectories and the model parameters is a difficult problem as trajectories are infinite-dimensional objects. Hence, a practical approach usually requires some sort of approximation. For example, Markov chain Monte Carlo (MCMC) methods usually discretise time [41, 16, 34, 2, 20], while particle filters approximate continuous densities by a finite number of point masses [13, 14, 15]. More recently, approaches using perfect simulation have been proposed [7, 8, 18]. The main advantage of these MCMC techniques is that they do not require approximations of the transition density using time discretisations.
In nature there are many examples of group behaviour arising from the action of individuals without any apparent central coordinator, such as the highly coordinated movements of flocks of birds or schools of fish. These are among the most fascinating phenomena to be found in nature; where the groups seem to turn and manoeuvre as a single unit, changing direction almost instantaneously. Similarly, in man-made activities, there are many cases of group-like behaviour, such as a group of aircraft flying in formation.
There are two principal reasons why it is very helpful to model the behaviour of groups explicitly, as opposed to treating all objects independently as in most multiple target tracking approaches. The first is that the joint tracking of (a priori) dependent objects within a group will lead to greater detection and tracking ability in hostile environments with high noise and low detection probabilities. For example, in the radar target tracking application, if several targets are in a group formation, then some information on the positions and speeds of those targets with missing measurements (due to poor detection probability) can be inferred given those targets that are detected. Similarly, if a newly detected target appears close to an existing group, the target can be initialised using the group velocity.
Many time series are characterised by abrupt changes in structure, such as sudden jumps in level or volatility. We consider changepoints to be those time points which divide a dataset into distinct homogeneous segments. In practice the number of changepoints will not be known. The ability to detect changepoints is important for both methodological and practical reasons including: the validation of an untested scientific hypothesis [27]; monitoring and assessment of safety critical processes [14]; and the validation of modelling assumptions [21].
The development of inference methods for changepoint problems is by no means a recent phenomenon, with early works including [39], [45] and [28]. Increasingly the ability to detect changepoints quickly and accurately is of interest to a wide range of disciplines. Recent examples of application areas include numerous bioinformatic applications [37, 15], the detection of malware within software [51], network traffic analysis [35], finance [46], climatology [32] and oceanography [34].
In this chapter we describe and compare a number of different approaches for estimating changepoints. For a more general overview of changepoint methods, we refer interested readers to [8] and [11]. The structure of this chapter is as follows. First we introduce the model we focus on. We then describe methods for detecting a single changepoint and methods for detecting multiple changepoint, which will cover both frequentist and Bayesian approaches.
A common way to handle non-linearity in complex time series data is to try splitting the data up into a number of simpler segments. Sometimes we have domain knowledge to support this piecewise modelling approach, for example in condition monitoring applications. In such problems, the evolution of some observed data is governed by a number of hidden factors that switch between different modes of operation. In real-world data, e.g. from medicine, robotic control or finance, we might be interested in factors which represent pathologies, mechanical failure modes, or economic conditions respectively. Given just the monitoring data, we are interested in recovering the state of the factors that gave rise to it.
A good model for this type of problem is the switching linear dynamical system (SLDS), which has been discussed in previous chapters. A latent ‘switch’ variable in this type of model selects between different linear-Gaussian state spaces. In this chapter we consider a generalisation, the factorial switching linear dynamical system (FSLDS), where instead of a single switch setting there are multiple discrete factors that collectively determine the dynamics. In practice there may be a very large number of possible factors, and we may only have explicit knowledge of commonly occurring ones.
We illustrate how the FSLDS can be used in the physiological monitoring of premature babies in intensive care. This application is a useful introduction because it has complex observed data, a diverse range of factors affecting the observations, and the challenge of many ‘unknown’ factors.
Time series are studied in a variety of disciplines and appear in many modern applications such as financial time series prediction, video-tracking, music analysis, control and genetic sequence analysis. This widespread interest at times obscures the commonalities in the developed models and techniques. A central aim of this book is to attempt to make modern time series techniques, specifically those based on probabilistic modelling, accessible to a broad range of researchers.
In order to achieve this goal, leading researchers that span the more traditional disciplines of statistics, control theory, engineering and signal processing, and the more recent areas of machine learning and pattern recognition, have been brought together to discuss advancements and developments in their respective fields. In addition, the book makes extensive use of the graphical models framework. This framework facilitates the representation of many classical models and provides insight into the computational complexity of their implementation. Furthermore, it enables to easily envisage new models tailored for a particular environment. For example, the book discusses novel state space models and their application in signal processing including condition monitoring and tracking. The book also describes modern developments in the machine learning community applied to more traditional areas of control theory.
The effective application of probabilistic models in the real world is gaining pace, largely through increased computational power which brings more general models into consideration through carefully developed implementations.
Sensor networks have recently generated a great deal of research interest within the computer and physical sciences, and their use for the scientific monitoring of remote and hostile environments is increasingly commonplace. While early sensor networks were a simple evolution of existing automated data loggers, that collected data for later offline scientific analysis, more recent sensor networks typically make current data available through the Internet, and thus, are increasingly being used for the real-time monitoring of environmental events such as floods or storm events (see [10] for a review of such environmental sensor networks).
Using real-time sensor data in this manner presents many novel challenges. However, more significantly for us, many of the information processing tasks that would previously have been performed offline by the owner or single user of an environmental sensor network (such as detecting faulty sensors, fusing noisy measurements from several sensors, and deciding how frequently readings should be taken), must now be performed in real-time on the mobile computers and PDAs carried by the multiple different users of the system (who may have different goals and may be using sensor readings for very different tasks). Importantly, it may also be necessary to use the trends and correlations observed in previous data to predict the value of environmental parameters into the future, or to predict the reading of a sensor that is temporarily unavailable (e.g. due to network outages).
Optimising a sequence of actions to attain some future goal is the general topic of control theory [26, 9]. It views an agent as an automaton that seeks to maximise expected reward (or minimise cost) over some future time period. Two typical examples that illustrate this are motor control and foraging for food.
As an example of a motor control task, consider a human throwing a spear to kill an animal. Throwing a spear requires the execution of a motor program that is such that at the moment that the hand releases the spear it has the correct speed and direction to hit the desired target. A motor program is a sequence of actions, and this sequence can be assigned a cost that consists generally of two terms: a path cost that specifies the energy consumption to contract the muscles to execute the motor program, and an end cost that specifies whether the spear will kill the animal, just hurt it, or miss it altogether. The optimal control solution is a sequence of motor commands that results in killing the animal by throwing the spear with minimal physical effort. If x denotes the state space (the positions and velocities of the muscles), the optimal control solution is a function u(x, t) that depends both on the actual state of the system at each time t and also explicitly on time.