We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Timeseries require specialised models since the number of variables can be very large and typically increases as new datapoints arrive. In this chapter we discuss models in which the process generating the observed data is fundamentally discrete. These models give rise to classical models with interesting applications in many fields from finance to speech processing and website ranking.
Markov models
Timeseries are datasets for which the constituent datapoints can be naturally ordered. This order often corresponds to an underlying single physical dimension, typically time, though any other single dimension may be used. The timeseries models we consider are probability models over a collection of random variables v1, …, vT with individual variables vt indexed by discrete time t. A probabilistic timeseries model requires a specification of the joint distribution p(v1, …, vT). For the case in which the observed data vt are discrete, the joint probability table for p(v1, …, vT) has exponentially many entries.We therefore cannot expect to independently specify all the exponentially many entries and need to make simplified models under which these entries can be parameterised in a lower-dimensional manner. Such simplifications are at the heart of timeseries modelling and we will discuss some classical models in the following sections.
In Chapter 3 we saw how belief networks are used to represent statements about independence of variables in a probabilistic model. Belief networks are simply one way to unite probability and graphical representation. Many others exist, all under the general heading of ‘graphical models’. Each has specific strengths and weaknesses. Broadly, graphical models fall into two classes: those useful for modelling, such as belief networks, and those useful for inference. This chapter will survey the most popular models from each class.
Graphical models
Graphical Models (GMs) are depictions of independence/dependence relationships for distributions. Each class of GM is a particular union of graph and probability constructs and details the form of independence assumptions represented. Graphical models are useful since they provide a framework for studying a wide class of probabilistic models and associated algorithms. In particular they help to clarify modelling assumptions and provide a unified framework under which inference algorithms in different communities can be related.
It needs to be emphasised that all forms of GM have a limited ability to graphically express conditional (in)dependence statements [281]. As we've seen, belief networks are useful formodelling ancestral conditional independence. In this chapter we'll introduce other types of GM that are more suited to representing different assumptions. Here we'll focus on Markov networks, chain graphs (which marry belief and Markov networks) and factor graphs. There are many more inhabitants of the zoo of graphical models, see [73, 314].
In Part I we discussed inference and showed that for certain models this is computationally tractable. However, for many models of interest, one cannot perform inference exactly and approximations are required.
In Part V we discuss approximate inference methods, beginning with sampling-based approaches. These are popular and well known in many branches of the mathematical sciences, having their origins in chemistry and physics. We also discuss alternative deterministic approximate inference methods which in some cases can have remarkably accurate performance.
It is important to bear in mind that no single algorithm is going to be best on all inference tasks. For this reason, we attempt throughout to explain the assumptions behind the techniques so that one may select an appropriate technique for the problem at hand.
Natural organisms inhabit a dynamical environment and arguably a large part of natural intelligence is in modelling causal relations and consequences of actions. In this sense, modelling temporal data is of fundamental interest. In a more artificial environment, there are many instances where predicting the future is of interest, particularly in areas such as finance and also in tracking of moving objects.
In Part IV, we discuss some of the classical models of timeseries that may be used to represent temporal data and also to make predictions of the future. Many of these models are well known in different branches of science from physics to engineering and are heavily used in areas such as speech recognition, financial prediction and control. We also discuss some more sophisticated models in Chapter 25, which may be skipped at first reading.
As an allusion to the fact that natural organisms inhabit a temporal world, we also address in Chapter 26 some basic models of how information processing might be achieved in distributed systems.