We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter introduces probability. We begin with an informal definition which enables us to build intuition about the properties of probability. Then, we present a more rigorous definition, based on the mathematical framework of probability spaces. Next, we describe conditional probability, a concept that makes it possible to update probabilities when additional information is revealed. In our first encounter with statistics, we explain how to estimate probabilities and conditional probabilities from data, as illustrated by an analysis of votes in the United States Congress. Building upon the concept of conditional probability, we define independence and conditional independence, which are critical concepts in probabilistic modeling. The chapter ends with a surprising twist: In practice, probabilities are often impossible to compute analytically! Fortunately, the Monte Carlo method provides a pragmatic solution to this challenge, allowing us to approximate probabilities very accurately using computer simulations. We apply w 3 × 3 basketball tournament from the 2020 Tokyo Olympics.
This self-contained guide introduces two pillars of data science, probability theory, and statistics, side by side, in order to illuminate the connections between statistical techniques and the probabilistic concepts they are based on. The topics covered in the book include random variables, nonparametric and parametric models, correlation, estimation of population parameters, hypothesis testing, principal component analysis, and both linear and nonlinear methods for regression and classification. Examples throughout the book draw from real-world datasets to demonstrate concepts in practice and confront readers with fundamental challenges in data science, such as overfitting, the curse of dimensionality, and causal inference. Code in Python reproducing these examples is available on the book's website, along with videos, slides, and solutions to exercises. This accessible book is ideal for undergraduate and graduate students, data science practitioners, and others interested in the theoretical concepts underlying data science methods.
In this chapter we present two spatial dependent models: one based on defining a latent variable for each area, and the other by defining one latent variable for each pair of latent areas. We call the latter the latent edges model. We compare both models with a real data set. Extensions to spatio-temporal constructions are also considered.
In this chapter we define what a conjugate family is in a Bayesian analysis context and develop detailed examples of some cases; in particular, we review the beta and binomial case, the Pareto and inverse Pareto case, the gamma and gamma case and the gamma and Poisson case. We conclude by providing a list of conjugate models.
In this chapter we show how to define temporal dependent sequences using a moving average type of construction. We compare the performance of this construction with a Markov-process type. We finally extend the models to include seasonal and periodic dependencies.
In this chapter we start with some attempts to construct dependence sequences with order larger than one and present a general result to achieve an invariant distribution via a three-level hierarchical model. We finally present some results involving exponential families.
In this chapter we describe a general procedure to construct Markov sequences with invariant distributions. The procedure can be used with conjugate and non-conjugate models and with parametric and nonparametric distributions. We derive several examples in detail and finish with some applications in survival analysis.
In this chapter we introduce the concept of exchangeability and show how to construct exchangeable sequences; we present our first result of how to construct exchangeable sequences and maintain a desirable marginal distribution and provide detailed examples. We finish with an application of exchangeable constructions in a meta analysis. Bugs and R code are provided.
In this chapter we start by reviewing the different types of inference procedures: frequentist, Bayesian, parametric and non-parametric. We introduce notation by providing a list of the probability distributions that will be used later on, together with their first two moments. We review some results on conditional moments and carry out several examples. We review definitions of stochastic processes, stationary processes and Markov processes, and finish by introducing the most common discrete-time stochastic processes that show dependence in time and space.
In this chapter we conclude the book by presenting dependent models for random vectors and for stochastic processes. The types of dependence are exchangeable, Markov, moving average, spatial or a combination of the latter two.
Bringing together years of research into one useful resource, this text empowers the reader to creatively construct their own dependence models. Intended for senior undergraduate and postgraduate students, it takes a step-by-step look at the construction of specific dependence models, including exchangeable, Markov, moving average and, in general, spatio-temporal models. All constructions maintain a desired property of pre-specifying the marginal distribution and keeping it invariant. They do not separate the dependence from the marginals and the mechanisms followed to induce dependence are so general that they can be applied to a very large class of parametric distributions. All the constructions are based on appropriate definitions of three building blocks: prior distribution, likelihood function and posterior distribution, in a Bayesian analysis context. All results are illustrated with examples and graphical representations. Applications with data and code are interspersed throughout the book, covering fields including insurance and epidemiology.