To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter, we extend the methods from Chapter 2 to handle the case, where the number of features d is large. Least squares does not work well in this case. We introduce ridge regression and the lasso, which are two methods for high-dimensional regression. We also discuss the bias–variance tradeoff and the challenges in constructing confidence intervals in this setting.
So far, we have focused on predicting an outcome Y from a set of features X. In this chapter, we turn to causal inference where we ask: What would Y be if we set X to a particular value x? This question concerns the distribution of the outcome Y after some hypothetical intervention. Answering such causal questions requires new tools and stronger assumptions than prediction.
Regression and classification are used to produce a prediction of an outcome Y. In many cases, we will also want a prediction set C that contains Y with probability . In this chapter, we discuss two methods for constructing predictions sets. The first is based on quantile regression. The second uses a method called conformal inference.
In this chapter, we discuss nonparametric regression, which does not assume that the regression function is linear or even approximately linear. We only assume that is a smooth function of x. This chapter describes some of these methods in the case where there is a single feature. Chapter 7 considers the case of multiple features.
This book aims to accelerate the inference of deep neural networks. Let us first consider why acceleration matters. Faster inference means quicker response times. For instance, in a speech recognition engine, transcription results can be delivered faster. In a translation engine, translations can be provided more quickly. This leads to an improved user experience.
In the previous chapters, we discussed methods to accelerate models by approximating pre-existing models. In this chapter, we describe how to obtain fast models by designing the architecture with acceleration in mind right from the beginning.
In this chapter, we introduce the concept of regression, which is a way to quantify the relationship between an outcome Y and a vector of features X. This relationship is expressed by the regression function, which is the mean of Y given X. This chapter discusses the main ideas and goals of regression analysis.
Quantization is the process of mapping continuous values to a smaller set of discrete values. A typical example is converting a floating-point number into an integer representation. For example, a matrix of 32-bit floating-point numbers
Statistical modelling and machine learning offer a vast toolbox of inference methods with which to model the world, discover patterns and reach beyond the data to make predictions when the truth is not certain. This concise book provides a clear introduction to those tools and to the core ideas – probabilistic model, likelihood, prior, posterior, overfitting, underfitting, cross-validation – that unify them. Toy and real examples illustrate diverse applications ranging from biomedical data to treasure hunts, while the accompanying datasets and computational notebooks in R and Python encourage hands-on learning. Instructors can benefit from online lecture slides and solutions to all the exercises. Requiring only first-year university-level knowledge of calculus, probability and linear algebra, the book equips students in statistics, data science and machine learning, as well as those in quantitative applied and social science programmes, with the tools and conceptual foundations to explore more advanced techniques.
The previous chapters have given an overview of various theoretical and applied aspects of differential equations on graphs.We foresee both a broadening and deepening of the field in the near future.
In this chapter, we will discuss how to implement the graph processes we have defined elsewhere in this book. The fundamental implementation obstacle is that for practical applications, the graphs one desires to use can be very large.
An important topic which has seen a considerable amount of attention in recent years concerns the continuum limits of the graph-based models discussed in this book1.
In Sections 3.1, 3.2, and 3.3, we defined graph versions of the Allen–Cahn equation, MBO scheme, and MCF respectively. It was first speculated in Van Gennip et al. (2014) that these dynamics on graphs might be related.
Starting from this section, we only consider undirected graphs unless stated differently in specific situations. In the definition of , we make the choices and .1 We will specify whether is a double-well or double-obstacle potential where this is relevant. Also, we recall from Remark 2.1.9 that we choose .