To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
When the outcome Y is binary or an integer, we need to modify our methods. In this chapter, we introduce logistic regression for binary data and Poisson regression for count data. These are special cases of a class of regression models called generalized linear models. Logistic regression is a special case of a more general suite of methods called classification, which are discussed in Chapter 9.
In this chapter, we explain how to estimate the prediction error of a regression model. The training error (the average of the squared residuals) under-estimates the prediction error. Instead, we use cross-validation that involves separating the data into one part for fitting the model and one part for estimating the prediction error. We can use the estimated prediction error to choose among a set of possible regression models.
In this chapter, we briefly describe methods for performing training in a fast and cost-effective manner. This book has primarily focused on accelerating inference. Techniques for inference, such as low-precision methods and fast architectures, can also be employed to accelerate training. In this chapter, we discuss issues that are unique to training.
Distillation (Bucila et al., 2006; Hinton et al., 2015) is a technique for obtaining a small and computationally efficient model that performs the same function as a large and computationally intensive model. The original large model is referred to as the teacher model, and the resulting small model is called the student model. It is also common to employ an ensemble of multiple models as the teacher model.
In this chapter, we briefly cover a few other topics related to regression. Each topic is the subject of entire textbooks. Our goal is to give a very concise introduction to each topic. The topics include random effects and empirical Bayes, neural nets and deep learning, survival analysis, graphical models, and time series.
In this chapter, we consider nonparametric regression when we have more than one feature. First, we show how the methods in Chapter 6 can be extended to handle this case. Then, we consider additive regression, regression trees, and random forests. Another estimator based on neural nets is discussed in Chapter 12.
In linear regression, we approximate the regression function with a linear function and estimate the coefficients using least squares. We construct confidence intervals, prediction bands, and show how residual plots help check the linear approximation. We also review some other regression tools, that are not as widely used as in the past.
In this chapter, we will introduce the main acceleration methods. We will explain each method in more detail in the following chapters. The aim of this chapter is to give an overview of acceleration methods.
When the outcome Y is discrete rather than continuous, we refer to the problem of predicting Y as classification. In many ways, this is easier than predicting a continuous outcome since Y can only take a few values. Most of the methods we have covered so far can be adapted to handle discrete outcomes. One particular method, based on neural nets, is covered in Chapter 12.