Linear regression is an additive model, which does not work for binary outcomes–that is, data y that take on the values 0 or 1. To model binary data, we need to add two features to the base model y = a + bx: a nonlinear transformation that bounds the output between 0 and 1 (unlike a + bx, which is unbounded), and a model that treats the resulting numbers as probabilities and maps them into random binary outcomes. This chapter and the next describe one such model–logistic regression–and then in Chapter 15 we discuss generalized linear models, a larger class that includes linear and logistic regression as special cases. In the present chapter, we introduce the mathematics of logistic regression and also its latent-data formulation, in which the binary outcome y is a discretized version of an unobserved or latent continuous measurement z. As with the linear model, we show how to fit logistic regression, interpret its coefficients, and plot data and fitted curves. The nonlinearity of the model increases the challenges of interpretation and model-building, as we discuss in the context of several examples.
Review the options below to login to check your access.
Log in with your Cambridge Higher Education account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.