As datasets can be usefully summarized to compress much information into small pieces (Chapter 2), probability distributions can be as well. We review some of these summaries in this chapter. These summaries of probability distributions revolve around various kinds of expectations of the behavior of a random variable. These expectations are important because they typically relate to the specific aspects of stochastic data-generating processes (DGPs), called parameters, that social scientists connect to substantive theory and attempt to uncover in empirical work. In one sense, the material in this chapter helps to explain why that is. In addition, this chapter also provides some basic familiarity with these important formal constructs that we use repeatedly in subsequent material.
One of the most important points of this chapter is to define the regression function, or expected value of Y as a function of X, as it exists in a DGP specified as a joint distribution function. We then proceed to establish some important properties about this DGP regression that help to motivate and justify the widespread interest in this function in empirical social science. Later chapters spend a great deal of time developing common models for this regression (and techniques to make inferences about the DGP regression from regression models fit to sample data), so it is important to get a handle on why anyone should care about it.
Review the options below to login to check your access.
Log in with your Cambridge Aspire website account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.