A correlation coefficient can be used to make predictions of dependent variable values using a procedure called linear regression. There are two equations that can be used to perform regression: the standardized regression equation and the unstandardized regression equation. Both regression equations produce a straight line that represents the predicted value on the dependent variable for a sample member with a given X variable score.
One statistical phenomenon to be aware of when making predictions is regression towards the mean, which occurs when a predicted dependent variable value that is closer to the mean of the dependent variable than the person’s score on the independent variable was to the mean of the independent variable. This means that outliers and rare events can be difficult or impossible to predict via the regression equations.
There are important assumptions of Pearson’s r and regression: (1) a linear relationship between variables, (2) homogeneity of residuals, (3) an absence of a restriction of range, (4) a lack of outliers/extreme values that distort the relationship between variables, (5) subgroups within the sample are equivalent, and (6) interval- or ratio-level data for both variables. Violating any of these assumptions can distort the correlation coefficient.
Review the options below to login to check your access.
Log in with your Cambridge Aspire website account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.