To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We propose a new estimator, the quadratic form estimator, of the Kronecker product model for covariance matrices. We show that this estimator has good properties in the large dimensional case (i.e., the cross-sectional dimension n is large relative to the sample size T). In particular, the quadratic form estimator is consistent in a relative Frobenius norm sense provided ${\log }^3n/T\to 0$. We obtain the limiting distributions of the Lagrange multiplier and Wald tests under both the null and local alternatives concerning the mean vector $\mu $. Testing linear restrictions of $\mu $ is also investigated. Finally, our methodology is shown to perform well in finite sample situations both when the Kronecker product model is true and when it is not true.
A correlation coefficient can be used to make predictions of dependent variable values using a procedure called linear regression. There are two equations that can be used to perform regression: the standardized regression equation and the unstandardized regression equation. Both regression equations produce a straight line that represents the predicted value on the dependent variable for a sample member with a given X variable score.
One statistical phenomenon to be aware of when making predictions is regression towards the mean, which occurs when a predicted dependent variable value that is closer to the mean of the dependent variable than the person’s score on the independent variable was to the mean of the independent variable. This means that outliers and rare events can be difficult or impossible to predict via the regression equations.
There are important assumptions of Pearson’s r and regression: (1) a linear relationship between variables, (2) homogeneity of residuals, (3) an absence of a restriction of range, (4) a lack of outliers/extreme values that distort the relationship between variables, (5) subgroups within the sample are equivalent, and (6) interval- or ratio-level data for both variables. Violating any of these assumptions can distort the correlation coefficient.
All null hypothesis statistical significance test (NHST) procedures follow eight steps: (1) forming groups in the data, (2) define the null hypothesis (H0), (3) set alpha (α), (4) choose a one-tailed or a two-tailed test, (5), calculate the observed value, (6) find the critical value, (7), compare the observed value and the critical value, and (8) calculate an effect size. For a z-test the effect size is Cohen’s d. This can be interpreted as the number of standard deviations between the two means.
A z-test is the simplest NHST and tests the H0 that a sample’s dependent variable mean and a population’s dependent variable mean are equal. If H0 is retained, then the difference between means is no greater than what would be expected from sampling error. If the H0 is rejected, it is not a good statistical model for the data.
When conducting NHSTs, it is possible to make the wrong decision about the H0. A Type I error occurs when a person rejects a H0 that is actually true. A Type II error occurs when a person retains a H0 that is actually false. It is impossible to know whether a correct decision has been made or not.
All the NHSTs in previous chapters compare two dependent variable means. When there are three or more group means, it is possible to use unpaired two-sample t-tests for each pair of group means, but there are two problems with this strategy. First, as the number of groups increases, the number of t-tests required increases faster. Second, the risk of Type I error increases with each additional t-test.
The analysis of variance (ANOVA) fixes both problems. Its null hypothesis is that all group means are equal. ANOVA follows the same eight steps as other NHST procedures. ANOVA produces an effect size, η2. The η2 effect size can be interpreted in two ways. First, η2 quantifies the percentage of dependent variable variance that is shared with the independent variable’s variance. Second, η2 measures how much better the group mean functions as a predicted score when compared to the grand mean.
ANOVA only says whether a difference exists – not which means differ from other means. To determine this, a post hoc test is frequently performed. The most common procedure is Tukey’s test. This helps researchers identify the location of the difference(s).
In this chapter, there are two types of probabilities that can be estimated: empirical probability and theoretical probability. Empirical probability is calculated by conducting a number of trials and finding the proportion that resulted in each outcome. Theoretical probability is calculated by dividing the number of methods of obtaining an outcome by the total number of possible outcomes. Adding together the probabilities of two different events will produce the probability that either one will occur. Multiplying the probabilities of two events together will produce the probability that both will occur at the same time or in succession. As the number of trials increases, the empirical probability and theoretical probability converge.
It is possible to build a histogram of empirical or theoretical probabilities. As the number of trials increases, the empirical and theoretical probability distributions converge. If an outcome is produced by adding together (or averaging) the results of events, the probability distribution is normally distributed. Because of this, it is possible to make inferences about the population based on sample data – a process called generalization. The mean of sample means converges to the population mean, and the standard deviation of means (the standard error) converges on the value.
This chapter discusses two types of descriptive statistics: models of central tendency and models of variability. Models of central tendency describe the location of the middle of the distribution, and models of variability describe the degree that scores are spread out from one another. There are four models of central tendency in this chapter. Listed in ascending order of the complexity of their calculations, these are the mode, median, mean, and trimmed mean. There are also four principal models of variability discussed in this chapter: the range, interquartile range, standard deviation, and variance. For the latter two statistics, students are shown three possible formulas (sample standard deviation and variance, population standard deviation and variance, and population standard deviation and variance estimated from sample data), along with an explanation of when it is appropriate to use each formula. No statistical model of central tendency or variability tells you everything you may need to know about your data. Only by using multiple models in conjunction with each other can you have a thorough understanding of your data.
Pearson’s correlation describes the relationship between two interval- or ratio-level variables. Positive correlation values indicate that individuals who have high X scores tend to have high Y scores (and that individuals with low X scores tend to have low Y scores). A negative correlation indicates that individuals with high X scores tend to have low Y scores (and that individuals with low X scores tend to have high Y scores). Correlation values closer to +1 or –1 indicate stronger relationships between the variables; values close to zero indicate weaker relationships. A correlation between two variables does not imply a causal relationship between them.
It is also possible to test a correlation coefficient for statistical significance, where the null hypothesis is r = 0. This follows the same steps of all NHSTs. The effect size for Pearson’s r is calculated by squaring the r value (r2).
A correlation is visualized with a scatterplot. Scatterplots for strong correlations have dots that are closely grouped together; scatterplots showing weak correlations have widely spaced dots. Positive correlations have dots that cluster in the lower-left and upper-right quadrants of a scatterplot. Negative correlations have the reverse pattern.