We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Analysis of variance (ANOVA) is a core technique for analysing data in the Life Sciences. This reference book bridges the gap between statistical theory and practical data analysis by presenting a comprehensive set of tables for all standard models of analysis of variance and covariance with up to three treatment factors. The book will serve as a tool to help post-graduates and professionals define their hypotheses, design appropriate experiments, translate them into a statistical model, validate the output from statistics packages and verify results. The systematic layout makes it easy for readers to identify which types of model best fit the themes they are investigating, and to evaluate the strengths and weaknesses of alternative experimental designs. In addition, a concise introduction to the principles of analysis of variance and covariance is provided, alongside worked examples illustrating issues and decisions faced by analysts.
Every model in Chapters 2 and 3 has one or more equivalents without full replication. For model 2.1 it is 1.1, for 2.2 it is 2.1, for 3.1 it is 4.1 or 6.1, for 3.2 it is 4.2 or 6.2, for 3.3 it is 5.6 or 6.3, and for 3.4 it is 3.1. Here we give two further versions of factorial models 3.1 and 3.2 without full replication. The lack of replicated sampling units means that at least one of the factors must be random, as demonstrated by model 7.1(i) below in comparison to (ii) and (iii). Factorial designs that lack full replication must further assume that there are no significant higher-order interactions between factors, which cannot be tested by the model since there is no measure of the residual error among replicate observations (subjects). This is problematic because lower-order effects can only be interpreted fully with respect to their higher-order interactions (chapter 3). Falsely assuming an absence of higher-order interactions will cause tests of lower-order effects to overestimate the Type I error (rejection of a true null hypothesis) and to underestimate the Type II error (acceptance of a false null hypothesis). Without testing for interactions, causality cannot be attributed to significant main effects, and no conclusion can be drawn about non-significant main effects. For some analyses, the existence of a significant main effect when levels of an orthogonal random block are pooled together may hold interest regardless of whether or not the effect also varies with block; the main effect indicates an overall trend averaged across levels of the random factor.
Empirical research invariably requires making informed choices about the design of data collection. Although the number and identity of experimental treatments is determined by the question(s) being addressed, the investigator must decide at what spatial and temporal scales to apply them and whether to include additional fixed or random factors to extend the generality of the study. The investigator can make efficient use of resources by balancing the cost of running the experiment against the power of the experiment to detect a biologically significant effect. In practice this means either minimising the resources required to achieve a desired level of statistical power or maximising the statistical power that can be attained using the finite resources available. An optimum design can be achieved only by careful planning before data collection, particularly in the selection of an appropriate model and allocation of sampling effort at appropriate spatial and temporal scales.
Inadequate statistical power continues to plague biological research (Jennions and Moller 2003; Ioannidis 2005), despite repeated calls to incorporate it into planning (Peterman 1990; Greenwood 1993; Thomas and Juanes 1996). Yet efficient experimentation has never been more in demand.
Nested designs extend one-factor ANOVA to consider two or more factors in a hierarchical structure. Nested factors cannot be cross factored with each other because each level of one factor exists only in one level of another (but see models 3.3 and 3.4 for cross-factored models with nesting). Nested designs allow us to quantify and compare the magnitudes of variation in the response at different spatial, temporal or organisational scales. They are used particularly for testing a factor of interest without confounding different scales of variation. For example, spatial variation in the infestation of farmed salmon with sea lice could be compared at three scales – among farms (A′), among cages within each farm (B′) and among fish within each cage (S′) – by sampling n fish in each of b cages on each of a farms. Similarly, seasonal variation (A) in infestation of farmed salmon by sea lice, over and above short term fluctuations in time (B′), could be measured by sampling n independent fish on b random occasions in each of a seasons.
Designs are inherently nested when treatments are applied across one organisational scale and responses are measured at a finer scale. For example the genotype of a plant may influence the mean length of its parasitic fungal hyphae. A test of this hypothesis must recognise the fact that hyphae grow in colonies (S′) that are nested within leaves (C′), which in turn are nested within plants (B′), which in turn are nested in genotype (A′) (discussed further on page 23).
You will need to declare any random factors and covariates as such. For balanced designs you may have an option to use the restricted form of the model (see page 242).
For a fully replicated design, most packages will give you all main effects and their interactions if you request the model in its abbreviated form. For example, the design Y = C|B|A+ε (model 3.2) can be requested as: ‘C|B|A’. Where a model has nested factors, you may need to request it with expansion of the nesting. For example the design Y = C|B′(A)+ε (model 3.3) is requested with ‘C|A+C|B(A)’.
Repeated-measures and unreplicated designs have no true residual variation. The package may require residual variation nevertheless, in which case declare all the terms except the highest-order term (always the last row with non-zero d.f. in the ANOVA tables in this book). For example, for the design Y = B|S′(A) (model 6.3) request: ‘B|A+B|S(A)–B*S(A)’, and the package will take the residual from the subtracted term. Likewise, for the design Y = S′|A (model 4.1) request: ‘S|A–S*A’, and the package will take the residual from the subtracted term; or equally, request ‘ A+ S’, and the package will take the residual from the one remaining undeclared term: S*A.
Correctly identifying the appropriate model to use (see page 57) is the principal hurdle in any analysis, but running the chosen model in your favourite statistics package also presents a number of potential pitfalls. If you encounter problems when using a statistics package, do refer to its help routines and tutorials in order to understand the input requirements and output formats, and to help you interpret error messages. If that fails then look to see if you have encountered one of these common problems.
Problems with sampling design
If I just want to identify any differences amongst a suite of samples, can I do t tests on all sample pairs? No, the null hypothesis of no difference requires a single test yielding a single P-value. Multiple P-values are problematic in any unplanned probing of the data with more than one test of the same null hypothesis, because the repeated testing inflates the Type I error rate (illustrated by an example on page 252). If an ANOVA reveals a general difference between samples, explore where the significance lies using post hoc tests designed to account for the larger family-wise error (page 245).
Hypothesis testing in the life sciences often involves comparing samples of observations, and analysis of variance is a core technique for analysing such information. Parametric analysis of variance, abbreviated as ‘ANOVA’, encompasses a generic methodology for identifying sources of variation in continuous data, from the simplest test of trend in a single sample, or difference between two samples, to complex tests of multiple interacting effects. Whilst simple one-factor models may suffice for closely controlled experiments, the inherent complexities of the natural world mean that rigorous tests of causality often require more sophisticated multi-factor models. In many cases, the same hypothesis can be tested using several different experimental designs, and alternative designs must be evaluated to select a robust and efficient model. Textbooks on statistics are available to explain the principles of ANOVA and statistics packages will compute the analyses. The purpose of this book is to bridge between the texts and the packages by presenting a comprehensive selection of ANOVA models, emphasising the strengths and weaknesses of each and allowing readers to compare between alternatives.
Our motivation for writing the book comes from a desire for a more systematic comparison than is available in textbooks, and a more considered framework for constructing tests than is possible with generic software. The obvious utility of computer packages for automating otherwise cumbersome analyses has a downside in their uncritical production of results. Packages adopt default options until instructed otherwise, which will not suit all types of data.
In the following Chapters 1 to 7, we will describe all common models with up to three treatment factors for seven principal classes of ANOVA design:
One-factor – replicate measures at each level of a single explanatory factor;
Nested – one factor nested in one or more other factors;
Factorial – fully replicated measures on two or more crossed factors;
Randomised blocks – repeated measures on spatial or temporal groups of sampling units;
Split plot – treatments applied at multiple spatial or temporal scales;
Repeated measures – subjects repeatedly measured or tested in temporal or spatial sequence;
Unreplicated factorial – a single measure per combination of two or more factors.
For each model we provide the following information:
The model equation;
The test hypothesis;
A table illustrating the allocation of factor levels to sampling units;
Illustrative examples;
Any special assumptions;
Guidance on analysis and interpretation;
Full analysis of variance tables showing all sources of variation, their associated degrees of freedom, components of variation estimated in the population, and appropriate error mean squares for the F-ratio denominator;
Options for pooling error mean square terms.
As an introduction to Chapters 1 to 7, we first describe the notation used, explain the layout of the allocation tables, present some worked examples and provide advice on identifying the appropriate statistical model.
Notation
Chapters 1 to 3 describe fully randomised and replicated designs. This means that each combination of levels of categorical factors (A, B, C) is assigned randomly to n sampling units (S′), which are assumed to be selected randomly and independently from the population of interest.
Balanced designs have the same number of replicate observations in each sample. Thus a one-factor model Y = A+ε will be balanced if sample sizes all take the same value n at each of the a levels of factor A. Balanced designs are generally straightforward to analyse because factors are completely independent of each other and the total sum of squares (SS) can be partitioned completely among the various terms in the model. The SS explained by each term is simply the improvement in the residual SS as that term is added to the model. These are often termed ‘sequential SS’ or ‘Type I SS’.
Designs become unbalanced when some sampling units are lost, destroyed or cannot be measured, or when practicalities mean that it is easier to sample some populations than others. For nested models, imbalance may result from unequal nesting as well as unequal sample sizes. Thus a nested model Y = B′(A)+ε will be balanced only if each of the a levels of factor A has b levels of factor B′, and each of the ba level of B′ has n replicate observations. For factorial models, an imbalance means that some combinations of treatments have more observations than others. An extreme case of unbalanced data arises in factorial designs where there are no observations for one or more combinations of treatments, resulting in missing samples and a substantially more complicated analysis.
Repeated-measures designs involve measuring each sampling unit repeatedly over time or applying treatment levels in temporal or spatial sequence to each sampling unit. Because these designs were developed primarily for use in medical research, sampling units are often referred to as subjects. Those factors for which each subject participates in every level are termed ‘within-subject’ or ‘repeated-measures’ factors; levels of the within-subject factor are applied in sequence to each subject. Conversely, ‘between-subjects’ factors are grouping factors, for which each subject participates in only one level. Repeated-measures models are classified into two types, subject-by-trial and subject-by-treatment models, according to the nature of the within-subject factors (Kirk 1994).
Subject-by-trial designs apply the levels of the within-subject factor to each subject in an order that cannot be randomised, because time or space is an inherent component of the factor. Subjects (sampling units) may be measured repeatedly over time to track natural temporal changes in some measurable trait – for example, blood pressure of patients at age 40, 50 and 60, biomass of plants in plots at fixed times after planting, build-up of lactic acid in muscle during exercise. Likewise, subjects may be measured repeatedly through space to determine how the response varies with position – for example barnacle density in plots at different shore elevations, or lichen diversity on the north and south sides of trees.
Analysis of variance, often abbreviated to ANOVA, is a powerful statistic and a core technique for testing causality in biological data. Researchers use ANOVA to explain variation in the magnitude of a response variable of interest. For example, an investigator might be interested in the sources of variation in patients' blood cholesterol level, measured in mg/dL. Factors that are hypothesised to contribute to variation in the response may be categorical or continuous. A categorical factor has levels – the categories – that are each applied to a different group of sampling units. For example, sampling units of hospital patients may be classified as male or female, representing two levels of the factor ‘Gender’. By contrast, a continuous factor has a continuous scale of values and is therefore a covariate of the response. For example, age of patients may be quantified by the covariate ‘Age’. ANOVA determines the influence of these effects on the response by testing whether the response differs among levels of the factor, or displays a trend across values of the covariate. Thus, blood cholesterol level of patients may be deemed to differ among male and female patients, or to increase or decrease with age of the patient.
A factor of interest can be experimental, with sampling units that are manipulated to impose contrasting treatments. For example, patients may be given a cholesterol-lowering drug or a placebo, which represent two levels of the factor ‘Drug’.