## 1. Introduction

Phenotypic plasticity, developmental homoeostasis and canalization are concepts related to environmental sensitivity and are importantly linked to mechanisms of adaptation and evolution. Environmental sensitivity can be defined either as mean phenotypic changes of a given genotype in different environments or as differences in the residual variance of different genotypes in the same environment (Jinks & Pooni, Reference Jinks and Pooni1988). The first definition gives rise to genotype by environment interactions acting at the level of the mean, a topic that is not the focus of the present work.

The subject of this study is related to the second definition, which gives rise to genetically structured heterogeneous variance models. In a given macro-environment, for a particular genotype a certain phenotypic value is expected. Typically, the observed phenotype differs from its expectation, and this deviation, which is the focus of the modelling exercise, may include a genetic component. This model has interesting implications in animal and plant improvement and in evolutionary biology. From a breeding point of view, it offers the possibility to decrease variation by selection, leading to more homogeneous products. Particularly in plant breeding this model is used to study stability of genotypes (Cotes *et al.*, Reference Cotes, Crossa, Sanches and Cornelius2006). In evolutionary biology, a fundamental problem is to understand the forces that maintain phenotypic variation. Most of the models assume that environmental variance is constant and explain the observed levels of variation by invoking a balance between a gain of genetic variance by mutation and a loss by different forms of selection and drift. Recently, Zhang & Hill (Reference Zhang and Hill2005) discussed models where environmental variance is partly under genetic control and study conditions for its maintenance under stabilizing selection. Considerable experimental support for the presence of genetic factors affecting environmental variance has been provided in recent years. The majority of this evidence stems from analyses of experimental or field data fitting models of different levels of complexity that account for genetic variance in environmental variance. The topic has been comprehensibly reviewed by Hill & Mulder (Reference Hill and Mulder2010).

Genotype by environment interaction and heterogeneity of variance are topics that received considerable attention in the plant breeding literature (Piepho, Reference Piepho1999; Cotes *et al.*, Reference Cotes, Crossa, Sanches and Cornelius2006), partly because yield trials are conducted in multiple environments and years. Recently, Edwards & Jannink (Reference Edwards and Jannink2006) used a Bayesian hierarchical model to study whether heterogeneity of residual and genotype by interaction variances could be assigned to genetic and genotype×environment effects. A similar model was implemented by Sorensen & Waagepetersen (Reference Sorensen and Waagepetersen2003) to investigate genetic heterogeneity of environmental variance for litter size in pigs. Inferences from such highly complex and parameterized models can be artefacts of the lack of normality of the data (Yang *et al.*, Reference Yang, Christensen and Sorensen2011).

The objective of the present work is to use relatively simple statistical models to test whether environmental variation of dry matter grain yield in maize is heterogeneous, and whether this heterogeneity can be specifically attributed to genotypic effects. The focus is on the general environmental variance (Falconer & Mackay, Reference Falconer and Mackay1996) which is referred here simply as environmental variance. Part of the present data, consisting of replicates of the same genotype over a given environment, are particularly well suited to investigate this problem. The variance between observations within a genotype reflects environmental variation, and a different variance across genotypes indicates that a genetic component operates at the level of the environmental variance. This type of evidence can be compelling under experimentally controlled conditions and the analysis is in principle straightforward: it requires comparing sample variances across genotypes. Such an analysis, using abdominal and sternopleural bristle number in *Drosophila* was reported by Mackay & Lyman (Reference Mackay and Lyman2005). The present data include genotypes tested in multiple locations; this also allows studying the contribution to variance heterogeneity due to location–years and due to their interaction with genotype.

The statistical evidence for a dependence of environmental variance on genotype is produced using two different types of analysis. The first is based on a non-parametric approach and the second on a parametric Bayesian approach. The latter allows, in principle, a formal analysis accounting for and quantifying all known sources of variation. Flexibility of inferences comes at the cost of complexity and model dependence.

## 2. Data and preliminary calculations

The data consist of dry matter grain yield per plot (in kg, hereinafter referred to as yield, units omitted in the rest of the text for simplicity) from trials of a German Maize Breeding Programme. Plants were grown in 9 m^{2} plots (1·5 m×6 m), 2 rows per plot. At each location, the same number of seeds were planted per plot producing 100 plants per plot, with very little variation in this number across plots. Three genetically homogeneous single-cross maize hybrids (all individuals from each of the three genotypes were genetically identical) were tested at 20 locations in 3 years. There were 27 location–year subclasses and 61 genotype–location–year subclasses, with a number of records (plots) ranging from 21 to 226 and a median of 64. Genotype 1 was represented in 26 of the 61 subclasses, genotype 2 in 19 and genotype 3 in 16. Preliminary analyses led to the elimination of extreme observations that had marked effects on inferences and after the editing process the number of observations for genotypes 1, 2 and 3 is 2046, 1595 and 1183, respectively, resulting in a total of 4824 records. The 20 locations represent a broad range of agroecological conditions in Germany.

The raw means , *i*=1, 2, 3, for genotypes 1, 2 and 3 are 9·17, 9·92 and 9·85, respectively. The corresponding sample variances are 2·36, 2·48 and 2·87, where *n*
_{
i
} is the number of observations in genotype *i*. With values of *n*
_{
i
} larger than 1000 as in the present data, simple *F*-tests reveal significant differences at a level smaller than 0·01%, and indicate that the hypothesis of equality of environmental variances across genotypes must be rejected.

This naive analysis is deficient in two ways. Firstly, the presence of location–years violate the assumption of common mean among observations of the same genotype. The second is the assumption of normality of residuals (discussed further below and illustrated in Fig. 1). The two approaches implemented here, that differ in the degree of parameterization and formality, attempt to overcome these shortcomings.

## 3. Non-parametric analysis

A first approach is to test for variance heterogeneity among the 61 genotype–location–year subclasses by means of Bartlett's statistic (Bartlett, Reference Bartlett1937). Along the same lines as in Ordas *et al.* (Reference Ordas, Malvar and Hill2008), the sampling distribution of Bartlett's statistic under the null hypothesis of variance homogeneity is approximated using 10 000 Monte Carlo replications, where in each, the indexes for the subclasses labelling the 4824 records are permuted and the 61 sample variances are obtained from the permuted records.

Repeating the exercise using as input the sample variances of the three genotypes, or the 27 location– year subclasses would lead to tests of heterogeneity associated with genotypes or location–years that are difficult to interpret, due to the very unbalanced structure of the data. Instead, heterogeneity due to genotype was studied using five independent subsets of the original 4824 yield data. The five subsets correspond to five location–year classes where the sampling variances of the three genotypes are computed using a minimum of 110 records per genotype (Table 2). This allows investigating the presence of genetic variance heterogeneity without the noise contributed by location–year effects.

## 4. Parametric Bayesian analysis

The other analysis is based on fitting a fully parametric Bayesian normal linear model with heterogeneous variance, to transformed data using the Box–Cox non-linear transformation (Box & Cox, Reference Box and Cox1964). The data consist of the original 4824 yield records, rather than the 61 sample variances used previously. The non-linear transformation parameter is first estimated using maximum likelihood and in a second step, the Bayesian model is implemented, conditional on this estimate. The transformation is used because a simple least squares analysis (using genotype–location–year location parameters) of the 4824 residuals revealed that the sample skewness and kurtosis are equal to −1·16 and 2·92, respectively. This signals the presence of asymmetry in the data but kurtosis is not detected. Skewness can lead to erroneous inferences, especially in the present work where the focus of inference is the variance.

### (i) Likelihoods

The general form of the six Bayesian models fitted to the data is (see Table 1 for the detailed description of the six models)

where the vector of fixed effects **b** contains either main effects of genotype and of location–year, or interaction effects of genotype by location–year. The term σ_{
i,M
}
^{2} is the conditional or residual variance for the *i*th observation under the *M*th model, and **X** is an observed incidence matrix. The residual variance is interpreted as the environmental variance.

*ME*, main effect parameters; *INT*, interaction parameters; G, main effect of genotypes (3 levels); LY, main effects of location–years (27 levels); GLY, interaction of genotype by location–years (61 levels).

Of the six models, two assume that the residual variance is homogeneous (Models *ME* and *INT*, with main effects and interaction effects, respectively, acting on the mean). In this case,

where μ* is an unknown scalar parameter in the real line. The term exp(μ*) can be interpreted as an unobserved average environmental variance. The remaining four models assume heterogeneity of the residual variance and the general model for this variance is

Here **w**
_{
i
}′ is the *i*th row in the observed incidence matrix **W** and **b*** is a parameter vector with the same classification variables as in **b**. The data vector **y** is of length *n*=4824. All the models assume normality of the unobserved conditional distribution of the observations (1).

The distribution of the fitted residuals from all the models displayed left skewness, violating the assumption of normality in (1). Therefore, the six models were also fitted to transformed data using the Box–Cox non-linear transformation parameter λ. The general form of the models under the transformation is

and *y*
_{
i
}
^{(λ)} is defined as (Box & Cox, Reference Box and Cox1964)

for some unobserved transformation parameter λ. The transformation parameter λ is introduced to induce normality of the conditional distribution and linearity of the conditional expectation, but not to achieve constant variance.

The Bayesian analyses are based on fitting the above linear models, and slight variations of them, conditional on either λ=1 (corresponding to an analysis on the original scale) or on the maximum likelihood estimate of λ (corresponding to an analysis under the transformed scale). Given λ, posterior distributions have known closed forms for all the models but due to availability of software we choose a Markov chain Monte Carlo (McMC) implementation. This has the advantage of providing a flexible environment for reporting inferences. Details of the Bayesian models and their McMC implementation have been reported in Yang *et al.* (Reference Yang, Christensen and Sorensen2011).

### (ii) Bayesian measures of model comparison and model fit

The models are compared using the pseudo log-marginal probability of the data, which is a commonly used quantity to compare the relative overall fit of a model (Gelfand, Reference Gelfand, Gilks, Richardson and Spiegelhalter1996). A larger value of the log-marginal probability indicates a better overall fit.

Departure from normality of the conditional distribution of the data is studied by means of posterior predictive model checking (Gelman *et al.*, Reference Gelman, Carlin, Stern and Rubin2004). A Monte Carlo estimate of the posterior distribution of the sample coefficient of skewness is computed under λ=1 or under , where is the maximum likelihood estimate of λ. The idea is to check whether residuals under λ=1 or under , satisfy the assumption of normality in (1) using the discrepancy measure

where

and **θ**
^{(j)}′=(**b**
^{(j)}′, μ*^{(j)}, **b***^{(j)}′) represent draws of the parameter vectors of a given model at the *j*th McMC iteration. Depending on whether the null model or the transformed model is under scrutiny, λ=1 or λ is equal to its maximum likelihood estimate. A graphical display involves plotting histograms of *T*(**z**, **θ**
^{(j)}) for both models.

The discrepancy measure *T*(**z**, **θ**
^{(j)}) represents the *j*th draw from the estimate of the posterior distribution of the sample skewness. The plot constructed using all the draws, displays the Monte Carlo estimate of the posterior distribution of the coefficient of skewness of the conditional distribution (1). Under normality the posterior distribution of the coefficient of skewness should be centred around a posterior mean of zero.

## 5. Results of the non-parametric analyses

The non-parametric analysis of the 61 sampling variances corresponding to the 61 genotype–location– year interaction subclasses, based on the Monte Carlo estimate of the sampling distribution of Bartlett's statistic, resulted in a *p*-value of the order of 10^{−5} which provides strong evidence against the null hypothesis of variance homogeneity.

To concentrate specifically on the effects of genotype on the variance free from the effects of location– years, the non-parametric analysis of the sampling variances was also performed within the five location– years where the three genotypes are represented with a minimum of 110 records per genotype. This resulted in *p*-values equal to 0·001, 0·0003, 0·25, 0·27 and 0·90 (ordered from left to right matching the order of the five locations in Table 2), revealing significant effects of genotype in the first two location–years.

To give an idea of the size of effects, Table 2 shows average yield, sample variance and number of records per genotype, for the five location–year classes, following the same order as the *p*-values above. The differences in sample variance within year–locations are small, relative to differences between year– locations (there are many locations with less than 100 records per genotype, with sample variance well over 1·3). The figures do not show association between mean yield and variance.

## 6. Results of the Bayesian analysis

### (i) The need for transforming the data

Maximum likelihood estimates of the transformation parameter λ for the six models are as follows. Model *ME*: 2·5, model *INT*: 2·5, model *ME–ME*: 1·8, model *ME–INT*: 1·9, model *INT–ME*: 1·8, model *INT– INT*: 1·8. Under variance homogeneity (models *ME* and *INT*) a larger value of λ is necessary to induce normality of residuals.

Four out of the six models were chosen to illustrate the effect of the transformation on the degree of skewness of the data (results for models *INT* and *ME*, and for models *ME–INT* and *INT–INT*, are very similar). Figure 1, top panel, shows posterior distributions of the coefficient of skewness (4) from the four chosen models fitted to untransformed data. The bottom panel shows the same distributions obtained from analyses of the transformed data. The first figure of the top panel indicates very severe left skewness under the *ME* model (the distribution of the coefficient of skewness has a posterior mean less than zero), which is only partly accommodated by fitting either the *ME–ME* model (second figure in top panel), or other heterogeneous variance models. The figures in the bottom panel show that the Monte Carlo estimate of the posterior distributions of the coefficient of skewness from analyses of transformed data are symmetrically distributed around a mean of zero, indicating absence of skewness. This provides strong evidence for the need of transforming the data.

### (ii) The effect of genotype on environmental variance: analysis of the complete data

Table 3 shows the measure of global fit of the models when data are analysed in the original metric or in the transformed scale. The figures indicate that in all cases, the quality of fit improves when the analysis is performed on the transformed scale, and when variance heterogeneity is accounted for. The best fitting model is *INT–ME* followed closely by *INT–INT*. Two other models with interaction effects acting on the mean were also fitted, one with only location–year effects acting on the variance, and the other with only genotype effects. The differences in log-marginal probabilities of the data between these models and model *INT–ME* are −32 and −266, respectively (in favour of the *INT–ME* model). The model with only location–year effects acting on the variance produced a considerably better fit than the one with only genotype effects, and both were outperformed by model *INT–ME*. This indicates that the effect of genotype should not be ignored as a source of environmental variance heterogeneity. However, the major source of environmental variance heterogeneity in these data is due to location–year effects.

Posterior means of the ratio of variances of different genotypes *G*
_{
i
}, *i*=1, 2, 3 (95% posterior intervals in parentheses) obtained under the best fitting model, (*INT–ME*) are shown in Table 4. The analysis in the transformed scale provides support for an effect of genotype on environmental variance (the value of 1 is at the extreme of the posterior interval in the first two cases). As an illustration, the table also shows that an analysis in the original scale (λ=1), without correcting for skewness, can lead to different results.

### (iii) The effect of genotype on environmental variance: analysis within location–years

In order to avoid the overwhelming effect of location– years on variance heterogeneity, five independent analyses were undertaken within the same five location–years in which the number of records per genotype is greater than 110 (see also Table 2). In agreement with the previous two approaches, no support for genetic heterogeneity was detected from the three right-most locations in the table and results are presented in Table 5 for the first two, labelled location–year 1 and 2, respectively. The models fitted are labelled *G-HOM*, with genotype effects on the mean and homogeneous environmental variance, and *G-HET* with genotype effects on both mean and variance. For location–year 1, the maximum likelihood estimates of λ using both models is equal to 2·6. For location–year 2, the ML estimate of λ is 2·4 under *G-HOM* and 1·5 under *G-HET*. For both models, the distribution of residuals in the original scale is left-skewed and symmetry is restored incorporating the transformation parameter in the analysis (not shown).

Table 5 shows the measure of global fit for data analysed in the original and transformed scales. For both location–years, the genetically heterogeneous variance model *G-HET* is favoured under both scales.

Table 6 shows posterior means and 95% posterior intervals of the ratios of variances for the 3 genotypic classes, for the two location–years, with data analysed in the transformed scale under model *G-HET*. There is a very strong evidence for genetic variance heterogeneity in both locations (the value of 1 is excluded from the 95% posterior interval in two out of three cases in both locations). The evidence for genetic variance heterogeneity is stronger in location–year 2, where relative differences of the order of 50% are reported. In these two locations there is no evidence of location–year by genotype interaction at the level of the variance.

## 7. Discussion

The objective of this work was to provide experimental support for the presence of genetic factors operating at the level of the environmental variance. The data available seem at first sight well suited to study this problem, allowing an analysis free from the assumptions inherent of complex models. However, despite the availability of replicated individuals of the same genotype, the absence of a well-defined homogeneous environment created the need to account for a non-genetic source of variation, that could act at the level of mean and variance, and could interact with genotype. This partly interfered with the goal of simplicity, and was dealt with using a non-parametric approach and more formally, applying a Bayesian analysis incorporating a Box–Cox transformation parameter to induce normality. The latter is particularly important because inferences on variances based on normality assumptions can be misleading if the data display skewness, as shown by Yang *et al.* (Reference Yang, Christensen and Sorensen2011).

The non-parametric analysis of the 61 genotype– location–year subclasses led to a rejection of the null hypothesis of equality of variances. To study heterogeneity due to genotypic effects only, analyses were carried out within the largest five location–year classes where the number of records per genotype exceed 110. Out of five such subsets of data marked differences in residual variance between genotypes were detected in two out of the five location–years. The same conclusion was reached using the Bayesian analyses. The latter allowed a proper quantification of the effect of genotypes on variance, leading to estimates of differences of the order of 30 and 50% in environmental variance between genotypes. The best-fitting Bayesian model postulates interaction of genotype by location–year effects on the mean, and main effects of genotype and of location–year at the level of the environmental variance. The Bayesian analysis also allowed establishing that in the current data, the biggest source of variance heterogeneity is due to location–year effects.

The analyses of dry matter grain yield in these data indicate that differences in the effects of genotypes on environmental variance is location–year dependent. This result fits well with recent work in yeast, reporting that levels of noise in the expression of a gene varies across environments and genetic background (Ansel *et al.*, Reference Ansel, Bottin, Rodriguez-Beltran, Damon, Nagarajan, Fehrmann, François and Yvert2008).

Identical number of seeds were planted per plot at each location and germination rate is known to be greater than 95%. Therefore, the number of plants harvested per plot varies little across plots, and this is expected to have a minor effect on plot yield. Any variation between plots that may arise due to this factor contributes to the macro-environmental variation. If variability in number of plants harvested per plot within genotypes were to be associated with genotype-dependent variability in germination rate, this would contribute positively to differences in within genotype variance between genotypes. An approximate idea of the size of this effect can be obtained as follows (Hozo *et al.*, Reference Hozo, Djulbegovic and Hozo2005). Chebyshev's inequality (Mood *et al.*, Reference Mood, Graybill and Boes1974) states that for a random variable *X* with mean μ and variance σ^{2} following an arbitrary distribution,

Setting *k*=3 yields 0·89 in the right-hand side. Therefore, the range *R* covers approximately 6σ and σ≈*R*/6. The average yield per plot is of the order of 10 kg, and with 100 plants per plot, the average contribution to plot yield per plant is 0·1 kg. Assuming that the minimum and maximum number of plants per plot is 90 and 100 plants, the range in yield is 1 kg, which results in a SD from this source equal to 1/6 kg and a variance of around 0·03 kg^{2}. Using the figures of sampling variances *S*
^{2} in Table 2, this only amounts to between 2 and 6% of within genotype variability.

In conclusion, this work has provided strong experimental evidence for the presence of genetic effects operating at the level of the residual variance of dry matter grain yield in maize. However, for a given genotype, its effect on the variance is environment (i.e. location–year) specific. Further, in the present analysis restricted to three genotypes, the biggest contributing source of heterogeneity are location–year effects. An assessment of the possibility of reducing residual variance by selecting specific genotypes would require an analysis involving a larger number of genotypes.

We are grateful to KWS SAAT AG for providing the maize data, and to Professor W. G. Hill for comments on an earlier draft of this manuscript.