Hostname: page-component-6766d58669-h8lrw Total loading time: 0 Render date: 2026-05-23T02:26:45.660Z Has data issue: false hasContentIssue false

Improved Lasso for genomic selection

Published online by Cambridge University Press:  14 December 2010

ANDRÉS LEGARRA*
Affiliation:
INRA, UR 631 SAGA, F-31326 Castanet-Tolosan, France
CHRISTÈLE ROBERT-GRANIÉ
Affiliation:
INRA, UR 631 SAGA, F-31326 Castanet-Tolosan, France
PASCAL CROISEAU
Affiliation:
INRA, UMR1313 GABI, F-78352 Jouy en Josas, France
FRANÇOIS GUILLAUME
Affiliation:
Institut de l'Elevage, F-75595 Paris, France
SÉBASTIEN FRITZ
Affiliation:
UNCEIA, F-75595 Paris, France
*
*Corresponding author. INRA, UR 631 SAGA, BP52627, F-31326 Castanet Tolosan, France. Tel: +33561285182. Fax: +33561285353. e-mail: andres.legarra@toulouse.inra.fr
Rights & Permissions [Opens in a new window]

Summary

Empirical experience with genomic selection in dairy cattle suggests that the distribution of the effects of single nucleotide polymorphisms (SNPs) might be far from normality for some traits. An alternative, avoiding the use of arbitrary prior information, is the Bayesian Lasso (BL). Regular BL uses a common variance parameter for residual and SNP effects (BL1Var). We propose here a BL with different residual and SNP effect variances (BL2Var), equivalent to the original Lasso formulation. The λ parameter in Lasso is related to genetic variation in the population. We also suggest precomputing individual variances of SNP effects by BL2Var, to be later used in a linear mixed model (HetVar-GBLUP). Models were tested in a cross-validation design including 1756 Holstein and 678 Montbéliarde French bulls, with 1216 and 451 bulls used as training data; 51 325 and 49 625 polymorphic SNP were used. Milk production traits were tested. Other methods tested included linear mixed models using variances inferred from pedigree estimates or integrated out from the data. Estimates of genetic variation in the population were close to pedigree estimates in BL2Var but not in BL1Var. BL1Var shrank breeding values too little because of the common variance. BL2Var was the most accurate method for prediction and accommodated well major genes, in particular for fat percentage. BL1Var was the least accurate. HetVar-GBLUP was almost as accurate as BL2Var and allows for simple computations and extensions.

Information

Type
Research Papers
Copyright
Copyright © Cambridge University Press 2010
Figure 0

Table 1. Estimates of ‘sharpness’ parameter λ (±se) in Holstein

Figure 1

Table 2. Results in Montbéliarde: estimates (±se) of ‘sharpness’ parameter λ, of population genetic variance σu2 and accuracies r (correlations between GEBVs and 2DYDs in the validation data set)

Figure 2

Table 3. Estimates of population genetic variance σu2se) in Holstein

Figure 3

Table 4. Accuracies: correlations between GEBVs and 2DYDs in the validation data set, in Holstein

Figure 4

Table 5. Regression coefficients b of 2DYDs on GEBVs in the validation data set, in Holstein

Figure 5

Table 6. Correlation among GEBVs in the validation data set predicted by various methods for milk yield (above diagonal) and fat percentage (below diagonal), in Holstein

Figure 6

Fig. 1. Estimated effects of SNP loci for FP in Holstein by the HetVar-GBLUP method, for chromosomes 13 (crosses) and 14 (rounds).

Figure 7

Fig. 2. Theoretical distribution of SNP effects for fat content according to estimates of σe2, σa2 and λ in BL2Var (continuous line), BL1Var (grey dashed line) and MCMC-GBLUP normal model (dotted black line). The figure has been scaled so that the normal distribution has a variance of 1.