Hostname: page-component-77f85d65b8-grvzd Total loading time: 0 Render date: 2026-04-17T16:21:27.744Z Has data issue: false hasContentIssue false

Fine tuning genomic evaluations in dairy cattle through SNP pre-selection with the Elastic-Net algorithm

Published online by Cambridge University Press:  22 December 2011

PASCAL CROISEAU*
Affiliation:
INRA, UMR1313 – Génétique Animale et Biologie Intégrative, 78352 Jouy en Josas, France
ANDRÉS LEGARRA
Affiliation:
INRA, UR 631, Station d'Amélioration Génétique des Animaux, F-31320 Castanet-Tolosan, France
FRANÇOIS GUILLAUME
Affiliation:
Institut de l’élevage, 149 rue de Bercy, 75595 Paris, France
SÉBASTIEN FRITZ
Affiliation:
UNCEIA, 149 rue de Bercy, 75595 Paris, France
AURÉLIA BAUR
Affiliation:
UNCEIA, 149 rue de Bercy, 75595 Paris, France
CARINE COLOMBANI
Affiliation:
INRA, UR 631, Station d'Amélioration Génétique des Animaux, F-31320 Castanet-Tolosan, France
CHRISTÈLE ROBERT-GRANIÉ
Affiliation:
INRA, UR 631, Station d'Amélioration Génétique des Animaux, F-31320 Castanet-Tolosan, France
DIDIER BOICHARD
Affiliation:
INRA, UMR1313 – Génétique Animale et Biologie Intégrative, 78352 Jouy en Josas, France
VINCENT DUCROCQ
Affiliation:
INRA, UMR1313 – Génétique Animale et Biologie Intégrative, 78352 Jouy en Josas, France
*
*Corresponding author: INRA, UMR1313 – Génétique Animale et Biologie Intégrative, 78352 Jouy en Josas, France. E-mail: pascal.croiseau@jouy.inra.fr
Rights & Permissions [Opens in a new window]

Summary

For genomic selection methods, the statistical challenge is to estimate the effect of each of the available single-nucleotide polymorphism (SNP). In a context where the number of SNPs (p) is much higher than the number of bulls (n), this task may lead to a poor estimation of these SNP effects if, as for genomic BLUP (gBLUP), all SNPs have a non-null effect. An alternative is to use approaches that have been developed specifically to solve the ‘p>>n’ problem. This is the case of variable selection methods and among them, we focus on the Elastic-Net (EN) algorithm that is a penalized regression approach. Performances of EN, gBLUP and pedigree-based BLUP were compared with data from three French dairy cattle breeds, giving very encouraging results for EN. We tried to push further the idea of improving SNP effect estimates by considering fewer of them. This variable selection strategy was considered both in the case of gBLUP and EN by adding an SNP pre-selection step based on quantitative trait locus (QTL) detection. Similar results were observed with or without a pre-selection step, in terms of correlations between direct genomic value (DGV) and observed daughter yield deviation in a validation data set. However, when applied to the EN algorithm, this strategy led to a substantial reduction of the number of SNPs included in the prediction equation. In a context where the number of genotyped animals and the number of SNPs gets larger and larger, SNP pre-selection strongly alleviates computing requirements and ensures that national evaluations can be completed within a reasonable time frame.

Information

Type
Research Papers
Copyright
Copyright © Cambridge University Press 2011
Figure 0

Table 1. Number of animals genotyped per data set for the three breeds studied

Figure 1

Table 2. Optimal α and λ parameters and corresponding number of SNPs with non-null effect for the six traits studied and for the three breeds using the EN procedure on the complete set of SNPs

Figure 2

Table 3. Number of LRT peaks identified for milk yield as a function of LRT threshold and window size in the Montbéliarde, Normande and Holstein breeds

Figure 3

Table 4. Weighted correlation between DGV and observed DYD for the three breeds obtained using pedigree-based BLUP, gBLUP and EN on the complete set of SNP (54 K) or after a pre-selection of the SNP (PS)

Figure 4

Table 5. Slope of the regression of observed DYD on DGV for the Holstein breed obtained using pedigree-based BLUP, gBLUP and EN on the complete set of SNP (54 K) or after a pre-selection of the SNP (PS)

Figure 5

Table 6. Correlation and number of SNP used in the prediction equation using the EN algorithm on the whole set of SNP (54 K) or after a pre-selection of the SNP (PS) for the Holstein breed

Figure 6

Table 7. Highest correlation and corresponding number of selected SNPs when using the whole set of SNP (54 K), after a pre-selection of the SNP (PS) or when the number of selected SNPs is limited to 2500, 1500 or 1000 in the Holstein breed

Figure 7

Fig. 1. Mean change in correlation (dashed lines) over the 25 traits for Montbéliarde (▪), Normande (•) and Holstein (▴) when the maximum number of SNPs selected by EN is restricted to the value indicated on the x-axis. Continuous lines represent the actual number of selected SNPs.