Skip to main content Accessibility help

Improved Lasso for genomic selection



Empirical experience with genomic selection in dairy cattle suggests that the distribution of the effects of single nucleotide polymorphisms (SNPs) might be far from normality for some traits. An alternative, avoiding the use of arbitrary prior information, is the Bayesian Lasso (BL). Regular BL uses a common variance parameter for residual and SNP effects (BL1Var). We propose here a BL with different residual and SNP effect variances (BL2Var), equivalent to the original Lasso formulation. The λ parameter in Lasso is related to genetic variation in the population. We also suggest precomputing individual variances of SNP effects by BL2Var, to be later used in a linear mixed model (HetVar-GBLUP). Models were tested in a cross-validation design including 1756 Holstein and 678 Montbéliarde French bulls, with 1216 and 451 bulls used as training data; 51 325 and 49 625 polymorphic SNP were used. Milk production traits were tested. Other methods tested included linear mixed models using variances inferred from pedigree estimates or integrated out from the data. Estimates of genetic variation in the population were close to pedigree estimates in BL2Var but not in BL1Var. BL1Var shrank breeding values too little because of the common variance. BL2Var was the most accurate method for prediction and accommodated well major genes, in particular for fat percentage. BL1Var was the least accurate. HetVar-GBLUP was almost as accurate as BL2Var and allows for simple computations and extensions.

  • View HTML
    • Send article to Kindle

      To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

      Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Improved Lasso for genomic selection
      Available formats

      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

      Improved Lasso for genomic selection
      Available formats

      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

      Improved Lasso for genomic selection
      Available formats


Corresponding author

*Corresponding author. INRA, UR 631 SAGA, BP52627, F-31326 Castanet Tolosan, France. Tel: +33561285182. Fax: +33561285353. e-mail:


Hide All
Aguilar, I., Misztal, I., Johnson, D. L., Legarra, A., Tsuruta, S. & Lawlor, T. J. (2010). A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. Journal of Dairy Science 93, 743752.
Calus, M. P. L., Meuwissen, T. H. E., de Roos, A. P. W. & Veerkamp, R. F. (2008). Accuracy of genomic selection using different methods to define haplotypes. Genetics 178, 553561.
Christensen, O. F. & Lund, M. S. (2010). Genomic prediction when some animals are not genotyped. Genetics Selection Evolution 42, 2.
de los Campos, G., Naya, H., Gianola, D., Crossa, J., Legarra, A., Manfredi, E., Weigel, K. & Cotes, J. M. (2009). Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics 182, 375385.
Falconer, D. & Mackay, T. (1996). Introduction to Quantitative Genetics. New York: Longman.
Gautier, M., Capitan, A., Fritz, S., Eggen, A., Boichard, D. & Druet, T. (2007). Characterization of the DGAT1K232A and variable number of tandem repeat polymorphisms in French dairy cattle. Journal of Dairy Science 90, 29802988.
Gianola, D., de los Campos, G., Hill, W. G., Manfredi, E. & Fernando, R. L. (2009). Additive genetic variability and the Bayesian alphabet. Genetics 183, 347363.
Gianola, D. & Fernando, R. L. (1986). Bayesian methods in animal breeding theory. Journal of Animal Science 63, 217244.
Goddard, M. (2008). Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245257.
Grisart, B., Coppieters, W., Farnir, F., Karim, L., Ford, C., Berzi, P., Cambisano, N., Mni, M., Reid, S., Simon, P., Spelman, R., Georges, M. & Snell, R. (2002). Positional candidate cloning of a QTL in dairy cattle: Identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition. Genome Research 12, 222231.
Guillaume, F., Fritz, S., Boichard, D.& Druet, T. (2008). Correlations of marker-assisted breeding values with progeny-test breeding values for eight hundred ninety-nine French Holstein bulls. Journal of Dairy Science 91, 25202522.
Habier, D., Fernando, R. L. & Dekkers, J. C. M. (2007). The impact of genetic relationship information on genome-assisted breeding values. Genetics 177, 23892397.
Hayes, B. J., Bowman, P. J., Chamberlain, A. J. & Goddard, M. E. (2009). Invited review: Genomic selection in dairy cattle: Progress and challenges. Journal of Dairy Science 92, 433443.
Henderson, C. R. (1984). Applications of Linear Models in Animal Breeding. Guelph: University of Guelph.
Hill, W. G. (2010). Understanding and using quantitative genetic variation. Philosophical Transactions of the Royal Society B 365, 7385.
Kizilkaya, K., Fernando, R. L. & Garrick, D. J. (2010). Genomic prediction of simulated multibreed and purebred performance using observed fifty thousand single nucleotide polymorphism genotypes. Journal of Animal Science 88, 544551.
Legarra, A., Aguilar, I. & Misztal, I. (2009). A relationship matrix including full pedigree and genomic information. Journal of Dairy Science 92, 46564663.
Legarra, A. & Misztal, I. (2008). Technical note: Computing strategies in Genome-wide selection. Journal of Dairy Science 91, 360366.
Legarra, A., Robert-Granié, C., Manfredi, E. & Elsen, J. M. (2008). Performance of genomic selection in mice. Genetics 180, 611618.
Luan, T., Woolliams, J. A., Lien, S., Kent, M., Svendsen, M. & Meuwissen, T. H. E. (2009). The accuracy of Genomic Selection in Norwegian red cattle assessed by cross-validation. Genetics 183, 11191126.
Lund, M. S., Sahana, G., de Koning, D. J., Su, G. & Carlborg, O. (2009). Comparison of analyses of the QTLMAS XII common dataset. I: Genomic selection. BMC Proceedings 3, S1.
Mäntysaari, E., Liu, Z. & Van Raden, P. (2010). Interbull validation test for genomic evaluations. Interbull Bulletin 41.
Meuwissen, T. H. E., Hayes, B. J. & Goddard, M. E. (2001). Prediction of total genetic value using genome-wide dense marker maps. Genetics, 157, 18191829.
Michael, J., Schucany, W. & Haas, R. (1976). Generating random variates using transformations with multiple roots. American Statistician 30, 8890.
Misztal, I., Tsuruta, S., Strabel, T., Auvray, B., Druet, T. & Lee, D. (2002). BLUPF90 and related programs (BGF90). In Seventh World Congress on Genetics Applied to Livestock Production, 2002, CD-ROM Communication N° 28–07.
Park, T. & Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association 103, 681686.
Patterson, H. & Thompson, R. (1971). Recovery of inter-block information when block sizes are unequal. Biometrika 58, 545554.
Peers, I. (1996). Statistical Analysis for Education and Psychology Researchers. Washington, DC: The Falmer Press.
Sorensen, D. & Gianola, D. (2002). Likelihood, Bayesian and MCMC Methods in Quantitative Genetics. New York: Springer.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B 58, 267288.
Usai, M. G., Goddard, M. E. & Hayes, B. J. (2009). LASSO with cross-validation for genomic selection. Genetics Research 91, 427436.
Van Raden, P. M. (2008). Efficient Methods to Compute Genomic Predictions. Journal of Dairy Science 91, 44144423.
Van Raden, P. M., Tooker, M. E. & Cole, J. B. (2009 a). Can you believe those genomic evaluations for young bulls? Journal of Animal Science 87(E-Suppl. 2), 314(abstr. 279).
Van Raden, P. M., Van Tassell, C. P., Wiggans, G. R., Sonstegard, T. S., Schnabel, R. D., Taylor, J. F. & Schenkel, F. S. (2009 b). Invited review: Reliability of genomic predictions for North American Holstein bulls. Journal of Dairy Science 92, 1624.
Van Raden, P. M. & Wiggans, G. R. (1991). Derivation, calculation, and use of national animal model information. Journal of Dairy Science 74, 27372746.
Van Tassell, C. P. & Van Vleck, L. D. (1996). Multiple-trait Gibbs sampler for animal models: flexible programs for Bayesian and likelihood-based (co)variance component inference. Journal of Animal Science 74, 25862597.
Verbyla, K. L., Hayes, B. J., Bowman, P. J. & Goddard, M. E. (2009). Accuracy of genomic selection using stochastic search variable selection in Australian Holstein Friesian dairy cattle. Genetics Research 91, 307311.
Weigel, K. A., de los Campos, G., González-Recio, O., Naya, H., Wu, X. L., Long, N., Rosa, G. J. M. & Gianola, D. (2009). Predictive ability of direct genomic values for lifetime net merit of Holstein sires using selected subsets of single nucleotide polymorphism markers. Journal of Dairy Science 92, 52485257.


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed