Skip to main content

AlleleCoder: a PERL script for coding co-dominant polymorphism data for PCA

  • Angela M. Baldo (a1), David M. Francis (a2), Martina Caramante (a3), Larry D. Robertson (a1) and Joanne A. Labate (a1)...

A useful biological interpretation of diploid heterozygotes is in terms of the dose of the common allele (0, 1 or 2 copies). We have developed a PERL script that converts FASTA files into coded spreadsheets suitable for principal component analysis. In combination with R and R Commander, two- and three-dimensional plots can be generated for visualizing genetic relationships. Such plots are useful for characterizing plant genetic resources. This method nicely illustrated the spectrum of genetic diversity in tomato landraces and the varieties categorized according to human-mediated dispersal.

Corresponding author
*Corresponding author. E-mail:
Hide All
Fox J (2005) The R commander: a basic statistics graphical user interface to R. Journal of Statistical Software 14: 144.
Horne BD and Camp NJ (2004) Principal component analysis for selection of optimal SNP-sets that capture intragenic genetic variation. Genetic Epidimiology 26: 1121.
Labate JA, Sheffer SM, Balch T and Robertson LD (2011) Diversity and population structure in a geographic sample of tomato accessions. Crop Science. doi: 10.2135/cropsci2010.05.0305 (in press).
Lin Z and Altman RB (2004) Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics 75: 850861.
Peakall R and Smouse PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes 6: 288295.
Pearson WR and Lipman DJ (1988) Improved tools for biological sequence comparison. Proceedings of the National Academic Sciences USA 85: 24442448.
R Development Core Team (2011) A Language and Environment for Statistical Computing. Vienna, Austria. R Foundation for Statistical Computing.
Rohlf FJ (2002) NTSYSpc: Numerical Taxonomy System, Version 2.1. Setauket, NY: Exeter Publishing, Ltd.
Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JG, Korf I, Lapp H, Lehväslaiho H, Matsalla C, Mungall CJ, Osborne BI, Pocock MR, Schattner P, Senger M, Stein LD, Stupka E, Wilkinson MD and Birney E (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Research 12: 16111618.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Plant Genetic Resources
  • ISSN: 1479-2621
  • EISSN: 1479-263X
  • URL: /core/journals/plant-genetic-resources
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Type Description Title
Supplementary Materials

Baldo Supplementary Material
Baldo Supplementary Material

 PDF (249 KB)
249 KB


Full text views

Total number of HTML views: 3
Total number of PDF views: 9 *
Loading metrics...

Abstract views

Total abstract views: 108 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 23rd November 2017. This data will be updated every 24 hours.