Skip to main content Accessibility help

AlleleCoder: a PERL script for coding co-dominant polymorphism data for PCA

  • Angela M. Baldo (a1), David M. Francis (a2), Martina Caramante (a3), Larry D. Robertson (a1) and Joanne A. Labate (a1)...

A useful biological interpretation of diploid heterozygotes is in terms of the dose of the common allele (0, 1 or 2 copies). We have developed a PERL script that converts FASTA files into coded spreadsheets suitable for principal component analysis. In combination with R and R Commander, two- and three-dimensional plots can be generated for visualizing genetic relationships. Such plots are useful for characterizing plant genetic resources. This method nicely illustrated the spectrum of genetic diversity in tomato landraces and the varieties categorized according to human-mediated dispersal.

Corresponding author
*Corresponding author. E-mail:
Hide All
Fox, J (2005) The R commander: a basic statistics graphical user interface to R. Journal of Statistical Software 14: 144.
Horne, BD and Camp, NJ (2004) Principal component analysis for selection of optimal SNP-sets that capture intragenic genetic variation. Genetic Epidimiology 26: 1121.
Labate, JA, Sheffer, SM, Balch, T and Robertson, LD (2011) Diversity and population structure in a geographic sample of tomato accessions. Crop Science. doi: 10.2135/cropsci2010.05.0305 (in press).
Lin, Z and Altman, RB (2004) Finding haplotype tagging SNPs by use of principal components analysis. American Journal of Human Genetics 75: 850861.
Peakall, R and Smouse, PE (2006) GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular Ecology Notes 6: 288295.
Pearson, WR and Lipman, DJ (1988) Improved tools for biological sequence comparison. Proceedings of the National Academic Sciences USA 85: 24442448.
R Development Core Team (2011) A Language and Environment for Statistical Computing. Vienna, Austria. R Foundation for Statistical Computing.
Rohlf, FJ (2002) NTSYSpc: Numerical Taxonomy System, Version 2.1. Setauket, NY: Exeter Publishing, Ltd.
Stajich, JE, Block, D, Boulez, K, Brenner, SE, Chervitz, SA, Dagdigian, C, Fuellen, G, Gilbert, JG, Korf, I, Lapp, H, Lehväslaiho, H, Matsalla, C, Mungall, CJ, Osborne, BI, Pocock, MR, Schattner, P, Senger, M, Stein, LD, Stupka, E, Wilkinson, MD and Birney, E (2002) The Bioperl toolkit: Perl modules for the life sciences. Genome Research 12: 16111618.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Plant Genetic Resources
  • ISSN: 1479-2621
  • EISSN: 1479-263X
  • URL: /core/journals/plant-genetic-resources
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Type Description Title
Supplementary materials

Baldo Supplementary Material
Baldo Supplementary Material

 PDF (249 KB)
249 KB


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed