Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-wg55d Total loading time: 0 Render date: 2024-05-06T13:22:21.950Z Has data issue: false hasContentIssue false

9 - Multiple Sequence Alignment

from PART II - MOLECULAR PHYLOGENETICS

Published online by Cambridge University Press:  26 October 2017

Tandy Warnow
Affiliation:
University of Illinois, Urbana-Champaign
Get access

Summary

Introduction

Phylogeny estimation generally begins by estimating a multiple sequence alignment on the set of sequences. Once the multiple sequence alignment is computed, a tree can then be computed on the alignment (Figure 9.1). Not surprisingly, errors in multiple sequence alignment estimation tend to produce errors in estimated trees (Ogden and Rosenberg, 2006; Nelesen et al., 2008; Liu et al., 2009a; Wang et al., 2012) and other downstream analyses. Hence, multiple sequence alignment is an important part of phylogeny estimation.

As we have seen, there are many methods for estimating trees from gap-free data. However, because multiple sequence alignments almost always contain gaps, represented as dashes, phylogeny estimation methods must be modified to be able to analyze alignments with dashes. Typically this is performed by treating the dashes as missing data (i.e., missing data means there is an actual nucleotide or amino acid, but it is not known). Alternatively, the dashes are sometimes treated as an additional state in the sequence evolution model, thus producing five states for nucleotide alignments or 21 states for amino acid alignments. Finally, sometimes sites (i.e., columns in the multiple sequence alignment) containing dashes are eliminated from the alignment before a tree is computed. The different treatments of sequence alignments can result in quite different theoretical and empirical performance.

Multiple sequence alignments are computed for different purposes, including phylogeny estimation and protein structure prediction, and the definition of what constitutes a correct alignment depends, at least in part, on the purpose for the alignment. For some biological datasets, curated alignments, typically based on experimentally confirmed structural features of the molecules (e.g., secondary structures or tertiary structures of RNAs and proteins), are used as benchmarks for evaluating alignment methods. Examples of such benchmarks for evaluating large amino acid alignments include HomFam (Sievers et al., 2011), BAliBASE (Thompson et al., 1999), and the 10AA collection (Nguyen et al., 2015b), while the Comparative Ribosomal Website (CRW) provides benchmarks for RNA alignment (Cannone et al., 2002). Evolutionary alignments, on the other hand, are defined by the evolutionary history relating the sequences.

Type
Chapter
Information
Computational Phylogenetics
An Introduction to Designing Methods for Phylogeny Estimation
, pp. 178 - 233
Publisher: Cambridge University Press
Print publication year: 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Multiple Sequence Alignment
  • Tandy Warnow, University of Illinois, Urbana-Champaign
  • Book: Computational Phylogenetics
  • Online publication: 26 October 2017
  • Chapter DOI: https://doi.org/10.1017/9781316882313.011
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • Multiple Sequence Alignment
  • Tandy Warnow, University of Illinois, Urbana-Champaign
  • Book: Computational Phylogenetics
  • Online publication: 26 October 2017
  • Chapter DOI: https://doi.org/10.1017/9781316882313.011
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • Multiple Sequence Alignment
  • Tandy Warnow, University of Illinois, Urbana-Champaign
  • Book: Computational Phylogenetics
  • Online publication: 26 October 2017
  • Chapter DOI: https://doi.org/10.1017/9781316882313.011
Available formats
×