To send content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about sending content to .
To send content items to your Kindle, first ensure email@example.com
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about sending to your Kindle.
Note you can select to send to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The contributions in this part of the book were all written by students, postdocs and visitors who in some way were involved in the graduate course Algebraic Statistics for Computational Biology that we taught in the mathematics department at UC Berkeley during the fall of 2004. The eighteen chapters offer a more in-depth study of some of the themes which were introduced in Part I. Most of the chapters contain original research that has not been published elsewhere. Highlights among new research results include:
New results about polytope propagation and parametric inference (Chapters 5, 6 and 8).
An example of a biologically correct alignment which is not the optimal alignment for any choice of parameters in the pair HMM (Chapter 7).
Theorem 9.3 which states that the number of inference functions of a graphical model grows polynomially for fixed number of parameters.
Theorem 10.5 which states that, for alphabets with four or more letters, every toric Viterbi sequence is a Viterbi sequence.
Explicit calculations of phylogenetic invariants for the strand symmetric model which interpolate between the general reversible model and group based models (Chapter 16).
Tree reconstruction based on singular value decomposition (Chapter 19).
The other chapters also include new mathematical results or methodological advances in computational biology. Chapter 15 introduces a standardized framework for working with small trees. Even results on the smallest non-trivial tree (with three leaves) are interesting, and are discussed in Chapter 18. Similarly, Chapter 14 presents a unified algebraic statistical view of mutagenetic tree models.
Part I of this book is devoted to outlining the basic principles of algebraic statistics and their relationship to computational biology. Although some of the ideas are complex, and their relationships intricate, the underlying philosophy of our approach to biological sequence analysis is summarized in the cartoon on the cover of the book. The fictional character is DiaNA, who appears throughout the book, and is the statistical surrogate for our biological intuition. In the cartoon, DiaNA is walking randomly on a graph and is tossing tetrahedral dice that can land on one of the letters A, C, G or T. A key feature of the tosses is that the outcome depends on the direction of her route. We, the observers, record the letters that appear on the successive throws, but are unable to see the path that DiaNA takes on her graph. Our goal is to guess DiaNA's path from the die roll outcomes. That is, we wish to make an inference about missing data from certain observations.
In this book, the observed data are DNA sequences. A standard problem of computational biology is to infer an optimal alignment for two given DNA sequences. We shall see that this problem is precisely our example of guessing DiaNA's path. In Chapter 4 we give an introduction to the relevant biological concepts, and we argue that our example is not just a toy problem but is fundamental for designing efficient algorithms for analyzing real biological data.
The quantitative analysis of biological sequence data is based on methods from statistics coupled with efficient algorithms from computer science. Algebra provides a framework for unifying many of the seemingly disparate techniques used by computational biologists. This book, first published in 2005, offers an introduction to this mathematical framework and describes tools from computational algebra for designing new algorithms for exact, accurate results. These algorithms can be applied to biological problems such as aligning genomes, finding genes and constructing phylogenies. The first part of this book consists of four chapters on the themes of Statistics, Computation, Algebra and Biology, offering speedy, self-contained introductions to the emerging field of algebraic statistics and its applications to genomics. In the second part, the four themes are combined and developed to tackle real problems in computational genomics. As the first book in the exciting and dynamic area, it will be welcomed as a text for self-study or for advanced undergraduate and beginning graduate courses.
Email your librarian or administrator to recommend adding this to your organisation's collection.