Preface

Veli Mäkinen; Djamal Belazzougui; Fabio Cunial; Alexandru I. Tomescu

doi:10.1017/CBO9781139940023.002

Preface

Published online by Cambridge University Press: 05 May 2015

Fabio Cunial and

Veli Mäkinen: Affiliation:
University of Helsinki
Djamal Belazzougui: Affiliation:
University of Helsinki
Fabio Cunial: Affiliation:
University of Helsinki
Alexandru I. Tomescu: Affiliation:
University of Helsinki

Book contents

Get access

Summary

Background

High-throughput sequencing has recently revolutionized the field of biological sequence analysis, both by stimulating the development of fundamentally new data structures and algorithms, and by changing the routine workflow of biomedical labs. Most key analytical steps now exploit index structures based on the Burrows–Wheeler transform, which have been under active development in theoretical computer science for over ten years. The ability of these structures to scale to very large datasets quickly led to their widespread adoption by the bioinformatics community, and their flexibility continues to spur new applications in genomics, transcriptomics, and metagenomics. Despite their fast and still ongoing development, the key techniques behind these indexes are by now well understood, and they are ready to be taught in graduate-level computer science courses.

This book focuses on the rigorous description of the fundamental algorithms and data structures that power modern sequence analysis workflows, ranging from the foundations of biological sequence analysis (like alignments and hidden Markov models) and classical index structures (like k-mer indexes, suffix arrays, and suffix trees), to Burrows–Wheeler indexes and to a number of advanced omics applications built on such a basis. The topics and the computational problems are chosen to cover the actual steps of large-scale sequencing projects, including read alignment, variant calling, haplotyping, fragment assembly, alignment-free genome comparison, compression of genome collections and of read sets, transcript prediction, and analysis of metagenomic samples: see Figure 1 for a schematic summary of all the main steps and data structures covered in this book. Although strongly motivated by high-throughput sequencing, many of the algorithms and data structures described in this book are general, and can be applied to a number of other fields that require the processing of massive sets of sequences. Most of the book builds on a coherent, self-contained set of algorithmic techniques and tools, which are gradually introduced, developed, and refined from the basics to more advanced variations.

Information

Type: Chapter
Information: Genome-Scale Algorithm Design
Biological Sequence Analysis in the Era of High-Throughput Sequencing
, pp. xvii - xxii

DOI: https://doi.org/10.1017/CBO9781139940023.002 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this chapter is currently unknown and may be updated in the future.

Book contents

Preface

Summary

Information

Access options

Book purchase

Temporarily unavailable

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Save book to Kindle

Save book to Dropbox

Save book to Google Drive