Book contents
- Frontmatter
- Contents
- Preface
- I Exact String Matching: The Fundamental String Problem
- II Suffix Trees and Their Uses
- III Inexact Matching, Sequence Alignment, Dynamic Programming
- IV Currents, Cousins, and Cameos
- 16 Maps, Mapping, Sequencing, and Superstrings
- 17 Strings and Evolutionary Trees
- 18 Three Short Topics
- 19 Models of Genome-Level Mutations
- Epilogue – where next?
- Bibliography
- Glossary
- Index
16 - Maps, Mapping, Sequencing, and Superstrings
from IV - Currents, Cousins, and Cameos
Published online by Cambridge University Press: 23 June 2010
- Frontmatter
- Contents
- Preface
- I Exact String Matching: The Fundamental String Problem
- II Suffix Trees and Their Uses
- III Inexact Matching, Sequence Alignment, Dynamic Programming
- IV Currents, Cousins, and Cameos
- 16 Maps, Mapping, Sequencing, and Superstrings
- 17 Strings and Evolutionary Trees
- 18 Three Short Topics
- 19 Models of Genome-Level Mutations
- Epilogue – where next?
- Bibliography
- Glossary
- Index
Summary
A look at some DNA mapping and sequencing problems
In this chapter we consider a number of theoretical and practical issues in creating and using genome maps and in large-scale (genomic) DNA sequencing. These areas are considered in this book for two reasons: First, we want to more completely explain the origin of molecular sequence data, since string problems on such data provide a large part of the motivation for studying string algorithms in general. Second, we need to more completely explain specific problems on strings that arise in obtaining molecular sequence data.
We start with a discussion of mapping in general and the distinction between physical maps and genetic maps. This leads to the discussion of several physical mapping techniques such as STS-content mapping and radiation-hybrid mapping. Our discussion emphasizes the combinatorial and computational aspects common to those techniques. We follow with a discussion of the tightest layout problem, and a short introduction to map comparison and map alignment. Then we move to large-scale sequencing and its relation to physical mapping. We emphasize shotgun sequencing and the string problems involved in sequence assembly under the shotgun strategy. Shotgun sequencing leads naturally to a beautiful pure string problem, the shortest common superstring problem. This pure, exact string problem is motivated by the practical problem of shotgun sequence assembly and deserves attention if only for the elegance of the results that have been obtained.
- Type
- Chapter
- Information
- Algorithms on Strings, Trees, and SequencesComputer Science and Computational Biology, pp. 395 - 446Publisher: Cambridge University PressPrint publication year: 1997