Published online by Cambridge University Press: 05 February 2016
Data visualization denotes the techniques of visually presenting complex data sets to achieve goals such as displaying multiple data dimensions simultaneously, connecting related data points from data sets, or showing data distribution patterns. They are of great value for data processing, data analysis, and data presentation activities.
Genomics and functional genomics are the major driving forces for the development and utilization of visualization tools in biological fields. Following the completion of genomic sequencing projects of human and other model organisms around the beginning of this century, our knowledge of genes has jumped to the tens of thousands per species. Expression profiling microarray can generate millions of data points per experiment. The challenge of the huge data set size and the need to integrate different data sources in analyses prompted significant research and development work by both academic and industrial bioinformaticians. As a result, many visualization methods, proposals, and tools for biological data have been developed thus far. This chapter will describe the problems and solutions for the visualization of three basic and largest (thus, most challenging) genomics/functional genomics data types. More specifically, the first two sections will discuss visualization of sequence data and pathway/gene network data, which are two data types specific to genomics and other biology fields. In the third section, we will review visualization methods of numeric data, such as expression profiling data, proteomic data, and genotyping data. Most of the techniques in the section can also be applied to other areas. However, some topics, such as viewing numeric data in the context of genome or pathways, are still biology-specific.
Sequence and genomes
The genome is the complete set of genetic materials for an organism, which includes genes, regulatory and replication-related sequences, as well as non-functional intergenic regions. For most organisms other than RNA viruses, long linear or circular DNA molecules form the biochemical basis of the genome that stores all the genetic information. Visualization of the genome refers to the visual display of the DNA sequences and associated annotations. Depending on the visualization purposes, genome visualization tools can be classified into two categories: sequence viewer for visualizing sequence and annotations, and genome alignment viewer, for comparing different genomes.
To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.