3 results
2 - An Introduction to The Cancer Genome Atlas
-
- By Bradley M. Broom, The University of Texas, Rehan Akbani, The University of Texas
- Edited by Kim-Anh Do, Zhaohui Steve Qin, Emory University, Atlanta, Marina Vannucci, Rice University, Houston
-
- Book:
- Advances in Statistical Bioinformatics
- Published online:
- 05 June 2013
- Print publication:
- 10 June 2013, pp 31-53
-
- Chapter
- Export citation
-
Summary
Introduction
The Cancer Genome Atlas (TCGA) is an ambitious undertaking of the National Institutes of Health (NIH), jointly led by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), to identify all key genomic changes in the major types and subtypes of cancer. In the following section, we briefly review the history and goals of the TCGA project. Section 2.3 describes how samples are collected and analyzed by the TCGA. Section 2.4 details how data are processed, stored, and made available to qualified researchers. Section 2.5 briefly surveys several widely available tools that can be used to analyze TCGA data. Section 2.6 summarizes the chapter.
History and Goals of the TCGA Project
At the turn of the century, it was clear (Balmain et al., 2003) that genomic alterations played a key role in cancer development and progression and that understanding these changes would be enormously important for devising improved methods for diagnosing clinically relevant cancer subtypes and for developing novel molecular therapies aimed at a specific cancer subtype. Several successful treatments for targeting cancer cells with specific genomic changes had been developed – for instance, Gleevec for chronic myeloid leukemia and Herceptin for breast cancer. Early experiments to determine the genomic basis of specific cancers had made it clear that the scope of the genomic changes concerned was enormously complex: an individual cancer could involve hundreds or thousands of genomic alterations, and these changes were for the most part specific to the cancer concerned.
12 - Methods for the Analysis of Copy Number Data in Cancer Research
-
- By Bradley M. Broom, The University of Texas, Kim-Anh Do, The University of Texas, Melissa Bondy, Baylor College of Medicine, Patricia Thompson, University of Arizona, Kevin Coombes, The University of Texas
- Edited by Kim-Anh Do, Zhaohui Steve Qin, Emory University, Atlanta, Marina Vannucci, Rice University, Houston
-
- Book:
- Advances in Statistical Bioinformatics
- Published online:
- 05 June 2013
- Print publication:
- 10 June 2013, pp 244-271
-
- Chapter
- Export citation
-
Summary
Introduction
Cancers are fundamentally caused by genomic changes in the cancer cells that lead to their uncontrolled growth (Balmain et al., 2003; Stratton et al., 2009). Understanding these changes, which include DNA copy number alterations, is an intense focus of current research into the causes of, and potential therapies for, every type of cancer. Major research projects, such as the Cancer Genome Atlas (TCGA) project (The Cancer Genome Atlas Research Network, 2008), aim to comprehensively catalog all genomic changes in cancer. This chapter discusses the problem of interpreting copy number data, specifically in the context of cancer research.
To measure copy number, whole-genome genotyping array assays hybridize sample DNA to oligonucleotides deposited on the array. Modern designs use synthetic oligonucleotides to measure copy number at frequent intervals along the genome, especially in regions of known copy number variation. Modern arrays also include many probes that target both alleles of a large number of common single-nucleotide polymorphisms (SNPs). These platforms are therefore widely used in genotyping studies. Array-based assays available for measuring genome-wide copy number include arrays from Illumina, Sentrix, Agilent, and Affymetrix. Data from next-generation sequencing of DNA can also be used to detect copy number alterations and is rapidly becoming cost competitive with array-based platforms.
Molecular inversion probe (MIP) arrays (Wang et al., 2007, 2009; Ji and Welch, 2009) are another platform that can be used for large-scale copy number analysis and genotyping. MIP technology uses less DNA, can handle lower quality DNA, has a greater dynamic range, has higher quality markers, and better separates allelic information than other array-based approaches.
20 - Computational Methods for Learning Bayesian Networks from High-Throughput Biological Data
- Edited by Kim-Anh Do, University of Texas, MD Anderson Cancer Center, Peter Müller, Swiss Federal Institute of Technology, Zürich, Marina Vannucci, Rice University, Houston
-
- Book:
- Bayesian Inference for Gene Expression and Proteomics
- Published online:
- 23 November 2009
- Print publication:
- 24 July 2006, pp 385-400
-
- Chapter
- Export citation
-
Summary
Abstract
Data from high-throughput technologies, such as gene expression microarrays, promise to yield insight into the nature of the cellular processes that have been disrupted by disease, thus improving our understanding of the disease and hastening the discovery of effective new treatments. Most of the analysis thus far has focused on identifying differential measurements, which form the basis of biomarker discovery. However, merely listing differentially expressed genes or gene products is not sufficient to explain the molecular basis of disease. Consequently, there is increasing interest in extracting more information from available data in the form of biologically meaningful relationships between the quantities being measured. The holy grail of such techniques is the robust identification of causal models of disease from data.
The goal of this chapter is to survey computational learning methods that extract models of altered interactions that lead to and occur in the diseased state. Our focus is on methods that represent biological processes as Bayesian networks and that learn these networks from experimental measurements of cellular activity. Specifically, we will survey computational methods for learning Bayesian networks from high-throughput biological data.
Introduction
Many diseases, especially cancers, involve the disruption or deregulation of many cellular processes. It is hoped that high-throughput technologies, such as gene expression microarrays – which provide a snapshot of the level of gene transcription occurring in a cell, for many thousands of genes – will yield insight into the nature of the affected processes, improve our understanding of the disease, and hasten the discovery of effective new treatments.
However, merely identifying differential measurements is not enough.