Skip to main content Accessibility help
×
Hostname: page-component-76fb5796d-5g6vh Total loading time: 0 Render date: 2024-04-26T06:54:26.938Z Has data issue: false hasContentIssue false

7 - A Latent Variable Approach for Integrative Clustering of Multiple Genomic Data Types

from Part B - Vertical Integrative Analysis (General Methods)

Published online by Cambridge University Press:  05 September 2015

George Tseng
Affiliation:
University of Pittsburgh
Debashis Ghosh
Affiliation:
Pennsylvania State University
Xianghong Jasmine Zhou
Affiliation:
University of Southern California
Ronglai Shen
Affiliation:
Memorial Sloan Kettering Cancer Center, New York, NY
Get access

Summary

Abstract

Clustering analysis is an unsupervised learning method that aims to group data into subsets based on the similarity among the data points. In gene expression microarray studies, clustering analysis has been used to identify biologically meaningful disease subtypes (samples in the same subtype share similar gene expression profiles), or to discover gene expression modules co-regulated through a similar mechanism. Recent technology advances have facilitated integrated genomic profiling across multiple platforms simultaneously including next-generation sequencing and high throughput array platforms.With the rapid accumulation of multidimensional datasets, there is an increasing need for robust and scalable statistical and computational methods for the analysis of such datasets. This book covers a wide range of topics on information integration of omics datasets. In this Chapter, we briefly review the recent advances in integrative clustering methods with a focus on introducing a latent variable approach developed by the authors and its extensions to perform variable selection, and to account for both discrete and continuous data types in the joint model. We also discuss several important questions in clustering analysis including how to determine the number of clusters and assess cluster stability. Finally, we demonstrate the application of the method to the TCGA colorectal cancer (CRC) dataset which includes whole-exome DNA-sequencing, Affymetrix SNP6.0 array, and RNA-sequencing in 276 CRC samples.

Introduction

Cancer is a heterogeneous disease. Identifying clinically relevant tumor subtypes that correlate with patient outcome (e.g., treatment response, survival) is an important yet difficult task. Over the past years, molecular classification based on microarray gene expression data has led to important discoveries of novel cancer subtypes (Perou et al., 1999; Alizadeh et al., 2000; Sorlie et al., 2001; Lapointe et al., 2003; Hoshida et al., 2003). However, the biological and therapeutic implications of most cancer expression subtypes remain largely unknown due to the lack of understanding of the underlying disease mechanisms. In addition, expression changes may be related to cellular activities independent of tumorigenesis, and therefore leading to subtypes that may not be directly relevant for diagnostic and prognostic purposes.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×