Joining the dots: network analysis of gene perturbation data

doi:10.1017/CBO9781139012751.006

6 - Joining the dots: network analysis of gene perturbation data

Published online by Cambridge University Press: 05 July 2015

Xin Wang ,

Ke Yuan and

Florian Markowetz

Edited by

Florian Markowetz and

Michael Boutros

Show author details

Xin Wang: Affiliation:
Cancer Research UK Cambridge Institute
Ke Yuan: Affiliation:
Cancer Research UK Cambridge Institute
Florian Markowetz: Affiliation:
Cancer Research UK Cambridge Institute
Florian Markowetz: Affiliation:
Cancer Research UK Cambridge Institute
Michael Boutros: Affiliation:
German Cancer Research Center, Heidelberg

Book contents

Get access

Summary

How to link genotypes and phenotypes is a long-standing question in modern biology. Modern high-throughput approaches are key technologies at the forefront of genetic research. They enable the analysis of a biological response to thousands of experimental perturbations and require a tight collaboration between experimental and computational scientists. Perturbation studies and computational approaches have revolutionized research in functional genomics and genetics and promise to lay the foundation for personalized medicine. For modern high-throughput technologies, computation is as important as experimentation. Genome-wide image-based RNA interference (RNAi) screens, for example, are only feasible because of computational techniques. Computational skills to analyse the data have become as important as experimental skills to generate the data.

Design and analysis of phenol typing screens depend on the number of genes perturbed and the richness of the phenotype observed (Figure 6.1). At one extreme are high-throughput screens with single reporters, e.g. a genome-wide screen for new components of a pathway. At the other extreme are perturbations of individual genes with very rich phenotypes, e.g. assessing the effects of a single gene perturbation on several molecular levels over time. Between these two extremes lie a variety of possible screen designs. Two widely used scenarios are small-scale perturbations (<20 genes) of a single target pathway with rich readouts, e.g. a global transcriptional profile, and medium-scale perturbations (hundreds of genes) with multi-parametric readouts, e.g. cell morphology or growth in different media. In the following we will discuss statistical and computational methodologies for functional analysis in all four scenarios.

Scenario 1: Genome-wide screens with single reporters

RNAi screens have been frequently and successfully applied for functional profiling of genes on a large scale (Boutros & Ahringer 2008). The vast majority of these applications use a single phenotype (e.g. cell viability, growth rate, activity of reporter constructs) to characterize the function of genes in specific biological pathways.

Information

Type: Chapter
Information: Systems Genetics
Linking Genotypes and Phenotypes
, pp. 83 - 107

DOI: https://doi.org/10.1017/CBO9781139012751.006 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Ahmed, A. & Xing, E. P. (2009), #x2018;Recovering time-varying networks of dependencies in social and biological studies’, Proceedings of the National Academy of Sciences of the USA 106(29), 11 878–11 883.CrossRef Google Scholar

Alexa, A., Rahnenfuhrer, J. & Lengauer, T. (2006), ‘Improved scoring of functional groups from gene expression data by decorrelating go graph structure’, Bioinformatics 22(13), 1600–1607.CrossRef Google Scholar

Anchang, B., Sadeh, M., Jacob, J., Tresch, A., Vlad, M. et al. (2009), ‘Modeling the temporal interplay of molecular signaling and gene expression by using dynamic nested effects models’, Proceedings of the National Academy of Sciences of the USA 106(16), 6447–6452.CrossRef Google Scholar

Arora, S., Gonzales, I., Hagelstrom, R., Beaudry, C., Choudhary, A. et al. (2010), ‘RNAipheno-type profiling of kinases identifies potential therapeutic targets in Ewing's sarcoma’, Molecular Cancer 9(1), 218.CrossRef Google Scholar

Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H. et al. (2000), ‘Gene Ontology: tool for the unification of biology’, Nature Genetics 25(1), 25–29.CrossRef Google Scholar

Bakal, C., Aach, J., Church, G. & Perrimon, N. (2007), ‘Quantitative morphological signatures define local signaling networks regulating cell morphology’, Science 316(5832), 1753–1756.CrossRef Google Scholar

Baryshnikova, A., Costanzo, M., Kim, Y., Ding, H., Koh, J. et al. (2010), ‘Quantitative analysis of fitness and genetic interactions in yeast on a genome scale’, Nature Methods 7(12), 1017–1024.CrossRef Google Scholar

Battle, A., Jonikas, M. C., Walter, P., Weissman, J. S. & Koller, D. (2010), ‘Automated identification of pathways from quantitative genetic interaction data’, Molecular Systems Biology 6, 379.CrossRef Google Scholar

Bauer, S., Grossmann, S., Vingron, M. & Robinson, P. (2008), ‘Ontologizer 2.0: a multifunctional tool for GO term enrichment analysis and data exploration’, Bioinformatics 24(14), 1650–1651.CrossRef Google Scholar

Beiβbarth, T. & Speed, T. (2004), ‘GOstat: find statistically overrepresented Gene Ontologies within a group of genes’, Bioinformatics 20(9), 1464–1465.Google Scholar

Beisser, D., Klau, G., Dandekar, T., Muller, T. & Dittrich, M. (2010), ‘BioNet: an R-package for the functional analysis of biological networks’, Bioinformatics 26(8), 1129–1130.CrossRef Google Scholar

Birmingham, A., Selfors, L., Forster, T., Wrobel, D., Kennedy, C. et al. (2009), ‘Statistical methods for analysis of high-throughput RNA interference screens’, Nature Methods 6(8), 569–575.CrossRef Google Scholar

Booker, M., Samsonova, A. A., Kwon, Y., Flockhart, I., Mohr, S. E. et al. (2011), ‘False negative rates in Drosophila cell-based RNAi screens: a case study’, BMC Genomics 12, 50.CrossRef Google Scholar

Boutros, M. & Ahringer, J. (2008), ‘The art and design of genetic screens: RNA interference’, Nature Reviews Genetics 9(7), 554–566.CrossRef Google Scholar

Boutros, M., Brás, L. P. & Huber, W. (2006), ‘Analysis of cell-based RNAi screens’, Genome Biology 7(7), R66.CrossRef Google Scholar

Boutros, M., Kiger, A. A., Armknecht, S., Kerr, K., Hild, M. et al. (2004), ‘Genome-wide RNAi analysis ofgrowth and viability in Drosophila cells’, Science 303(5659), 832–835.CrossRef Google Scholar

Breitkreutz, B., Stark, C., Reguly, T., Boucher, L., Breitkreutz, A. et al. (2008), ‘The BioGRID interaction database: 2008 update’, Nucleic Acids Research 36 (Suppl 1), D637–D640.Google Scholar

Brideau, C., Gunter, B., Pikounis, B. & Liaw, A. (2003), ‘Improved statistical methods for hit selection in high-throughput screening’, Journal of Biomolecular Screening 8(6), 634–647.CrossRef Google Scholar

Castro, M., Wang, X., Fletcher, M., Meyer, K. & Markowetz, F. (2012), ‘RedeR: R/Bioconductor package for representing modular structures, nested networks and multiple levels of hierarchical associations’, Genome Biology 13(4), R29.CrossRef Google Scholar

Cheung, H. W., Cowley, G. S., Weir, B. A., Boehm, J. S., Rusin, S. et al. (2011), ‘Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific depen-dencies in ovarian cancer’, Proceedings ofthe National Academy ofSciences ofthe USA 108(30), 12372–12377.Google Scholar

Collins, S., Miller, K., Maas, N., Roguev, A., Fillingham, J. et al. (2007), ‘Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map’, Nature 446(7137), 806–810.CrossRef Google Scholar

Costanzo, M., Baryshnikova, A., Bellay, J., Kim, Y., Spear, E. et al. (2010), ‘The genetic landscape of a cell’, Science 327(5964), 425.CrossRef Google Scholar

Dadgostar, H., Zarnegar, B., Hoffmann, A., Qin, X., Truong, U. et al. (2002), ‘Cooperation of multiple signaling pathways in CD40-regulated gene expression in B lymphocytes’, Proceedings ofthe National Academy ofSciences ofthe USA 99(3), 1497–1502.Google Scholar

de Hoon, M., Imoto, S. & Miyano, S. (2002), ‘A comparison of clustering techniques for gene expression data’, Proceedings of the 10th International Conference on Intelligent Systems for Molecular Biology, Abstract 33A.Google Scholar

Dempster, A., Laird, N. & Rubin, D. (1977), ‘Maximum likelihood from incomplete data via the EM algorithm’, Journal ofthe Royal Statistical Society, Series B (Methodological) 39(1), 1–38.Google Scholar

Echeverri, C. & Perrimon, N. (2006), ‘High-throughput RNAi screening in cultured cells: a user's guide’, Nature Reviews Genetics 7(5), 373–384.CrossRef Google Scholar

Echeverri, C., Beachy, P., Baum, B., Boutros, M., Buchholz, F. et al. (2006), ‘Minimizing the risk of reporting false positives in large-scale RNAi screens’, Nature Methods 3(10), 777–779.CrossRef Google Scholar

Eisen, M., Spellman, P., Brown, P. & Botstein, D. (1998), ‘Cluster analysis and display of genome-wide expression patterns’, Proceedings ofthe National Academy of Sciences of the USA 95(25), 14863–14 868.Google Scholar

Failmezger, H., Praveen, P., Tresch, A. & Frohlich, H. (2013), ‘Learning gene network structure from time lapse cell imaging in RNAi knockdowns’, Bioinformatics 29(12), 1534–1540.CrossRef Google Scholar

Falcon, S. & Gentleman, R. (2007), ‘Using GOstats to test gene lists for GO term association’, Bioinformatics 23(2), 257–258.CrossRef Google Scholar

Farha, M. & Brown, E. (2010), ‘Chemical probes of Escherichia coli uncovered through chemical-chemical interaction profiling with compounds of known biological activity’, Chem-istry & Biology 17(8), 852–862.Google Scholar

Friedman, N. (2004), ‘Inferring cellular networks using probabilistic graphical models’, Science 303(5659), 799–805.CrossRef Google Scholar

Friedman, N., Linial, M., Nachman, I. & Pe'er, D. (2000), ‘Using Bayesian networks to analyze expression data’, Journal of Computational Biology 7(3–4), 601–620.CrossRef Google Scholar

Fröhlich, H., BeiBbarth, T., Tresch, A., Kostka, D., Jacob, J. et al. (2008a), ‘Analyzing gene perturbation screens with nested effects models in R and Bioconductor’, Bioinformatics 24(21), 2549–2550.CrossRef Google Scholar

Fröhlich, H., Fellmann, M., Sueltmann, H., Poustka, A. & Beissbarth, T. (2007), ‘Large scale statistical inference of signaling pathways from RNAi and microarray data’, BMC Bioinformatics 8, 386.CrossRef Google Scholar

Fröhlich, H., Fellmann, M., Sueltmann, H., Poustka, A. & Beissbarth, T. (2008fc), ‘Estimating large-scale signaling networks through nested effect models with intervention effects from microarray data’, Bioinformatics 24(22), 2650–2656.CrossRef Google Scholar

Fröhlich, H., Praveen, P. & Tresch, A. (2011), ‘Fast and efficient dynamic nested effects models’, Bioinformatics 27(2), 238–244.CrossRef Google Scholar

Fuchs, F., Pau, G., Kranz, D., Sklyar, O., Budjan, C. et al. (2010), ‘Clustering phenotype populations by genome-wide RNAi and multiparametric imaging’, Molecular Systems Biology 6, 370.CrossRef Google Scholar

Geyer, C. (2010), ‘Introduction to Markov chain Monte Carlo’, in S., Brooks, A., Gelman, G., Jones & X.-L., Meng eds., Handbook of Markov chain Monte Carlo, CRC Press, Boca Raton, FL, pp. 3–48.Google Scholar

Green, R., Kao, H., Audhya, A., Arur, S., Mayers, J. et al. (2011), ‘A high-resolution C. ele-gans essential gene network based on phenotypic profiling of a complex tissue’, Cell 145(3), 470–482.CrossRef

Hahne, F., Arlt, D., Sauermann, M., Majety, M., Poustka, A. et al. (2006), ‘Statistical methods and software for the analysis of high-throughput reverse genetic assays using flow cytometry readouts’, Genome Biology 7(8), R77.CrossRef Google Scholar

Horn, T., Sandmann, T., Fischer, B., Axelsson, E., Huber, W. et al. (2011), ‘Mapping of signaling networks through synthetic genetic interaction analysis by RNAi’, Nature Methods8(4), 341–346.Google Scholar

House, C. D., Vaske, C. J., Schwartz, A. M., Obias, V., Frank, B. et al. (2010), ‘Voltage-gated Na+ channel SCN5A is a key regulator of a gene transcriptional network that controls colon cancer invasion’, Cancer Research 70(17), 6957–6967.CrossRef Google Scholar

Jeffreys, H. (1998), Theory of probability, 3rd edn, Oxford University Press.Google Scholar

Jensen, L. J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C. et al. (2009), ‘String 8: a global view on proteins and their functional interactions in 630 organisms’, Nucleic Acids Research 37(Database issue), D412–D416.CrossRef Google Scholar

Kaderali, L., Dazert, E., Zeuge, U., Frese, M. & Bartenschlager, R. (2009), ‘Reconstructing signaling pathways from RNAi data using probabilistic Boolean threshold networks’, Bioinformatics 25(17), 2229–2235.CrossRef Google Scholar

Kessler, J., Kahle, K., Sun, T., Meerbrey, K., Schlabach, M. et al. (2012), ‘A SUMOylation-dependent transcriptional subprogram is required for Myc-driven tumorigenesis’, Science 335(6066), 348–353.CrossRef Google Scholar

Li, C. & Wong, W. (2001), ‘Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application’, Genome Biology 2(8), 0032.Google Scholar

Liberzon, A., Subramanian, A., Pinchback, R., Thorvaldsdottir, H., Tamayo, P. et al. (2011), ‘Molecular signatures database (MSigDB) 3.0’, Bioinformatics 27(12), 1739–1740.CrossRef Google Scholar

Lu, R., Markowetz, F., Unwin, R. D., Leek, J. T., Airoldi, E. M. et al. (2009), ‘Systems level dynamic analyses of fate change in murine embryonic stem cells’, Nature 462(7271), 358–362.CrossRef Google Scholar

Maathuis, M. H., Colombo, D., Kalisch, M. & Bhlmann, P. (2010),‘Predicting causal effects in large-scale systems from observational data’, Nature Methods 7(4), 247–248.CrossRef Google Scholar

Madigan, D., York, J. & Allard, D. (1995), ‘Bayesian graphical models for discrete data’, International Statistical Review/Revue Internationale de Statistique 63(2), 215–232.Google Scholar

Malo, N., Hanley, J., Cerquozzi, S., Pelletier, J. & Nadon, R. (2006), ‘Statistical practice in high-throughput screening data analysis’, Nature Biotechnology 24(2), 167–175.CrossRef Google Scholar

Mani, R., St Onge, R., Hartman, J., Giaever, G. & Roth, F. (2008), ‘Defining genetic interaction’, Proceedings of the National Academy of Sciences of the USA 105(9), 3461–3466.CrossRef Google Scholar

Markowetz, F. (2010), ‘How to understand the cell by breaking it: network analysis of gene perturbation screens’, PLoS Computational Biology 6(2), e1000655.CrossRef Google Scholar

Markowetz, F. & Spang, R. (2007), ‘Inferring cellular networks - a review’, BMC Bioinformatics 8(Suppl6), S5.CrossRef Google Scholar

Markowetz, F., Bloch, J. & Spang, R. (2005a), ‘Non-transcriptional pathway features recon-structed fromsecondary effects of RNA interference’, Bioinformatics 21(21), 4026–4032.CrossRef Google Scholar

Markowetz, F., Grossmann, S. & Spang, R. (2005b), ‘Probabilistic soft interventions in conditional Gaussian networks’, Proceedings of 10th International Workshop on Artificial Intelligence and Statistics.Google Scholar

Markowetz, F., Kostka, D., Troyanskaya, O. G. & Spang, R. (2007), ‘Nested effects models for high-dimensional phenotyping screens’, Bioinformatics 23(13), i305–i312.CrossRef Google Scholar

Markowetz, F., Mulder, K. W., Airoldi, E. M., Lemischka, I. R. & Troyanskaya, O. G. (2010), ‘Mapping dynamic histone acetylation patterns to gene expression in nanog-depleted murine embryonic stem cells’, PLoS Computational Biology 6(12), e1001034.CrossRef Google Scholar

Merico, D., Isserlin, R., Stueker, O., Emili, A. & Bader, G. (2010), ‘Enrichment map: a network-based method for gene-set enrichment visualization and interpretation’, PLoS One 5(11), e13984.CrossRef Google Scholar

Mulder, K. W., Wang, X., Escriu, C., Ito, Y., Schwarz, R. F. et al. (2012), ‘Diverse epigenetic strategies interact to control epidermal differentiation’, Nature Cell Biology 14 (7), 753–763.CrossRef Google Scholar

Müller, P., Kuttenkeuler, D., Gesellchen, V., Zeidler, M. P. & Boutros, M. (2005), ‘Identification of JAK/STAT signalling components by genome-wide RNA interference’, Nature 436(7052), 871–875.CrossRef Google Scholar

Murphy, K. (2002), ‘Dynamic Bayesian networks: representation, inference and learning’, PhD thesis, University of California - Berkeley.

Niederberger, T., Etzold, S., Lidschreiber, M., Maier, K., Martin, D. et al. (2012), ‘MC EMiNEM maps the interaction landscape of the mediator’, PLoS Computational Biology 8(6), e1002568.CrossRef Google Scholar

Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H. et al. (1999), ‘KEGG: Kyoto encyclopedia of genes and genomes’, Nucleic Acids Research 27(1), 29–34.CrossRef Google Scholar

Orvedahl, A., Sumpter Jr, R., Xiao, G., Ng, A., Zou, Z. et al. (2011), ‘Image-based genome-wide siRNA screen identifies selective autophagy factors’, Nature 480(7375), 113–117.CrossRef Google Scholar

Pearl, J. (1988), Probabilistic reasoning in intelligent systems: networks of plausible inference,Morgan Kaufmann, San Mateo, CA.Google Scholar

Pearl, J. (2000), Causality: models, reasoning, and inference, Cambridge University Press.Google Scholar

Pe'er, D. (2005), ‘Bayesian network analysis of signaling networks: a primer’, Science STKE 2005(281), l4.Google Scholar

Pe'er, D., Regev, A., Elidan, G. & Friedman, N. (2001), ‘Inferring subnetworks from perturbed expression profiles’, Bioinformatics 17(Suppl 1), S215–S224.CrossRef Google Scholar

Pelz, O., Gilsdorf, M. & Boutros, M. (2010), ‘web-cellHTS2: a web-application for the analysis of high-throughput screening data’, BMC Bioinformatics 11(1), 185.CrossRef Google Scholar

Rung, J., Schlitt, T., Brazma, A., Freivalds, K. & Vilo, J. (2002), ‘Building and analysing genome-wide gene disruption networks’, Bioinformatics 18(Suppl 2), S202–S210.CrossRef Google Scholar

Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D. A. & Nolan, G. P. (2005), ‘Causal protein-signaling networks derived from multiparameter single-cell data’, Science 308(5721), 523–529.CrossRef Google Scholar

Shimoni, Y., Fink, M. Y., Choi, S.-G. & Sealfon, S. C. (2010), ‘Plato's cave algorithm: inferring functional signaling networks from early gene expression shadows’, PLoS Computational Biology 6(6), e1000828.CrossRef Google Scholar

Smyth, G. K. (2005), ‘Limma: linear models for microarray data’, in R., GentlemanV., CareyS., DudoitR., Irizarry & W., Huber eds., Bioinformatics and computational biology solutions using R and Bioconductor', Springer, New York pp. 397–420.Google Scholar

Song, L., Kolar, M. & Xing, E. P. (2009), ‘Time-varying dynamic Bayesian networks’, Advances in Neural Information Processing Systems 22, 1732–1740.Google Scholar

Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L. et al. (2005), ‘Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles’, Proceedings of the National Academy of Sciences of the USA 102(43), 15 545–15 550.CrossRef Google Scholar

Suzuki, R. & Shimodaira, H. (2006), ‘Pvclust: an R package for assessing the uncertainty in hierarchical clustering’, Bioinformatics 22(12), 1540.Google Scholar

Tong, A., Lesage, G., Bader, G., Ding, H., Xu, H. et al. (2004), ‘Global mapping of the yeast genetic interaction network’, Science 303 (5659), 808.CrossRef Google Scholar

Tresch, A. & Markowetz, F. (2008), ‘Structure learning in nested effects models’, Statistical Applications in Genetics and Molecular Biology 7(1), 9.CrossRef Google Scholar

Vaske, C. J., House, C., Luu, T., Frank, B., Yeang, C.-H. et al. (2009), ‘A factor graph nested effects model to identify networks from genetic perturbations’, PLoS Computational Biology 5(1), e1000274.CrossRef Google Scholar

Wagner, A. (2001), ‘How to reconstruct a large genetic network from n gene perturbations in fewer than n2 easy steps’, Bioinformatics 17(12), 1183–1197.CrossRef Google Scholar

Wang, X., Castro, M. A., Mulder, K. W. & Markowetz, F. (2012), ‘Posterior association networks and functional modules inferred from rich phenotypes of gene perturbations’, PLoS Computational Biology 8(6), e1002566.CrossRef Google Scholar

Wang, X., Terfve, C., Rose, J. C. & Markowetz, F. (2011), ‘HTSanalyzeR: an R/Bioconductor package for integrated network analysis of high-throughput screens’, Bioinformatics 27(6), 879–880.CrossRef Google Scholar

Wang, X., Yuan, K., Hellmayr, C., Liu, W. & Markowetz, F. (2014), ‘Reconstructing evolving signalling networks by hidden Markov nested effects models’, Annals of Applied Statistics 8(1), 448–480.CrossRef Google Scholar

Zhang, J., Chung, T. & Oldenburg, K. (1999), ‘A simple statistical parameter for use in evaluation and validation of high throughput screening assays’, Journal ofBiomolecular Screening 4(2), 67–73.Google Scholar

Zhang, X., Yang, X., Chung, N., Gates, A., Stec, E. et al. (2006), ‘Robust statistical methods for hit selection in RNA interference high-throughput screening experiments’, Pharmacogenomics 7(3), 299–309.Google Scholar

Accessibility standard: Unknown

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.