Skip to main content Accessibility help
×
Hostname: page-component-7c8c6479df-8mjnm Total loading time: 0 Render date: 2024-03-28T01:46:46.621Z Has data issue: false hasContentIssue false

6 - Joining the dots: network analysis of gene perturbation data

Published online by Cambridge University Press:  05 July 2015

Xin Wang
Affiliation:
Cancer Research UK Cambridge Institute
Ke Yuan
Affiliation:
Cancer Research UK Cambridge Institute
Florian Markowetz
Affiliation:
Cancer Research UK Cambridge Institute
Florian Markowetz
Affiliation:
Cancer Research UK Cambridge Institute
Michael Boutros
Affiliation:
German Cancer Research Center, Heidelberg
Get access

Summary

How to link genotypes and phenotypes is a long-standing question in modern biology. Modern high-throughput approaches are key technologies at the forefront of genetic research. They enable the analysis of a biological response to thousands of experimental perturbations and require a tight collaboration between experimental and computational scientists. Perturbation studies and computational approaches have revolutionized research in functional genomics and genetics and promise to lay the foundation for personalized medicine. For modern high-throughput technologies, computation is as important as experimentation. Genome-wide image-based RNA interference (RNAi) screens, for example, are only feasible because of computational techniques. Computational skills to analyse the data have become as important as experimental skills to generate the data.

Design and analysis of phenol typing screens depend on the number of genes perturbed and the richness of the phenotype observed (Figure 6.1). At one extreme are high-throughput screens with single reporters, e.g. a genome-wide screen for new components of a pathway. At the other extreme are perturbations of individual genes with very rich phenotypes, e.g. assessing the effects of a single gene perturbation on several molecular levels over time. Between these two extremes lie a variety of possible screen designs. Two widely used scenarios are small-scale perturbations (<20 genes) of a single target pathway with rich readouts, e.g. a global transcriptional profile, and medium-scale perturbations (hundreds of genes) with multi-parametric readouts, e.g. cell morphology or growth in different media. In the following we will discuss statistical and computational methodologies for functional analysis in all four scenarios.

Scenario 1: Genome-wide screens with single reporters

RNAi screens have been frequently and successfully applied for functional profiling of genes on a large scale (Boutros & Ahringer 2008). The vast majority of these applications use a single phenotype (e.g. cell viability, growth rate, activity of reporter constructs) to characterize the function of genes in specific biological pathways.

Type
Chapter
Information
Systems Genetics
Linking Genotypes and Phenotypes
, pp. 83 - 107
Publisher: Cambridge University Press
Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ahmed, A. & Xing, E. P. (2009), #x2018;Recovering time-varying networks of dependencies in social and biological studies’, Proceedings of the National Academy of Sciences of the USA 106(29), 11 878–11 883.CrossRefGoogle Scholar
Alexa, A., Rahnenfuhrer, J. & Lengauer, T. (2006), ‘Improved scoring of functional groups from gene expression data by decorrelating go graph structure’, Bioinformatics 22(13), 1600–1607.CrossRefGoogle Scholar
Anchang, B., Sadeh, M., Jacob, J., Tresch, A., Vlad, M. et al. (2009), ‘Modeling the temporal interplay of molecular signaling and gene expression by using dynamic nested effects models’, Proceedings of the National Academy of Sciences of the USA 106(16), 6447–6452.CrossRefGoogle Scholar
Arora, S., Gonzales, I., Hagelstrom, R., Beaudry, C., Choudhary, A. et al. (2010), ‘RNAipheno-type profiling of kinases identifies potential therapeutic targets in Ewing's sarcoma’, Molecular Cancer 9(1), 218.CrossRefGoogle Scholar
Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H. et al. (2000), ‘Gene Ontology: tool for the unification of biology’, Nature Genetics 25(1), 25–29.CrossRefGoogle Scholar
Bakal, C., Aach, J., Church, G. & Perrimon, N. (2007), ‘Quantitative morphological signatures define local signaling networks regulating cell morphology’, Science 316(5832), 1753–1756.CrossRefGoogle Scholar
Baryshnikova, A., Costanzo, M., Kim, Y., Ding, H., Koh, J. et al. (2010), ‘Quantitative analysis of fitness and genetic interactions in yeast on a genome scale’, Nature Methods 7(12), 1017–1024.CrossRefGoogle Scholar
Battle, A., Jonikas, M. C., Walter, P., Weissman, J. S. & Koller, D. (2010), ‘Automated identification of pathways from quantitative genetic interaction data’, Molecular Systems Biology 6, 379.CrossRefGoogle Scholar
Bauer, S., Grossmann, S., Vingron, M. & Robinson, P. (2008), ‘Ontologizer 2.0: a multifunctional tool for GO term enrichment analysis and data exploration’, Bioinformatics 24(14), 1650–1651.CrossRefGoogle Scholar
Beiβbarth, T. & Speed, T. (2004), ‘GOstat: find statistically overrepresented Gene Ontologies within a group of genes’, Bioinformatics 20(9), 1464–1465.Google Scholar
Beisser, D., Klau, G., Dandekar, T., Muller, T. & Dittrich, M. (2010), ‘BioNet: an R-package for the functional analysis of biological networks’, Bioinformatics 26(8), 1129–1130.CrossRefGoogle Scholar
Birmingham, A., Selfors, L., Forster, T., Wrobel, D., Kennedy, C. et al. (2009), ‘Statistical methods for analysis of high-throughput RNA interference screens’, Nature Methods 6(8), 569–575.CrossRefGoogle Scholar
Booker, M., Samsonova, A. A., Kwon, Y., Flockhart, I., Mohr, S. E. et al. (2011), ‘False negative rates in Drosophila cell-based RNAi screens: a case study’, BMC Genomics 12, 50.CrossRefGoogle Scholar
Boutros, M. & Ahringer, J. (2008), ‘The art and design of genetic screens: RNA interference’, Nature Reviews Genetics 9(7), 554–566.CrossRefGoogle Scholar
Boutros, M., Brás, L. P. & Huber, W. (2006), ‘Analysis of cell-based RNAi screens’, Genome Biology 7(7), R66.CrossRefGoogle Scholar
Boutros, M., Kiger, A. A., Armknecht, S., Kerr, K., Hild, M. et al. (2004), ‘Genome-wide RNAi analysis ofgrowth and viability in Drosophila cells’, Science 303(5659), 832–835.CrossRefGoogle Scholar
Breitkreutz, B., Stark, C., Reguly, T., Boucher, L., Breitkreutz, A. et al. (2008), ‘The BioGRID interaction database: 2008 update’, Nucleic Acids Research 36 (Suppl 1), D637–D640.Google Scholar
Brideau, C., Gunter, B., Pikounis, B. & Liaw, A. (2003), ‘Improved statistical methods for hit selection in high-throughput screening’, Journal of Biomolecular Screening 8(6), 634–647.CrossRefGoogle Scholar
Castro, M., Wang, X., Fletcher, M., Meyer, K. & Markowetz, F. (2012), ‘RedeR: R/Bioconductor package for representing modular structures, nested networks and multiple levels of hierarchical associations’, Genome Biology 13(4), R29.CrossRefGoogle Scholar
Cheung, H. W., Cowley, G. S., Weir, B. A., Boehm, J. S., Rusin, S. et al. (2011), ‘Systematic investigation of genetic vulnerabilities across cancer cell lines reveals lineage-specific depen-dencies in ovarian cancer’, Proceedings ofthe National Academy ofSciences ofthe USA 108(30), 12372–12377.Google Scholar
Collins, S., Miller, K., Maas, N., Roguev, A., Fillingham, J. et al. (2007), ‘Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map’, Nature 446(7137), 806–810.CrossRefGoogle Scholar
Costanzo, M., Baryshnikova, A., Bellay, J., Kim, Y., Spear, E. et al. (2010), ‘The genetic landscape of a cell’, Science 327(5964), 425.CrossRefGoogle Scholar
Dadgostar, H., Zarnegar, B., Hoffmann, A., Qin, X., Truong, U. et al. (2002), ‘Cooperation of multiple signaling pathways in CD40-regulated gene expression in B lymphocytes’, Proceedings ofthe National Academy ofSciences ofthe USA 99(3), 1497–1502.Google Scholar
de Hoon, M., Imoto, S. & Miyano, S. (2002), ‘A comparison of clustering techniques for gene expression data’, Proceedings of the 10th International Conference on Intelligent Systems for Molecular Biology, Abstract 33A.Google Scholar
Dempster, A., Laird, N. & Rubin, D. (1977), ‘Maximum likelihood from incomplete data via the EM algorithm’, Journal ofthe Royal Statistical Society, Series B (Methodological) 39(1), 1–38.Google Scholar
Echeverri, C. & Perrimon, N. (2006), ‘High-throughput RNAi screening in cultured cells: a user's guide’, Nature Reviews Genetics 7(5), 373–384.CrossRefGoogle Scholar
Echeverri, C., Beachy, P., Baum, B., Boutros, M., Buchholz, F. et al. (2006), ‘Minimizing the risk of reporting false positives in large-scale RNAi screens’, Nature Methods 3(10), 777–779.CrossRefGoogle Scholar
Eisen, M., Spellman, P., Brown, P. & Botstein, D. (1998), ‘Cluster analysis and display of genome-wide expression patterns’, Proceedings ofthe National Academy of Sciences of the USA 95(25), 14863–14 868.Google Scholar
Failmezger, H., Praveen, P., Tresch, A. & Frohlich, H. (2013), ‘Learning gene network structure from time lapse cell imaging in RNAi knockdowns’, Bioinformatics 29(12), 1534–1540.CrossRefGoogle Scholar
Falcon, S. & Gentleman, R. (2007), ‘Using GOstats to test gene lists for GO term association’, Bioinformatics 23(2), 257–258.CrossRefGoogle Scholar
Farha, M. & Brown, E. (2010), ‘Chemical probes of Escherichia coli uncovered through chemical-chemical interaction profiling with compounds of known biological activity’, Chem-istry & Biology 17(8), 852–862.Google Scholar
Friedman, N. (2004), ‘Inferring cellular networks using probabilistic graphical models’, Science 303(5659), 799–805.CrossRefGoogle Scholar
Friedman, N., Linial, M., Nachman, I. & Pe'er, D. (2000), ‘Using Bayesian networks to analyze expression data’, Journal of Computational Biology 7(3–4), 601–620.CrossRefGoogle Scholar
Fröhlich, H., BeiBbarth, T., Tresch, A., Kostka, D., Jacob, J. et al. (2008a), ‘Analyzing gene perturbation screens with nested effects models in R and Bioconductor’, Bioinformatics 24(21), 2549–2550.CrossRefGoogle Scholar
Fröhlich, H., Fellmann, M., Sueltmann, H., Poustka, A. & Beissbarth, T. (2007), ‘Large scale statistical inference of signaling pathways from RNAi and microarray data’, BMC Bioinformatics 8, 386.CrossRefGoogle Scholar
Fröhlich, H., Fellmann, M., Sueltmann, H., Poustka, A. & Beissbarth, T. (2008fc), ‘Estimating large-scale signaling networks through nested effect models with intervention effects from microarray data’, Bioinformatics 24(22), 2650–2656.CrossRefGoogle Scholar
Fröhlich, H., Praveen, P. & Tresch, A. (2011), ‘Fast and efficient dynamic nested effects models’, Bioinformatics 27(2), 238–244.CrossRefGoogle Scholar
Fuchs, F., Pau, G., Kranz, D., Sklyar, O., Budjan, C. et al. (2010), ‘Clustering phenotype populations by genome-wide RNAi and multiparametric imaging’, Molecular Systems Biology 6, 370.CrossRefGoogle Scholar
Geyer, C. (2010), ‘Introduction to Markov chain Monte Carlo’, in S., Brooks, A., Gelman, G., Jones & X.-L., Meng eds., Handbook of Markov chain Monte Carlo, CRC Press, Boca Raton, FL, pp. 3–48.Google Scholar
Green, R., Kao, H., Audhya, A., Arur, S., Mayers, J. et al. (2011), ‘A high-resolution C. ele-gans essential gene network based on phenotypic profiling of a complex tissue’, Cell 145(3), 470–482.CrossRef
Hahne, F., Arlt, D., Sauermann, M., Majety, M., Poustka, A. et al. (2006), ‘Statistical methods and software for the analysis of high-throughput reverse genetic assays using flow cytometry readouts’, Genome Biology 7(8), R77.CrossRefGoogle Scholar
Horn, T., Sandmann, T., Fischer, B., Axelsson, E., Huber, W. et al. (2011), ‘Mapping of signaling networks through synthetic genetic interaction analysis by RNAi’, Nature Methods8(4), 341–346.Google Scholar
House, C. D., Vaske, C. J., Schwartz, A. M., Obias, V., Frank, B. et al. (2010), ‘Voltage-gated Na+ channel SCN5A is a key regulator of a gene transcriptional network that controls colon cancer invasion’, Cancer Research 70(17), 6957–6967.CrossRefGoogle Scholar
Jeffreys, H. (1998), Theory of probability, 3rd edn, Oxford University Press.Google Scholar
Jensen, L. J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C. et al. (2009), ‘String 8: a global view on proteins and their functional interactions in 630 organisms’, Nucleic Acids Research 37(Database issue), D412–D416.CrossRefGoogle Scholar
Kaderali, L., Dazert, E., Zeuge, U., Frese, M. & Bartenschlager, R. (2009), ‘Reconstructing signaling pathways from RNAi data using probabilistic Boolean threshold networks’, Bioinformatics 25(17), 2229–2235.CrossRefGoogle Scholar
Kessler, J., Kahle, K., Sun, T., Meerbrey, K., Schlabach, M. et al. (2012), ‘A SUMOylation-dependent transcriptional subprogram is required for Myc-driven tumorigenesis’, Science 335(6066), 348–353.CrossRefGoogle Scholar
Li, C. & Wong, W. (2001), ‘Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application’, Genome Biology 2(8), 0032.Google Scholar
Liberzon, A., Subramanian, A., Pinchback, R., Thorvaldsdottir, H., Tamayo, P. et al. (2011), ‘Molecular signatures database (MSigDB) 3.0’, Bioinformatics 27(12), 1739–1740.CrossRefGoogle Scholar
Lu, R., Markowetz, F., Unwin, R. D., Leek, J. T., Airoldi, E. M. et al. (2009), ‘Systems level dynamic analyses of fate change in murine embryonic stem cells’, Nature 462(7271), 358–362.CrossRefGoogle Scholar
Maathuis, M. H., Colombo, D., Kalisch, M. & Bhlmann, P. (2010),‘Predicting causal effects in large-scale systems from observational data’, Nature Methods 7(4), 247–248.CrossRefGoogle Scholar
Madigan, D., York, J. & Allard, D. (1995), ‘Bayesian graphical models for discrete data’, International Statistical Review/Revue Internationale de Statistique 63(2), 215–232.Google Scholar
Malo, N., Hanley, J., Cerquozzi, S., Pelletier, J. & Nadon, R. (2006), ‘Statistical practice in high-throughput screening data analysis’, Nature Biotechnology 24(2), 167–175.CrossRefGoogle Scholar
Mani, R., St Onge, R., Hartman, J., Giaever, G. & Roth, F. (2008), ‘Defining genetic interaction’, Proceedings of the National Academy of Sciences of the USA 105(9), 3461–3466.CrossRefGoogle Scholar
Markowetz, F. (2010), ‘How to understand the cell by breaking it: network analysis of gene perturbation screens’, PLoS Computational Biology 6(2), e1000655.CrossRefGoogle Scholar
Markowetz, F. & Spang, R. (2007), ‘Inferring cellular networks - a review’, BMC Bioinformatics 8(Suppl6), S5.CrossRefGoogle Scholar
Markowetz, F., Bloch, J. & Spang, R. (2005a), ‘Non-transcriptional pathway features recon-structed fromsecondary effects of RNA interference’, Bioinformatics 21(21), 4026–4032.CrossRefGoogle Scholar
Markowetz, F., Grossmann, S. & Spang, R. (2005b), ‘Probabilistic soft interventions in conditional Gaussian networks’, Proceedings of 10th International Workshop on Artificial Intelligence and Statistics.Google Scholar
Markowetz, F., Kostka, D., Troyanskaya, O. G. & Spang, R. (2007), ‘Nested effects models for high-dimensional phenotyping screens’, Bioinformatics 23(13), i305–i312.CrossRefGoogle Scholar
Markowetz, F., Mulder, K. W., Airoldi, E. M., Lemischka, I. R. & Troyanskaya, O. G. (2010), ‘Mapping dynamic histone acetylation patterns to gene expression in nanog-depleted murine embryonic stem cells’, PLoS Computational Biology 6(12), e1001034.CrossRefGoogle Scholar
Merico, D., Isserlin, R., Stueker, O., Emili, A. & Bader, G. (2010), ‘Enrichment map: a network-based method for gene-set enrichment visualization and interpretation’, PLoS One 5(11), e13984.CrossRefGoogle Scholar
Mulder, K. W., Wang, X., Escriu, C., Ito, Y., Schwarz, R. F. et al. (2012), ‘Diverse epigenetic strategies interact to control epidermal differentiation’, Nature Cell Biology 14 (7), 753–763.CrossRefGoogle Scholar
Müller, P., Kuttenkeuler, D., Gesellchen, V., Zeidler, M. P. & Boutros, M. (2005), ‘Identification of JAK/STAT signalling components by genome-wide RNA interference’, Nature 436(7052), 871–875.CrossRefGoogle Scholar
Murphy, K. (2002), ‘Dynamic Bayesian networks: representation, inference and learning’, PhD thesis, University of California - Berkeley.
Niederberger, T., Etzold, S., Lidschreiber, M., Maier, K., Martin, D. et al. (2012), ‘MC EMiNEM maps the interaction landscape of the mediator’, PLoS Computational Biology 8(6), e1002568.CrossRefGoogle Scholar
Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H. et al. (1999), ‘KEGG: Kyoto encyclopedia of genes and genomes’, Nucleic Acids Research 27(1), 29–34.CrossRefGoogle Scholar
Orvedahl, A., Sumpter Jr, R., Xiao, G., Ng, A., Zou, Z. et al. (2011), ‘Image-based genome-wide siRNA screen identifies selective autophagy factors’, Nature 480(7375), 113–117.CrossRefGoogle Scholar
Pearl, J. (1988), Probabilistic reasoning in intelligent systems: networks of plausible inference,Morgan Kaufmann, San Mateo, CA.Google Scholar
Pearl, J. (2000), Causality: models, reasoning, and inference, Cambridge University Press.Google Scholar
Pe'er, D. (2005), ‘Bayesian network analysis of signaling networks: a primer’, Science STKE 2005(281), l4.Google Scholar
Pe'er, D., Regev, A., Elidan, G. & Friedman, N. (2001), ‘Inferring subnetworks from perturbed expression profiles’, Bioinformatics 17(Suppl 1), S215–S224.CrossRefGoogle Scholar
Pelz, O., Gilsdorf, M. & Boutros, M. (2010), ‘web-cellHTS2: a web-application for the analysis of high-throughput screening data’, BMC Bioinformatics 11(1), 185.CrossRefGoogle Scholar
Rung, J., Schlitt, T., Brazma, A., Freivalds, K. & Vilo, J. (2002), ‘Building and analysing genome-wide gene disruption networks’, Bioinformatics 18(Suppl 2), S202–S210.CrossRefGoogle Scholar
Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D. A. & Nolan, G. P. (2005), ‘Causal protein-signaling networks derived from multiparameter single-cell data’, Science 308(5721), 523–529.CrossRefGoogle Scholar
Shimoni, Y., Fink, M. Y., Choi, S.-G. & Sealfon, S. C. (2010), ‘Plato's cave algorithm: inferring functional signaling networks from early gene expression shadows’, PLoS Computational Biology 6(6), e1000828.CrossRefGoogle Scholar
Smyth, G. K. (2005), ‘Limma: linear models for microarray data’, in R., GentlemanV., CareyS., DudoitR., Irizarry & W., Huber eds., Bioinformatics and computational biology solutions using R and Bioconductor', Springer, New York pp. 397–420.Google Scholar
Song, L., Kolar, M. & Xing, E. P. (2009), ‘Time-varying dynamic Bayesian networks’, Advances in Neural Information Processing Systems 22, 1732–1740.Google Scholar
Subramanian, A., Tamayo, P., Mootha, V. K., Mukherjee, S., Ebert, B. L. et al. (2005), ‘Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles’, Proceedings of the National Academy of Sciences of the USA 102(43), 15 545–15 550.CrossRefGoogle Scholar
Suzuki, R. & Shimodaira, H. (2006), ‘Pvclust: an R package for assessing the uncertainty in hierarchical clustering’, Bioinformatics 22(12), 1540.Google Scholar
Tong, A., Lesage, G., Bader, G., Ding, H., Xu, H. et al. (2004), ‘Global mapping of the yeast genetic interaction network’, Science 303 (5659), 808.CrossRefGoogle Scholar
Tresch, A. & Markowetz, F. (2008), ‘Structure learning in nested effects models’, Statistical Applications in Genetics and Molecular Biology 7(1), 9.CrossRefGoogle Scholar
Vaske, C. J., House, C., Luu, T., Frank, B., Yeang, C.-H. et al. (2009), ‘A factor graph nested effects model to identify networks from genetic perturbations’, PLoS Computational Biology 5(1), e1000274.CrossRefGoogle Scholar
Wagner, A. (2001), ‘How to reconstruct a large genetic network from n gene perturbations in fewer than n2 easy steps’, Bioinformatics 17(12), 1183–1197.CrossRefGoogle Scholar
Wang, X., Castro, M. A., Mulder, K. W. & Markowetz, F. (2012), ‘Posterior association networks and functional modules inferred from rich phenotypes of gene perturbations’, PLoS Computational Biology 8(6), e1002566.CrossRefGoogle Scholar
Wang, X., Terfve, C., Rose, J. C. & Markowetz, F. (2011), ‘HTSanalyzeR: an R/Bioconductor package for integrated network analysis of high-throughput screens’, Bioinformatics 27(6), 879–880.CrossRefGoogle Scholar
Wang, X., Yuan, K., Hellmayr, C., Liu, W. & Markowetz, F. (2014), ‘Reconstructing evolving signalling networks by hidden Markov nested effects models’, Annals of Applied Statistics 8(1), 448–480.CrossRefGoogle Scholar
Zhang, J., Chung, T. & Oldenburg, K. (1999), ‘A simple statistical parameter for use in evaluation and validation of high throughput screening assays’, Journal ofBiomolecular Screening 4(2), 67–73.Google Scholar
Zhang, X., Yang, X., Chung, N., Gates, A., Stec, E. et al. (2006), ‘Robust statistical methods for hit selection in RNA interference high-throughput screening experiments’, Pharmacogenomics 7(3), 299–309.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×