Skip to main content Accessibility help
×
Home

Networks' characteristics are important for systems biology

  • ANDREW K. RIDER (a1), TIJANA MILENKOVIĆ (a1), GEOFFREY H. SIWO (a2), RICHARD S. PINAPATI (a2), SCOTT J. EMRICH (a3), MICHAEL T. FERDIG (a2) and NITESH V. CHAWLA (a4)...

Abstract

A fundamental goal of systems biology is to create models that describe relationships between biological components. Networks are an increasingly popular approach to this problem. However, a scientist interested in modeling biological (e.g., gene expression) data as a network is quickly confounded by the fundamental problem: how to construct the network? It is fairly easy to construct a network, but is it the network for the problem being considered? This is an important problem with three fundamental issues: How to weight edges in the network in order to capture actual biological interactions? What is the effect of the type of biological experiment used to collect the data from which the network is constructed? How to prune the weighted edges (or what cut-off to apply)? Differences in the construction of networks could lead to different biological interpretations.

Indeed, we find that there are statistically significant dissimilarities in the functional content and topology between gene co-expression networks constructed using different edge weighting methods, data types, and edge cut-offs. We show that different types of known interactions, such as those found through Affinity Capture-Luminescence or Synthetic Lethality experiments, appear in significantly varying amounts in networks constructed in different ways. Hence, we demonstrate that different biological questions may be answered by the different networks. Consequently, we posit that the approach taken to build a network can be matched to biological questions to get targeted answers. More study is required to understand the implications of different network inference approaches and to draw reliable conclusions from networks used in the field of systems biology.

Copyright

References

Hide All
Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., . . . Sherlock, G. (2000). Gene Ontology: Tool for the unification of biology. Nature Genetics, 25 (1), 2529.
Barabasi, A.-L., & Oltvai, Z. N. (2004). Network biology: Understanding the cell's functional organization. Nature Reviews Genetics, 5 (2), 101113.
Brem, R. B., & Kruglyak, L. (2005). The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proceedings of the National Academy of Sciences of the United States of America, 102 (5), 15721577.
Butte, A. J., & Kohane, I. S. (2000). Mutual information relevance networks: Functional genomic clustering using pairwise entropy measurements. In Altman, R., Dunker, K., Hunter, L., Lauderdale, K., & Klein, T. (Eds.), Pacific symposium for biocomputing, Vol. 5 (pp. 418429). Hawaii.
Carlson, M., Zhang, B., Fang, Z., Mischel, P., Horvath, S., & Nelson, S. (2006). Gene connectivity, function, and sequence conservation: Predictions from modular yeast co-expression networks. BMC Genomics, 7 (1), 40.
Christie, K. R., Hong, E. L., & Cherry, J. M. (2009). Functional annotations for the Saccharomyces cerevisiae genome: The knowns and the known unknowns. Trends in Microbiology, 17 (7), 286294.
Datta, S., & Datta, S. (2003). Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics, 19 (4), 459466.
Demsar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 130.
De Smet, R., & Marchal, K. (2010). Advantages and limitations of current network inference methods. Nature Reviews Microbiology, 8 (10), 717729.
Eisen, M. B., Spellman, P. T., Brown, P. O., & Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America, 95 (25), 1486314868.
Faith, J. J., Hayete, B., Thaden, J. T., Mogno, I., Wierzbowski, J., Cottarel, G., . . . Gardner, T. S. (2007). Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biology, 5 (1), e8.
Feizi, S., Marbach, D., Medard, M., & Kellis, M. (2013). Network deconvolution as a general method to distinguish direct dependencies in networks. Nature Biotechnology, 31 (8), 726733.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486, 75174.
Freeman, L. C. (1977). A set of measures of centrality. Sociometry, 40 (1), 3541.
Grigoriev, A. (2001). A relationship between gene expression and protein interactions on the proteome scale: Analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acids Research, 29 (17), 35133519.
Hanisch, D., Zien, A., Zimmer, R., & Lengauer, T. (2002). Co-clustering of biological networks and gene expression data. Bioinformatics, 18 (Suppl. 1), S145S154.
Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A K-Means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28 (1), 100108.
Ho, H., Milenković, T., Memišević, V., Aruri, J., Pržulj, N., & Ganesan, A. (2010). Protein interaction network uncovers melanogenesis regulatory network components within functional genomics datasets. BMC Systems Biology, 4 (84).
Hughes, T. R., Marton, M. J., Jones, A. R., Roberts, C. J., Stoughton, R., Armour, C. D., . . . Friend, S. H. (2000). Functional discovery via a compendium of expression profiles. Cell, 102 (1), 109126.
Ivliev, A. E., AC't Hoen, P., & Sergeeva, M. G. (2010). Coexpression network analysis identifies transcriptional modules related to proastrocytic differentiation and sprouty signaling in glioma. Cancer Research, 70 (24), 1006010070.
Jansen, R. (2001). Genetical genomics: The added value from segregation. Trends in Genetics, 17 (7), 388391.
Kok, S., & Domingos, P. (2005). Learning the structure of Markov logic networks. In De Raedt, & Wrobel, S. (Eds.), Proceedings of the 22nd International Conference on Machine Learning (pp. 441448). Bonn, Germany: ACM.
Landgrebe, T. C. W., Paclik, P., Duin, R. P. W., & Bradley, A. P. (2006). Precision-recall operating characteristic (P-ROC) curves in imprecise environments. In Tang, Y. Y., Wang, S. P., Lorette, G., Yeung, D. S., & Yan, H. (Eds.), Proceedings of the 18th International Conference on Pattern Recognition, Vol. 4, pp. 123127. Hong Kong.
Luce, R. D., & Perry, A. D. (1949). A method of matrix analysis of group structure. Psychometrika, 14 (2), 95116.
Marbach, D., Prill, R. J., Schaffter, T., Mattiussi, C., Floreano, D., & Stolovitzky, G. (2010). Revealing strengths and weaknesses of methods for gene network inference. Proceedings of the National Academy of Sciences of the United States of America, 107 (14), 62866291.
Margolin, A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Favera, R., & Califano, A. (2006). ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context. BMC Bioinformatics, 7 (Suppl. 1), S7.
Markowetz, F., & Spang, R. (2007). Inferring cellular networks – A review. BMC Bioinformatics, 8 (Suppl. 6).
Mason, M., Fan, G., Plath, K., Zhou, Q., & Horvath, S. (2009). Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells. BMC Genomics, 10 (1), 327.
Meyer, P., Lafitte, F., & Bontempi, G. (2008). MINET: A R/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinformatics, 9 (1), 461.
Milenković, T., Lai, J., & Przulj, N. (2008). GraphCrunch: A tool for large network analyses. BMC Bioinformatics, 9 (1), 70.
Milenković, T., Memišević, V., Bonato, A., & Pržulj, N. (2011). Dominating biological networks. PLOS ONE, 6 (8), e23016.
Nayak, R. R., Kearns, M., Spielman, R. S., & Cheung, V. G. (2009). Coexpression network based on natural variation in human gene expression reveals gene interactions and functions. Genome Research, 19 (11), 19531962.
Parzen, E. (1962). On estimation of a probability density function and mode. The Annals of Mathematical Statistics, 33 (3), 10651076.
Pe'er, D., Regev, A., Elidan, G., & Friedman, N. (2001). Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17 (Suppl. 1), S215–S224.
Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27 (8), 12261238.
Pons, P., & Latapy, M. (2005). Computing communities in large networks using random walks. In Yolum, P., Güngör, T., Gürgen, F., & Özturan, C. (Eds.), Computer and Information Sciences – ISCIS 2005, Vol. 3733, Chap. 31 (pp. 284293). Berlin/Heidelberg: Springer.
Rider, A. K., Siwo, G., Emrich, S. J., Ferdig, M. T., & Chawla, N. V. (2014). A supervised learning approach to the ensemble clustering of genes. International Journal of Data Mining and Bioinformatics, 9 (2), 199219.
Sabidussi, G. (1966). The centrality index of a graph. Psychometrika, 31 (4), 581603.
Smith, E. N., & Kruglyak, L. (2008). Gene-environment interaction in yeast gene expression. PLoS Biology, 6 (4), e83.
Solava, R. W., Michaels, R. P., & Milenković, T. (2012). Graphlet-based edge clustering reveals pathogen-interacting proteins. Bioinformatics, 18 (28), i480i486. Also, in Proceedings of the 11th European Conference on Computational Biology (ECCB), Basel, Switzerland, September 9–12, 2012 (acceptance rate: 14%).
Steuer, R., Kurths, J., Daub, C. O., Weise, J., & Selbig, J. (2002). The mutual information: Detecting and evaluating dependencies between variables. Bioinformatics, 18 (Suppl. 2), S231S240.
Ucar, D., Neuhaus, I., Ross-MacDonald, P., Tilford, C., Parthasarathy, S., Siemers, N., & Ji, R. R. (2007). Construction of a reference gene association network from multiple profiling data: application to data analysis. Bioinformatics, 23 (20), 27162724.
van de Vijver, M. J., He, Y. D., van't Veer, L. J., Dai, H., Hart, A. A., Voskuil, D. W., . . . Bernards, R. (2002). A gene-expression signature as a predictor of survival in breast cancer. New England Journal of Medicine, 347 (25), 19992009.
van Noort, V., Snel, B., & Huynen, M. A. (2004). The yeast coexpression network has a small-world, scale-free architecture and can be explained by a simple model. EMBO Reports, 5 (3), 280284.
van't Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M., . . . Friend, S. H. (2002). Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415 (6871), 530536.
Wittkop, T., Baumbach, J., Lobo, F. P., & Rahmann, S. (2007). Large scale clustering of protein sequences with force-a layout based heuristic for weighted cluster editing. BMC Bioinformatics, 8 (1), 396.
Zhou, X., Kao, M.-C. C., & Hung, W. (2002). Transitive functional annotation by shortest-path analysis of gene expression data. Proceedings of the National Academy of Sciences of the United States of America, 99 (20), 1278312788.
Zhu, J., Zhang, B., Smith, E. N., Drees, B., Brem, R. B., Kruglyak, L., Bumgarner, R. E., & Schadt, E. E. (2008). Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nature Genetics, 40 (7), 854861.

Keywords

Type Description Title
PDF
Supplementary materials

Rider Supplementary Material
Figures and Tables

 PDF (528 KB)
528 KB

Networks' characteristics are important for systems biology

  • ANDREW K. RIDER (a1), TIJANA MILENKOVIĆ (a1), GEOFFREY H. SIWO (a2), RICHARD S. PINAPATI (a2), SCOTT J. EMRICH (a3), MICHAEL T. FERDIG (a2) and NITESH V. CHAWLA (a4)...

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed