Generating weighted and thresholded gene coexpression networks using signed distance correlation

Javier Pardo-Diaz; Philip S. Poole; Mariano Beguerisse-Díaz; Charlotte M. Deane; Gesine Reinert

doi:10.1017/nws.2022.13

Generating weighted and thresholded gene coexpression networks using signed distance correlation

Published online by Cambridge University Press: 16 June 2022

Javier Pardo-Diaz

Philip S. Poole ,

Mariano Beguerisse-Díaz ,

Charlotte M. Deane and

Gesine Reinert

Show author details

Javier Pardo-Diaz: Affiliation:
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
Philip S. Poole: Affiliation:
Department of Plant Sciences, University of Oxford, Oxford OX1 3RB, UK
Mariano Beguerisse-Díaz: Affiliation:
Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK
Charlotte M. Deane: Affiliation:
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
Gesine Reinert*: Affiliation:
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
*: *Corresponding author. Email: reinert@stats.ox.ac.uk

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Even within well-studied organisms, many genes lack useful functional annotations. One way to generate such functional information is to infer biological relationships between genes or proteins, using a network of gene coexpression data that includes functional annotations. Signed distance correlation has proved useful for the construction of unweighted gene coexpression networks. However, transforming correlation values into unweighted networks may lead to a loss of important biological information related to the intensity of the correlation. Here, we introduce a principled method to construct weighted gene coexpression networks using signed distance correlation. These networks contain weighted edges only between those pairs of genes whose correlation value is higher than a given threshold. We analyze data from different organisms and find that networks generated with our method based on signed distance correlation are more stable and capture more biological information compared to networks obtained from Pearson correlation. Moreover, we show that signed distance correlation networks capture more biological information than unweighted networks based on the same metric. While we use biological data sets to illustrate the method, the approach is general and can be used to construct networks in other domains. Code and data are available on https://github.com/javier-pardodiaz/sdcorGCN.

Keywords

gene expression robustness weighted networks correlation

Information

Type: Research Article
Information: Network Science , Volume 10 , Issue 2 , June 2022 , pp. 131 - 145

DOI: https://doi.org/10.1017/nws.2022.13 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Action Editor: Christoph Stadtfeld

A preliminary version of this paper was presented at the Ninth International Conference on Complex Networks and their Applications (COMPLEX NETWORKS 2020).

References

Bar-Joseph, Z., Gerber, G. K., Lee, T. I., Rinaldi, N. J., Yoo, J. Y., Robert, F., … Young, R. A. (2003). Computational discovery of gene modules and regulatory networks. Nature Biotechnology, 21(11), 1337–1342.CrossRef Google Scholar PubMed

Bernhardt, B. C., Chen, Z., He, Y., Evans, A. C., & Bernasconi, N. (2011). Graph-theoretical analysis reveals disrupted small-world organization of cortical thickness correlation networks in temporal lobe epilepsy. Cerebral Cortex, 21(9), 2147–2157.Google Scholar PubMed

Bolstad, B. M., Irizarry, R. A., Åstrand, M., & Speed, T. P. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19(2), 185–193.CrossRef Google Scholar PubMed

Bozhilova, L. V., Pardo-Diaz, J., Reinert, G., & Deane, C. M. (2020). COGENT: Evaluating the consistency of gene co-expression networks. Bioinformatics, 09. btaa787.CrossRef Google Scholar

Chen, X., Yin, J., Qu, J., & Huang, L. (2018). MDHGI: Matrix decomposition and heterogeneous graph inference for miRNA-disease association prediction. PLoS Computational Biology, 14(8), e1006418.CrossRef Google Scholar PubMed

Donges, J. F., Zou, Y., Marwan, N., & Kurths, J. (2009). Complex networks in climate dynamics. The European Physical Journal Special Topics, 174(1), 157–179.CrossRef Google Scholar

George, G., Singh, S., Lokappa, S. B., & Varkey, J. (2019). Gene co-expression network analysis for identifying genetic markers in Parkinson’s disease-a three-way comparative approach. Genomics, 111(4), 819–830.CrossRef Google Scholar PubMed

Hughes, T. R., Marton, M. J., Jones, A. R., Roberts, C. J., Stoughton, R., Armour, C. D., Bennett, H. A., … Friend, S. H. (2000). Functional discovery via a compendium of expression profiles. Cell, 102(1), 109–126.CrossRef Google Scholar

Ihmels, J., Friedlander, G., Bergmann, S., Sarig, O., Ziv, Y., & Barkai, N. (2002). Revealing modular organization in the yeast transcriptional network. Nature Genetics, 31(4), 370–377.CrossRef Google Scholar PubMed

Klimm, F., Toledo, E. M., Monfeuga, T., Zhang, F., Deane, C. M., & Reinert, G. (2020). Functional module detection through integration of single-cell RNA sequencing data with protein–protein interaction networks. BMC Genomics, 21(1), 1–10.CrossRef Google Scholar PubMed

Kothapalli, R., Yoder, S. J., Mane, S., & Loughran, T. P. (2002). Microarray results: How accurate are they? BMC Bioinformatics, 3(1), 22.CrossRef Google Scholar

Langfelder, P., & Horvath, S. (2008). WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics, 9(1), 559.CrossRef Google Scholar

Lee, H. K., Hsu, A. K., Sajdak, J., Qin, J., & Pavlidis, P. (2004). Coexpression analysis of human genes across many microarray data sets. Genome Research, 14(6), 1085–1094.CrossRef Google Scholar PubMed

Li, H., Wang, Y., Jiang, J., Zhao, H., Feng, X., Zhao, B., & Wang, L. (2019). A novel human microbe-disease association prediction method based on the bidirectional weighted network. Frontiers in Microbiology, 10, 676.CrossRef Google Scholar PubMed

Magwene, P. M., & Kim, J. (2004). Estimating genomic coexpression networks using first-order conditional independence. Genome Biology, 5(12), R100.CrossRef Google Scholar PubMed

Makrodimitris, S., Reinders, M. J. T., & van Ham, R. C. H. J. (2020). Metric learning on expression data for gene function prediction. Bioinformatics, 36(4), 1182–1190.Google Scholar PubMed

Pardo-Diaz, J., Bozhilova, L. V., Beguerisse-Daz, M., Poole, P. S., Deane, C. M., & Reinert, G. (2021). Robust gene coexpression networks using signed distance correlation. Bioinformatics, 02. btab041.CrossRef Google Scholar

Petryszak, R., Keays, M., Tang, Y. A., Fonseca, N. A., Barrera, E., Burdett, T., … Brazma, A. (2016). Expression atlas update–an integrated database of gene and protein expression in humans, animals and plants. Nucleic Acids Research, 44(D1), D746–D752.CrossRef Google Scholar

Prieto, C., Risueño, A., Fontanillo, C., & De Las Rivas, J. (2008). Human gene coexpression landscape: Confident network derived from tissue transcriptomic profiles. PloS One, 3(12), e3911.CrossRef Google Scholar PubMed

Song, F., Cui, C., Gao, L., & Cui, Q. (2019). miES: Predicting the essentiality of miRNAs with machine learning and sequence features. Bioinformatics, 35(6), 1053–1054.Google Scholar PubMed

Stuart, J. M., Segal, E., Koller, D., & Kim, S. K. (2003). A gene-coexpression network for global discovery of conserved genetic modules. Science, 302(5643), 249–255.CrossRef Google Scholar

Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35(6), 2769–2794.CrossRef Google Scholar

Szklarczyk, D., Gable, A. L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., … von Mering, C. (2019). STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Research, 47(D1), D607–D613.CrossRef Google Scholar PubMed

Ucar, D., Neuhaus, I., Ross-MacDonald, P., Tilford, C., Parthasarathy, S., Siemers, N., & Ji, R.-R. (2007). Construction of a reference gene association network from multiple profiling data: application to data analysis. Bioinformatics, 23(20), 2716–2724.CrossRef Google Scholar PubMed

van Noort, V., Snel, B., & Huynen, M. A. (2003). Predicting gene function by conserved co-expression. TRENDS in Genetics, 19(5), 238–242.CrossRef Google Scholar PubMed

Wang, G.-J., Xie, C., & Stanley, H. E. (2018). Correlation structure and evolution of world stock markets: Evidence from Pearson and partial correlation-based networks. Computational Economics, 51(3), 607–635.CrossRef Google Scholar

Weirauch, M. T. (2011). Gene coexpression networks for the analysis of DNA microarray data. Applied Statistics for Network Biology: Methods in Systems Biology, 1, 215–250.CrossRef Google Scholar

Wolfe, C. J., Kohane, I. S., & Butte, A. J. (2005). Systematic survey reveals general applicability of “guilt-by-association” within gene coexpression networks. BMC Bioinformatics, 6(1), 227.CrossRef Google Scholar PubMed

Article contents

Generating weighted and thresholded gene coexpression networks using signed distance correlation

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests