Hostname: page-component-89b8bd64d-dvtzq Total loading time: 0 Render date: 2026-05-13T10:19:52.282Z Has data issue: false hasContentIssue false

Generating weighted and thresholded gene coexpression networks using signed distance correlation

Published online by Cambridge University Press:  16 June 2022

Javier Pardo-Diaz
Affiliation:
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
Philip S. Poole
Affiliation:
Department of Plant Sciences, University of Oxford, Oxford OX1 3RB, UK
Mariano Beguerisse-Díaz
Affiliation:
Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK
Charlotte M. Deane
Affiliation:
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
Gesine Reinert*
Affiliation:
Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
*
*Corresponding author. Email: reinert@stats.ox.ac.uk

Abstract

Even within well-studied organisms, many genes lack useful functional annotations. One way to generate such functional information is to infer biological relationships between genes or proteins, using a network of gene coexpression data that includes functional annotations. Signed distance correlation has proved useful for the construction of unweighted gene coexpression networks. However, transforming correlation values into unweighted networks may lead to a loss of important biological information related to the intensity of the correlation. Here, we introduce a principled method to construct weighted gene coexpression networks using signed distance correlation. These networks contain weighted edges only between those pairs of genes whose correlation value is higher than a given threshold. We analyze data from different organisms and find that networks generated with our method based on signed distance correlation are more stable and capture more biological information compared to networks obtained from Pearson correlation. Moreover, we show that signed distance correlation networks capture more biological information than unweighted networks based on the same metric. While we use biological data sets to illustrate the method, the approach is general and can be used to construct networks in other domains. Code and data are available on https://github.com/javier-pardodiaz/sdcorGCN.

Information

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable