Diffusion profile embedding as a basis for graph vertex similarity

Scott Payne; Edgar Fuller; George Spirou; Cun-Quan Zhang

doi:10.1017/nws.2021.11

Diffusion profile embedding as a basis for graph vertex similarity

Published online by Cambridge University Press: 07 October 2021

George Spirou and

Scott Payne: Affiliation:
Department of Mathematics, West Virginia University, Morgantown, WV, USA (e-mails: spayne7@mix.wvu.edu, cun-quan.zhang@mail.wvu.edu),
Edgar Fuller*: Affiliation:
Department of Mathematics and Statistics, Florida International University, Miami, FL, USA
George Spirou: Affiliation:
Department of Medical Engineering, University of South Florida, Tampa, FL, USA (e-mail: gspirou@usf.edu)
Cun-Quan Zhang: Affiliation:
Department of Mathematics, West Virginia University, Morgantown, WV, USA (e-mails: spayne7@mix.wvu.edu, cun-quan.zhang@mail.wvu.edu),
*: *Corresponding author. Email: ejfuller@gmail.com

Article contents

Abstract
Footnotes
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the 'Save PDF' action button.

We describe here a notion of diffusion similarity, a method for defining similarity between vertices in a given graph using the properties of random walks on the graph to model the relationships between vertices. Using the approach of graph vertex embedding, we characterize a vertex vi by considering two types of diffusion patterns: the ways in which random walks emanate from the vertex vi to the remaining graph and how they converge to the vertex vi from the graph. We define the similarity of two vertices vi and vj as the average of the cosine similarity of the vectors characterizing vi and vj. We obtain these vectors by modifying the solution to a differential equation describing a type of continuous time random walk.

This method can be applied to any dataset that can be assigned a graph structure that is weighted or unweighted, directed or undirected. It can be used to represent similarity of vertices within community structures of a network while at the same time representing similarity of vertices within layered substructures (e.g., bipartite subgraphs) of the network. To validate the performance of our method, we apply it to synthetic data as well as the neural connectome of the C. elegans worm and a connectome of neurons in the mouse retina. A tool developed to characterize the accuracy of the similarity values in detecting community structures, the uncertainty index, is introduced in this paper as a measure of the quality of similarity methods.

Keywords

Information

Type: Research Article
Information: Network Science , Volume 9 , Issue 3 , September 2021 , pp. 328 - 353

DOI: https://doi.org/10.1017/nws.2021.11 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is included and the original work is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use.
Copyright: © The Author(s), 2021. Published by Cambridge University Press

Footnotes

Action Editor: Ulrik Brandes

References

Altun, Z., Hall, D. H., Wolkow, C. A., Crocker, C., & Lints, R. Handbook of C. Elegans Anatomy. Wormatlas 2002–2021.Google Scholar

Angstmann, C. N., Donnelly, I. C., & Henry, B. I. (2013). Pattern formation on networks with reactions: A continuous-time random-walk approach. Physical Review E, 87(Mar), 032804.CrossRef Google Scholar

Avrachenkov, K., Chebotarev, P., & Rubanov, D. (2019). Similarities on graphs: Kernels versus proximity measures. European Journal of Combinatorics, 80, 47–56. Special Issue in Memory of Michel Marie Deza.CrossRef Google Scholar

Bai, X., Wilson, R. C., & Hancock, E. R. (2005). Manifold embedding of graphs using the heat kernel. In R. Martin, H. Bez, & M. Sabin (Eds.), Mathematics of surfaces xi (pp. 34–49). Berlin, Heidelberg: Springer Berlin Heidelberg.CrossRef Google Scholar

Bock, D. D., Lee, W.-C. A., Kerlin, A. M., Andermann, M. L., Hood, G., Wetzel, A. W., Yurgenson, S., Soucy, E. R., Kim, H. Suk, & Reid, R. C. (2011). Network anatomy and in vivo physiology of visual cortical neurons. Nature, 471(7337), 177–182.CrossRef Google Scholar PubMed

Bondy, A., & Murty, M. R. (2008). Graph theory. Springer.CrossRef Google Scholar

Brandes, U. (2016). Network positions. Methodological Innovations, 9, 1–19.CrossRef Google Scholar

Brauer, F., & Nohel, J. (1969). The qualitative theory of ordinary differential equations, an introduction. New York: W. A. Benjamin.Google Scholar

Brenner, S. (1973). The Genetics of Behaviour. British Medical Bulletin, 29(3), 269–271.CrossRef Google Scholar PubMed

Briggman, K. L., Helmstaedter, M., & Denk, W. (2011). Wiring specificity in the direction-selectivity circuit of the retina. Nature, 471(7337), 183–188.CrossRef Google Scholar PubMed

Chalfie, M., & Sulston, J. (1981). Developmental genetics of the mechanosensory neurons of caenorhabditis elegans. Developmental Biology, 82(2), 358–370.CrossRef Google Scholar PubMed

Chalfie, M, Sulston, J. E., White, J. G., Southgate, E., Thomson, J. N., & Brenner, S. (1985). The neural circuit for touch sensitivity in Caenorhabditis Elegans. Journal of Neuroscience, 5(4), 956–964.CrossRef Google Scholar PubMed

Chan, P. K., Schlag, M. D. F., & Zien, J. Y. (1994). Spectral k-way ratio-cut partitioning and clustering. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 13(9), 1088–1096.CrossRef Google Scholar

Chen, B. L., Hall, D. H., & Chklovskii, D. B. (2006). Wiring optimization can relate neuronal structure and function. Proceedings of the National Academy of Sciences, 103(12), 4723–4728.CrossRef Google Scholar PubMed

Chen, B. L.-J. (2007). Neuronal network of c.elegans: From anatomy to behavior. Ph.D. thesis, The Watson School of Biological Sciences at Cold Spring Harbor Laboratory.Google Scholar

Cheng, X., Rachh, M., & Steinerberger, S. (2019). On the diffusion geometry of graph Laplacians and applications. Applied and Computational Harmonic Analysis, 46(3), 674–688.CrossRef Google Scholar

Chung, F. (1997). Spectral graph theory. American Mathematical Society.Google Scholar

Chung, F. (2007). The heat Kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences, 104(50), 19735–19740.CrossRef Google Scholar

Cook, S. J., Jarrell, T. A., Brittin, C. A., Wang, Y., Bloniarz, A. E., Yakovlev, M. A., Nguyen, K. C. Q., Tang, L. T. H, Bayer, E. A., Duerr, J. S., Bülow, H. E., Hobert, O., Hall, D. H., & Emmons, S. W. (2019). Whole-animal connectomes of both caenorhabditis elegans sexes. Nature, 571(7763), 63–71.CrossRef Google Scholar

Cooper, K., & Barahona, M. (2010). Role-based similarity in directed networks.Google Scholar

Corsi, A. K, Wightman, B., & Chalfie, M. (2015). A transparent window into biology: A primer on caenorhabditis elegans. Genetics, 200(2), 387–407.CrossRef Google Scholar PubMed

Delvenne, J.-C., Schaub, M.T., Yaliraki, S.N., & Barahona, M. (2013). The stability of a graph partition: A dynamics-based framework for community detection. Dynamics On and Of complex Networks, 2, 221–242.Google Scholar

Diestel, R. (2017). Graph theory. Springer.CrossRef Google Scholar

Eisenmann, D. M. (2005). Wormbook: The online review of c. elegans biology. Research Community, Wormbook.Google Scholar

Emmons, S. W. (2015). The beginning of connectomics: A commentary on White et al. (1986) The structure of the nervous system of the nematode Caenorhabditis elegans. Philosophical Transactions of the Royal Society B: Biological Sciences, 370(1666), 20140309.Google Scholar

Erdös, P., & Rényi, A. (1959). On random graphs I. Publ. Math. Debrecen, 6, 290–297.Google Scholar

Erdös, P., & Rényi, A. (1960). On the evolution of random graphs. Publ Math. Inst, 5, 17–61.Google Scholar

Estrada, E., & Silver, G. (2017). Accounting for the role of long walks on networks via a new matrix function. J. Appl Math. Anal, 449, 1581–1600.CrossRef Google Scholar

Fiedler, M. (1989). Laplacian of graphs and algebraic connectivity. Banach Center Publications, 25(1), 57–70.CrossRef Google Scholar

Fouss, F., Yen, L., Pirotte, A., & Saerens, M. (2006). An experimental investigation of graph kernels on a collaborative recommendation task. In Proceedings of the Sixth International Conference on Data Mining, ICDM’06 (pp. 863–868). IEEE.CrossRef Google Scholar

Fouss, F., Pirotte, A., Renders, J.-M., & Saerens, M. (2007). Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering, 19(3), 355–369.CrossRef Google Scholar

Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(5814), 16.CrossRef Google Scholar PubMed

Frieze, A., & Karoński, M. (2016). Introduction to random graphs. Cambridge University Press.CrossRef Google Scholar

Genton, M. G. (2002). Classes of Kernels for machine learning: A statistics perspective. Journal of Machine Learning Research, 2(Mar.), 299–312.Google Scholar

Ghawalby, H. E., & Hancock, E. R. (2015). Heat Kernel embeddings, differential geometry and graph structure. Axioms, 4, 275–293.CrossRef Google Scholar

Gray, J. M., Hill, J. J., & Bargmann, C. I. (2005). A circuit for navigation in caenorhabditis elegans. Proceedings of the National Academy of Sciences, 102(9), 3184–3191.CrossRef Google Scholar PubMed

Helmstaedter, M., Briggman, K. L, Turaga, S. C, Jain, Viren, S., Sebastian, H. & Denk, W. (2013). Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature, 500(7461), 168–174.CrossRef Google Scholar

Huang, W., Segarra, S., & Ribeiro, A. (2015). Diffusion distance for signals supported on networks. 2015 49th asilomar conference on signals, systems and computers (pp. 1219–1223).CrossRef Google Scholar

Jabr, F. (2012). The connectome debate: Is mapping the mind of a worm worth it? Scientific American, 18.Google Scholar

Jaccard, P. (1901). Distribution de la flore alpine dans le bassin des dranses et dans quelques regions voisines. Bulletin de la société vaudoise des sciences naturelles, 37, 241–272.Google Scholar

Jonas, E., & Kording, K. (2015). Automatic discovery of cell types and microcircuitry from neural connectomics. Elife, 4, e04250.CrossRef Google Scholar PubMed

Kato, S., Kaplan, H. S., Schrödel, T., Skora, S., Lindsay, T. H., Yemini, E., Lockery, S., & Zimmer, M. (2015). Global brain dynamics embed the motor command sequence of caenorhabditis elegans. Cell, 163(3), 656–669.CrossRef Google Scholar PubMed

Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18(1), 39–43.CrossRef Google Scholar

Kolb, Helga. (2011). Inner plexiform layer. In H. Kolb, R. Nelson, E. Fernandez, & B. Jones (Eds.), Webvision: The organization of the retina and visual system. University of Utah Health Sciences Center.Google Scholar

Kondor, R. I., & Lafferty, J. (2002). Diffusion kernels on graphs and other discrete input spaces. In Proceedings of ICML (pp. 315–322).Google Scholar

Kovács, I. A., Luck, K., Spirohn, K., Wang, Y., Pollis, C., Schlabach, S., Bian, W., Kim, D.-K., Kishore, N., & Hao, T. (2019). Network-based prediction of protein interactions. Nature Communications, 10(1), 1–8.CrossRef Google Scholar

Leicht, E. A., Holme, P., & Newman, M. E. J. (2006). Vertex similarity in networks. Physics Review E, 73.CrossRef Google Scholar

Leifer, A. M., Fang-Yen, C., Gershow, M., Alkema, M. J., & Samuel, A. D. T. (2011). Optogenetic manipulation of neural activity in freely moving caenorhabditis elegans. Nature Methods, 8(2), 147–152.CrossRef Google Scholar PubMed

Lenart, C. (1998). A generalized distance in graphs and centered partitions. SIAM Journal on Discrete Mathematics, 11(2), 293–304.CrossRef Google Scholar

Liben-Nowell, D., & Kleinberg, J. (2007). The link-prediction problem for social networks. Journal of the Association for Information Science and Technology, 58(7), 1019–1031.Google Scholar

Lovász, L. (1993). Random walks on graphs: A survey. Bolyai Society Mathematical Studies: Combinatorics - paul erdös is eighty, 2, 1–46.Google Scholar

Lü, L., & Zhou, T. (2011). Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and Its Applications, 390(6), 1150–1170.CrossRef Google Scholar

Lü, L., Zhang, Y.-C., Yeung, C. H., & Zhou, T. (2011). Leaders in social networks, the delicious case. Plos One, 6(6). e21202.CrossRef Google Scholar PubMed

Luo, D., Ding, C., Huang, H., & Li, T. (2009). Non-negative Laplacian embedding. In 2009 Ninth IEEE International Conference on Data Mining (pp. 337–346). IEEE.CrossRef Google Scholar

Marc, R. E., Anderson, J. R., Jones, B. W., Sigulinsky, C. L., & Lauritzen, J. S. (2014). The AII amacrine cell connectome: A dense network hub. Frontiers in Neural Circuits, 8, 104.CrossRef Google Scholar PubMed

Masuda, N., Porter, M. A., & Lambiotte, R. (2017). Random walks and diffusion on networks. Physics Reports, 716–717(November), 1–58.CrossRef Google Scholar

Meila, M., & Shi, J. (2001). A random walks view of spectral segmentation. Proceedings of the 8th international workshop on artificial intelligence and statistics.Google Scholar

Mohar, B. (1989). Isoperimetric numbers of graphs. Journal of Combinatorial Theory, Series B, 47(3), 274–291.CrossRef Google Scholar

Ohyama, T., Schneider-Mizell, C. M., Fetter, R. D., Aleman, J. V., Franconville, R., Rivera-Alba, M., … Zlatic, M. (2015). A multilevel multimodal circuit enhances action selection in drosophila. Nature, 520, 633–639.CrossRef Google Scholar PubMed

Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Technical Report, Stanford InfoLab.Google Scholar

Pech, R., Hao, D., Lee, Y.-L., Yan, Y., & Zhou, T. (2019). Link prediction via linear optimization. Physica A: Statistical Mechanics and Its Applications, 528, 121319.CrossRef Google Scholar

Perrault-Joncas, D. C., & Meila, M. (2011). Directed graph embedding: An algorithm based on continuous limits of Laplacian-type operators. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, & K. Q. Weinberger (Eds.), Advances in neural information processing systems 24 (pp. 990–998). Curran Associates, Inc.Google Scholar

Petit, J., Lambiotte, R., & Carletti, T. (2019). Classes of random walks on temporal networks with competing timescales. Applied Network Science, 4(1), 72.CrossRef Google Scholar

Pirri, J. K., & Alkema, M. J. (2012). The neuroethology of c. elegans escape. Current Opinion in Neurobiology, 22(2), 187–193.CrossRef Google Scholar PubMed

Qi, X., Wu, Q., Zhang, Y., Fuller, E., & Zhang, C.-Q. (2011). A novel model for dna sequence similarity analysis based on graph theory. Evolutionary Bioinformatics, 7, EBO–S7364.CrossRef Google Scholar

Qi, X., Duval, R. D., Christensen, K., Fuller, E., Spahiu, A., Wu, Q., …. Zhang, C. (2013). Terrorist networks, network energy and node removal: A new measure of centrality based on Laplacian energy. Social Networking, 2(01), 19.CrossRef Google Scholar

Rosvall, M., & Bergstrom, C. (2008). Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences, 105(4), 1118–1123.CrossRef Google Scholar PubMed

Rosvall, M., Axelsson, D., & Bergstrom, C. (2009). The map equation. European Physical Journal Special Topics, 178, 13–23.CrossRef Google Scholar

Ryan, K., Lu, Z., & Meinertzhagen, I. A. (2016). The CNS connectome of a tadpole larva of Ciona intestinalis (l.) highlights sidedness in the brain of a chordate sibling. elife, 5(dec.), e16962.CrossRef Google Scholar PubMed

Salton, G., & McGill, M. J. (1983). Introduction to modern information retrieval. McGraw-Hill computer science series. McGraw-Hill.Google Scholar

Sanders, J., Nagy, S., Fetterman, G., Wright, C., Treinin, M., & Biron, D. (2013). The caenorhabditis elegans interneuron ALA is (also) a high-threshold mechanosensor. BMC Neuroscience, 14(Dec), 156–156.CrossRef Google Scholar PubMed

Sawin, E. R., Ranganathan, R., & Horvitz, H. R. (2000). C. elegans locomotory rate is modulated by the environment through a dopaminergic pathway and by experience through a serotonergic pathway. Neuron, 26(3), 619–631.CrossRef Google Scholar PubMed

Schafer, W. R. (2005). Deciphering the neural and molecular mechanisms of c. elegans behavior. Current Biology, 15(17), R723–R729.CrossRef Google Scholar PubMed

Schuske, K., Beg, A. A., & Jorgensen, E. M. (2004). The GABA nervous system in c. elegans. Trends in Neurosciences, 27(7), 407–414.CrossRef Google Scholar PubMed

Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge Univ Press.CrossRef Google Scholar

Shi, J., & M., J. (2000). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine intelligence, 22(8), 888–905.Google Scholar

Shirkhorshidi, A. S., Aghabozorgi, S., & Wah, T. Y. (2015). A comparison study on similarity and dissimilarity measures in clustering continuous data. PLOS ONE, 10(12), 1–20.CrossRef Google Scholar

Smola, A. J., & Kondor, R. (2003). Kernels and regularization on graphs. In Learning theory and kernel machines (pp. 144–158). Springer.CrossRef Google Scholar

Takemura, S.-y, Nern, A., Chklovskii, D. B., Scheffer, L. K., Rubin, G. M., & Meinertzhagen, I. A. (2017). The comprehensive connectome of a neural substrate for on motion detection in Drosophila. elife, 6(Apr), e24394.Google Scholar

Thiel, K., & Berthold, M. R. (2010). Node similarities from spreading activation. In 2010 IEEE international conference on data mining (pp. 1085–1090).CrossRef Google Scholar

Towlson, E. K., Vértes, P. E., Ahnert, S. E., Schafer, W. R., & Bullmore, E. T. (2013). The rich club of the c.elegans neuronal connectome. Journal of Neuroscience, 10(33), 15.Google Scholar

Van Buskirk, C., & Sternberg, P. W. (2007). Epidermal growth factor signaling induces behavioral quiescence in caenorhabditis elegans. Nature Neuroscience, 10(10), 1300–1307.CrossRef Google Scholar PubMed

Varshney, L. R., Chen, B. L., Paniagua, E., Hall, D. H., & Chklovskii, D. B. Neuronal connectivity II. http://www.wormatlas.org/neuronalwiring.html.Google Scholar

Varshney, L. R., Chen, B. L., Paniagua, E., Hall, D. H., & Chklovskii, D. B. (2011). Structural properties of the caenorhabditis elegans neuronal network. Plos Computational Biology, 7(2), e1001066.CrossRef Google Scholar PubMed

West, D. (2001). Introduction to graph theory. Prentice Hall.Google Scholar

White, J. G. (2013). Getting into the mind of a worm–a personal view. Wormbook, 1–10.CrossRef Google Scholar

White, J. G., Southgate, E., Thomson, J. N., & Brenner, S. (1986). The structure of the nervous system of the nematode caenorhabditis elegans. Philosophical Transactions of the Royal Society of London Series B, Biological Sciences, 314(1165), 1–340.Google Scholar PubMed

Zheng, Z., Lauritzen, J. S., Perlman, E., Robinson, C. G., Nichols, M., Milkie, D., … Bock, D. D. (2018). A complete electron microscopy volume of the brain of adult drosophila melanogaster. Cell, 174(3), 730–743.e22.CrossRef Google Scholar PubMed

Zhou, T., Lü, L., & Zhang, Y.-C. (2009). Predicting missing links via local information. The European Physical Journal B, 71(4), 623–630.CrossRef Google Scholar

Zhou, T., Lee, Y.-L, & Wang, G. (2021). Experimental analyses on 2-hop-based and 3-hop-based link prediction algorithms. Physica A: Statistical Mechanics and Its Applications, 564. Article 125532.Google Scholar

Article contents

Diffusion profile embedding as a basis for graph vertex similarity

Abstract

Keywords

Information

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests