Hostname: page-component-6b989bf9dc-md2j5 Total loading time: 0 Render date: 2024-04-14T16:46:02.145Z Has data issue: false hasContentIssue false

Network classification-based structural analysis of real networks and their model-generated counterparts

Published online by Cambridge University Press:  20 May 2022

Marcell Nagy*
Affiliation:
Department of Stochastics, Institute of Mathematics, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary
Roland Molontay
Affiliation:
Department of Stochastics, Institute of Mathematics, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary MTA-BME Stochastics Research Group, Műegyetem rkp. 3., H-1111 Budapest, Hungary
*
Corresponding author: Marcell Nagy, email: marcessz@math.bme.hu, molontay@math.bme.hu

Abstract

Data-driven analysis of complex networks has been in the focus of research for decades. An important area of research is to study how well real networks can be described with a small selection of metrics, furthermore how well network models can capture the relations between graph metrics observed in real networks. In this paper, we apply machine-learning techniques to investigate the aforementioned problems. We study 500 real-world networks along with 2000 synthetic networks generated by four frequently used network models with previously calibrated parameters to make the generated graphs as similar to the real networks as possible. This paper unifies several branches of data-driven complex network analysis, such as the study of graph metrics and their pair-wise relationships, network similarity estimation, model calibration, and graph classification. We find that the correlation profiles of the structural measures significantly differ across network domains and the domain can be efficiently determined using a small selection of graph metrics. The structural properties of the network models with fixed parameters are robust enough to perform parameter calibration. The goodness-of-fit of the network models highly depends on the network domain. By solving classification problems, we find that the models lack the capability of generating a graph with a high clustering coefficient and relatively large diameter simultaneously. On the other hand, models are able to capture exactly the degree-distribution-related metrics.

Type
Research Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Action Editor: Christoph Stadtfeld

A preliminary version of this paper was presented at the Ninth International Conference on Complex Networks and their Applications (COMPLEX NETWORKS 2020).

References

Aliakbary, S., Habibi, J., & Movaghar, A. (2014). Quantification and comparison of degree distributions in complex networks. In 7th international symposium on telecommunications (IST 2014) (pp. 464–469). IEEE.CrossRefGoogle Scholar
Aliakbary, S., Motallebi, S., Rashidian, S., Habibi, J., & Movaghar, A. (2015a). Distance metric learning for complex networks: Towards size-independent comparison of network structures. Chaos: An Interdisciplinary Journal of Nonlinear Science, 25(2), 023111.CrossRefGoogle ScholarPubMed
Aliakbary, S., Motallebi, S., Rashidian, S., Habibi, J., & Movaghar, A. (2015b). Noise-tolerant model selection and parameter estimation for complex networks. Physica A: Statistical Mechanics and Its Applications, 427, 100112.CrossRefGoogle Scholar
Arnold, N. A., Mondragón, R. J., & Clegg, R. G. (2021). Likelihood-based approach to discriminate mixtures of network models that vary in time. Scientific Reports, 11(1), 113.Google ScholarPubMed
Attar, N., & Aliakbary, S. (2017). Classification of complex networks based on similarity of topological network features. Chaos: An Interdisciplinary Journal of Nonlinear Science, 27(9), 091102.CrossRefGoogle ScholarPubMed
Bagrow, J. P., Bollt, E. M., Skufca, J. D., & Ben-Avraham, D. (2008). Portraits of complex networks. EPL (Europhysics Letters), 81(6), 68004.Google Scholar
Bai, Y., Ding, H., Bian, S., Chen, T., Sun, Y., & Wang, W. (2019). Simgnn: A neural network approach to fast graph similarity computation. In Proceedings of the twelfth ACM international conference on web search and data mining (pp. 384–392). ACM.CrossRefGoogle Scholar
Barabási, A.-L. (2016). Network science. Cambridge University Press.Google Scholar
Barabási, A-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509512.CrossRefGoogle ScholarPubMed
Barnett, I., Malik, N., Kuijjer, M. L., Mucha, P. J., & Onnela, J.-P. (2019). Endnote: Feature-based classification of networks. Network Science, 7(3), 438.CrossRefGoogle Scholar
Bezáková, I., Kalai, A., & Santhanam, R. (2006). Graph model selection using maximum likelihood. In Proceedings of the 23rd International Conference on Machine Learning (pp. 105–112). ACM.CrossRefGoogle Scholar
Bläsius, T., Friedrich, T., Katzmann, M., Krohmer, A., & Striebel, J. (2018). Towards a systematic evaluation of generative network models. In International workshop on algorithms and models for the web-graph (pp. 99–114). Springer.CrossRefGoogle Scholar
Bonner, S., Brennan, J., Theodoropoulos, G., Kureshi, I., & McGough, A. S. (2016a). Deep topology classification: A new approach for massive graph classification. In 2016 IEEE international conference on big data (pp. 3290–3297). IEEE.CrossRefGoogle Scholar
Bonner, S., Brennan, J., Theodoropoulos, G., Kureshi, I., & McGough, A. S. (2016b). Efficient comparison of massive graphs through the use of ’graph fingerprints’. In 12th international workshop on mining and learning with graphs, KDD 2016.Google Scholar
Bordino, I., Donato, D., Gionis, A., & Leonardi, S. (2008). Mining large networks with subgraph counting. In 2008 eighth IEEE international conference on data mining (pp. 737–742). IEEE.CrossRefGoogle Scholar
Bounova, G., & de Weck, O. (2012). Overview of metrics and their correlation patterns for multiple-metric topology analysis on heterogeneous graph ensembles. Physical Review E, 85(1), 016117.CrossRefGoogle ScholarPubMed
Butler, D., & Djordjević, S. (2014). University of Exeter, Centre for Water Systems: Benchmarks.Google Scholar
Canning, J. P., Ingram, E. E., Nowak-Wolff, S., Ortiz, A. M., Ahmed, N. K., Rossi, R. A., Schmitt, K. R. B., & Soundarajan, S. (2018). Predicting graph categories from structural properties. arxiv preprint arxiv:1805.02682.Google Scholar
Chatterjee, A., Manohar, M., & Ramadurai, G. (2016). Statistical analysis of bus networks in india. Plos One, 11(12), e0168478.CrossRefGoogle Scholar
Chen, D., Shi, D.-D., Qin, M., Xu, S.-M., & Pan, G.-J. (2018). Complex network comparison based on communicability sequence entropy. Physical Review E, 98(1), 012319.CrossRefGoogle ScholarPubMed
Clauset, A., Tucker, E., & Sainz, M. (2016). The Colorado Index of complex networks.Google Scholar
Croux, C., & Dehon, C. (2008). Robustness versus efficiency for nonparametric correlation measures. FBE Research Report Kbi_0803.Google Scholar
Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. International Journal on Complex Systems, 1695(5), 19.Google Scholar
Del Genio, C. I., Gross, T., & Bassler, K. E. (2011). All scale-free networks are sparse. Physical Review Letters, 107(17), 178701.CrossRefGoogle ScholarPubMed
Diego, V., Jeremy, G., & Rupesh, N. (2003). Interaction web database.Google Scholar
Dunne, J. A., Maschner, H., Betts, M. W., Huntly, N., Russell, R., Williams, R. J., & Wood, S. A. (2016). The roles and impacts of human hunter-gatherers in north pacific marine food webs. Scientific Reports, 6, 21179.CrossRefGoogle ScholarPubMed
Faust, K. (2006). Comparing social networks: size, density, and local structure. Metodoloski zvezki, 3(2), 185.Google Scholar
Fay, D., Moore, A. W., Brown, K., Filosi, M., & Jurman, G. (2014). Graph metrics as summary statistics for approximate bayesian computation with application to network model parameter estimation. Journal of Complex Networks, 3(1), 5283.CrossRefGoogle Scholar
Filkov, V., Saul, Z. M., Roy, S., D’Souza, R. M., & Devanbu, P. T. (2009). Modeling and verifying a broad array of network properties. EPL (Europhysics Letters), 86(2), 28003.CrossRefGoogle Scholar
Fong, P. W. L., Anwar, M., & Zhao, Z. (2009). A privacy preservation model for Facebook-style social network systems. In European symposium on research in computer security (pp. 303–320). Springer.CrossRefGoogle Scholar
Friedman, J., Hastie, T., & Tibshirani, R. (2001). The elements of statistical learning. Vol. 1. Springer Series in Statistics. New York, NY, USA: Springer.Google Scholar
Gao, X., Xiao, B., Tao, D., & Li, X. (2010). A survey of graph edit distance. Pattern Analysis and Applications, 13(1), 113129.CrossRefGoogle Scholar
Garcia-Robledo, A., Diaz-Perez, A., & Morales-Luna, G. (2013). Correlation analysis of complex network metrics on the topology of the Internet. In 2013 10th international conference and expo on emerging technologies for a smarter world (cewit) (pp. 1–6). IEEE.CrossRefGoogle Scholar
Gjoka, M., Tillman, B., & Markopoulou, A. (2015). Construction of simple graphs with a target joint degree matrix and beyond. In 2015 IEEE conference on computer communications (INFOCOM) (pp. 1553–1561). Citeseer.CrossRefGoogle Scholar
Goldenberg, A., Zheng, A. X., Fienberg, S. E., et al. (2010). A survey of statistical network models. Foundations and Trends in Machine Learning, 2(2), 129233.CrossRefGoogle Scholar
Griffith, V., Xu, Y., & Ratti, C. (2017). Graph theoretic properties of the darkweb. arxiv preprint arxiv:1704.07525.Google Scholar
Hagberg, A., Swart, P., & D., S Chult (2008). Exploring network structure, dynamics, and function using Networkx. Tech. rept. Los Alamos National Lab.(LANL), NM, USA.Google Scholar
Hammond, D. K., Gur, Y., & Johnson, C. R. (2013). Graph diffusion distance: A difference measure for weighted graphs based on the graph Laplacian exponential kernel. In 2013 IEEE global conference on signal and information processing (pp. 419–422). IEEE.CrossRefGoogle Scholar
Harrison, K. R. (2014). Network similarity measures and automatic construction of graph models using genetic programming. M.Phil. thesis, Brock University.Google Scholar
Holme, P., & Kim, B. J. (2002). Growing scale-free networks with tunable clustering. Physical Review E, 65(2), 026107.CrossRefGoogle ScholarPubMed
Ikehara, K., & Clauset, A. (2017). Characterizing the structural diversity of complex networks across domains. arxiv preprint arxiv:1710.11304.Google Scholar
Jamakovic, A., & Uhlig, S. (2008). On the relationships between topological measures in real-world networks. Networks and Heterogeneous Media, 3(2), 345.CrossRefGoogle Scholar
Janssen, J., Hurshman, M., & Kalyaniwalla, N. (2012). Model selection for social networks using graphlets. Internet Mathematics, 8(4), 338363.CrossRefGoogle Scholar
Kang, U., Tong, H., & Sun, J. (2012). Fast random walk graph kernel. In Proceedings of the 2012 SIAM international conference on data mining (pp. 828–838). SIAM.CrossRefGoogle Scholar
Kashima, H., Tsuda, K., & Inokuchi, A. (2004). Kernels for graphs. Kernel Methods in Computational Biology, 39(1), 101113.Google Scholar
Kasthuri, N., & Lichtman, J. (2008). Neurodata’s graph database.Google Scholar
Kelmans, A. K. (1976). Comparison of graphs by their number of spanning trees. Discrete Mathematics, 16(3), 241261.CrossRefGoogle Scholar
Kiar, G. (2016). Gremlin: Graph estimation from mr images leading to inference in neuroscience. Ph.D. thesis, Johns Hopkins University.Google Scholar
Ktena, S. I., Parisot, S., Ferrante, E., Rajchl, M., Lee, M., Glocker, B., & Rueckert, D. (2017). Distance metric learning using graph convolutional networks: Application to functional brain networks. In International conference on medical image computing and computer-assisted intervention (pp. 469–477). Springer.CrossRefGoogle Scholar
Kunegis, J. (2013). KONECT – The Koblenz network collection. In Proceedings of international conference on world wide web companion (pp. 13431350).Google Scholar
Langley, P., & Iba, W. (1993). Average-case analysis of a nearest neighbor algorithm. In International joint conference on artificial intelligence, vol. 13 (pp. 889–889). Citeseer.Google Scholar
Leskovec, J., Kleinberg, J., & Faloutsos, C. (2005). Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining (pp. 177–187). ACM.CrossRefGoogle Scholar
Leskovec, J., Kleinberg, J., & Faloutsos, C. (2007). Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD), 1(1), 2.CrossRefGoogle Scholar
Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., & Ghahramani, Z. (2010). Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research, 11(Feb), 9851042.Google Scholar
Li, G., Semerci, M., Yener, B., & Zaki, M. J. (2012). Effective graph classification based on topological and label attributes. Statistical Analysis and Data Mining: The ASA Data Science Journal, 5(4), 265283.CrossRefGoogle Scholar
Lim, S.-H., Lee, S. M., Powers, S., Shankar, M., & Imam, N. (2015). Survey of approaches to generate realistic synthetic graphs. Oak Ridge National Laboratory.Google Scholar
Mahadevan, P., Krioukov, D., Fall, K., & Vahdat, A. (2006). Systematic topology analysis and generation using degree correlations. In Acm sigcomm computer communication review, vol. 36 (pp. 135146). ACM.CrossRefGoogle Scholar
Martnez, Johann H, & Chavez, Mario. (2019). Comparing complex networks: in defence of the simple. New Journal of Physics.CrossRefGoogle Scholar
Mheich, A., Hassan, M., Khalil, M., Gripon, V., Dufor, O., & Wendling, F. (2018). SimiNet: A novel method for quantifying brain network similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(9), 22382249.CrossRefGoogle ScholarPubMed
Middendorf, M., Ziv, E., & Wiggins, C. H. (2005). Inferring network mechanisms: The drosophila melanogaster protein interaction network. Proceedings of the National Academy of Sciences, 102(9), 31923197.Google Scholar
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., & Alon, U. (2002). Network motifs: simple building blocks of complex networks. Science, 298(5594), 824827.CrossRefGoogle ScholarPubMed
Nagy, M. (2018). Data-driven analysis of fractality and other characteristics of complex networks. Masters Thesis. Budapest University of Technology and Economics.Google Scholar
Nagy, M., & Molontay, R. (2021). Supplementary Material for Network Classification Based Structural Analysis of Real Networks and their Model-Generated Counterparts. github.com/marcessz/complex-networks.Google Scholar
Narayanan, A., Chandramohan, M., Venkatesan, R., Chen, L., Liu, Y., & Jaiswal, S. (2017). graph2vec: Learning distributed representations of graphs. arxiv preprint arxiv:1707.05005.Google Scholar
Newman, M. E. J. (2003). Properties of highly clustered networks. Physical Review E, 68(2), 026121.CrossRefGoogle ScholarPubMed
Peixoto, T. P. (2014). The graph-tool python library. figshare.Google Scholar
Peixoto, T. P. (2017). Nonparametric bayesian inference of the microcanonical stochastic block model. Physical Review E, 95(1), 012317.CrossRefGoogle ScholarPubMed
Pržulj, N. (2007). Biological network comparison using graphlet degree distribution. Bioinformatics, 23(2), e177e183.CrossRefGoogle ScholarPubMed
Ralaivola, L., Swamidass, S. J, Saigo, H., & Baldi, P. (2005). Graph kernels for chemical informatics. Neural Networks, 18(8), 10931110.CrossRefGoogle ScholarPubMed
Rossi, R. A., & Ahmed, N. K. (2015). The network data repository with interactive graph analytics and visualization. In Proceedings of the 29th AAAI conference on artificial intelligence.Google Scholar
Rossi, R. A., Zhou, R., & Ahmed, N. K. (2017). Deep feature learning for graphs. arxiv preprint arxiv:1704.08829.Google Scholar
Sala, A., Cao, L., Wilson, C., Zablit, R., Zheng, H., & Zhao, B. Y. (2010). Measurement-calibrated graph models for social network experiments. In Proceedings of the 19th international conference on world wide web (pp. 861–870). ACM.CrossRefGoogle Scholar
Schieber, T. A., Carpi, L., Daz-Guilera, A., Pardalos, P. M., Masoller, C., & Ravetti, M. G. (2017). Quantification of network structural dissimilarities. Nature communications, 8, 13928.Google ScholarPubMed
Soundarajan, S., Eliassi-Rad, T., & Gallagher, B. (2014). A guide to selecting a network similarity method. In Proceedings of the 2014 SIAM international conference on data mining (pp. 1037–1045). SIAM.Google Scholar
Stabler, B., Bar-Gera, H., Sall, E., & Transportation Networks for Research Core Team. (2019). Transportation networks for research.Google Scholar
Sukrit, G., Rami, P., & Konstantin, K. (2016). Comparative network analysis using KronFit. Complex Networks VII. Studies in Computational Intelligence, 644, 363375.CrossRefGoogle Scholar
Sun, X., & Wandelt, S. (2014). Network similarity analysis of air navigation route systems. Transportation Research Part E: Logistics and Transportation Review, 70, 416434.CrossRefGoogle Scholar
Ugander, J., Backstrom, L., & Kleinberg, J. (2013). Subgraph frequencies: Mapping the empirical and extremal geography of large graph collections. In Proceedings of the 22nd international conference on world wide web (pp. 1307–1318). ACM.CrossRefGoogle Scholar
van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(Nov), 25792605.Google Scholar
Vishwanathan, S. V. N., Schraudolph, N. N., Kondor, R., & Borgwardt, K. M. (2010). Graph kernels. Journal of Machine Learning Research, 11(Apr), 12011242.Google Scholar
Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684), 440.Google ScholarPubMed
Wegner, A. E., Ospina-Forero, L., Gaunt, R. E., Deane, C. M., & Reinert, G. (2018). Identifying networks with common organizational principles. Journal of Complex Networks, 6(6), 887913.CrossRefGoogle Scholar
Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207244.Google Scholar
Wilson, R. C, & Zhu, P. (2008). A study of graph spectra for comparing graphs and trees. Pattern Recognition, 41(9), 28332841.CrossRefGoogle Scholar
Yang, L., & Jin, R. (2006). Distance metric learning: A comprehensive survey. Michigan State Universiy, 2(2), 4.Google Scholar