NetworKit: A tool suite for large-scale complex network analysis

CHRISTIAN L. STAUDT; ALEKSEJS SAZONOVS; HENNING MEYERHENKE

doi:10.1017/nws.2016.20

NetworKit: A tool suite for large-scale complex network analysis

Published online by Cambridge University Press: 28 December 2016

CHRISTIAN L. STAUDT ,

ALEKSEJS SAZONOVS and

HENNING MEYERHENKE

Show author details

CHRISTIAN L. STAUDT: Affiliation:
Institute of Theoretical Informatics, Karlsruhe Institute of Technology (KIT), 76131 Karlsruhe, Germany (e-mail: christian.staudt@kit.edu)
ALEKSEJS SAZONOVS: Affiliation:
Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK (e-mail: as45@sanger.ac.uk)
HENNING MEYERHENKE: Affiliation:
Institute of Theoretical Informatics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany (e-mail: meyerhenke@kit.edu)

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We introduce NetworKit, an open-source software package for analyzing the structure of large complex networks. Appropriate algorithmic solutions are required to handle increasingly common large graph data sets containing up to billions of connections. We describe the methodology applied to develop scalable solutions to network analysis problems, including techniques like parallelization, heuristics for computationally expensive problems, efficient data structures, and modular software architecture. Our goal for the software is to package results of our algorithm engineering efforts and put them into the hands of domain experts. NetworKit is implemented as a hybrid combining the kernels written in C++ with a Python frontend, enabling integration into the Python ecosystem of tested tools for data analysis and scientific computing. The package provides a wide range of functionality (including common and novel analytics algorithms and graph generators) and does so via a convenient interface. In an experimental comparison with related software, NetworKit shows the best performance on a range of typical analysis tasks.

Keywords

complex networks network analysis network science parallel graph algorithms data analysis software

Information

Type: Research Article
Information: Network Science , Volume 4 , Issue 4 , December 2016 , pp. 508 - 530

DOI: https://doi.org/10.1017/nws.2016.20 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2016

References

Aiello, W., Chung, F., & Lu, L. (2000). A random graph model for massive graphs. In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing, Portland, OR, USA. ACM, pp. 171–180.CrossRef Google Scholar

Alahakoon, T., Tripathi, R., Kourtellis, N., Simha, R., & Iamnitchi, A. (2011). K-path centrality: A new centrality measure in social networks. In Proceedings of the 4th Workshop on Social Network Systems, Salzburg, Austria. ACM, p. 1.CrossRef Google Scholar

Albert, R., & Barabási, A. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74 (1), 47.CrossRef Google Scholar

Bader, D. A., Meyerhenke, H., Sanders, P., Schulz, C., Kappes, A., & Wagner, D. (2014). Benchmarking for graph clustering and partitioning. In Encyclopedia of Social Network Analysis and Mining (pp. 73–82). Springer.CrossRef Google Scholar

Bastian, M., Heymann, S., & Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. In Proceedings of the Third International Conference on Weblogs and Social Media, ICWSM 2009, San Diego, CA, USA, May 17–20, 2009.CrossRef Google Scholar

Batagelj, V., & Brandes, U. (2005). Efficient generation of large random networks. Physical Review E, 71 (3), 036113.CrossRef Google Scholar PubMed

Batagelj, V., & Mrvar, A. (2004). Pajek—analysis and visualization of large networks, Lecture Notes in Computer Science, vol. 2265 (pp 477–478). Springer.Google Scholar

Batagelj, V., & Zaveršnik, M. (2011). Fast algorithms for determining (generalized) core groups in social networks. Advances in Data Analysis and Classification, 5 (2), 129–145.CrossRef Google Scholar

Behnel, S., Bradshaw, R., Citro, C., Dalcin, L., Seljebotn, D. S., & Smith, K. (2011). Cython: The best of both worlds. Computing in Science & Engineering, 13 (2), 31–39.CrossRef Google Scholar

Bergamini, E., & Meyerhenke, H. (2015). Fully-dynamic approximation of betweenness centrality. In Proceedings of the Algorithms - ESA 2015 - 23rd Annual European Symposium, Patras, Greece, September 14–16. Springer, pp. 155–166.CrossRef Google Scholar

Bergamini, E., Meyerhenke, H., & Staudt, C. (2015). Approximating betweenness centrality in large evolving networks. In Proceedings of the 17th Workshop on Algorithm Engineering and Experiments, ALENEX 2015. San Diego, CA, USA, January 5, 2015. SIAM, pp. 133–146.CrossRef Google Scholar

Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008 (10), P10008.CrossRef Google Scholar

Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., & Hwang, D.-U. (2006). Complex networks: Structure and dynamics. Physics Reports, 424 (4), 175–308.CrossRef Google Scholar

Bode, M., Fountoulakis, N., & Müller, T. (2015). On the largest component of a hyperbolic model of complex networks. Electronic Journal of Combinatorics, 22 (3), P3.24.CrossRef Google Scholar

Boldi, P., Codenotti, B., Santini, M., & Vigna, S. (2004). Ubicrawler: A scalable fully distributed web crawler. Software: Practice & Experience, 34 (8), 711–726.Google Scholar

Brandes, U. (2001). A faster algorithm for betweenness centrality. Journal of Mathematical Sociology, 25 (2), 163–177.CrossRef Google Scholar

Brin, S., & Page, L. (2012). Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer Networks, 56 (18), 3825–3833.CrossRef Google Scholar

CAIDA (2003). Caida skitter router-level topology measurements. Retrieved from http://www.caida.org/data/router-adjacencies/.Google Scholar

Chakrabarti, D., Zhan, Y., & Faloutsos, C. (2004). R-MAT: A recursive model for graph mining. In SDM (vol. 4, pp. 442–446). Orlando, FL, USA: SIAM.Google Scholar

Costa, L. d. F., Oliveira, O. N. Jr, Travieso, G., Rodrigues, F. A., Villas Boas, P. R., Antiqueira, L., . . . Correa Rocha, L. E. (2011). Analyzing and modeling real-world phenomena with complex networks: A survey of applications. Advances in Physics, 60 (3), 329–412.CrossRef Google Scholar

Crescenzi, P., Grossi, R., Habib, M., Lanzi, L., & Marino, A. (2013). On computing the diameter of real-world undirected graphs. Theoretical Computer Science, 514, 84–95.CrossRef Google Scholar

Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal, Complex Systems, 1695 (5), 1–9.Google Scholar

Dasari, N. S., Ranjan, D., & Zubair, M. (2014). ParK: An efficient algorithm for k-core decomposition on multicore processors. In Lin, J., Pei, J., Hu, X., Chang, W., Nambiar, R., Aggarwal, C., Cercone, N., Honavar, V., Huan, J., Mobasher, B., & Pyne, S. (Eds.), IEEE International Conference on Big Data, Big Data 2014 (pp. 9–16). Washington, DC: IEEE, October 27–30, 2014.Google Scholar

Ediger, D., Jiang, K., Riedy, E. J., & Bader, D. A. (2013). GraphCT: Multithreaded algorithms for massive graph analysis. IEEE Transactions on Parallel and Distributed Systems, 24 (11), 2220–2229.Google Scholar

Ediger, D., McColl, R., Riedy, E. J., & Bader, D. A. (2012). STINGER: High performance data structure for streaming graphs. In IEEE Conference on High Performance Extreme Computing HPEC 2012, Waltham, MA, USA, September 10–12, 2012, pp. 1–5.CrossRef Google Scholar

Eppstein, D., & Wang, J. (2004). Fast approximation of centrality. Journal of Graph Algorithms Applications, 8, 39–45.CrossRef Google Scholar

Esders, K. (2015). Link prediction in large-scale complex networks. Master's thesis, Karlsruhe Institute of Technology.Google Scholar

Flick, P. (2014). Analysis of human tissue-specific protein-protein interaction networks. Master's thesis, Karlsruhe Institute of Technology.Google Scholar

Freeman, L. C. (1979). Centrality in social networks conceptual clarification. Social Networks, 1 (3), 215–239.CrossRef Google Scholar

Geisberger, R., Sanders, P., & Schultes, D. (2008). Better approximation of betweenness centrality. In Proceedings of the Tenth Workshop on Algorithm Engineering and Experiments (ALENEX), San Francisco, CA, USA. SIAM, pp. 90–100.CrossRef Google Scholar

Girvan, M., & Newman, M. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99 (12), 7821.CrossRef Google Scholar PubMed

Gugelmann, L., Panagiotou, K., & Peter, U. (2012). Random hyperbolic graphs: Degree sequence and clustering - (extended abstract). In Automata, Languages, and Programming - 39th International Colloquium, ICALP 2012, Proceedings, Part II, pp. 573–585.Google Scholar

Hagberg, A., Swart, P., & S Chult, D. (2008). Exploring network structure, dynamics, and function using NetworkX. Technical report, Los Alamos National Laboratory (LANL).Google Scholar

Hakimi, S. L. (1962). On realizability of a set of integers as degrees of the vertices of a linear graph. I. Journal of the Society for Industrial & Applied Mathematics, 10 (3), 496–506.CrossRef Google Scholar

Hamann, M., Lindner, G., Meyerhenke, H., Staudt, C. L., & Wagner, D. (2016). Structure-preserving sparsification methods for social networks. Social Network Analysis and Mining, 6 (1), 22:1–22:22.CrossRef Google Scholar

Katz, L. (1953). A new status index derived from sociometric analysis. Psychometrika, 18 (1), 39–43.CrossRef Google Scholar

Kiwi, M. A., & Mitsche, D. (2015). A bound for the diameter of random hyperbolic graphs. In Proceedings of the Twelfth Workshop on Analytic Algorithmics and Combinatorics, ANALCO 2015, San Diego, CA, USA, January 4, 2015, pp. 26–39.CrossRef Google Scholar

Koch, J., Staudt, C. L., Vogel, M., & Meyerhenke, H. (2015). Complex network analysis on distributed systems: An empirical comparison. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Paris, France. ACM, pp. 1169–1176.CrossRef Google Scholar

Krioukov, D., Papadopoulos, F., Kitsak, M., Vahdat, A., & Boguñá, M. (2010). Hyperbolic geometry of complex networks. Physical Review E, 82, 036106.CrossRef Google Scholar PubMed

Kunegis, J. (2013). KONECT: The koblenz network collection. In 22nd International World Wide Web Web Conference, WWW '13, Rio de Janeiro, Brazil, May 13–17, 2013, Companion Volume, pp. 1343–1350.CrossRef Google Scholar

Lambiotte, R. (2010). Multi-scale modularity in complex networks. In Proceedings of the Eighth International Symposium on Modeling and Optimization in Mobile, Ad-Hoc and Wireless Networks (WiOpt 2010), May 31–June 4, 2010, University of Avignon, Avignon, France, pp. 546–553.Google Scholar

Lancichinetti, A., & Fortunato, S. (2009). Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Physical Review E, 80 (1), 016118.CrossRef Google Scholar PubMed

Leskovec, J., & Krevl, A. (2014). SNAP Datasets: Stanford large network dataset collection. Retrieved from http://snap.stanford.edu/data.Google Scholar

Leskovec, J., & Sosič, R. (2014). SNAP: A general purpose network analysis and graph mining library in C++. Retrieved from http://snap.stanford.edu/snap.Google Scholar

Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., & Hellerstein, J. M. (2012). Distributed graphlab: A framework for machine learning and data mining in the cloud. Proceedings of the VLDB Endowment, 5 (8), 716–727.CrossRef Google Scholar

Lugowski, A., Alber, D.M., Buluç, A., Gilbert, J. R., Reinhardt, S.P., Teng, Y., & Waranis, A. (2012). A flexible open-source toolbox for scalable complex graph analysis. In Proceedings of the Twelfth SIAM International Conference on Data Mining, Anaheim, CA, USA, April 26–28, 2012, pp. 930–941.CrossRef Google Scholar

Newman, M. (2010). Networks: An introduction. New York, NY, USA: Oxford University Press.CrossRef Google Scholar

O'Madadhain, J., Fisher, D., White, S., & Boey, Y. (2003). The JUNG (java universal network/graph) framework. Irvine, California: University of California.Google Scholar

Erdős, P., & Rényi, A. (1960). On the evolution of random graphs. Publication of the Mathematical Institute of the Hungarian Academy of Sciences, 5.Google Scholar

Peixoto, T. P. (2006). graph-tool. Retrieved from http://graph-tool.skewed.de.Google Scholar

Perez, F., Granger, B. E., & Obispo, C. (2013). An open source framework for interactive, collaborative and reproducible scientific computing and education.Google Scholar

Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76 (3), 036106.CrossRef Google Scholar PubMed

Riondato, M., & Kornaropoulos, E. M. (2016). Fast approximation of betweenness centrality through sampling. Data Mining and Knowledge Discovery, 30 (2), 438–475.CrossRef Google Scholar

Schank, T., & Wagner, D. (2005). Approximating clustering coefficient and transitivity. Journal of Graph Algorithms and Applications, 9 (2), 265–275.CrossRef Google Scholar

Shun, J., & Blelloch, G. E. (2013). Ligra: A lightweight graph processing framework for shared memory. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '13, Shenzhen, China, February 23–27, 2013. ACM, pp. 135–146.CrossRef Google Scholar

Sporns, O., & Betzel, R. F. (2016). Modular brain networks. Annual Review of Psychology, 67, 613–640.CrossRef Google Scholar PubMed

Staudt, C. L., & Meyerhenke, H. (2016). Engineering parallel algorithms for community detection in massive networks. IEEE Transactions on Parallel and Distributed Systems, 27 (1), 171–184.CrossRef Google Scholar

Traud, A. L., Mucha, P. J., & Porter, M. A. (2012). Social structure of facebook networks. Physica A: Statistical Mechanics and its Applications, 391 (16), 4165–4180.CrossRef Google Scholar

von Looz, M., Meyerhenke, H., & Prutkin, R. (2015). Generating random hyperbolic graphs in subquadratic time. In Proceedings of 26th International Symposium on Algorithms and Computation (ISAAC 2015), LNCS, Nagoya, Japan. Springer.Google Scholar

Zafarani, R., & Liu, H. (2009). Social computing data repository at ASU. Retrieved from http://socialcomputing.asu.edu.Google Scholar

Article contents

NetworKit: A tool suite for large-scale complex network analysis

Abstract

Keywords

Information

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests