Skip to main content
×
×
Home

Efficient detection of communities with significant overlaps in networks: Partial community merger algorithm

  • ELVIS H. W. XU (a1) and PAK MING HUI (a1)
Abstract

Detecting communities in large-scale social networks is a challenging task where each vertex may belong to multiple communities. Such behavior of vertices and the implied strong overlaps among communities render many detection algorithms invalid. We develop a Partial Community Merger Algorithm (PCMA) for detecting communities with significant overlaps as well as slightly overlapping and disjoint ones. It is a bottom-up approach based on properly reassembling partial information of communities revealed in ego networks of vertices to reconstruct complete communities. We propose a novel similarity measure of communities and an efficient merger process to address the two key issues—noise control and merger order—in implementing this approach. PCMA is tested against two benchmarks and overall it outperforms all compared algorithms in both accuracy and efficiency. It is applied to two huge online social networks, Friendster and Sina Weibo. Millions of communities are detected and they are of higher qualities than the corresponding metadata groups. We find that the latter should not be regarded as the ground-truth of structural communities. The significant overlapping pattern found in the detected communities confirms the need of new algorithms, such as PCMA, to handle multiple memberships of vertices in social networks.

Copyright
References
Hide All
Ahn, Y.-Y., Bagrow, J. P., & Lehmann, S. (2010). Link communities reveal multiscale complexity in networks. Nature, 466 (7307), 761764.
Ball, B., Karrer, B., & Newman, M. E. J. (2011). Efficient and principled method for detecting communities in networks. Physical Review E, 84 (3), 036103.
Baumes, J., Goldberg, M., Krishnamoorthy, M., Magdon-Ismail, M., & Preston, N. (2005 Feb.). Finding communities by clustering a graph into overlapping subgraphs. In Proceedings of the IADIS International Conference on Applied Computing, pp. 615–623.
Bianconi, G., Pin, P., & Marsili, M. (2009). Assessing the relevance of node features for network structure. Proceedings of the National Academy of Sciences of the United States of America, 106 (28), 1143311438.
Condon, A., & Karp, R. M. (2001). Algorithms for graph partitioning on the planted partition model. Random Structures and Algorithms, 18 (2), 116140.
Coscia, M., Rossetti, G., Giannotti, F., & Pedreschi, D. (2012). DEMON: A local-first discovery method for overlapping communities. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM Press, pp. 615–623.
Coscia, M., Rossetti, G., Giannotti, F., & Pedreschi, D. (2014). Uncovering hierarchical and overlapping communities with a local-first approach. ACM Transactions on Knowledge Discovery from Data, 9 (1), 127.
Danon, L., Díaz-Guilera, A., Duch, J., & Arenas, A. (2005). Comparing community structure identification. Journal of Statistical Mechanics: Theory and Experiment, 2005 (9), P09008.
Evans, T. S., & Lambiotte, R. (2009). Line graphs, link partitions, and overlapping communities. Physical Review E, 80 (1), 016105.
Evans, T. S., & Lambiotte, R. (2010). Line graphs of weighted networks for overlapping communities. The European Physical Journal B, 77 (2), 265272.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486 (3–5), 75174.
Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99 (12), 78217826.
Gregory, S. (2010). Finding overlapping communities in networks by label propagation. New Journal of Physics, 12 (10), 103018.
Gregory, S. (2011). Fuzzy overlapping communities in networks. Journal of Statistical Mechanics: Theory and Experiment, 2011 (2), P02017.
Hric, D., Darst, R. K., & Fortunato, S. (2014). Community detection in networks: Structural communities versus ground truth. Physical Review E, 90 (6), 062805.
Lancichinetti, A., Fortunato, S., & Kertész, J. (2009). Detecting the overlapping and hierarchical community structure in complex networks. New Journal of Physics, 11 (3), 033015.
Lancichinetti, A., Fortunato, S., & Radicchi, F. (2008). Benchmark graphs for testing community detection algorithms. Physical Review E, 78 (4), 046110.
Lancichinetti, A., Radicchi, F., & Ramasco, J. J. (2010). Statistical significance of communities in networks. Physical Review E, 81 (4), 046110.
Lancichinetti, A., Radicchi, F., Ramasco, J. J, & Fortunato, S. (2011). Finding statistically significant communities in networks. PLoS ONE, 6 (4), e18961.
Leskovec, J., & Krevl, A. (2014 June). SNAP Datasets: Stanford large network dataset collection. Retrieved from http://snap.stanford.edu/data.
Murray, G., Carenini, G., & Ng, R. (2012 June). Using the omega index for evaluating abstractive community detection. In Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization.
Palla, G., Derényi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlapping community structure of complex networks in nature and society. Nature, 435 (7043), 814818.
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., & Parisi, D. (2004). Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America, 101 (9), 26582663.
Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76 (3), 036106.
Rees, B. S., & Gallagher, K. B. (2013). EgoClustering: Overlapping community detection via merged friendship-groups. In Özyer, T., Rokne, J., Wagner, G., & Reuser, A. (Eds.), The influence of technology on social network analysis and mining (pp. 120). Vienna: Springer Vienna.
Seidman, S. B. (1983). Internal cohesion of ls sets in graphs. Social Networks, 5 (2), 97107.
Soundarajan, S., & Hopcroft, J. E. (2015). Use of local group information to identify communities in networks. ACM Transactions on Knowledge Discovery from Data, 9 (3), 127.
Xie, J., Kelley, S., & Szymanski, B. K. (2013). Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Computing Surveys, 45 (4), 135.
Xie, J., Szymanski, B. K., & Liu, X. (2011). SLPA: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops, IEEE pp. 344–349.
Xu, E. H. W. (2016). Retrieved from https://github.com/hwxu/pcma.
Yang, J., & Leskovec, J. (2013a). Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42 (1), 181213.
Yang, J., & Leskovec, J. (2013b). Overlapping community detection at scale. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining, (pp. 587–596). ACM Press.
Yang, J., & Leskovec, J. (2014). Structure and overlaps of ground-truth communities in networks. ACM Transactions on Intelligent Systems and Technology, 5 (2), 135.
Zhang, P. (2015). Evaluating accuracy of community detection using the relative normalized mutual information. Journal of Statistical Mechanics: Theory and Experiment, 2015 (11), P11006.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Network Science
  • ISSN: 2050-1242
  • EISSN: 2050-1250
  • URL: /core/journals/network-science
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Keywords

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed