Skip to main content Accessibility help
×
Home

A spectral method for community detection in moderately sparse degree-corrected stochastic block models

  • Lennart Gulikers (a1), Marc Lelarge (a2) and Laurent Massoulié (a3)

Abstract

We consider community detection in degree-corrected stochastic block models. We propose a spectral clustering algorithm based on a suitably normalized adjacency matrix. We show that this algorithm consistently recovers the block membership of all but a vanishing fraction of nodes, in the regime where the lowest degree is of order log(n) or higher. Recovery succeeds even for very heterogeneous degree distributions. The algorithm does not rely on parameters as input. In particular, it does not need to know the number of communities.

Copyright

Corresponding author

* Postal address: Microsoft Research - Inria Joint Centre, Campus de l'École Polytechnique, Bâtiment Alan Turing, 1 rue Honoré d'Estienne d'Orves, 91120 Palaiseau, France.
** Email address: lennart.gulikers@inria.fr
*** Postal address: Inria Paris, 2 rue Simone Iff, CS 42112, 75589 Paris Cedex 12, France. Email address: marc.lelarge@ens.fr
**** Email address: laurent.massoulie@inria.fr

References

Hide All
[1] Abbe, E., Bandeira, A. S. and Hall, G. (2016). Exact recovery in the stochastic block model. IEEE Trans. Inf. Theory 62, 471487.
[2] Adamic, L. A. and Glance, N. (2005). The political blogosphere and the 2004 U.S. election: divided they blog. In Proc. LinkKDD'05, ACM, New York, pp. 3643.
[3] Bernstein, S. (1946). The Theory of Probabilities. Gastehizdat, Moscow.
[4] Chaudhuri, K., Chung, F. and Tsiatas, A. (2012). Spectral clustering of graphs with general degrees in the extended planted partition model. In Proc. 25th Annual Conf. Learning Theory, pp. 35.135.23.
[5] Chung, F. and Radcliffe, M. (2011). On the spectra of general random graphs. Electron. J. Combin. 18, 215.
[6] Chung, F., Lu, L. and Vu, V. (2004). The spectra of random graphs with given expected degrees. Internet Math. 1, 257275.
[7] Coja-Oghlan, A. and Lanka, A. (2010). Finding planted partitions in random graphs with general degree distributions. SIAM J. Discrete Math. 23, 16821714.
[8] Dasgupta, A., Hopcroft, J. E. and McSherry, F. (2004). Spectral analysis of random graphs with skewed degree distributions. In Proc. 45th Ann. IEEE Symp. Foundations Comput. Sci., IEEE, New York, pp. 602610.
[9] Decelle, A., Krzakala, F., Moore, C. and Zdeborová, L. (2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Phys. Rev. E 84, 066106.
[10] Feige, U. and Ofek, E. (2005). Spectral techniques applied to sparse random graphs. Random Structures Algorithms 27, 251275.
[11] Girvan, M. and Newman, M. E. J. (2002). Community structure in social and biological networks. Proc. Nat. Acad. Sci. USA 99, 78217826.
[12] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97, 10901098.
[13] Holland, P. W., Laskey, K. B. and Leinhardt, S. (1983). Stochastic blockmodels: first steps. Social Networks 5, 109137.
[14] Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis. Cambridge University Press.
[15] Jin, J. (2015). Fast community detection by SCORE. Ann. Statist. 43, 5789.
[16] Karrer, B. and Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Phys. Rev. E 83, 016107.
[17] Krzakala, F. et al. (2013). Spectral redemption in clustering sparse networks. Proc. Nat. Acad. Sci. USA 110, 2093520940.
[18] Le, C. M. and Vershynin, R. (2015). Concentration and regularization of random graphs. Preprint. Available at https://arxiv.org/abs/1506.00669.
[19] Lei, J. and Rinaldo, A. (2015). Consistency of spectral clustering in stochastic block models. Ann. Statist. 43, 215237.
[20] Lusseau, D. et al. (2003). The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations. Behavioral Ecol. Sociobiol. 54, 396405.
[21] McSherry, F. (2001). Spectral partitioning of random graphs. In Proc. 42nd IEEE Symp. Foundations Comput. Sci., IEEE, Los Alamitos, CA, pp. 529537.
[22] Mihail, M. and Papadimitriou, C. (2002). On the eigenvalue power law. In Randomization and Approximation Techniques in Computer Science (Lecture Notes Comput. Sci. 2483), Springer, Berlin, pp. 254262.
[23] Mossel, E., Neeman, J. and Sly, A. (2015). Consistency thresholds for the planted bisection model. In STOC'15—Proc. 2015 ACM Symp. Theory Comput., ACM, New York, pp. 6975.
[24] Newman, M. E. J. (2004). Detecting community structure in networks. Europ. Phys. J. B 38, 321330.
[25] Newman, M. E. J. (2010). Networks: An Introduction. Oxford University Press.
[26] Newman, M. E. J. and Girvan, M. (2004). Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113.
[27] Qin, T. and Rohe, K. (2013). Regularized spectral clustering under the degree-corrected stochastic blockmodel. In Advances in Neural Information Processing Systems 26, Curran Associates, Red Hook, NY, pp. 31203128.
[28] Rohe, K., Chatterjee, S. and Yu, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39, 18781915.
[29] Tomozei, D.-C. and Massoulié, L. (2014). Distributed user profiling via spectral methods. Stoch. Systems 4, 143.
[30] Von Luxburg, U. (2007). A tutorial on spectral clustering. Statist. Comput. 17, 395416.
[31] Zachary, W. W. (1977). An information flow model for conflict and fission in small groups. J. Anthropological Res. 33, 452473.
[32] Zhang, X., Nadakuditi, R. and Newman, M. E. J. (2014). Spectra of random graphs with community structure and arbitrary degrees. Phys. Rev. E 89, 042816.

Keywords

MSC classification

A spectral method for community detection in moderately sparse degree-corrected stochastic block models

  • Lennart Gulikers (a1), Marc Lelarge (a2) and Laurent Massoulié (a3)

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed