Literature DB >> 34716259

Two provably consistent divide-and-conquer clustering algorithms for large networks.

Soumendu Sundar Mukherjee1,2, Purnamrita Sarkar3, Peter J Bickel4.   

Abstract

In this article, we advance divide-and-conquer strategies for solving the community detection problem in networks. We propose two algorithms that perform clustering on several small subgraphs and finally patch the results into a single clustering. The main advantage of these algorithms is that they significantly bring down the computational cost of traditional algorithms, including spectral clustering, semidefinite programs, modularity-based methods, likelihood-based methods, etc., without losing accuracy, and even improving accuracy at times. These algorithms are also, by nature, parallelizable. Since most traditional algorithms are accurate, and the corresponding optimization problems are much simpler in small problems, our divide-and-conquer methods provide an omnibus recipe for scaling traditional algorithms up to large networks. We prove the consistency of these algorithms under various subgraph selection procedures and perform extensive simulations and real-data analysis to understand the advantages of the divide-and-conquer approach in various settings.

Entities:  

Keywords:  clustering; divide-and-conquer; networks

Year:  2021        PMID: 34716259      PMCID: PMC8612351          DOI: 10.1073/pnas.2100482118

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  8 in total

1.  Finding and evaluating community structure in networks.

Authors:  M E J Newman; M Girvan
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-02-26

2.  Efficient discovery of overlapping communities in massive networks.

Authors:  Prem K Gopalan; David M Blei
Journal:  Proc Natl Acad Sci U S A       Date:  2013-08-15       Impact factor: 11.205

3.  A nonparametric view of network models and Newman-Girvan and other modularities.

Authors:  Peter J Bickel; Aiyou Chen
Journal:  Proc Natl Acad Sci U S A       Date:  2009-11-23       Impact factor: 11.205

4.  Stochastic blockmodels and community structure in networks.

Authors:  Brian Karrer; M E J Newman
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2011-01-21

5.  Co-clustering directed graphs to discover asymmetries and directional communities.

Authors:  Karl Rohe; Tai Qin; Bin Yu
Journal:  Proc Natl Acad Sci U S A       Date:  2016-10-21       Impact factor: 11.205

6.  A protein complex network of Drosophila melanogaster.

Authors:  K G Guruharsha; Jean-François Rual; Bo Zhai; Julian Mintseris; Pujita Vaidya; Namita Vaidya; Chapman Beekman; Christina Wong; David Y Rhee; Odise Cenaj; Emily McKillip; Saumini Shah; Mark Stapleton; Kenneth H Wan; Charles Yu; Bayan Parsa; Joseph W Carlson; Xiao Chen; Bhaveen Kapadia; K VijayRaghavan; Steven P Gygi; Susan E Celniker; Robert A Obar; Spyros Artavanis-Tsakonas
Journal:  Cell       Date:  2011-10-28       Impact factor: 41.582

7.  Covariate-assisted spectral clustering.

Authors:  N Binkiewicz; J T Vogelstein; K Rohe
Journal:  Biometrika       Date:  2017-03-19       Impact factor: 2.445

8.  Structure and inference in annotated networks.

Authors:  M E J Newman; Aaron Clauset
Journal:  Nat Commun       Date:  2016-06-16       Impact factor: 14.919

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.