Literature DB >> 29081726

CGC: A Flexible and Robust Approach to Integrating Co-Regularized Multi-Domain Graph for Clustering.

Wei Cheng1, Zhishan Guo1, Xiang Zhang2, Wei Wang3.   

Abstract

Multi-view graph clustering aims to enhance clustering performance by integrating heterogeneous information collected in different domains. Each domain provides a different view of the data instances. Leveraging cross-domain information has been demonstrated an effective way to achieve better clustering results. Despite the previous success, existing multi-view graph clustering methods usually assume that different views are available for the same set of instances. Thus instances in different domains can be treated as having strict one-to-one relationship. In many real-life applications, however, data instances in one domain may correspond to multiple instances in another domain. Moreover, relationships between instances in different domains may be associated with weights based on prior (partial) knowledge. In this paper, we propose a flexible and robust framework, CGC (Co-regularized Graph Clustering), based on non-negative matrix factorization (NMF), to tackle these challenges. CGC has several advantages over the existing methods. First, it supports many-to-many cross-domain instance relationship. Second, it incorporates weight on cross-domain relationship. Third, it allows partial cross-domain mapping so that graphs in different domains may have different sizes. Finally, it provides users with the extent to which the cross-domain instance relationship violates the in-domain clustering structure, and thus enables users to re-evaluate the consistency of the relationship. We develop an efficient optimization method that guarantees to find the global optimal solution with a given confidence requirement. The proposed method can automatically identify noisy domains and assign smaller weights to them. This helps to obtain optimal graph partition for the focused domain. Extensive experimental results on UCI benchmark data sets, newsgroup data sets and biological interaction networks demonstrate the effectiveness of our approach.

Entities:  

Keywords:  Algorithms; Design; Performance; co-regularization; graph clustering; nonnegative matrix factorization

Year:  2016        PMID: 29081726      PMCID: PMC5658064          DOI: 10.1145/2903147

Source DB:  PubMed          Journal:  ACM Trans Knowl Discov Data        ISSN: 1556-4681            Impact factor:   2.713


  8 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  An ensemble framework for clustering protein-protein interaction networks.

Authors:  Sitaram Asur; Duygu Ucar; Srinivasan Parthasarathy
Journal:  Bioinformatics       Date:  2007-07-01       Impact factor: 6.937

3.  Genome-wide searching of rare genetic variants in WTCCC data.

Authors:  Tao Feng; Xiaofeng Zhu
Journal:  Hum Genet       Date:  2010-06-13       Impact factor: 4.132

4.  TEAM: efficient two-locus epistasis tests in human genome-wide association study.

Authors:  Xiang Zhang; Shunping Huang; Fei Zou; Wei Wang
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

Review 5.  Detecting gene-gene interactions that underlie human diseases.

Authors:  Heather J Cordell
Journal:  Nat Rev Genet       Date:  2009-06       Impact factor: 53.242

6.  PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes.

Authors:  Vamsi K Mootha; Cecilia M Lindgren; Karl-Fredrik Eriksson; Aravind Subramanian; Smita Sihag; Joseph Lehar; Pere Puigserver; Emma Carlsson; Martin Ridderstråle; Esa Laurila; Nicholas Houstis; Mark J Daly; Nick Patterson; Jill P Mesirov; Todd R Golub; Pablo Tamayo; Bruce Spiegelman; Eric S Lander; Joel N Hirschhorn; David Altshuler; Leif C Groop
Journal:  Nat Genet       Date:  2003-07       Impact factor: 38.330

7.  Geometric interpretation of gene coexpression network analysis.

Authors:  Steve Horvath; Jun Dong
Journal:  PLoS Comput Biol       Date:  2008-08-15       Impact factor: 4.475

8.  Detection of functional modes in protein dynamics.

Authors:  Jochen S Hub; Bert L de Groot
Journal:  PLoS Comput Biol       Date:  2009-08-28       Impact factor: 4.475

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.