Literature DB >> 19840391

Breaking the hierarchy--a new cluster selection mechanism for hierarchical clustering methods.

László A Zahoránszky1, Gyula Y Katona, Péter Hári, András Málnási-Csizmadia, Katharina A Zweig, Gergely Zahoránszky-Köhalmi.   

Abstract

BACKGROUND: Hierarchical clustering methods like Ward's method have been used since decades to understand biological and chemical data sets. In order to get a partition of the data set, it is necessary to choose an optimal level of the hierarchy by a so-called level selection algorithm. In 2005, a new kind of hierarchical clustering method was introduced by Palla et al. that differs in two ways from Ward's method: it can be used on data on which no full similarity matrix is defined and it can produce overlapping clusters, i.e., allow for multiple membership of items in clusters. These features are optimal for biological and chemical data sets but until now no level selection algorithm has been published for this method.
RESULTS: In this article we provide a general selection scheme, the level independent clustering selection method, called LInCS. With it, clusters can be selected from any level in quadratic time with respect to the number of clusters. Since hierarchically clustered data is not necessarily associated with a similarity measure, the selection is based on a graph theoretic notion of cohesive clusters. We present results of our method on two data sets, a set of drug like molecules and set of protein-protein interaction (PPI) data. In both cases the method provides a clustering with very good sensitivity and specificity values according to a given reference clustering. Moreover, we can show for the PPI data set that our graph theoretic cohesiveness measure indeed chooses biologically homogeneous clusters and disregards inhomogeneous ones in most cases. We finally discuss how the method can be generalized to other hierarchical clustering methods to allow for a level independent cluster selection.
CONCLUSION: Using our new cluster selection method together with the method by Palla et al. provides a new interesting clustering mechanism that allows to compute overlapping clusters, which is especially valuable for biological and chemical data sets.

Entities:  

Year:  2009        PMID: 19840391      PMCID: PMC2774311          DOI: 10.1186/1748-7188-4-12

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  15 in total

1.  Protein interactions: two methods for assessment of the reliability of high throughput observations.

Authors:  Charlotte M Deane; Łukasz Salwiński; Ioannis Xenarios; David Eisenberg
Journal:  Mol Cell Proteomics       Date:  2002-05       Impact factor: 5.911

Review 2.  Community structure in social and biological networks.

Authors:  M Girvan; M E J Newman
Journal:  Proc Natl Acad Sci U S A       Date:  2002-06-11       Impact factor: 11.205

3.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

4.  ZINC--a free database of commercially available compounds for virtual screening.

Authors:  John J Irwin; Brian K Shoichet
Journal:  J Chem Inf Model       Date:  2005 Jan-Feb       Impact factor: 4.956

5.  Uncovering the overlapping community structure of complex networks in nature and society.

Authors:  Gergely Palla; Imre Derényi; Illés Farkas; Tamás Vicsek
Journal:  Nature       Date:  2005-06-09       Impact factor: 49.962

6.  BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks.

Authors:  Steven Maere; Karel Heymans; Martin Kuiper
Journal:  Bioinformatics       Date:  2005-06-21       Impact factor: 6.937

Review 7.  Molecular similarity and diversity in chemoinformatics: from theory to applications.

Authors:  Ana G Maldonado; J P Doucet; Michel Petitjean; Bo-Tao Fan
Journal:  Mol Divers       Date:  2006-02       Impact factor: 2.943

8.  Identification of functional modules in a PPI network by clique percolation clustering.

Authors:  Shihua Zhang; Xuemei Ning; Xiang-Sun Zhang
Journal:  Comput Biol Chem       Date:  2006-11-13       Impact factor: 2.877

9.  CFinder: locating cliques and overlapping modules in biological networks.

Authors:  Balázs Adamcsek; Gergely Palla; Illés J Farkas; Imre Derényi; Tamás Vicsek
Journal:  Bioinformatics       Date:  2006-02-10       Impact factor: 6.937

10.  Diagnostic tests. 1: Sensitivity and specificity.

Authors:  D G Altman; J M Bland
Journal:  BMJ       Date:  1994-06-11
View more
  6 in total

1.  Modulation of Triple Artemisinin-Based Combination Therapy Pharmacodynamics by Plasmodium falciparum Genotype.

Authors:  Megan R Ansbro; Zina Itkin; Lu Chen; Gergely Zahoranszky-Kohalmi; Chanaki Amaratunga; Olivo Miotto; Tyler Peryea; Charlotte V Hobbs; Seila Suon; Juliana M Sá; Arjen M Dondorp; Rob W van der Pluijm; Thomas E Wellems; Anton Simeonov; Richard T Eastman
Journal:  ACS Pharmacol Transl Sci       Date:  2020-11-02

2.  Clustering of High Throughput Gene Expression Data.

Authors:  Harun Pirim; Burak Ekşioğlu; Andy Perkins; Cetin Yüceer
Journal:  Comput Oper Res       Date:  2012-12       Impact factor: 4.008

3.  Specialization and utilization after hepatectomy in academic medical centers.

Authors:  Joshua J Shaw; Heena P Santry; Shimul A Shah
Journal:  J Surg Res       Date:  2013-05-21       Impact factor: 2.192

4.  Impact of similarity threshold on the topology of molecular similarity networks and clustering outcomes.

Authors:  Gergely Zahoránszky-Kőhalmi; Cristian G Bologa; Tudor I Oprea
Journal:  J Cheminform       Date:  2016-03-30       Impact factor: 5.514

5.  SmartGraph: a network pharmacology investigation platform.

Authors:  Gergely Zahoránszky-Kőhalmi; Timothy Sheils; Tudor I Oprea
Journal:  J Cheminform       Date:  2020-01-21       Impact factor: 5.514

6.  A network-based method to assess the statistical significance of mild co-regulation effects.

Authors:  Emőke-Ágnes Horvát; Jitao David Zhang; Stefan Uhlmann; Özgür Sahin; Katharina Anna Zweig
Journal:  PLoS One       Date:  2013-09-09       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.