Literature DB >> 16352721

Information-based clustering.

Noam Slonim1, Gurinder Singh Atwal, Gasper Tkacik, William Bialek.   

Abstract

In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. Here, we reformulate the clustering problem from an information theoretic perspective that avoids many of these assumptions. In particular, our formulation obviates the need for defining a cluster "prototype," does not require an a priori similarity metric, is invariant to changes in the representation of the data, and naturally captures nonlinear relations. We apply this approach to different domains and find that it consistently produces clusters that are more coherent than those extracted by existing algorithms. Finally, our approach provides a way of clustering based on collective notions of similarity rather than the traditional pairwise measures.

Mesh:

Year:  2005        PMID: 16352721      PMCID: PMC1317937          DOI: 10.1073/pnas.0507432102

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  8 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data.

Authors:  Eran Segal; Michael Shapira; Aviv Regev; Dana Pe'er; David Botstein; Daphne Koller; Nir Friedman
Journal:  Nat Genet       Date:  2003-06       Impact factor: 38.330

3.  Network information and connected correlations.

Authors:  Elad Schneidman; Susanne Still; Michael J Berry; William Bialek
Journal:  Phys Rev Lett       Date:  2003-12-02       Impact factor: 9.161

4.  Open source clustering software.

Authors:  M J L de Hoon; S Imoto; J Nolan; S Miyano
Journal:  Bioinformatics       Date:  2004-02-10       Impact factor: 6.937

5.  Use of logic relationships to decipher protein network organization.

Authors:  Peter M Bowers; Shawn J Cokus; David Eisenberg; Todd O Yeates
Journal:  Science       Date:  2004-12-24       Impact factor: 47.728

Review 6.  Exploring the new world of the genome with DNA microarrays.

Authors:  P O Brown; D Botstein
Journal:  Nat Genet       Date:  1999-01       Impact factor: 38.330

7.  Genomic expression programs in the response of yeast cells to environmental changes.

Authors:  A P Gasch; P T Spellman; C M Kao; O Carmel-Harel; M B Eisen; G Storz; D Botstein; P O Brown
Journal:  Mol Biol Cell       Date:  2000-12       Impact factor: 4.138

8.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

  8 in total
  67 in total

1.  Identifying informative subsets of the Gene Ontology with information bottleneck methods.

Authors:  Bo Jin; Xinghua Lu
Journal:  Bioinformatics       Date:  2010-08-11       Impact factor: 6.937

2.  A universal framework for regulatory element discovery across all genomes and data types.

Authors:  Olivier Elemento; Noam Slonim; Saeed Tavazoie
Journal:  Mol Cell       Date:  2007-10-26       Impact factor: 17.970

3.  Probabilistic assembly of human protein interaction networks from label-free quantitative proteomics.

Authors:  Mihaela E Sardiu; Yong Cai; Jingji Jin; Selene K Swanson; Ronald C Conaway; Joan W Conaway; Laurence Florens; Michael P Washburn
Journal:  Proc Natl Acad Sci U S A       Date:  2008-01-24       Impact factor: 11.205

4.  Annealing and the Normalized N-Cut.

Authors:  Tomáš Gedeon; Albert E Parker; Collette Campion; Zane Aldworth
Journal:  Pattern Recognit       Date:  2008-02       Impact factor: 7.740

5.  Information flow and optimization in transcriptional regulation.

Authors:  Gasper Tkacik; Curtis G Callan; William Bialek
Journal:  Proc Natl Acad Sci U S A       Date:  2008-08-21       Impact factor: 11.205

6.  MIST: Maximum Information Spanning Trees for dimension reduction of biological data sets.

Authors:  Bracken M King; Bruce Tidor
Journal:  Bioinformatics       Date:  2009-03-04       Impact factor: 6.937

7.  Evolutionarily conserved coding properties of auditory neurons across grasshopper species.

Authors:  Daniela Neuhofer; Sandra Wohlgemuth; Andreas Stumpner; Bernhard Ronacher
Journal:  Proc Biol Sci       Date:  2008-09-07       Impact factor: 5.349

Review 8.  Coverage of protein domain families with structural protein-protein interactions: current progress and future trends.

Authors:  Alexander Goncearenco; Benjamin A Shoemaker; Dachuan Zhang; Alexey Sarychev; Anna R Panchenko
Journal:  Prog Biophys Mol Biol       Date:  2014-06-13       Impact factor: 3.667

9.  RiceArrayNet: a database for correlating gene expression from transcriptome profiling, and its application to the analysis of coexpressed genes in rice.

Authors:  Tae-Ho Lee; Yeon-Ki Kim; Thu Thi Minh Pham; Sang Ik Song; Ju-Kon Kim; Kyu Young Kang; Gynheung An; Ki-Hong Jung; David W Galbraith; Minkyun Kim; Ung-Han Yoon; Baek Hie Nahm
Journal:  Plant Physiol       Date:  2009-07-15       Impact factor: 8.340

10.  Functional clustering algorithm for the analysis of dynamic network data.

Authors:  S Feldt; J Waddell; V L Hetrick; J D Berke; M Zochowski
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2009-05-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.