Literature DB >> 17895277

Clustering by soft-constraint affinity propagation: applications to gene-expression data.

Michele Leone1, Martin Weigt.   

Abstract

MOTIVATION: Similarity-measure-based clustering is a crucial problem appearing throughout scientific data analysis. Recently, a powerful new algorithm called Affinity Propagation (AP) based on message-passing techniques was proposed by Frey and Dueck (2007a). In AP, each cluster is identified by a common exemplar all other data points of the same cluster refer to, and exemplars have to refer to themselves. Albeit its proved power, AP in its present form suffers from a number of drawbacks. The hard constraint of having exactly one exemplar per cluster restricts AP to classes of regularly shaped clusters, and leads to suboptimal performance, e.g. in analyzing gene expression data.
RESULTS: This limitation can be overcome by relaxing the AP hard constraints. A new parameter controls the importance of the constraints compared to the aim of maximizing the overall similarity, and allows to interpolate between the simple case where each data point selects its closest neighbor as an exemplar and the original AP. The resulting soft-constraint affinity propagation (SCAP) becomes more informative, accurate and leads to more stable clustering. Even though a new a priori free parameter is introduced, the overall dependence of the algorithm on external tuning is reduced, as robustness is increased and an optimal strategy for parameter selection emerges more naturally. SCAP is tested on biological benchmark data, including in particular microarray data related to various cancer types. We show that the algorithm efficiently unveils the hierarchical cluster structure present in the data sets. Further on, it allows to extract sparse gene expression signatures for each cluster.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17895277     DOI: 10.1093/bioinformatics/btm414

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  18 in total

1.  Computational Analysis of Muscular Dystrophy Sub-types Using A Novel Integrative Scheme.

Authors:  Chen Wang; Sook Ha; Jianhua Xuan; Yue Wang; Eric Hoffman
Journal:  Neurocomputing       Date:  2012-09-01       Impact factor: 5.719

2.  Perturbation biology: inferring signaling networks in cellular systems.

Authors:  Evan J Molinelli; Anil Korkut; Weiqing Wang; Martin L Miller; Nicholas P Gauthier; Xiaohong Jing; Poorvi Kaushik; Qin He; Gordon Mills; David B Solit; Christine A Pratilas; Martin Weigt; Alfredo Braunstein; Andrea Pagnani; Riccardo Zecchina; Chris Sander
Journal:  PLoS Comput Biol       Date:  2013-12-19       Impact factor: 4.475

3.  Shape similarity, better than semantic membership, accounts for the structure of visual object representations in a population of monkey inferotemporal neurons.

Authors:  Carlo Baldassi; Alireza Alemi-Neissi; Marino Pagan; James J Dicarlo; Riccardo Zecchina; Davide Zoccolan
Journal:  PLoS Comput Biol       Date:  2013-08-08       Impact factor: 4.475

4.  Nonlinear dimension reduction and clustering by Minimum Curvilinearity unfold neuropathic pain and tissue embryological classes.

Authors:  Carlo Vittorio Cannistraci; Timothy Ravasi; Franco Maria Montevecchi; Trey Ideker; Massimo Alessio
Journal:  Bioinformatics       Date:  2010-09-15       Impact factor: 6.937

5.  Clustering gene expression data with a penalized graph-based metric.

Authors:  Ariel E Bayá; Pablo M Granitto
Journal:  BMC Bioinformatics       Date:  2011-01-04       Impact factor: 3.169

6.  Acupuncture and chiropractic care for chronic pain in an integrated health plan: a mixed methods study.

Authors:  Lynn L DeBar; Charles Elder; Cheryl Ritenbaugh; Mikel Aickin; Rick Deyo; Richard Meenan; John Dickerson; Jennifer A Webster; Bobbi Jo Yarborough
Journal:  BMC Complement Altern Med       Date:  2011-11-25       Impact factor: 3.659

7.  Interpolation based consensus clustering for gene expression time series.

Authors:  Tai-Yu Chiu; Ting-Chieh Hsu; Chia-Cheng Yen; Jia-Shung Wang
Journal:  BMC Bioinformatics       Date:  2015-04-16       Impact factor: 3.169

8.  CLAG: an unsupervised non hierarchical clustering algorithm handling biological data.

Authors:  Linda Dib; Alessandra Carbone
Journal:  BMC Bioinformatics       Date:  2012-08-08       Impact factor: 3.169

9.  An Affinity Propagation-Based DNA Motif Discovery Algorithm.

Authors:  Chunxiao Sun; Hongwei Huo; Qiang Yu; Haitao Guo; Zhigang Sun
Journal:  Biomed Res Int       Date:  2015-08-10       Impact factor: 3.411

10.  Parallel clustering algorithm for large-scale biological data sets.

Authors:  Minchao Wang; Wu Zhang; Wang Ding; Dongbo Dai; Huiran Zhang; Hao Xie; Luonan Chen; Yike Guo; Jiang Xie
Journal:  PLoS One       Date:  2014-04-04       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.