Literature DB >> 16766561

Clustering microarray gene expression data using weighted Chinese restaurant process.

Zhaohui S Qin1.   

Abstract

MOTIVATION: Clustering microarray gene expression data is a powerful tool for elucidating co-regulatory relationships among genes. Many different clustering techniques have been successfully applied and the results are promising. However, substantial fluctuation contained in microarray data, lack of knowledge on the number of clusters and complex regulatory mechanisms underlying biological systems make the clustering problems tremendously challenging.
RESULTS: We devised an improved model-based Bayesian approach to cluster microarray gene expression data. Cluster assignment is carried out by an iterative weighted Chinese restaurant seating scheme such that the optimal number of clusters can be determined simultaneously with cluster assignment. The predictive updating technique was applied to improve the efficiency of the Gibbs sampler. An additional step is added during reassignment to allow genes that display complex correlation relationships such as time-shifted and/or inverted to be clustered together. Analysis done on a real dataset showed that as much as 30% of significant genes clustered in the same group display complex relationships with the consensus pattern of the cluster. Other notable features including automatic handling of missing data, quantitative measures of cluster strength and assignment confidence. Synthetic and real microarray gene expression datasets were analyzed to demonstrate its performance. AVAILABILITY: A computer program named Chinese restaurant cluster (CRC) has been developed based on this algorithm. The program can be downloaded at http://www.sph.umich.edu/csg/qin/CRC/.

Entities:  

Mesh:

Year:  2006        PMID: 16766561     DOI: 10.1093/bioinformatics/btl284

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  35 in total

1.  Integrative Sparse K-Means With Overlapping Group Lasso in Genomic Applications for Disease Subtype Discovery.

Authors:  Zhiguang Huo; George Tseng
Journal:  Ann Appl Stat       Date:  2017-07-20       Impact factor: 2.083

2.  Query large scale microarray compendium datasets using a model-based bayesian approach with variable selection.

Authors:  Ming Hu; Zhaohui S Qin
Journal:  PLoS One       Date:  2009-02-13       Impact factor: 3.240

3.  Discovering transcriptional modules by Bayesian data integration.

Authors:  Richard S Savage; Zoubin Ghahramani; Jim E Griffin; Bernard J de la Cruz; David L Wild
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

4.  CLIC: clustering analysis of large microarray datasets with individual dimension-based clustering.

Authors:  Taegyun Yun; Taeho Hwang; Kihoon Cha; Gwan-Su Yi
Journal:  Nucleic Acids Res       Date:  2010-06-06       Impact factor: 16.971

5.  Semi-supervised gene shaving method for predicting low variation biological pathways from genome-wide data.

Authors:  Dongxiao Zhu
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

6.  Expression profiles of switch-like genes accurately classify tissue and infectious disease phenotypes in model-based classification.

Authors:  Michael Gormley; Aydin Tozeren
Journal:  BMC Bioinformatics       Date:  2008-11-17       Impact factor: 3.169

7.  A comparison of four clustering methods for brain expression microarray data.

Authors:  Alexander L Richards; Peter Holmans; Michael C O'Donovan; Michael J Owen; Lesley Jones
Journal:  BMC Bioinformatics       Date:  2008-11-25       Impact factor: 3.169

8.  MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering.

Authors:  Eun-Youn Kim; Seon-Young Kim; Daniel Ashlock; Dougu Nam
Journal:  BMC Bioinformatics       Date:  2009-08-22       Impact factor: 3.169

9.  Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

Authors:  Qingzhong Liu; Andrew H Sung; Zhongxue Chen; Jianzhong Liu; Xudong Huang; Youping Deng
Journal:  PLoS One       Date:  2009-12-11       Impact factor: 3.240

10.  AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology.

Authors:  Fiona Achcar; Jean-Michel Camadro; Denis Mestivier
Journal:  Nucleic Acids Res       Date:  2009-05-27       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.