Literature DB >> 16675467

A mixture model with random-effects components for clustering correlated gene-expression profiles.

S K Ng1, G J McLachlan, K Wang, L Ben-Tovim Jones, S-W Ng.   

Abstract

MOTIVATION: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes.
RESULTS: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation)and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too. AVAILABILITY: A Fortran program blue called EMMIX-WIRE (EM-based MIXture analysis WIth Random Effects) is available on request from the corresponding author.

Mesh:

Year:  2006        PMID: 16675467     DOI: 10.1093/bioinformatics/btl165

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  37 in total

1.  Wavelet-based functional clustering for patterns of high-dimensional dynamic gene expression.

Authors:  Bong-Rae Kim; Timothy McMurry; Wei Zhao; Rongling Wu; Arthur Berg
Journal:  J Comput Biol       Date:  2010-08       Impact factor: 1.479

2.  A computational approach to the functional clustering of periodic gene-expression profiles.

Authors:  Bong-Rae Kim; Li Zhang; Arthur Berg; Jianqing Fan; Rongling Wu
Journal:  Genetics       Date:  2008-09-09       Impact factor: 4.562

3.  Bayesian model-based tight clustering for time course data.

Authors:  Yongsung Joo; G Casella; J Hobert
Journal:  Comput Stat       Date:  2010-03       Impact factor: 1.000

4.  Repeated measures regression mixture models.

Authors:  Minjung Kim; M Lee Van Horn; Thomas Jaki; Jeroen Vermunt; Daniel Feaster; Kenneth L Lichstein; Daniel J Taylor; Brant W Riedel; Andrew J Bush
Journal:  Behav Res Methods       Date:  2020-04

5.  Whole-Volume Clustering of Time Series Data from Zebrafish Brain Calcium Images via Mixture Modeling.

Authors:  Hien D Nguyen; Jeremy F P Ullmann; Geoffrey J McLachlan; Venkatakaushik Voleti; Wenze Li; Elizabeth M C Hillman; David C Reutens; Andrew L Janke
Journal:  Stat Anal Data Min       Date:  2017-12-06       Impact factor: 1.051

6.  Importance of replication in analyzing time-series gene expression data: corticosteroid dynamics and circadian patterns in rat liver.

Authors:  Tung T Nguyen; Richard R Almon; Debra C DuBois; William J Jusko; Ioannis P Androulakis
Journal:  BMC Bioinformatics       Date:  2010-05-26       Impact factor: 3.169

7.  A temporal precedence based clustering method for gene expression microarray data.

Authors:  Ritesh Krishna; Chang-Tsun Li; Vicky Buchanan-Wollaston
Journal:  BMC Bioinformatics       Date:  2010-01-30       Impact factor: 3.169

8.  Functional clustering of periodic transcriptional profiles through ARMA(p,q).

Authors:  Ning Li; Timothy McMurry; Arthur Berg; Zhong Wang; Scott A Berceli; Rongling Wu
Journal:  PLoS One       Date:  2010-04-16       Impact factor: 3.240

9.  AutoClass@IJM: a powerful tool for Bayesian classification of heterogeneous data in biology.

Authors:  Fiona Achcar; Jean-Michel Camadro; Denis Mestivier
Journal:  Nucleic Acids Res       Date:  2009-05-27       Impact factor: 16.971

10.  The bimodality index: a criterion for discovering and ranking bimodal signatures from cancer gene expression profiling data.

Authors:  Jing Wang; Sijin Wen; W Fraser Symmans; Lajos Pusztai; Kevin R Coombes
Journal:  Cancer Inform       Date:  2009-08-05
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.