Literature DB >> 11934740

A mixture model-based approach to the clustering of microarray expression data.

G J McLachlan1, R W Bean, D Peel.   

Abstract

MOTIVATION: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes.
RESULTS: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. AVAILABILITY: EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/

Entities:  

Mesh:

Year:  2002        PMID: 11934740     DOI: 10.1093/bioinformatics/18.3.413

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  96 in total

1.  Assessment of reliability of microarray data and estimation of signal thresholds using mixture modeling.

Authors:  Musa H Asyali; Mohamed M Shoukri; Omer Demirkaya; Khalid S A Khabar
Journal:  Nucleic Acids Res       Date:  2004-04-27       Impact factor: 16.971

2.  Wavelet-based functional clustering for patterns of high-dimensional dynamic gene expression.

Authors:  Bong-Rae Kim; Timothy McMurry; Wei Zhao; Rongling Wu; Arthur Berg
Journal:  J Comput Biol       Date:  2010-08       Impact factor: 1.479

3.  Introducing knowledge into differential expression analysis.

Authors:  Ewa Szczurek; Przemysław Biecek; Jerzy Tiuryn; Martin Vingron
Journal:  J Comput Biol       Date:  2010-08       Impact factor: 1.479

4.  A simple Bayesian mixture model with a hybrid procedure for genome-wide association studies.

Authors:  Yu-Chung Wei; Shu-Hui Wen; Pei-Chun Chen; Chih-Hao Wang; Chuhsing K Hsiao
Journal:  Eur J Hum Genet       Date:  2010-04-21       Impact factor: 4.246

5.  Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset.

Authors:  X Liu; S Sivaganesan; K Y Yeung; J Guo; R E Bumgarner; Mario Medvedovic
Journal:  Bioinformatics       Date:  2006-05-18       Impact factor: 6.937

6.  Differential gene expression of wheat progeny with contrasting levels of transpiration efficiency.

Authors:  Gang-Ping Xue; C Lynne McIntyre; Scott Chapman; Neil I Bower; Heather Way; Antonio Reverter; Bryan Clarke; Ray Shorter
Journal:  Plant Mol Biol       Date:  2006-08       Impact factor: 4.076

7.  Analysis of time-series gene expression data: methods, challenges, and opportunities.

Authors:  I P Androulakis; E Yang; R R Almon
Journal:  Annu Rev Biomed Eng       Date:  2007       Impact factor: 9.590

8.  The t-mixture model approach for detecting differentially expressed genes in microarrays.

Authors:  Shuo Jiao; Shunpu Zhang
Journal:  Funct Integr Genomics       Date:  2008-01-22       Impact factor: 3.410

9.  The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis.

Authors:  James A Koziol; Anne C Feng; Zhenyu Jia; Yipeng Wang; Seven Goodison; Michael McClelland; Dan Mercola
Journal:  Bioinformatics       Date:  2008-07-15       Impact factor: 6.937

10.  Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data.

Authors:  Benhuai Xie; Wei Pan; Xiaotong Shen
Journal:  Bioinformatics       Date:  2009-12-23       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.