Literature DB >> 11867086

Statistical inference for simultaneous clustering of gene expression data.

Katherine S Pollard1, Mark J van der Laan.   

Abstract

Current methods for analysis of gene expression data are mostly based on clustering and classification of either genes or samples. We offer support for the idea that more complex patterns can be identified in the data if genes and samples are considered simultaneously. We formalize the approach and propose a statistical framework for two-way clustering. A simultaneous clustering parameter is defined as a function theta=Phi(P) of the true data generating distribution P, and an estimate is obtained by applying this function to the empirical distribution P(n). We illustrate that a wide range of clustering procedures, including generalized hierarchical methods, can be defined as parameters which are compositions of individual mappings for clustering patients and genes. This framework allows one to assess classical properties of clustering methods, such as consistency, and to formally study statistical inference regarding the clustering parameter. We present results of simulations designed to assess the asymptotic validity of different bootstrap methods for estimating the distribution of Phi(P(n)). The method is illustrated on a publicly available data set.

Entities:  

Mesh:

Year:  2002        PMID: 11867086     DOI: 10.1016/s0025-5564(01)00116-x

Source DB:  PubMed          Journal:  Math Biosci        ISSN: 0025-5564            Impact factor:   2.144


  3 in total

1.  Model validation for gene selection and regulation maps.

Authors:  Enrico Capobianco
Journal:  Funct Integr Genomics       Date:  2007-12-07       Impact factor: 3.410

2.  Dating phylogenies with hybrid local molecular clocks.

Authors:  Stéphane Aris-Brosou
Journal:  PLoS One       Date:  2007-09-12       Impact factor: 3.240

3.  Global analysis of patterns of gene expression during Drosophila embryogenesis.

Authors:  Pavel Tomancak; Benjamin P Berman; Amy Beaton; Richard Weiszmann; Elaine Kwan; Volker Hartenstein; Susan E Celniker; Gerald M Rubin
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.