Literature DB >> 12902543

Shrinkage-based similarity metric for cluster analysis of microarray data.

Vera Cherepinsky1, Jiawu Feng, Marc Rejali, Bud Mishra.   

Abstract

The current standard correlation coefficient used in the analysis of microarray data was introduced by M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein [(1998) Proc. Natl. Acad. Sci. USA 95, 14863-14868]. Its formulation is rather arbitrary. We give a mathematically rigorous correlation coefficient of two data vectors based on James-Stein shrinkage estimators. We use the assumptions described by Eisen et al., also using the fact that the data can be treated as transformed into normal distributions. While Eisen et al. use zero as an estimator for the expression vector mean mu, we start with the assumption that for each gene, mu is itself a zero-mean normal random variable [with a priori distribution N(0,tau 2)], and use Bayesian analysis to obtain a posteriori distribution of mu in terms of the data. The shrunk estimator for mu differs from the mean of the data vectors and ultimately leads to a statistically robust estimator for correlation coefficients. To evaluate the effectiveness of shrinkage, we conducted in silico experiments and also compared similarity metrics on a biological example by using the data set from Eisen et al. For the latter, we classified genes involved in the regulation of yeast cell-cycle functions by computing clusters based on various definitions of correlation coefficients and contrasting them against clusters based on the activators known in the literature. The estimated false positives and false negatives from this study indicate that using the shrinkage metric improves the accuracy of the analysis.

Entities:  

Mesh:

Year:  2003        PMID: 12902543      PMCID: PMC187810          DOI: 10.1073/pnas.1633770100

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  6 in total

1.  Serial regulation of transcriptional regulators in the yeast cell cycle.

Authors:  I Simon; J Barnett; N Hannett; C T Harbison; N J Rinaldi; T L Volkert; J J Wyrick; J Zeitlinger; D K Gifford; T S Jaakkola; R A Young
Journal:  Cell       Date:  2001-09-21       Impact factor: 41.582

2.  The transcriptional program of sporulation in budding yeast.

Authors:  S Chu; J DeRisi; M Eisen; J Mulholland; D Botstein; P O Brown; I Herskowitz
Journal:  Science       Date:  1998-10-23       Impact factor: 47.728

3.  Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.

Authors:  M Schena; D Shalon; R Heller; A Chai; P O Brown; R W Davis
Journal:  Proc Natl Acad Sci U S A       Date:  1996-10-01       Impact factor: 11.205

4.  Exploring the metabolic and genetic control of gene expression on a genomic scale.

Authors:  J L DeRisi; V R Iyer; P O Brown
Journal:  Science       Date:  1997-10-24       Impact factor: 47.728

5.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

6.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization.

Authors:  P T Spellman; G Sherlock; M Q Zhang; V R Iyer; K Anders; M B Eisen; P O Brown; D Botstein; B Futcher
Journal:  Mol Biol Cell       Date:  1998-12       Impact factor: 4.138

  6 in total
  8 in total

1.  Differentially expressed genes between solitary large hepatocellular carcinoma and nodular hepatocellular carcinoma.

Authors:  Lian-Yue Yang; Wei Wang; Ji-Xiang Peng; Jie-Quan Yang; Gen-Wen Huang
Journal:  World J Gastroenterol       Date:  2004-12-15       Impact factor: 5.742

2.  Knowledge-based data analysis comes of age.

Authors:  Michael F Ochs
Journal:  Brief Bioinform       Date:  2009-10-23       Impact factor: 11.622

3.  Identification of gene expression profiling in hepatocellular carcinoma using cDNA microarrays.

Authors:  Wei Wang; Ji Xiang Peng; Jie Quan Yang; Lian Yue Yang
Journal:  Dig Dis Sci       Date:  2009-12       Impact factor: 3.199

Review 4.  Systematic deciphering of cancer genome networks.

Authors:  Bernard Fendler; Gurinder Atwal
Journal:  Yale J Biol Med       Date:  2012-09-25

5.  LS-NMF: a modified non-negative matrix factorization algorithm utilizing uncertainty estimates.

Authors:  Guoli Wang; Andrew V Kossenkov; Michael F Ochs
Journal:  BMC Bioinformatics       Date:  2006-03-28       Impact factor: 3.169

6.  An adaptive alpha spending algorithm improves the power of statistical inference in microarray data analysis.

Authors:  Jacob P L Brand; Lang Chen; Xiangqin Cui; Alfred A Bartolucci; Grier P Page; Kyoungmi Kim; Stephen Barnes; Vinodh Srinivasasainagendra; Mark T Beasley; David B Allison
Journal:  Bioinformation       Date:  2007-04-10

7.  A combinational feature selection and ensemble neural network method for classification of gene expression data.

Authors:  Bing Liu; Qinghua Cui; Tianzi Jiang; Songde Ma
Journal:  BMC Bioinformatics       Date:  2004-09-27       Impact factor: 3.169

8.  Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

Authors:  Jianchao Yao; Chunqi Chang; Mari L Salmi; Yeung Sam Hung; Ann Loraine; Stanley J Roux
Journal:  BMC Bioinformatics       Date:  2008-06-18       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.