Literature DB >> 34880705

Inference with Transposable Data: Modeling the Effects of Row and Column Correlations.

Genevera I Allen1, Robert Tibshirani2.   

Abstract

We consider the problem of large-scale inference on the row or column variables of data in the form of a matrix. Many of these data matrices are transposable meaning that neither the row variables nor the column variables can be considered independent instances. An example of this scenario is detecting significant genes in microarrays when the samples may be dependent due to latent variables or unknown batch effects. By modeling this matrix data using the matrix-variate normal distribution, we study and quantify the effects of row and column correlations on procedures for large-scale inference. We then propose a simple solution to the myriad of problems presented by unanticipated correlations: We simultaneously estimate row and column covariances and use these to sphere or de-correlate the noise in the underlying data before conducting inference. This procedure yields data with approximately independent rows and columns so that test statistics more closely follow null distributions and multiple testing procedures correctly control the desired error rates. Results on simulated models and real microarray data demonstrate major advantages of this approach: (1) increased statistical power, (2) less bias in estimating the false discovery rate, and (3) reduced variance of the false discovery rate estimators.

Entities:  

Keywords:  covariance estimation; false discovery rate; large-scale inference; matrix-variate normal; multiple testing; transposable regularized covariance models

Year:  2012        PMID: 34880705      PMCID: PMC8649963          DOI: 10.1111/j.1467-9868.2011.01027.x

Source DB:  PubMed          Journal:  J R Stat Soc Series B Stat Methodol        ISSN: 1369-7412            Impact factor:   4.488


  22 in total

1.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.

Authors:  Ron Edgar; Michael Domrachev; Alex E Lash
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

2.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

Authors:  Yee Hwa Yang; Sandrine Dudoit; Percy Luu; David M Lin; Vivian Peng; John Ngai; Terence P Speed
Journal:  Nucleic Acids Res       Date:  2002-02-15       Impact factor: 16.971

3.  SUCCESSIVE NORMALIZATION OF RECTANGULAR ARRAYS.

Authors:  Richard A Olshen; Bala Rajaratnam
Journal:  Ann Stat       Date:  2010-06-01       Impact factor: 4.028

4.  Adjusting batch effects in microarray expression data using empirical Bayes methods.

Authors:  W Evan Johnson; Cheng Li; Ariel Rabinovic
Journal:  Biostatistics       Date:  2006-04-21       Impact factor: 5.899

Review 5.  A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion.

Authors:  Alessio Farcomeni
Journal:  Stat Methods Med Res       Date:  2007-08-14       Impact factor: 3.021

6.  Genome-wide co-expression based prediction of differential expressions.

Authors:  Yinglei Lai
Journal:  Bioinformatics       Date:  2007-11-15       Impact factor: 6.937

7.  Gene ranking and biomarker discovery under correlation.

Authors:  Verena Zuber; Korbinian Strimmer
Journal:  Bioinformatics       Date:  2009-07-30       Impact factor: 6.937

8.  The effect of correlation in false discovery rate estimation.

Authors:  Armin Schwartzman; Xihong Lin
Journal:  Biometrika       Date:  2011-03       Impact factor: 2.445

9.  On Consistency and Sparsity for Principal Components Analysis in High Dimensions.

Authors:  Iain M Johnstone; Arthur Yu Lu
Journal:  J Am Stat Assoc       Date:  2009-06-01       Impact factor: 5.033

10.  The effects of normalization on the correlation structure of microarray data.

Authors:  Xing Qiu; Andrew I Brooks; Lev Klebanov; Ndrei Yakovlev
Journal:  BMC Bioinformatics       Date:  2005-05-16       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.