Literature DB >> 28334305

Variance component score test for time-course gene set analysis of longitudinal RNA-seq data.

Denis Agniel1, Boris P Hejblum2.   

Abstract

As gene expression measurement technology is shifting from microarrays to sequencing, the statistical tools available for their analysis must be adapted since RNA-seq data are measured as counts. It has been proposed to model RNA-seq counts as continuous variables using nonparametric regression to account for their inherent heteroscedasticity. In this vein, we propose tcgsaseq, a principled, model-free, and efficient method for detecting longitudinal changes in RNA-seq gene sets defined a priori. The method identifies those gene sets whose expression varies over time, based on an original variance component score test accounting for both covariates and heteroscedasticity without assuming any specific parametric distribution for the (transformed) counts. We demonstrate that despite the presence of a nonparametric component, our test statistic has a simple form and limiting distribution, and both may be computed quickly. A permutation version of the test is additionally proposed for very small sample sizes. Applied to both simulated data and two real datasets, tcgsaseq is shown to exhibit very good statistical properties, with an increase in stability and power when compared to state-of-the-art methods ROAST (rotation gene set testing), edgeR, and DESeq2, which can fail to control the type I error under certain realistic settings. We have made the method available for the community in the R package tcgsaseq.
© The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Keywords:  Gene Set Analysis; Heteroscedasticity; Longitudinal data; RNA-seq data; Variance component testing

Mesh:

Year:  2017        PMID: 28334305      PMCID: PMC5862256          DOI: 10.1093/biostatistics/kxx005

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  28 in total

1.  Analyzing gene expression data in terms of gene sets: methodological issues.

Authors:  Jelle J Goeman; Peter Bühlmann
Journal:  Bioinformatics       Date:  2007-02-15       Impact factor: 6.937

2.  Kernel machine SNP-set analysis for censored survival outcomes in genome-wide association studies.

Authors:  Xinyi Lin; Tianxi Cai; Michael C Wu; Qian Zhou; Geoffrey Liu; David C Christiani; Xihong Lin
Journal:  Genet Epidemiol       Date:  2011-08-04       Impact factor: 2.135

3.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays.

Authors:  John C Marioni; Christopher E Mason; Shrikant M Mane; Matthew Stephens; Yoav Gilad
Journal:  Genome Res       Date:  2008-06-11       Impact factor: 9.043

4.  Score test of homogeneity for survival data.

Authors:  D Commenges; P K Andersen
Journal:  Lifetime Data Anal       Date:  1995       Impact factor: 1.588

5.  What if we ignore the random effects when analyzing RNA-seq data in a multifactor experiment.

Authors:  Shiqi Cui; Tieming Ji; Jilong Li; Jianlin Cheng; Jing Qiu
Journal:  Stat Appl Genet Mol Biol       Date:  2016-04

6.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

Review 7.  The enigmatic role of mast cells in dominant tolerance.

Authors:  Victor C de Vries; Karina Pino-Lagos; Raul Elgueta; Randolph J Noelle
Journal:  Curr Opin Organ Transplant       Date:  2009-08       Impact factor: 2.640

8.  Gene set analysis using variance component tests.

Authors:  Yen-Tsung Huang; Xihong Lin
Journal:  BMC Bioinformatics       Date:  2013-06-28       Impact factor: 3.169

9.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

10.  Detection of deregulated modules using deregulatory linked path.

Authors:  Yuxuan Hu; Lin Gao; Kai Shi; David K Y Chiu
Journal:  PLoS One       Date:  2013-07-24       Impact factor: 3.240

View more
  3 in total

1.  Airway transcriptomic profiling after bronchial thermoplasty.

Authors:  Shu-Yi Liao; Angela L Linderholm; Ken Y Yoneda; Nicholas J Kenyon; Richart W Harper
Journal:  ERJ Open Res       Date:  2019-02-18

2.  MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments.

Authors:  Brian E Vestal; Camille M Moore; Elizabeth Wynn; Laura Saba; Tasha Fingerlin; Katerina Kechris
Journal:  BMC Bioinformatics       Date:  2020-08-28       Impact factor: 3.169

Review 3.  Temporal Dynamic Methods for Bulk RNA-Seq Time Series Data.

Authors:  Vera-Khlara S Oh; Robert W Li
Journal:  Genes (Basel)       Date:  2021-02-27       Impact factor: 4.096

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.