Literature DB >> 28042360

Single-gene negative binomial regression models for RNA-Seq data with higher-order asymptotic inference.

Yanming Di1.   

Abstract

We consider negative binomial (NB) regression models for RNA-Seq read counts and investigate an approach where such NB regression models are fitted to individual genes separately and, in particular, the NB dispersion parameter is estimated from each gene separately without assuming commonalities between genes. This single-gene approach contrasts with the more widely-used dispersion-modeling approach where the NB dispersion is modeled as a simple function of the mean or other measures of read abundance, and then estimated from a large number of genes combined. We show that through the use of higher-order asymptotic techniques, inferences with correct type I errors can be made about the regression coefficients in a single-gene NB regression model even when the dispersion is unknown and the sample size is small. The motivations for studying single-gene models include: 1) they provide a basis of reference for understanding and quantifying the power-robustness trade-offs of the dispersion-modeling approach; 2) they can also be potentially useful in practice if moderate sample sizes become available and diagnostic tools indicate potential problems with simple models of dispersion.

Entities:  

Keywords:  92D20; Extra-Poisson variation; Higher-order asymptotics; Negative binomial; Overdispersion; Power-robustness; Primary 62P10; RNA-Seq; Regression

Year:  2015        PMID: 28042360      PMCID: PMC5193394          DOI: 10.4310/SII.2015.v8.n4.a1

Source DB:  PubMed          Journal:  Stat Interface        ISSN: 1938-7989            Impact factor:   0.582


  13 in total

1.  Statistical significance for genomewide studies.

Authors:  John D Storey; Robert Tibshirani
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-25       Impact factor: 11.205

2.  Normalization, testing, and false discovery rate estimation for RNA-sequencing data.

Authors:  Jun Li; Daniela M Witten; Iain M Johnstone; Robert Tibshirani
Journal:  Biostatistics       Date:  2011-10-14       Impact factor: 5.899

3.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays.

Authors:  John C Marioni; Christopher E Mason; Shrikant M Mane; Matthew Stephens; Yoav Gilad
Journal:  Genome Res       Date:  2008-06-11       Impact factor: 9.043

4.  The transcriptional landscape of the yeast genome defined by RNA sequencing.

Authors:  Ugrappa Nagalakshmi; Zhong Wang; Karl Waern; Chong Shou; Debasish Raha; Mark Gerstein; Michael Snyder
Journal:  Science       Date:  2008-05-01       Impact factor: 47.728

5.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

6.  Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates.

Authors:  Steven P Lund; Dan Nettleton; Davis J McCarthy; Gordon K Smyth
Journal:  Stat Appl Genet Mol Biol       Date:  2012-10-22

Review 7.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

8.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Authors:  Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

9.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

10.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

View more
  5 in total

1.  Model-Based Clustering with Measurement or Estimation Errors.

Authors:  Wanli Zhang; Yanming Di
Journal:  Genes (Basel)       Date:  2020-02-10       Impact factor: 4.096

2.  The level of residual dispersion variation and the power of differential expression tests for RNA-Seq data.

Authors:  Gu Mi; Yanming Di
Journal:  PLoS One       Date:  2015-04-07       Impact factor: 3.240

3.  Sequence count data are poorly fit by the negative binomial distribution.

Authors:  Stijn Hawinkel; J C W Rayner; Luc Bijnens; Olivier Thas
Journal:  PLoS One       Date:  2020-04-30       Impact factor: 3.240

4.  Searching for best lower dimensional visualization angles for high dimensional RNA-Seq data.

Authors:  Wanli Zhang; Yanming Di
Journal:  PeerJ       Date:  2018-07-12       Impact factor: 2.984

5.  Micro-environmental sensing by bone marrow stroma identifies IL-6 and TGFβ1 as regulators of hematopoietic ageing.

Authors:  Simona Valletta; Alexander Thomas; Yiran Meng; Xiying Ren; Roy Drissen; Hilal Sengül; Cristina Di Genua; Claus Nerlov
Journal:  Nat Commun       Date:  2020-08-14       Impact factor: 14.919

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.