Literature DB >> 33152752

Negative Binomial mixed models estimated with the maximum likelihood method can be used for longitudinal RNAseq data.

Roula Tsonaka1, Pietro Spitali2.   

Abstract

Time-course RNAseq experiments, where tissues are repeatedly collected from the same subjects, e.g. humans or animals over time or under several different experimental conditions, are becoming more popular due to the reducing sequencing costs. Such designs offer the great potential to identify genes that change over time or progress differently in time across experimental groups. Modelling of the longitudinal gene expression in such time-course RNAseq data is complicated by the serial correlations, missing values due to subject dropout or sequencing errors, long follow up with potentially non-linear progression in time and low number of subjects. Negative Binomial mixed models can address all these issues. However, such models under the maximum likelihood (ML) approach are less popular for RNAseq data due to convergence issues (see, e.g. [1]). We argue in this paper that it is the use of an inaccurate numerical integration method in combination with the typically small sample sizes which causes such mixed models to fail for a great portion of tested genes. We show that when we use the accurate adaptive Gaussian quadrature approach to approximate the integrals over the random-effects terms, we can successfully estimate the model parameters with the maximum likelihood method. Moreover, we show that the boostrap method can be used to preserve the type I error rate in small sample settings. We evaluate empirically the small sample properties of the test statistics and compare with state-of-the-art approaches. The method is applied on a longitudinal mice experiment to study the dynamics in Duchenne Muscular Dystrophy. Contact:s.tsonaka@lumc.nl Roula Tsonaka is an assistant professor at the Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center. Her research focuses on statistical methods for longitudinal omics data. Pietro Spitali is an assistant professor at the Department of Human Genetics, Leiden University Medical Center. His research focuses on the identification of biomarkers for neuromuscular disorders.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Keywords:  Adaptive Gaussian quadrature integration; Bootstrap; Negative Binomial mixed effects model; Random effects models

Year:  2021        PMID: 33152752      PMCID: PMC8293834          DOI: 10.1093/bib/bbaa264

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  11 in total

1.  Use of within-array replicate spots for assessing differential expression in microarray experiments.

Authors:  Gordon K Smyth; Joëlle Michaud; Hamish S Scott
Journal:  Bioinformatics       Date:  2005-01-18       Impact factor: 6.937

2.  What if we ignore the random effects when analyzing RNA-seq data in a multifactor experiment.

Authors:  Shiqi Cui; Tieming Ji; Jilong Li; Jianlin Cheng; Jing Qiu
Journal:  Stat Appl Genet Mol Biol       Date:  2016-04

3.  A scaling normalization method for differential expression analysis of RNA-seq data.

Authors:  Mark D Robinson; Alicia Oshlack
Journal:  Genome Biol       Date:  2010-03-02       Impact factor: 13.583

4.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

5.  ShrinkBayes: a versatile R-package for analysis of count-based sequencing data in complex study designs.

Authors:  Mark A van de Wiel; Maarten Neerincx; Tineke E Buffart; Daoud Sie; Henk M W Verheul
Journal:  BMC Bioinformatics       Date:  2014-04-26       Impact factor: 3.169

6.  voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.

Authors:  Charity W Law; Yunshun Chen; Wei Shi; Gordon K Smyth
Journal:  Genome Biol       Date:  2014-02-03       Impact factor: 13.583

7.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.

Authors:  Michael I Love; Wolfgang Huber; Simon Anders
Journal:  Genome Biol       Date:  2014       Impact factor: 13.583

8.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

9.  Next maSigPro: updating maSigPro bioconductor package for RNA-seq time series.

Authors:  María José Nueda; Sonia Tarazona; Ana Conesa
Journal:  Bioinformatics       Date:  2014-06-03       Impact factor: 6.937

10.  Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model.

Authors:  Xiaoxiao Sun; David Dalpiaz; Di Wu; Jun S Liu; Wenxuan Zhong; Ping Ma
Journal:  BMC Bioinformatics       Date:  2016-08-26       Impact factor: 3.169

View more
  1 in total

1.  A comparison of methods for multiple degree of freedom testing in repeated measures RNA-sequencing experiments.

Authors:  Elizabeth A Wynn; Brian E Vestal; Tasha E Fingerlin; Camille M Moore
Journal:  BMC Med Res Methodol       Date:  2022-05-28       Impact factor: 4.612

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.