Literature DB >> 25725090

SimSeq: a nonparametric approach to simulation of RNA-sequence datasets.

Sam Benidt1, Dan Nettleton1.   

Abstract

MOTIVATION: RNA sequencing analysis methods are often derived by relying on hypothetical parametric models for read counts that are not likely to be precisely satisfied in practice. Methods are often tested by analyzing data that have been simulated according to the assumed model. This testing strategy can result in an overly optimistic view of the performance of an RNA-seq analysis method.
RESULTS: We develop a data-based simulation algorithm for RNA-seq data. The vector of read counts simulated for a given experimental unit has a joint distribution that closely matches the distribution of a source RNA-seq dataset provided by the user. We conduct simulation experiments based on the negative binomial distribution and our proposed nonparametric simulation algorithm. We compare performance between the two simulation experiments over a small subset of statistical methods for RNA-seq analysis available in the literature. We use as a benchmark the ability of a method to control the false discovery rate. Not surprisingly, methods based on parametric modeling assumptions seem to perform better with respect to false discovery rate control when data are simulated from parametric models rather than using our more realistic nonparametric simulation strategy.
AVAILABILITY AND IMPLEMENTATION: The nonparametric simulation algorithm developed in this article is implemented in the R package SimSeq, which is freely available under the GNU General Public License (version 2 or later) from the Comprehensive R Archive Network (http://cran.rproject.org/). CONTACT: sgbenidt@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2015        PMID: 25725090      PMCID: PMC4481850          DOI: 10.1093/bioinformatics/btv124

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  25 in total

1.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis.

Authors:  Marie-Agnès Dillies; Andrea Rau; Julie Aubert; Christelle Hennequet-Antier; Marine Jeanmougin; Nicolas Servant; Céline Keime; Guillemette Marot; David Castel; Jordi Estelle; Gregory Guernec; Bernd Jagla; Luc Jouneau; Denis Laloë; Caroline Le Gall; Brigitte Schaëffer; Stéphane Le Crom; Mickaël Guedj; Florence Jaffrézic
Journal:  Brief Bioinform       Date:  2012-09-17       Impact factor: 11.622

2.  Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates.

Authors:  Steven P Lund; Dan Nettleton; Davis J McCarthy; Gordon K Smyth
Journal:  Stat Appl Genet Mol Biol       Date:  2012-10-22

3.  Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-Seq data.

Authors:  Jun Li; Robert Tibshirani
Journal:  Stat Methods Med Res       Date:  2011-11-28       Impact factor: 3.021

4.  Comprehensive molecular characterization of clear cell renal cell carcinoma.

Authors: 
Journal:  Nature       Date:  2013-06-23       Impact factor: 49.962

5.  Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq and microarrays.

Authors:  Daniel Bottomly; Nicole A R Walter; Jessica Ezzell Hunter; Priscila Darakjian; Sunita Kawane; Kari J Buck; Robert P Searles; Michael Mooney; Shannon K McWeeney; Robert Hitzemann
Journal:  PLoS One       Date:  2011-03-24       Impact factor: 3.240

6.  Differential expression analysis for sequence count data.

Authors:  Simon Anders; Wolfgang Huber
Journal:  Genome Biol       Date:  2010-10-27       Impact factor: 13.583

7.  subSeq: determining appropriate sequencing depth through efficient read subsampling.

Authors:  David G Robinson; John D Storey
Journal:  Bioinformatics       Date:  2014-09-03       Impact factor: 6.937

8.  Error estimates for the analysis of differential expression from RNA-seq count data.

Authors:  Conrad J Burden; Sumaira E Qureshi; Susan R Wilson
Journal:  PeerJ       Date:  2014-09-23       Impact factor: 2.984

9.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors:  Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal:  Bioinformatics       Date:  2009-11-11       Impact factor: 6.937

10.  Evaluating statistical analysis models for RNA sequencing experiments.

Authors:  Pablo D Reeb; Juan P Steibel
Journal:  Front Genet       Date:  2013-09-17       Impact factor: 4.599

View more
  22 in total

1.  A Combined PLS and Negative Binomial Regression Model for Inferring Association Networks from Next-Generation Sequencing Count Data.

Authors:  Maiju Pesonen; Jaakko Nevalainen; Steven Potter; Somnath Datta; Susmita Datta
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2017-02-07       Impact factor: 3.710

2.  SeqNet: An R Package for Generating Gene-Gene Networks and Simulating RNA-Seq Data.

Authors:  Tyler Grimes; Somnath Datta
Journal:  J Stat Softw       Date:  2021-07-10       Impact factor: 6.440

3.  A multi-omics data simulator for complex disease studies and its application to evaluate multi-omics data analysis methods for disease classification.

Authors:  Ren-Hua Chung; Chen-Yu Kang
Journal:  Gigascience       Date:  2019-05-01       Impact factor: 6.524

Review 4.  A broad survey of DNA sequence data simulation tools.

Authors:  Shatha Alosaimi; Armand Bandiang; Noelle van Biljon; Denis Awany; Prisca K Thami; Milaine S S Tchamga; Anmol Kiran; Olfa Messaoud; Radia Ismaeel Mohammed Hassan; Jacquiline Mugo; Azza Ahmed; Christian D Bope; Imane Allali; Gaston K Mazandu; Nicola J Mulder; Emile R Chimusa
Journal:  Brief Funct Genomics       Date:  2020-01-22       Impact factor: 4.241

5.  Recommendations of scRNA-seq Differential Gene Expression Analysis Based on Comprehensive Benchmarking.

Authors:  Jake Gagnon; Lira Pi; Matthew Ryals; Qingwen Wan; Wenxing Hu; Zhengyu Ouyang; Baohong Zhang; Kejie Li
Journal:  Life (Basel)       Date:  2022-06-07

6.  ACTIVA: realistic single-cell RNA-seq generation with automatic cell-type identification using introspective variational autoencoders.

Authors:  A Ali Heydari; Oscar A Davalos; Lihong Zhao; Katrina K Hoyer; Suzanne S Sindi
Journal:  Bioinformatics       Date:  2022-02-18       Impact factor: 6.931

Review 7.  Dynamics in Transcriptomics: Advancements in RNA-seq Time Course and Downstream Analysis.

Authors:  Daniel Spies; Constance Ciaudo
Journal:  Comput Struct Biotechnol J       Date:  2015-08-24       Impact factor: 7.271

8.  Identification of expression patterns in the progression of disease stages by integration of transcriptomic data.

Authors:  Sara Aibar; Maria Abaigar; Francisco Jose Campos-Laborie; Jose Manuel Sánchez-Santos; Jesus M Hernandez-Rivas; Javier De Las Rivas
Journal:  BMC Bioinformatics       Date:  2016-11-22       Impact factor: 3.169

9.  Accurate assembly of minority viral haplotypes from next-generation sequencing through efficient noise reduction.

Authors:  Sergey Knyazev; Viachaslau Tsyvina; Anupama Shankar; Andrew Melnyk; Alexander Artyomenko; Tatiana Malygina; Yuri B Porozov; Ellsworth M Campbell; William M Switzer; Pavel Skums; Serghei Mangul; Alex Zelikovsky
Journal:  Nucleic Acids Res       Date:  2021-09-27       Impact factor: 16.971

10.  CLOVE: classification of genomic fusions into structural variation events.

Authors:  Jan Schröder; Adrianto Wirawan; Bertil Schmidt; Anthony T Papenfuss
Journal:  BMC Bioinformatics       Date:  2017-07-20       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.