Camille Stephan-Otto Attolini1, Victor Peña2, David Rossell3. 1. Institute for Research in Biomedicine (IRB Barcelona), Barcelona, Spain. 2. Department of Statistical Science, Duke University, Durham, North Carolina, USA and. 3. Department of Statistics, University of Warwick, Coventry, UK.
Abstract
MOTIVATION: Designing an RNA-seq study depends critically on its specific goals, technology and underlying biology, which renders general guidelines inadequate. We propose a Bayesian framework to customize experiments so that goals can be attained and resources are not wasted, with a focus on alternative splicing. RESULTS: We studied how read length, sequencing depth, library preparation and the number of replicates affects cost-effectiveness of single-sample and group comparison studies. Optimal settings varied strongly according to the target organism or tissue (potential 50-500% cost cuts) and, interestingly, short reads outperformed long reads for standard analyses. Our framework learns key characteristics for study design from the data, and predicts if and how to continue experimentation. These predictions matched several follow-up experimental datasets that were used for validation. We provide default pipelines, but the framework can be combined with other data analysis methods and can help assess their relative merits. AVAILABILITY AND IMPLEMENTATION: casper package at www.bioconductor.org/packages/release/bioc/html/casper.html, Supplementary Manual by typing casperDesign() at the R prompt. CONTACT: rosselldavid@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Designing an RNA-seq study depends critically on its specific goals, technology and underlying biology, which renders general guidelines inadequate. We propose a Bayesian framework to customize experiments so that goals can be attained and resources are not wasted, with a focus on alternative splicing. RESULTS: We studied how read length, sequencing depth, library preparation and the number of replicates affects cost-effectiveness of single-sample and group comparison studies. Optimal settings varied strongly according to the target organism or tissue (potential 50-500% cost cuts) and, interestingly, short reads outperformed long reads for standard analyses. Our framework learns key characteristics for study design from the data, and predicts if and how to continue experimentation. These predictions matched several follow-up experimental datasets that were used for validation. We provide default pipelines, but the framework can be combined with other data analysis methods and can help assess their relative merits. AVAILABILITY AND IMPLEMENTATION: casper package at www.bioconductor.org/packages/release/bioc/html/casper.html, Supplementary Manual by typing casperDesign() at the R prompt. CONTACT: rosselldavid@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Gregory R Grant; Michael H Farkas; Angel D Pizarro; Nicholas F Lahens; Jonathan Schug; Brian P Brunk; Christian J Stoeckert; John B Hogenesch; Eric A Pierce Journal: Bioinformatics Date: 2011-07-19 Impact factor: 6.937
Authors: Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras Journal: Bioinformatics Date: 2012-10-25 Impact factor: 6.937
Authors: Michele A Busby; Chip Stewart; Chase A Miller; Krzysztof R Grzeda; Gabor T Marth Journal: Bioinformatics Date: 2013-01-12 Impact factor: 6.937
Authors: Cole Trapnell; David G Hendrickson; Martin Sauvageau; Loyal Goff; John L Rinn; Lior Pachter Journal: Nat Biotechnol Date: 2012-12-09 Impact factor: 54.908
Authors: Tuuli Lappalainen; Michael Sammeth; Marc R Friedländer; Peter A C 't Hoen; Jean Monlong; Manuel A Rivas; Mar Gonzàlez-Porta; Natalja Kurbatova; Thasso Griebel; Pedro G Ferreira; Matthias Barann; Thomas Wieland; Liliana Greger; Maarten van Iterson; Jonas Almlöf; Paolo Ribeca; Irina Pulyakhina; Daniela Esser; Thomas Giger; Andrew Tikhonov; Marc Sultan; Gabrielle Bertier; Daniel G MacArthur; Monkol Lek; Esther Lizano; Henk P J Buermans; Ismael Padioleau; Thomas Schwarzmayr; Olof Karlberg; Halit Ongen; Helena Kilpinen; Sergi Beltran; Marta Gut; Katja Kahlem; Vyacheslav Amstislavskiy; Oliver Stegle; Matti Pirinen; Stephen B Montgomery; Peter Donnelly; Mark I McCarthy; Paul Flicek; Tim M Strom; Hans Lehrach; Stefan Schreiber; Ralf Sudbrak; Angel Carracedo; Stylianos E Antonarakis; Robert Häsler; Ann-Christine Syvänen; Gert-Jan van Ommen; Alvis Brazma; Thomas Meitinger; Philip Rosenstiel; Roderic Guigó; Ivo G Gut; Xavier Estivill; Emmanouil T Dermitzakis Journal: Nature Date: 2013-09-15 Impact factor: 49.962
Authors: Nikoleta A Tzioutziou; Allan B James; Wenbin Guo; Cristiane P G Calixto; Runxuan Zhang; Hugh G Nimmo; John W S Brown Journal: Methods Mol Biol Date: 2022