Literature DB >> 28018047

Scalable Genomics with R and Bioconductor.

Michael Lawrence1, Martin Morgan2.   

Abstract

This paper reviews strategies for solving problems encountered when analyzing large genomic data sets and describes the implementation of those strategies in R by packages from the Bioconductor project. We treat the scalable processing, summarization and visualization of big genomic data. The general ideas are well established and include restrictive queries, compression, iteration and parallel computing. We demonstrate the strategies by applying Bioconductor packages to the detection and analysis of genetic variants from a whole genome sequencing experiment.

Entities:  

Keywords:  Bioconductor; R; big data; biology; genomics

Year:  2014        PMID: 28018047      PMCID: PMC5181792          DOI: 10.1214/14-STS476

Source DB:  PubMed          Journal:  Stat Sci        ISSN: 0883-4237            Impact factor:   2.901


  6 in total

1.  The human genome browser at UCSC.

Authors:  W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

2.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

3.  BigWig and BigBed: enabling browsing of large distributed datasets.

Authors:  W J Kent; A S Zweig; G Barber; A S Hinrichs; D Karolchik
Journal:  Bioinformatics       Date:  2010-07-17       Impact factor: 6.937

4.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

5.  Software for computing and annotating genomic ranges.

Authors:  Michael Lawrence; Wolfgang Huber; Hervé Pagès; Patrick Aboyoun; Marc Carlson; Robert Gentleman; Martin T Morgan; Vincent J Carey
Journal:  PLoS Comput Biol       Date:  2013-08-08       Impact factor: 4.475

6.  The variant call format and VCFtools.

Authors:  Petr Danecek; Adam Auton; Goncalo Abecasis; Cornelis A Albers; Eric Banks; Mark A DePristo; Robert E Handsaker; Gerton Lunter; Gabor T Marth; Stephen T Sherry; Gilean McVean; Richard Durbin
Journal:  Bioinformatics       Date:  2011-06-07       Impact factor: 6.937

  6 in total
  5 in total

Review 1.  Orchestrating high-throughput genomic analysis with Bioconductor.

Authors:  Wolfgang Huber; Vincent J Carey; Robert Gentleman; Simon Anders; Marc Carlson; Benilton S Carvalho; Hector Corrada Bravo; Sean Davis; Laurent Gatto; Thomas Girke; Raphael Gottardo; Florian Hahne; Kasper D Hansen; Rafael A Irizarry; Michael Lawrence; Michael I Love; James MacDonald; Valerie Obenchain; Andrzej K Oleś; Hervé Pagès; Alejandro Reyes; Paul Shannon; Gordon K Smyth; Dan Tenenbaum; Levi Waldron; Martin Morgan
Journal:  Nat Methods       Date:  2015-02       Impact factor: 28.547

2.  Promoter Architecture and Sex-Specific Gene Expression in Daphnia pulex.

Authors:  R Taylor Raborn; Ken Spitze; Volker P Brendel; Michael Lynch
Journal:  Genetics       Date:  2016-08-31       Impact factor: 4.562

3.  Big data in multi-block data analysis: An approach to parallelizing Partial Least Squares Mode B algorithm.

Authors:  Alba Martinez-Ruiz; Cristina Montañola-Sales
Journal:  Heliyon       Date:  2019-04-29

4.  Comparative analyses of vertebrate CPEB proteins define two subfamilies with coordinated yet distinct functions in post-transcriptional gene regulation.

Authors:  Berta Duran-Arqué; Manuel Cañete; Chiara Lara Castellazzi; Anna Bartomeu; Anna Ferrer-Caelles; Oscar Reina; Adrià Caballé; Marina Gay; Gianluca Arauz-Garofalo; Eulalia Belloc; Raúl Mendez
Journal:  Genome Biol       Date:  2022-09-12       Impact factor: 17.906

5.  Evidence of Transcriptional Shutoff by Pathogenic Viral Haemorrhagic Septicaemia Virus in Rainbow Trout.

Authors:  Irene Cano; Eduarda M Santos; Karen Moore; Audrey Farbos; Ronny van Aerle
Journal:  Viruses       Date:  2021-06-11       Impact factor: 5.048

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.