Literature DB >> 24462600

Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

Bo Liu1, Ravi K Madduri2, Borja Sotomayor3, Kyle Chard3, Lukasz Lacinski3, Utpal J Dave3, Jianqiang Li4, Chunchen Liu5, Ian T Foster2.   

Abstract

Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach.
Copyright © 2014 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Bioinformatics; Cloud computing; Galaxy; Scientific workflow; Sequencing analyses

Mesh:

Year:  2014        PMID: 24462600      PMCID: PMC4203338          DOI: 10.1016/j.jbi.2014.01.005

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  14 in total

1.  EMBOSS: the European Molecular Biology Open Software Suite.

Authors:  P Rice; I Longden; A Bleasby
Journal:  Trends Genet       Date:  2000-06       Impact factor: 11.639

2.  The Bioperl toolkit: Perl modules for the life sciences.

Authors:  Jason E Stajich; David Block; Kris Boulez; Steven E Brenner; Stephen A Chervitz; Chris Dagdigian; Georg Fuellen; James G R Gilbert; Ian Korf; Hilmar Lapp; Heikki Lehväslaiho; Chad Matsalla; Chris J Mungall; Brian I Osborne; Matthew R Pocock; Peter Schattner; Martin Senger; Lincoln D Stein; Elia Stupka; Mark D Wilkinson; Ewan Birney
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

3.  'Big data', Hadoop and cloud computing in genomics.

Authors:  Aisling O'Driscoll; Jurate Daugelaite; Roy D Sleator
Journal:  J Biomed Inform       Date:  2013-07-18       Impact factor: 6.317

4.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

Review 5.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

6.  Galaxy CloudMan: delivering cloud compute clusters.

Authors:  Enis Afgan; Dannon Baker; Nate Coraor; Brad Chapman; Anton Nekrutenko; James Taylor
Journal:  BMC Bioinformatics       Date:  2010-12-21       Impact factor: 3.169

7.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.

Authors:  Jeremy Goecks; Anton Nekrutenko; James Taylor
Journal:  Genome Biol       Date:  2010-08-25       Impact factor: 13.583

8.  Cloud computing for comparative genomics.

Authors:  Dennis P Wall; Parul Kudtarkar; Vincent A Fusaro; Rimma Pivovarov; Prasad Patil; Peter J Tonellato
Journal:  BMC Bioinformatics       Date:  2010-05-18       Impact factor: 3.169

9.  Searching for SNPs with cloud computing.

Authors:  Ben Langmead; Michael C Schatz; Jimmy Lin; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-11-20       Impact factor: 13.583

10.  CloudBurst: highly sensitive read mapping with MapReduce.

Authors:  Michael C Schatz
Journal:  Bioinformatics       Date:  2009-04-08       Impact factor: 6.937

View more
  21 in total

1.  Design and development of a medical big data processing system based on Hadoop.

Authors:  Qin Yao; Yu Tian; Peng-Fei Li; Li-Li Tian; Yang-Ming Qian; Jing-Song Li
Journal:  J Med Syst       Date:  2015-02-10       Impact factor: 4.460

Review 2.  A Primer on Infectious Disease Bacterial Genomics.

Authors:  Tarah Lynch; Aaron Petkau; Natalie Knox; Morag Graham; Gary Van Domselaar
Journal:  Clin Microbiol Rev       Date:  2016-09-07       Impact factor: 26.132

Review 3.  Cloud computing for genomic data analysis and collaboration.

Authors:  Ben Langmead; Abhinav Nellore
Journal:  Nat Rev Genet       Date:  2018-01-30       Impact factor: 53.242

4.  EuPathDB: The Eukaryotic Pathogen Genomics Database Resource.

Authors:  Susanne Warrenfeltz; Evelina Y Basenko; Kathryn Crouch; Omar S Harb; Jessica C Kissinger; David S Roos; Achchuthan Shanmugasundram; Fatima Silva-Franco
Journal:  Methods Mol Biol       Date:  2018

5.  Needs Assessment for Research Use of High-Throughput Sequencing at a Large Academic Medical Center.

Authors:  Albert Geskin; Elizabeth Legowski; Anish Chakka; Uma R Chandran; M Michael Barmada; William A LaFramboise; Jeremy Berg; Rebecca S Jacobson
Journal:  PLoS One       Date:  2015-06-26       Impact factor: 3.240

Review 6.  From big data analysis to personalized medicine for all: challenges and opportunities.

Authors:  Akram Alyass; Michelle Turcotte; David Meyre
Journal:  BMC Med Genomics       Date:  2015-06-27       Impact factor: 3.063

Review 7.  Cracking the Code of Human Diseases Using Next-Generation Sequencing: Applications, Challenges, and Perspectives.

Authors:  Vincenza Precone; Valentina Del Monaco; Maria Valeria Esposito; Fatima Domenica Elisa De Palma; Anna Ruocco; Francesco Salvatore; Valeria D'Argenio
Journal:  Biomed Res Int       Date:  2015-11-19       Impact factor: 3.411

8.  Native structure-based modeling and simulation of biomolecular systems per mouse click.

Authors:  Benjamin Lutz; Claude Sinner; Stefan Bozic; Ivan Kondov; Alexander Schug
Journal:  BMC Bioinformatics       Date:  2014-08-29       Impact factor: 3.169

Review 9.  Novel bioinformatic developments for exome sequencing.

Authors:  Stefan H Lelieveld; Joris A Veltman; Christian Gilissen
Journal:  Hum Genet       Date:  2016-04-13       Impact factor: 4.132

10.  Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.

Authors:  Enis Afgan; Clare Sloggett; Nuwan Goonasekera; Igor Makunin; Derek Benson; Mark Crowe; Simon Gladman; Yousef Kowsar; Michael Pheasant; Ron Horst; Andrew Lonie
Journal:  PLoS One       Date:  2015-10-26       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.