Literature DB >> 25342933

Experiences Building Globus Genomics: A Next-Generation Sequencing Analysis Service using Galaxy, Globus, and Amazon Web Services.

Ravi K Madduri1, Dinanath Sulakhe1, Lukasz Lacinski1, Bo Liu1, Alex Rodriguez1, Kyle Chard1, Utpal J Dave1, Ian T Foster1.   

Abstract

We describe Globus Genomics, a system that we have developed for rapid analysis of large quantities of next-generation sequencing (NGS) genomic data. This system achieves a high degree of end-to-end automation that encompasses every stage of data analysis including initial data retrieval from remote sequencing centers or storage (via the Globus file transfer system); specification, configuration, and reuse of multi-step processing pipelines (via the Galaxy workflow system); creation of custom Amazon Machine Images and on-demand resource acquisition via a specialized elastic provisioner (on Amazon EC2); and efficient scheduling of these pipelines over many processors (via the HTCondor scheduler). The system allows biomedical researchers to perform rapid analysis of large NGS datasets in a fully automated manner, without software installation or a need for any local computing infrastructure. We report performance and cost results for some representative workloads.

Entities:  

Keywords:  Cloud; HPC; HTC; NGS; workflows

Year:  2014        PMID: 25342933      PMCID: PMC4203657          DOI: 10.1002/cpe.3274

Source DB:  PubMed          Journal:  Concurr Comput        ISSN: 1532-0626            Impact factor:   1.536


  15 in total

1.  The case for cloud computing in genome informatics.

Authors:  Lincoln D Stein
Journal:  Genome Biol       Date:  2010-05-05       Impact factor: 13.583

Review 2.  Exome sequencing as a tool for Mendelian disease gene discovery.

Authors:  Michael J Bamshad; Sarah B Ng; Abigail W Bigham; Holly K Tabor; Mary J Emond; Deborah A Nickerson; Jay Shendure
Journal:  Nat Rev Genet       Date:  2011-09-27       Impact factor: 53.242

3.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

4.  Cloud-scale RNA-sequencing differential expression analysis with Myrna.

Authors:  Ben Langmead; Kasper D Hansen; Jeffrey T Leek
Journal:  Genome Biol       Date:  2010-08-11       Impact factor: 13.583

Review 5.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

6.  Genome-wide mapping of in vivo protein-DNA interactions.

Authors:  David S Johnson; Ali Mortazavi; Richard M Myers; Barbara Wold
Journal:  Science       Date:  2007-05-31       Impact factor: 47.728

7.  Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.

Authors:  Shanrong Zhao; Kurt Prenger; Lance Smith; Thomas Messina; Hongtao Fan; Edward Jaeger; Susan Stephens
Journal:  BMC Genomics       Date:  2013-06-27       Impact factor: 3.969

8.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.

Authors:  Jeremy Goecks; Anton Nekrutenko; James Taylor
Journal:  Genome Biol       Date:  2010-08-25       Impact factor: 13.583

9.  Searching for SNPs with cloud computing.

Authors:  Ben Langmead; Michael C Schatz; Jimmy Lin; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-11-20       Impact factor: 13.583

10.  Model-based analysis of ChIP-Seq (MACS).

Authors:  Yong Zhang; Tao Liu; Clifford A Meyer; Jérôme Eeckhoute; David S Johnson; Bradley E Bernstein; Chad Nusbaum; Richard M Myers; Myles Brown; Wei Li; X Shirley Liu
Journal:  Genome Biol       Date:  2008-09-17       Impact factor: 13.583

View more
  19 in total

Review 1.  Progress Toward Cancer Data Ecosystems.

Authors:  Robert L Grossman
Journal:  Cancer J       Date:  2018 May/Jun       Impact factor: 3.360

2.  Prevalence of Inherited Mutations in Breast Cancer Predisposition Genes among Women in Uganda and Cameroon.

Authors:  Babatunde Adedokun; Yonglan Zheng; Paul Ndom; Antony Gakwaya; Timothy Makumbi; Alicia Y Zhou; Toshio F Yoshimatsu; Alex Rodriguez; Ravi K Madduri; Ian T Foster; Aminah Sallam; Olufunmilayo I Olopade; Dezheng Huo
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2019-12-23       Impact factor: 4.254

3.  Developing a framework for digital objects in the Big Data to Knowledge (BD2K) commons: Report from the Commons Framework Pilots workshop.

Authors:  Kathleen M Jagodnik; Simon Koplev; Sherry L Jenkins; Lucila Ohno-Machado; Benedict Paten; Stephan C Schurer; Michel Dumontier; Ruben Verborgh; Alex Bui; Peipei Ping; Neil J McKenna; Ravi Madduri; Ajay Pillai; Avi Ma'ayan
Journal:  J Biomed Inform       Date:  2017-05-10       Impact factor: 6.317

4.  Enhancing PCORnet Clinical Research Network data completeness by integrating multistate insurance claims with electronic health records in a cloud environment aligned with CMS security and privacy requirements.

Authors:  Lemuel R Waitman; Xing Song; Dammika Lakmal Walpitage; Daniel C Connolly; Lav P Patel; Mei Liu; Mary C Schroeder; Jeffrey J VanWormer; Abu Saleh Mosa; Ernest T Anye; Ann M Davis
Journal:  J Am Med Inform Assoc       Date:  2022-03-15       Impact factor: 4.497

5.  CUF-Links: Continuous and Ubiquitous FAIRness Linkages for reproducible research.

Authors:  Ian Foster; Carl Kesselman
Journal:  Computer (Long Beach Calif)       Date:  2022-08-02       Impact factor: 2.256

Review 6.  Cloud computing for genomic data analysis and collaboration.

Authors:  Ben Langmead; Abhinav Nellore
Journal:  Nat Rev Genet       Date:  2018-01-30       Impact factor: 53.242

7.  Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types.

Authors:  Cory C Funk; Alex M Casella; Segun Jung; Matthew A Richards; Alex Rodriguez; Paul Shannon; Rory Donovan-Maiye; Ben Heavner; Kyle Chard; Yukai Xiao; Gustavo Glusman; Nilufer Ertekin-Taner; Todd E Golde; Arthur Toga; Leroy Hood; John D Van Horn; Carl Kesselman; Ian Foster; Ravi Madduri; Nathan D Price; Seth A Ament
Journal:  Cell Rep       Date:  2020-08-18       Impact factor: 9.995

Review 8.  Visual programming for next-generation sequencing data analytics.

Authors:  Franco Milicchio; Rebecca Rose; Jiang Bian; Jae Min; Mattia Prosperi
Journal:  BioData Min       Date:  2016-04-27       Impact factor: 2.522

Review 9.  Big Data: the challenge for small research groups in the era of cancer genomics.

Authors:  Aisyah Mohd Noor; Lars Holmberg; Cheryl Gillett; Anita Grigoriadis
Journal:  Br J Cancer       Date:  2015-10-22       Impact factor: 7.640

10.  Long Read Alignment with Parallel MapReduce Cloud Platform.

Authors:  Ahmed Abdulhakim Al-Absi; Dae-Ki Kang
Journal:  Biomed Res Int       Date:  2015-12-29       Impact factor: 3.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.