Literature DB >> 22072385

MR-Tandem: parallel X!Tandem using Hadoop MapReduce on Amazon Web Services.

Brian Pratt1, J Jeffry Howbert, Natalie I Tasman, Erik J Nilsson.   

Abstract

SUMMARY: MR-Tandem adapts the popular X!Tandem peptide search engine to work with Hadoop MapReduce for reliable parallel execution of large searches. MR-Tandem runs on any Hadoop cluster but offers special support for Amazon Web Services for creating inexpensive on-demand Hadoop clusters, enabling search volumes that might not otherwise be feasible with the compute resources a researcher has at hand. MR-Tandem is designed to drop in wherever X!Tandem is already in use and requires no modification to existing X!Tandem parameter files, and only minimal modification to X!Tandem-based workflows.
AVAILABILITY AND IMPLEMENTATION: MR-Tandem is implemented as a lightly modified X!Tandem C++ executable and a Python script that drives Hadoop clusters including Amazon Web Services (AWS) Elastic Map Reduce (EMR), using the modified X!Tandem program as a Hadoop Streaming mapper and reducer. The modified X!Tandem C++ source code is Artistic licensed, supports pluggable scoring, and is available as part of the Sashimi project at http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/extern/xtandem/. The MR-Tandem Python script is Apache licensed and available as part of the Insilicos Cloud Army project at http://ica.svn.sourceforge.net/viewvc/ica/trunk/mr-tandem/. Full documentation and a windows installer that configures MR-Tandem, Python and all necessary packages are available at this same URL. CONTACT: brian.pratt@insilicos.com

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 22072385      PMCID: PMC3244769          DOI: 10.1093/bioinformatics/btr615

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  TANDEM: matching proteins with tandem mass spectra.

Authors:  Robertson Craig; Ronald C Beavis
Journal:  Bioinformatics       Date:  2004-02-19       Impact factor: 6.937

2.  Parallel tandem: a program for parallel processing of tandem mass spectra using PVM or MPI and X!Tandem.

Authors:  Dexter T Duncan; Robertson Craig; Andrew J Link
Journal:  J Proteome Res       Date:  2005 Sep-Oct       Impact factor: 4.466

3.  X!!Tandem, an improved method for running X!tandem in parallel on collections of commodity computers.

Authors:  Robert D Bjornson; Nicholas J Carriero; Christopher Colangelo; Mark Shifman; Kei-Hoi Cheung; Perry L Miller; Kenneth Williams
Journal:  J Proteome Res       Date:  2007-09-29       Impact factor: 4.466

  3 in total
  8 in total

1.  Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.

Authors:  Joseph Slagel; Luis Mendoza; David Shteynberg; Eric W Deutsch; Robert L Moritz
Journal:  Mol Cell Proteomics       Date:  2014-11-23       Impact factor: 5.911

Review 2.  Minireview: progress and challenges in proteomics data management, sharing, and integration.

Authors:  Lauren B Becnel; Neil J McKenna
Journal:  Mol Endocrinol       Date:  2012-08-17

Review 3.  Current algorithmic solutions for peptide-based proteomics data generation and identification.

Authors:  Michael R Hoopmann; Robert L Moritz
Journal:  Curr Opin Biotechnol       Date:  2012-11-08       Impact factor: 9.740

4.  Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data.

Authors:  Fahad Saeed; Muhammad Haseeb; S S Iyengar
Journal:  J Parallel Distrib Comput       Date:  2021-11-17       Impact factor: 3.734

5.  High Performance Computing Framework for Tera-Scale Database Search of Mass Spectrometry Data.

Authors:  Muhammad Haseeb; Fahad Saeed
Journal:  Nat Comput Sci       Date:  2021-08-20

6.  Method for rapid protein identification in a large database.

Authors:  Wenli Zhang; Xiaofang Zhao
Journal:  Biomed Res Int       Date:  2013-08-13       Impact factor: 3.411

7.  Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.

Authors:  Steven Lewis; Attila Csordas; Sarah Killcoyne; Henning Hermjakob; Michael R Hoopmann; Robert L Moritz; Eric W Deutsch; John Boyle
Journal:  BMC Bioinformatics       Date:  2012-12-05       Impact factor: 3.169

8.  Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

Authors:  Wei-Po Lee; Yu-Ting Hsiao; Wei-Che Hwang
Journal:  BMC Syst Biol       Date:  2014-01-16
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.