Literature DB >> 22916831

Cloud parallel processing of tandem mass spectrometry based proteomics data.

Yassene Mohammed1, Ekaterina Mostovenko, Alex A Henneman, Rob J Marissen, André M Deelder, Magnus Palmblad.   

Abstract

Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.

Mesh:

Substances:

Year:  2012        PMID: 22916831     DOI: 10.1021/pr300561q

Source DB:  PubMed          Journal:  J Proteome Res        ISSN: 1535-3893            Impact factor:   4.466


  9 in total

1.  Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.

Authors:  Joseph Slagel; Luis Mendoza; David Shteynberg; Eric W Deutsch; Robert L Moritz
Journal:  Mol Cell Proteomics       Date:  2014-11-23       Impact factor: 5.911

Review 2.  Big data in medicine is driving big changes.

Authors:  F Martin-Sanchez; K Verspoor
Journal:  Yearb Med Inform       Date:  2014-08-15

3.  Advanced Multidimensional Separations in Mass Spectrometry: Navigating the Big Data Deluge.

Authors:  Jody C May; John A McLean
Journal:  Annu Rev Anal Chem (Palo Alto Calif)       Date:  2016-03-30       Impact factor: 10.745

4.  An Open Data Format for Visualization and Analysis of Cross-Linked Mass Spectrometry Results.

Authors:  Michael R Hoopmann; Luis Mendoza; Eric W Deutsch; David Shteynberg; Robert L Moritz
Journal:  J Am Soc Mass Spectrom       Date:  2016-07-28       Impact factor: 3.109

5.  Cloudy with a Chance of Peptides: Accessibility, Scalability, and Reproducibility with Cloud-Hosted Environments.

Authors:  Benjamin A Neely
Journal:  J Proteome Res       Date:  2021-01-29       Impact factor: 4.466

6.  Deep learning embedder method and tool for mass spectra similarity search.

Authors:  Chunyuan Qin; Xiyang Luo; Chuan Deng; Kunxian Shu; Weimin Zhu; Johannes Griss; Henning Hermjakob; Mingze Bai; Yasset Perez-Riverol
Journal:  J Proteomics       Date:  2020-12-08       Impact factor: 3.855

Review 7.  Toward a Literature-Driven Definition of Big Data in Healthcare.

Authors:  Emilie Baro; Samuel Degoul; Régis Beuscart; Emmanuel Chazard
Journal:  Biomed Res Int       Date:  2015-06-02       Impact factor: 3.411

8.  Scientific workflow optimization for improved peptide and protein identification.

Authors:  Sonja Holl; Yassene Mohammed; Olav Zimmermann; Magnus Palmblad
Journal:  BMC Bioinformatics       Date:  2015-09-03       Impact factor: 3.169

9.  Low cost, high performance processing of single particle cryo-electron microscopy data in the cloud.

Authors:  Michael A Cianfrocco; Andres E Leschziner
Journal:  Elife       Date:  2015-05-08       Impact factor: 8.140

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.