Literature DB >> 34723198

High Performance Computing Framework for Tera-Scale Database Search of Mass Spectrometry Data.

Muhammad Haseeb1, Fahad Saeed1,2,3.   

Abstract

Database peptide search algorithms deduce peptides from mass spectrometry (MS) data. There has been substantial effort in improving their computational efficiency to achieve larger and more complex systems biology studies. However, modern serial and high-performance computing (HPC) algorithms exhibit sub-optimal performance mainly due to their ineffective parallel designs (low resource utilization), and high overhead costs. We present an HPC framework, called HiCOPS, for efficient acceleration of the database peptide search algorithms on distributed-memory supercomputers. HiCOPS provides, on average, more than 10-fold improvement in speed, and superior parallel performance over several existing HPC database search software. We also formulate a mathematical model for performance analysis and optimization, and report near-optimal results for several key metrics including strong-scale efficiency, hardware utilization, load-balance, inter-process communication and I/O overheads. The core parallel design, techniques, and optimizations presented in HiCOPS are search-algorithm independent and can be extended to efficiently accelerate the existing and future algorithms and software.

Entities:  

Keywords:  bulk synchronous parallel; high performance computing; mass spectrometry; peptide identification; proteomics

Year:  2021        PMID: 34723198      PMCID: PMC8554525          DOI: 10.1038/s43588-021-00113-z

Source DB:  PubMed          Journal:  Nat Comput Sci        ISSN: 2662-8457


  37 in total

1.  Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry.

Authors:  Marshall Bern; Yuhan Cai; David Goldberg
Journal:  Anal Chem       Date:  2007-01-23       Impact factor: 6.986

2.  An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database.

Authors:  J K Eng; A L McCormack; J R Yates
Journal:  J Am Soc Mass Spectrom       Date:  1994-11       Impact factor: 3.109

3.  Bolt: a New Age Peptide Search Engine for Comprehensive MS/MS Sequencing Through Vast Protein Databases in Minutes.

Authors:  Amol Prakash; Shadab Ahmad; Swetaketu Majumder; Conor Jenkins; Ben Orsburn
Journal:  J Am Soc Mass Spectrom       Date:  2019-08-26       Impact factor: 3.109

4.  pClean: An Algorithm To Preprocess High-Resolution Tandem Mass Spectra for Database Searching.

Authors:  Yamei Deng; Zhe Ren; Qingfei Pan; Da Qi; Bo Wen; Yan Ren; Huanming Yang; Lin Wu; Fei Chen; Siqi Liu
Journal:  J Proteome Res       Date:  2019-08-14       Impact factor: 4.466

5.  pDeep: Predicting MS/MS Spectra of Peptides with Deep Learning.

Authors:  Xie-Xuan Zhou; Wen-Feng Zeng; Hao Chi; Chunjie Luo; Chao Liu; Jianfeng Zhan; Si-Min He; Zhifei Zhang
Journal:  Anal Chem       Date:  2017-11-21       Impact factor: 6.986

6.  Full-Spectrum Prediction of Peptides Tandem Mass Spectra using Deep Neural Network.

Authors:  Kaiyuan Liu; Sujun Li; Lei Wang; Yuzhen Ye; Haixu Tang
Journal:  Anal Chem       Date:  2020-02-25       Impact factor: 8.008

7.  MS-GF+ makes progress towards a universal database search tool for proteomics.

Authors:  Sangtae Kim; Pavel A Pevzner
Journal:  Nat Commun       Date:  2014-10-31       Impact factor: 14.919

8.  MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics.

Authors:  Andy T Kong; Felipe V Leprevost; Dmitry M Avtonomov; Dattatreya Mellacheruvu; Alexey I Nesvizhskii
Journal:  Nat Methods       Date:  2017-04-10       Impact factor: 28.547

9.  Revealing higher than expected diversity of Harpacticoida (Crustacea:Copepoda) in the North Sea using MALDI-TOF MS and molecular barcoding.

Authors:  S Rossel; P Martínez Arbizu
Journal:  Sci Rep       Date:  2019-06-24       Impact factor: 4.379

10.  MS2CNN: predicting MS/MS spectrum based on protein sequence using deep convolutional neural networks.

Authors:  Yang-Ming Lin; Ching-Tai Chen; Jia-Ming Chang
Journal:  BMC Genomics       Date:  2019-12-24       Impact factor: 3.969

View more
  1 in total

1.  Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data.

Authors:  Fahad Saeed; Muhammad Haseeb; S S Iyengar
Journal:  J Parallel Distrib Comput       Date:  2021-11-17       Impact factor: 3.734

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.