Literature DB >> 28868521

An Out-of-Core GPU based dimensionality reduction algorithm for Big Mass Spectrometry Data and its application in bottom-up Proteomics.

Muaaz Gul Awan1, Fahad Saeed2.   

Abstract

Modern high resolution Mass Spectrometry instruments can generate millions of spectra in a single systems biology experiment. Each spectrum consists of thousands of peaks but only a small number of peaks actively contribute to deduction of peptides. Therefore, pre-processing of MS data to detect noisy and non-useful peaks are an active area of research. Most of the sequential noise reducing algorithms are impractical to use as a pre-processing step due to high time-complexity. In this paper, we present a GPU based dimensionality-reduction algorithm, called G-MSR, for MS2 spectra. Our proposed algorithm uses novel data structures which optimize the memory and computational operations inside GPU. These novel data structures include Binary Spectra and Quantized Indexed Spectra (QIS). The former helps in communicating essential information between CPU and GPU using minimum amount of data while latter enables us to store and process complex 3-D data structure into a 1-D array structure while maintaining the integrity of MS data. Our proposed algorithm also takes into account the limited memory of GPUs and switches between in-core and out-of-core modes based upon the size of input data. G-MSR achieves a peak speed-up of 386x over its sequential counterpart and is shown to process over a million spectra in just 32 seconds. The code for this algorithm is available as a GPL open-source at GitHub at the following link: https://github.com/pcdslab/G-MSR.

Entities:  

Keywords:  BigData; Data Reduction; GPU; Mass Spectrometry; Out-of-Core; Proteomics

Year:  2017        PMID: 28868521      PMCID: PMC5580946          DOI: 10.1145/3107411.3107466

Source DB:  PubMed          Journal:  ACM BCB


  14 in total

Review 1.  Mass spectrometry-based proteomics.

Authors:  Ruedi Aebersold; Matthias Mann
Journal:  Nature       Date:  2003-03-13       Impact factor: 49.962

2.  Automatic quality assessment of peptide tandem mass spectra.

Authors:  Marshall Bern; David Goldberg; W Hayes McDonald; John R Yates
Journal:  Bioinformatics       Date:  2004-08-04       Impact factor: 6.937

3.  Identification and proteomic profiling of exosomes in human urine.

Authors:  Trairak Pisitkun; Rong-Fong Shen; Mark A Knepper
Journal:  Proc Natl Acad Sci U S A       Date:  2004-08-23       Impact factor: 11.205

4.  HiXCorr: a portable high-speed XCorr engine for high-resolution tandem mass spectrometry.

Authors:  Hyunwoo Kim; Hosung Jo; Heejin Park; Eunok Paek
Journal:  Bioinformatics       Date:  2015-08-26       Impact factor: 6.937

5.  MS-REDUCE: an ultrafast technique for reduction of big mass spectrometry data for high-throughput processing.

Authors:  Muaaz Gul Awan; Fahad Saeed
Journal:  Bioinformatics       Date:  2016-01-21       Impact factor: 6.937

6.  Cleaning of raw peptide MS/MS spectra: improved protein identification following deconvolution of multiply charged peaks, isotope clusters, and removal of background noise.

Authors:  Nedim Mujezinovic; Günther Raidl; James R A Hutchins; Jan-Michael Peters; Karl Mechtler; Frank Eisenhaber
Journal:  Proteomics       Date:  2006-10       Impact factor: 3.984

Review 7.  Mass spectrometry-based proteomics: from cancer biology to protein biomarkers, drug targets, and clinical applications.

Authors:  Connie R Jimenez; Henk M W Verheul
Journal:  Am Soc Clin Oncol Educ Book       Date:  2014

8.  Faster SEQUEST searching for peptide identification from tandem mass spectra.

Authors:  Benjamin J Diament; William Stafford Noble
Journal:  J Proteome Res       Date:  2011-07-29       Impact factor: 4.466

9.  CPhos: a program to calculate and visualize evolutionarily conserved functional phosphorylation sites.

Authors:  Boyang Zhao; Trairak Pisitkun; Jason D Hoffert; Mark A Knepper; Fahad Saeed
Journal:  Proteomics       Date:  2012-10-29       Impact factor: 3.984

10.  A novel approach to denoising ion trap tandem mass spectra.

Authors:  Jiarui Ding; Jinhong Shi; Guy G Poirier; Fang-Xiang Wu
Journal:  Proteome Sci       Date:  2009-03-17       Impact factor: 2.480

View more
  4 in total

1.  MaSS-Simulator: A Highly Configurable Simulator for Generating MS/MS Datasets for Benchmarking of Proteomics Algorithms.

Authors:  Muaaz Gul Awan; Fahad Saeed
Journal:  Proteomics       Date:  2018-09-28       Impact factor: 3.984

2.  Communication Lower-Bounds for Distributed-Memory Computations for Mass Spectrometry based Omics Data.

Authors:  Fahad Saeed; Muhammad Haseeb; S S Iyengar
Journal:  J Parallel Distrib Comput       Date:  2021-11-17       Impact factor: 3.734

3.  GPU-DAEMON: GPU algorithm design, data management & optimization template for array based big omics data.

Authors:  Muaaz Gul Awan; Taban Eslami; Fahad Saeed
Journal:  Comput Biol Med       Date:  2018-08-16       Impact factor: 4.589

4.  Fast-GPU-PCC: A GPU-Based Technique to Compute Pairwise Pearson's Correlation Coefficients for Time Series Data-fMRI Study.

Authors:  Taban Eslami; Fahad Saeed
Journal:  High Throughput       Date:  2018-04-20
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.