Literature DB >> 25045604

Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures.

Fahad Saeed1, Jason D Hoffert2, Trairak Pisitkun3, Mark A Knepper2.   

Abstract

Modern mass spectrometers can produce large numbers of peptide spectra from complex biological samples in a short time. A substantial amount of redundancy is observed in these data sets from peptides that may get selected multiple times in Liquid Chromatography Tandem Mass Spectrometry (LC-MS/MS) experiments. A large number of spectra do not get mapped to specific peptide sequences due to low signal-to-noise (S/N) ratio of the spectra from these machines. Clustering is one way to mitigate the problems of these complex mass spectrometry data sets. Recently we presented a graph theoretic framework, known as CAMS, for clustering of large-scale mass spectrometry data. CAMS utilized a novel metric to exploit the spatial patterns in the mass spectrometry peaks which allowed highly accurate clustering results. However, comparison of each spectrum with every other spectrum makes the clustering problem computationally inefficient. In this paper we present a parallel algorithm, called P-CAMS, that uses thread-level and instruction-level parallelism on multicore architectures to substantially decrease running times. P-CAMS relies on intelligent matrix completion to reduce the number of comparisons, threads to run on each core and Single Instruction Multiple Data (SIMD) paradigm inside each thread to exploit massive parallelism on multicore architectures. A carefully crafted load-balanced scheme that uses spatial locations of the mass spectrometry peaks mapped to nearest level cache and core allows super-linear speedups. We study the scalability of the algorithm with a wide variety of mass spectrometry data and variation in architecture specific parameters. The results show that SIMD style data parallelism combined with thread-level parallelism for multicore architectures is a powerful combination that allows substantial reduction in runtimes even for all-to-all comparison algorithms. The quality assessment is performed using real-world data set and is shown to be consistent with the serial version of the same algorithm.

Entities:  

Year:  2014        PMID: 25045604      PMCID: PMC4100726          DOI: 10.1007/s13721-014-0054-1

Source DB:  PubMed          Journal:  Netw Model Anal Health Inform Bioinform        ISSN: 2192-6670


  19 in total

1.  Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility.

Authors:  David L Tabb; Michael J MacCoss; Christine C Wu; Scott D Anderson; John R Yates
Journal:  Anal Chem       Date:  2003-05-15       Impact factor: 6.986

2.  Large-scale characterization of HeLa cell nuclear phosphoproteins.

Authors:  Sean A Beausoleil; Mark Jedrychowski; Daniel Schwartz; Joshua E Elias; Judit Villén; Jiaxu Li; Martin A Cohn; Lewis C Cantley; Steven P Gygi
Journal:  Proc Natl Acad Sci U S A       Date:  2004-08-09       Impact factor: 11.205

3.  Improving large-scale proteomics by clustering of mass spectrometry data.

Authors:  Ilan Beer; Eilon Barnea; Tamar Ziv; Arie Admon
Journal:  Proteomics       Date:  2004-04       Impact factor: 3.984

4.  Classification filtering strategy to improve the coverage and sensitivity of phosphoproteome analysis.

Authors:  Xinning Jiang; Mingliang Ye; Guanghui Han; Xiaoli Dong; Hanfa Zou
Journal:  Anal Chem       Date:  2010-07-15       Impact factor: 6.986

5.  MS2Grouper: group assessment and synthetic replacement of duplicate proteomic tandem mass spectra.

Authors:  David L Tabb; Melissa R Thompson; Gurusahai Khalsa-Moyers; Nathan C VerBerkmoes; W Hayes McDonald
Journal:  J Am Soc Mass Spectrom       Date:  2005-08       Impact factor: 3.109

6.  A fast coarse filtering method for peptide identification by mass spectrometry.

Authors:  Smriti R Ramakrishnan; Rui Mao; Aleksey A Nakorchevskiy; John T Prince; Willard S Willard; Weijia Xu; Edward M Marcotte; Daniel P Miranker
Journal:  Bioinformatics       Date:  2006-04-03       Impact factor: 6.937

7.  Linear discriminant analysis-based estimation of the false discovery rate for phosphopeptide identifications.

Authors:  Xiuxia Du; Feng Yang; Nathan P Manes; David L Stenoien; Matthew E Monroe; Joshua N Adkins; David J States; Samuel O Purvine; David G Camp; Richard D Smith
Journal:  J Proteome Res       Date:  2008-04-19       Impact factor: 4.466

8.  Quantitative phosphoproteomics applied to the yeast pheromone signaling pathway.

Authors:  Albrecht Gruhler; Jesper V Olsen; Shabaz Mohammed; Peter Mortensen; Nils J Faergeman; Matthias Mann; Ole N Jensen
Journal:  Mol Cell Proteomics       Date:  2005-01-22       Impact factor: 5.911

9.  CAMS-RS: Clustering Algorithm for Large-Scale Mass Spectrometry Data Using Restricted Search Space and Intelligent Random Sampling.

Authors:  Fahad Saeed; Jason D Hoffert; Mark A Knepper
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2014 Jan-Feb       Impact factor: 3.710

10.  Large-scale phosphorylation analysis of alpha-factor-arrested Saccharomyces cerevisiae.

Authors:  Xue Li; Scott A Gerber; Adam D Rudner; Sean A Beausoleil; Wilhelm Haas; Judit Villén; Joshua E Elias; Steve P Gygi
Journal:  J Proteome Res       Date:  2007-03       Impact factor: 4.466

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.