Literature DB >> 26355513

CAMS-RS: Clustering Algorithm for Large-Scale Mass Spectrometry Data Using Restricted Search Space and Intelligent Random Sampling.

Fahad Saeed, Jason D Hoffert, Mark A Knepper.   

Abstract

High-throughput mass spectrometers can produce massive amounts of redundant data at an astonishing rate with many of them having poor signal-to-noise (S/N) ratio. These low S/N ratio spectra may not get interpreted using conventional spectra-to-database matching techniques. In this paper, we present an efficient algorithm, CAMS-RS (Clustering Algorithm for Mass Spectra using Restricted Space and Sampling) for clustering of raw mass spectrometry data. CAMS-RS utilizes a novel metric (called F-set) that exploits the temporal and spatial patterns to accurately assess similarity between two given spectra. The F-set similarity metric is independent of the retention time and allows clustering of mass spectrometry data from independent LC-MS/MS runs. A novel restricted search space strategy is devised to limit the comparisons of the number of spectra. An intelligent sampling method is executed on individual bins that allow merging of the results to make the final clusters. Our experiments, using experimentally generated data sets, show that the proposed algorithm is able to cluster spectra with high accuracy and is helpful in interpreting low S/N ratio spectra. The CAMS-RS algorithm is highly scalable with increasing number of spectra and our implementation allows clustering of up to a million spectra within minutes.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 26355513      PMCID: PMC6143137          DOI: 10.1109/TCBB.2013.152

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  28 in total

1.  Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility.

Authors:  David L Tabb; Michael J MacCoss; Christine C Wu; Scott D Anderson; John R Yates
Journal:  Anal Chem       Date:  2003-05-15       Impact factor: 6.986

2.  A graph-theoretic approach for the separation of b and y ions in tandem mass spectra.

Authors:  Bo Yan; Chongle Pan; Victor N Olman; Robert L Hettich; Ying Xu
Journal:  Bioinformatics       Date:  2004-09-28       Impact factor: 6.937

3.  Dynamics of the G protein-coupled vasopressin V2 receptor signaling network revealed by quantitative phosphoproteomics.

Authors:  Jason D Hoffert; Trairak Pisitkun; Fahad Saeed; Jae H Song; Chung-Lin Chou; Mark A Knepper
Journal:  Mol Cell Proteomics       Date:  2011-11-21       Impact factor: 5.911

4.  MS2Grouper: group assessment and synthetic replacement of duplicate proteomic tandem mass spectra.

Authors:  David L Tabb; Melissa R Thompson; Gurusahai Khalsa-Moyers; Nathan C VerBerkmoes; W Hayes McDonald
Journal:  J Am Soc Mass Spectrom       Date:  2005-08       Impact factor: 3.109

5.  Retention time alignment algorithms for LC/MS data must consider non-linear shifts.

Authors:  Katharina Podwojski; Arno Fritsch; Daniel C Chamrad; Wolfgang Paul; Barbara Sitek; Kai Stühler; Petra Mutzel; Christian Stephan; Helmut E Meyer; Wolfgang Urfer; Katja Ickstadt; Jörg Rahnenführer
Journal:  Bioinformatics       Date:  2009-01-28       Impact factor: 6.937

6.  Progressive peak clustering in GC-MS Metabolomic experiments applied to Leishmania parasites.

Authors:  David P De Souza; Eleanor C Saunders; Malcolm J McConville; Vladimir A Likić
Journal:  Bioinformatics       Date:  2006-03-09       Impact factor: 6.937

7.  A fast SEQUEST cross correlation algorithm.

Authors:  Jimmy K Eng; Bernd Fischer; Jonas Grossmann; Michael J Maccoss
Journal:  J Proteome Res       Date:  2008-09-06       Impact factor: 4.466

8.  An Efficient Algorithm for Clustering of Large-Scale Mass Spectrometry Data.

Authors:  Fahad Saeed; Trairak Pisitkun; Mark A Knepper; Jason D Hoffert
Journal:  Proceedings (IEEE Int Conf Bioinformatics Biomed)       Date:  2012-10-04

9.  Quantitative phosphoproteomics of vasopressin-sensitive renal cells: regulation of aquaporin-2 phosphorylation at two sites.

Authors:  Jason D Hoffert; Trairak Pisitkun; Guanghui Wang; Rong-Fong Shen; Mark A Knepper
Journal:  Proc Natl Acad Sci U S A       Date:  2006-04-25       Impact factor: 11.205

10.  Cluster analysis of mass spectrometry data reveals a novel component of SAGA.

Authors:  David W Powell; Connie M Weaver; Jennifer L Jennings; K Jill McAfee; Yue He; P Anthony Weil; Andrew J Link
Journal:  Mol Cell Biol       Date:  2004-08       Impact factor: 4.272

View more
  5 in total

1.  An Out-of-Core GPU based dimensionality reduction algorithm for Big Mass Spectrometry Data and its application in bottom-up Proteomics.

Authors:  Muaaz Gul Awan; Fahad Saeed
Journal:  ACM BCB       Date:  2017-08

2.  CHICKN: extraction of peptide chromatographic elution profiles from large scale mass spectrometry data by means of Wasserstein compressive hierarchical cluster analysis.

Authors:  Olga Permiakova; Romain Guibert; Alexandra Kraut; Thomas Fortin; Anne-Marie Hesse; Thomas Burger
Journal:  BMC Bioinformatics       Date:  2021-02-12       Impact factor: 3.169

3.  Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures.

Authors:  Fahad Saeed; Jason D Hoffert; Trairak Pisitkun; Mark A Knepper
Journal:  Netw Model Anal Health Inform Bioinform       Date:  2014-04

4.  GPU-DAEMON: GPU algorithm design, data management & optimization template for array based big omics data.

Authors:  Muaaz Gul Awan; Taban Eslami; Fahad Saeed
Journal:  Comput Biol Med       Date:  2018-08-16       Impact factor: 4.589

Review 5.  Soil and leaf litter metaproteomics-a brief guideline from sampling to understanding.

Authors:  Katharina M Keiblinger; Stephan Fuchs; Sophie Zechmeister-Boltenstern; Katharina Riedel
Journal:  FEMS Microbiol Ecol       Date:  2016-08-21       Impact factor: 4.194

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.