Literature DB >> 24677029

Numerical compression schemes for proteomics mass spectrometry data.

Johan Teleman1, Andrew W Dowsey2, Faviel F Gonzalez-Galarza3, Simon Perkins3, Brian Pratt4, Hannes L Röst5, Lars Malmström5, Johan Malmström6, Andrew R Jones3, Eric W Deutsch7, Fredrik Levander8.   

Abstract

The open XML format mzML, used for representation of MS data, is pivotal for the development of platform-independent MS analysis software. Although conversion from vendor formats to mzML must take place on a platform on which the vendor libraries are available (i.e. Windows), once mzML files have been generated, they can be used on any platform. However, the mzML format has turned out to be less efficient than vendor formats. In many cases, the naïve mzML representation is fourfold or even up to 18-fold larger compared with the original vendor file. In disk I/O limited setups, a larger data file also leads to longer processing times, which is a problem given the data production rates of modern mass spectrometers. In an attempt to reduce this problem, we here present a family of numerical compression algorithms called MS-Numpress, intended for efficient compression of MS data. To facilitate ease of adoption, the algorithms target the binary data in the mzML standard, and support in main proteomics tools is already available. Using a test set of 10 representative MS data files we demonstrate typical file size decreases of 90% when combined with traditional compression, as well as read time decreases of up to 50%. It is envisaged that these improvements will be beneficial for data handling within the MS community.
© 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

Entities:  

Mesh:

Year:  2014        PMID: 24677029      PMCID: PMC4047472          DOI: 10.1074/mcp.O114.037879

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


  21 in total

1.  ProteomeGRID: towards a high-throughput proteomics pipeline through opportunistic cluster image computing for two-dimensional gel electrophoresis.

Authors:  Andrew W Dowsey; Michael J Dunn; Guang-Zhong Yang
Journal:  Proteomics       Date:  2004-12       Impact factor: 3.984

2.  imzML: Imaging Mass Spectrometry Markup Language: A common data format for mass spectrometry imaging.

Authors:  Andreas Römpp; Thorsten Schramm; Alfons Hester; Ivo Klinkert; Jean-Pierre Both; Ron M A Heeren; Markus Stöckli; Bernhard Spengler
Journal:  Methods Mol Biol       Date:  2011

3.  A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS.

Authors:  Matthew Bellew; Marc Coram; Matthew Fitzgibbon; Mark Igra; Tim Randolph; Pei Wang; Damon May; Jimmy Eng; Ruihua Fang; Chenwei Lin; Jinzhi Chen; David Goodlett; Jeffrey Whiteaker; Amanda Paulovich; Martin McIntosh
Journal:  Bioinformatics       Date:  2006-06-09       Impact factor: 6.937

4.  The proteios software environment: an extensible multiuser platform for management and analysis of proteomics data.

Authors:  Jari Häkkinen; Gregory Vincic; Olle Månsson; Kristofer Wårell; Fredrik Levander
Journal:  J Proteome Res       Date:  2009-06       Impact factor: 4.466

5.  An adaptive alignment algorithm for quality-controlled label-free LC-MS.

Authors:  Marianne Sandin; Ashfaq Ali; Karin Hansson; Olle Månsson; Erik Andreasson; Svante Resjö; Fredrik Levander
Journal:  Mol Cell Proteomics       Date:  2013-01-09       Impact factor: 5.911

6.  "Lossless" compression of high resolution mass spectra of small molecules.

Authors:  Bo Blanckenburg; Yuri E M van der Burgt; André M Deelder; Magnus Palmblad
Journal:  Metabolomics       Date:  2010-03-07       Impact factor: 4.290

7.  A cross-platform toolkit for mass spectrometry and proteomics.

Authors:  Matthew C Chambers; Brendan Maclean; Robert Burke; Dario Amodei; Daniel L Ruderman; Steffen Neumann; Laurent Gatto; Bernd Fischer; Brian Pratt; Jarrett Egertson; Katherine Hoff; Darren Kessner; Natalie Tasman; Nicholas Shulman; Barbara Frewen; Tahmina A Baker; Mi-Youn Brusniak; Christopher Paulse; David Creasy; Lisa Flashner; Kian Kani; Chris Moulding; Sean L Seymour; Lydia M Nuwaysir; Brent Lefebvre; Frank Kuhlmann; Joe Roark; Paape Rainer; Suckau Detlev; Tina Hemenway; Andreas Huhmer; James Langridge; Brian Connolly; Trey Chadick; Krisztina Holly; Josh Eckels; Eric W Deutsch; Robert L Moritz; Jonathan E Katz; David B Agus; Michael MacCoss; David L Tabb; Parag Mallick
Journal:  Nat Biotechnol       Date:  2012-10       Impact factor: 54.908

8.  A uniform proteomics MS/MS analysis platform utilizing open XML file formats.

Authors:  Andrew Keller; Jimmy Eng; Ning Zhang; Xiao-jun Li; Ruedi Aebersold
Journal:  Mol Syst Biol       Date:  2005-08-02       Impact factor: 11.429

9.  A common open representation of mass spectrometry data and its application to proteomics research.

Authors:  Patrick G A Pedrioli; Jimmy K Eng; Robert Hubley; Mathijs Vogelzang; Eric W Deutsch; Brian Raught; Brian Pratt; Erik Nilsson; Ruth H Angeletti; Rolf Apweiler; Kei Cheung; Catherine E Costello; Henning Hermjakob; Sequin Huang; Randall K Julian; Eugene Kapp; Mark E McComb; Stephen G Oliver; Gilbert Omenn; Norman W Paton; Richard Simpson; Richard Smith; Chris F Taylor; Weimin Zhu; Ruedi Aebersold
Journal:  Nat Biotechnol       Date:  2004-11       Impact factor: 54.908

10.  The PRoteomics IDEntification (PRIDE) Converter 2 framework: an improved suite of tools to facilitate data submission to the PRIDE database and the ProteomeXchange consortium.

Authors:  Richard G Côté; Johannes Griss; José A Dianes; Rui Wang; James C Wright; Henk W P van den Toorn; Bas van Breukelen; Albert J R Heck; Niels Hulstaert; Lennart Martens; Florian Reisinger; Attila Csordas; David Ovelleiro; Yasset Perez-Rivevol; Harald Barsnes; Henning Hermjakob; Juan Antonio Vizcaíno
Journal:  Mol Cell Proteomics       Date:  2012-09-04       Impact factor: 5.911

View more
  17 in total

1.  CHICKN: extraction of peptide chromatographic elution profiles from large scale mass spectrometry data by means of Wasserstein compressive hierarchical cluster analysis.

Authors:  Olga Permiakova; Romain Guibert; Alexandra Kraut; Thomas Fortin; Anne-Marie Hesse; Thomas Burger
Journal:  BMC Bioinformatics       Date:  2021-02-12       Impact factor: 3.169

2.  BatMass: a Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics.

Authors:  Dmitry M Avtonomov; Alexander Raskind; Alexey I Nesvizhskii
Journal:  J Proteome Res       Date:  2016-06-28       Impact factor: 4.466

3.  Streaming visualisation of quantitative mass spectrometry data based on a novel raw signal decomposition method.

Authors:  Yan Zhang; Ranjeet Bhamber; Isabel Riba-Garcia; Hanqing Liao; Richard D Unwin; Andrew W Dowsey
Journal:  Proteomics       Date:  2015-03-09       Impact factor: 3.984

4.  MetaDB a Data Processing Workflow in Untargeted MS-Based Metabolomics Experiments.

Authors:  Pietro Franceschi; Roman Mylonas; Nir Shahaf; Matthias Scholz; Panagiotis Arapitsas; Domenico Masuero; Georg Weingart; Silvia Carlin; Urska Vrhovsek; Fulvio Mattivi; Ron Wehrens
Journal:  Front Bioeng Biotechnol       Date:  2014-12-16

5.  Proteomics Standards Initiative: Fifteen Years of Progress and Future Work.

Authors:  Eric W Deutsch; Sandra Orchard; Pierre-Alain Binz; Wout Bittremieux; Martin Eisenacher; Henning Hermjakob; Shin Kawano; Henry Lam; Gerhard Mayer; Gerben Menschaert; Yasset Perez-Riverol; Reza M Salek; David L Tabb; Stefan Tenzer; Juan Antonio Vizcaíno; Mathias Walzer; Andrew R Jones
Journal:  J Proteome Res       Date:  2017-09-15       Impact factor: 4.466

6.  Delayed effects of transcriptional responses in Mycobacterium tuberculosis exposed to nitric oxide suggest other mechanisms involved in survival.

Authors:  Teresa Cortes; Olga T Schubert; Amir Banaei-Esfahani; Ben C Collins; Ruedi Aebersold; Douglas B Young
Journal:  Sci Rep       Date:  2017-08-15       Impact factor: 4.379

7.  Quantitative proteomic characterization of lung-MSC and bone marrow-MSC using DIA-mass spectrometry.

Authors:  Sara Rolandsson Enes; Emma Åhrman; Anitha Palani; Oskar Hallgren; Leif Bjermer; Anders Malmström; Stefan Scheding; Johan Malmström; Gunilla Westergren-Thorsson
Journal:  Sci Rep       Date:  2017-08-24       Impact factor: 4.379

Review 8.  The arc of Mass Spectrometry Exchange Formats is long, but it bends toward HDF5.

Authors:  Manor Askenazi; Hisham Ben Hamidane; Johannes Graumann
Journal:  Mass Spectrom Rev       Date:  2016-10-14       Impact factor: 10.946

Review 9.  Data standards can boost metabolomics research, and if there is a will, there is a way.

Authors:  Philippe Rocca-Serra; Reza M Salek; Masanori Arita; Elon Correa; Saravanan Dayalan; Alejandra Gonzalez-Beltran; Tim Ebbels; Royston Goodacre; Janna Hastings; Kenneth Haug; Albert Koulman; Macha Nikolski; Matej Oresic; Susanna-Assunta Sansone; Daniel Schober; James Smith; Christoph Steinbeck; Mark R Viant; Steffen Neumann
Journal:  Metabolomics       Date:  2015-11-17       Impact factor: 4.290

10.  Dinosaur: A Refined Open-Source Peptide MS Feature Detector.

Authors:  Johan Teleman; Aakash Chawade; Marianne Sandin; Fredrik Levander; Johan Malmström
Journal:  J Proteome Res       Date:  2016-06-08       Impact factor: 4.466

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.