Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Comparison of Programmatic Approaches for Efficient Accessing to mzML Files.

Literature DB >> 21766049

Comparison of Programmatic Approaches for Efficient Accessing to mzML Files.

Abstract

The Human Proteome Organization (HUPO) Proteomics Standard Initiative has been tasked with developing file formats for storing raw data (mzML) and the results of spectral processing (protein identification and quantification) from proteomics experiments (mzIndentML). In order to fully characterize complex experiments, special data types have been designed. Standardized file formats will promote visualization, validation and dissemination of data independent of the vendor-specific binary data storage files. Innovative programmatic solutions for robust and efficient data access to standardized file formats will contribute to more rapid wide-scale acceptance of these file formats by the proteomics community.In this work, we compare algorithms for accessing spectral data in the mzML file format. As an XML file, mzML files allow efficient parsing of data structures when using XML-specific class types. These classes provide only sequential access to files. However, random access to spectral data is needed in many algorithmic applications for processing proteomics datasets. Here, we demonstrate implementation of memory streams to convert a sequential access into random access. Our application preserves the elegant XML parsing capabilities. Benchmarking file access times in sequential and random access modes show that while for small number of spectra the random access is more time efficient, when retrieving large number of spectra sequential access becomes more efficient. We also provide comparisons to other file accessing methods from academia and industry.

Entities: Chemical Disease Gene Species

Year: 2011 PMID： 21766049 PMCID： PMC3135311 DOI： 10.4172/2153-0602.1000109

Source DB: PubMed Journal: J Data Mining Genomics Proteomics

27 in total

1. Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search.

Authors: Andrew Keller; Alexey I Nesvizhskii; Eugene Kolker; Ruedi Aebersold
Journal: Anal Chem Date: 2002-10-15 Impact factor: 6.986

2. GutenTag: high-throughput sequence tagging via an empirically derived fragmentation model.

Authors: David L Tabb; Anita Saraf; John R Yates
Journal: Anal Chem Date: 2003-12-01 Impact factor: 6.986

3. Open mass spectrometry search algorithm.

Authors: Lewis Y Geer; Sanford P Markey; Jeffrey A Kowalak; Lukas Wagner; Ming Xu; Dawn M Maynard; Xiaoyu Yang; Wenyao Shi; Stephen H Bryant
Journal: J Proteome Res Date: 2004 Sep-Oct Impact factor: 4.466

4. An efficient data format for mass spectrometry-based proteomics.

Authors: Anuj R Shah; Jennifer Davidson; Matthew E Monroe; Anoop M Mayampurath; William F Danielson; Yan Shi; Aaron C Robinson; Brian H Clowers; Mikhail E Belov; Gordon A Anderson; Richard D Smith
Journal: J Am Soc Mass Spectrom Date: 2010-07-07 Impact factor: 3.109

5. Open source system for analyzing, validating, and storing protein identification data.

Authors: Robertson Craig; John P Cortens; Ronald C Beavis
Journal: J Proteome Res Date: 2004 Nov-Dec Impact factor: 4.466

6. A suite of algorithms for the comprehensive analysis of complex protein mixtures using high-resolution LC-MS.

Authors: Matthew Bellew; Marc Coram; Matthew Fitzgibbon; Mark Igra; Tim Randolph; Pei Wang; Damon May; Jimmy Eng; Ruihua Fang; Chenwei Lin; Jinzhi Chen; David Goodlett; Jeffrey Whiteaker; Amanda Paulovich; Martin McIntosh
Journal: Bioinformatics Date: 2006-06-09 Impact factor: 6.937

7. Five years of progress in the Standardization of Proteomics Data 4th Annual Spring Workshop of the HUPO-Proteomics Standards Initiative April 23-25, 2007 Ecole Nationale Supérieure (ENS), Lyon, France.

Authors: Sandra Orchard; Luisa Montechi-Palazzi; Eric W Deutsch; Pierre-Alain Binz; Andrew R Jones; Norman Paton; Angel Pizarro; David M Creasy; Jérôme Wojcik; Henning Hermjakob
Journal: Proteomics Date: 2007-10 Impact factor: 3.984

Review 8. The minimum information about a proteomics experiment (MIAPE).

Authors: Chris F Taylor; Norman W Paton; Kathryn S Lilley; Pierre-Alain Binz; Randall K Julian; Andrew R Jones; Weimin Zhu; Rolf Apweiler; Ruedi Aebersold; Eric W Deutsch; Michael J Dunn; Albert J R Heck; Alexander Leitner; Marcus Macht; Matthias Mann; Lennart Martens; Thomas A Neubert; Scott D Patterson; Peipei Ping; Sean L Seymour; Puneet Souda; Akira Tsugita; Joel Vandekerckhove; Thomas M Vondriska; Julian P Whitelegge; Marc R Wilkins; Ioannnis Xenarios; John R Yates; Henning Hermjakob
Journal: Nat Biotechnol Date: 2007-08 Impact factor: 54.908

3. Use of singular value decomposition analysis to differentiate phosphorylated precursors in strong cation exchange fractions.

Authors: Rovshan G Sadygov
Journal: Electrophoresis Date: 2014-07-24 Impact factor: 3.535

3 in total