| Literature DB >> 22956731 |
Abstract
The application of mass spectrometry (MS) to the analysis of proteomes has enabled the high-throughput identification and abundance measurement of hundreds to thousands of proteins per experiment. However, the formidable informatics challenge associated with analyzing MS data has required a wide variety of data file formats to encode the complex data types associated with MS workflows. These formats encompass the encoding of input instruction for instruments, output products of the instruments, and several levels of information and results used by and produced by the informatics analysis tools. A brief overview of the most common file formats in use today is presented here, along with a discussion of related topics.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22956731 PMCID: PMC3518119 DOI: 10.1074/mcp.R112.019695
Source DB: PubMed Journal: Mol Cell Proteomics ISSN: 1535-9476 Impact factor: 5.911
Fig. 1.Overview graph of the mass spectrometry proteomics formats discussed here. The overall workflow of MS proteomics is depicted by the large shapes and the arrows connecting them. Ovals represent the major data types within the workflow. The small rectangles represent the individual file formats associated by an edge to their general data type. Shaded formats are officially approved or soon-to-be-approved standards. Different formats associated with the same data type are not necessarily redundant or equivalent.
List of some of the most prominent software tools and libraries and the formats they support
Most search engines, which all support a variety of formats, are not included.
| Tool | Formats | Reference |
|---|---|---|
| ProteoWizard | mzML, TraML, mzIdentML, mzXML, vendor formats | ( |
| OpenMS | mzML, TraML, mzIdentML, mzData, mzQuantML, et al. | ( |
| Trans-Proteomic Pipeline (TPP) | mzML, mzXML, pepXML, protXML (ProteoWizard) | ( |
| compomics-utilities | MSF, tandem, mzML, omx, dat, FASTA | ( |
| jmzReader | mzML, mzXML, mzData, PRIDE XML, dta, MGF, ms2, pkl | ( |
| jTraML | TraML | ( |
| multiplierz | Vendor formats | ( |
| PEFF Viewer | PEFF | |
| PRIDE Converter 2 | mzTab, PRIDE XML (jmzReader) | ( |
| Mascot & Distiller | MGF, mzML, mzXML, mzIdentML, vendor formats | |
| SpectraST | msp, splib, blib, ASF, mzML, mzXML, pepXML, etc. | ( |
| ProHits | PSI-MI (TPP formats) | ( |
| Anubis | TraML, mzML, mzXML | ( |
| Proteios | TraML, mzML, mzXML | ( |
| Skyline | .sky, .skyd, mzML, mzXML, vendor formats | ( |
| ATAQS | TraML, mzML, mzXML | ( |
| Corra | APML, mzXML | ( |
| Java MIAPE API | PRIDE XML, mzML, mzIdentML, GelML | ( |
Fig. 2.Example of a set of peaks depicted in “profile” mode as it is collected and commonly written by an instrument; “thresholded” mode, in which values below a certain threshold (or sometimes just zeros) are not written out to save space; and “centroided” mode, wherein only the detected peaks are written. Formats such as mzML can encode any one of these types per spectrum.