| Literature DB >> 23467006 |
Yasset Perez-Riverol1, Rui Wang, Henning Hermjakob, Markus Müller, Vladimir Vesada, Juan Antonio Vizcaíno.
Abstract
Data processing, management and visualization are central and critical components of a state of the art high-throughput mass spectrometry (MS)-based proteomics experiment, and are often some of the most time-consuming steps, especially for labs without much bioinformatics support. The growing interest in the field of proteomics has triggered an increase in the development of new software libraries, including freely available and open-source software. From database search analysis to post-processing of the identification results, even though the objectives of these libraries and packages can vary significantly, they usually share a number of features. Common use cases include the handling of protein and peptide sequences, the parsing of results from various proteomics search engines output files, and the visualization of MS-related information (including mass spectra and chromatograms). In this review, we provide an overview of the existing software libraries, open-source frameworks and also, we give information on some of the freely available applications which make use of them. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.Entities:
Keywords: (HUPO)-PSI; (Human Proteome Organization) — Proteomics Standards Initiative; AMT; ATAQS; Accurate Mass Tag; Application programming interface; Automated and Targeted Analysis with Quantitative SRM; Bioinformatics; CV; Controlled Vocabulary; DAO; Data Access Object; Databases; EBI; European Bioinformatics Institute; FDR; False Discovery Rate; GUI; Graphical User Interface; ICAT; ICPL; IPTL; ISB; Institute for Systems Biology; Isobaric Peptide Termini Labeling; Isobaric Tag for Relative and Absolute Quantitation; Isotope-Coded Affinity Tags; Isotope-Coded Protein Label; JPL; Java Proteomic Library; LC-MS; LIMS; Laboratory Information Management System; Liquid Chromatography–Mass Spectrometry; MGF; MIAPE; MS; Mascot Generic Format; Mass Spectrometry; Minimum Information About a Proteomics Experiment; Open source software; PASSEL; PRIDE; PRoteomics IDEntifications (database); PSM; PTM; Peptide Spectrum Match; PeptideAtlas SRM Experiment Library; Post-Translational Modifications; Proteomics; RT; Retention Time; SILAC; SRM; Selected Reaction Monitoring; Software libraries; Stable Isotope Labeling by Amino acids in Cell culture; TMT; TOPP; TPP; Tandem Mass Tag; The OpenMS Proteomics Pipeline; Trans-Proteomic Pipeline; emPAI; exponentially modified Protein Abundance Index; iTRAQ
Mesh:
Year: 2013 PMID: 23467006 PMCID: PMC3898926 DOI: 10.1016/j.bbapap.2013.02.032
Source DB: PubMed Journal: Biochim Biophys Acta ISSN: 0006-3002
Fig. 1Schema of the possible computational processing steps of a proteomics data set.
Different libraries for in silico analysis of proteins. Isoelectric point (pI), retention time (RT), Sequence Digestion (SD), Decoy database generation (DDG), consider post-translational modifications (PTM), molecular formula prediction (MFP), FASTA Sequence Databases Reader (FD).
| Library | Language | Version | Property prediction | Custom features | Supported formats | URL | Integration | Reference |
|---|---|---|---|---|---|---|---|---|
| BioJava | Java | Legacy 1.8.2 (2012) | SD | FD | Maven | |||
| compomics-utilities | Java | 3.6.12 (2012) | RT, GRAVY index, isotopic distribution | SD, PTM, Sequence pattern filtering, Decoy DDG | FD, Mascot dat, X!Tandem XML, OMSSA output, Proteome Discoverer/ msf files | Maven | ||
| InsilicoSpectro | Perl | 1.3.24 (2008) | RT, | SD, PTM | FD, Mascot XML output | CPAN | ||
| Java Proteomic Library (JPL) | Java | 1.0 (2012) | SD, PTM, MFP | FD | – | |||
| mspire | Ruby | 0.8.2 (2012) | Mass, isotopic distribution | SD, MFP | FD | – | ||
| multiplierz | Python | (2011) | Mass | SD | FD | – | ||
| OpenMS | C++ | 1.9 (2012) | Mass, RT | SD, PTM, DDG | FD, Mascot XML output | – | ||
| pyteomics | Python | 1.2.5 (2012) | SD | FD | PyPI | |||
| TPP (Trans Proteomic Pipeline) | C++, Java | 4.6 (2012) | SD, PTM, Proteotypic Peptide Prediction, DDG | FD | – |
Fig. 2(A) Evolution of Mass Spectrometry file formats. (B) Schema of the PRIDE toolsuite tools PRIDE Converter 2 and PRIDE Inspector.
Software libraries to read (r) and write (w) MS-based information from different file formats.
| Library | Language | File formats | URL | Integration | Reference | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mzML | mzXML | mzData | Peak list files | Search engine output files | mzIdentML | mzTab | FASTA | PRIDE XML | |||||
| compomics-utilities | Java | – | – | – | r/w (mgf) | r (OMSSA, Mascot, X!Tandem) | – | – | r/w | r/w | Maven | ||
| jmzIdentML | Java | – | – | – | – | – | r/w | – | – | – | Maven | ||
| jmzML | Java | r/w | – | – | – | – | – | – | – | – | Maven | ||
| jmzReader | Java | r | r | r | r (mgf, pkl, ms2, dta) | – | – | – | – | – | Maven | ||
| jmzTab | Java | – | – | – | – | – | – | r/w | – | – | Maven | ||
| JRAP | Java | – | r/w | – | – | – | – | – | – | – | – | ||
| MGFp | C++ | – | – | – | – | r Mascot | – | – | – | – | – | ||
| OpenMS | C++ | r/w | r/w | r/w | – | r (Mascot, Sequest, OMSSA, X!Tandem) | – | – | r | – | – | ||
| PRIDE Converter 2 | Java | r | r | r | r (mgf, pkl, ms2, dta) | r (Mascot, X!Tandem, OMSSA, SpectraST, CRUX, MSGF, Proteome Discoverer) | r | – | r | r/w | Maven | ||
| ProteoWizard | C++ | r/w | r/w | – | r/w (mgf, ms2) | – | r/w | – | – | – | – | ||
| pymzML | Python | r/w | – | – | – | – | – | – | – | – | pypi | ||
Different software packages to pre-processing the MS proteomics and metabolomics data.
| Library | Language | File formats | Processing Methods | URL | Reference | ||||
|---|---|---|---|---|---|---|---|---|---|
| Spectrum normalization | Spectrum clustering | Deconvolution | Spectrum alignment | Spectrum quality assessment | |||||
| maltcms | Java | mzML, mzXML, mzData | X | X | |||||
| mMass | Python | mzML, mzXML, mzData, MGF, | X | X | |||||
| msInspect | Java | mzXML | X | X | |||||
| mzMine2 | Java | mzML, mzXML. mzData | X | ||||||
| OpenMS | C++ | mzML, mzXML, mzData | X | X | X | ||||
Fig. 3Classification of MS-based quantification methods including the open-source packages available for each of them.