| Literature DB >> 24861615 |
Arnold Kuzniar1, Roland Kanaar2.
Abstract
We present the Proteomics Identifications and Quantitations Data Management and Integration Service or PIQMIe that aids in reliable and scalable data management, analysis and visualization of semi-quantitative mass spectrometry based proteomics experiments. PIQMIe readily integrates peptide and (non-redundant) protein identifications and quantitations from multiple experiments with additional biological information on the protein entries, and makes the linked data available in the form of a light-weight relational database, which enables dedicated data analyses (e.g. in R) and user-driven queries. Using the web interface, users are presented with a concise summary of their proteomics experiments in numerical and graphical forms, as well as with a searchable protein grid and interactive visualization tools to aid in the rapid assessment of the experiments and in the identification of proteins of interest. The web server not only provides data access through a web interface but also supports programmatic access through RESTful web service. The web server is available at http://piqmie.semiqprot-emc.cloudlet.sara.nl or http://www.bioinformatics.nl/piqmie. This website is free and open to all users and there is no login requirement.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24861615 PMCID: PMC4086067 DOI: 10.1093/nar/gku478
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Computational proteomics workflow including the PIQMIe service. Before using the service, the semi-quantitative MS data are analyzed by the MaxQuant software. The resulting files, i.e. the peptide (‘evidence.txt’) and protein lists (‘proteinGroups.txt’) are uploaded together with the used FASTA sequence library to the server through the submission web page. PIQMIe then populates an SQLite database called the Integrated Proteomics database (IPdb), and makes the linked data accessible through (i) a local SQL interface, (ii) remote RESTful web service or (iii) a web browser.
Figure 2.Overall summary of the SILAC ECM and MV experiments at the peptide and protein levels using bar charts: (A) database-dependent protein identifications; (B) non-redundant protein (groups) identifications and quantitations; (C) potentially regulated non-redundant proteins (FC ≥ 1.5; P value < 0.05); (D) peptide identifications and quantitations. Full description of the abbreviations used in the bar charts: n_prot_acc, number of protein accessions including isoforms in the source database (or FASTA sequence library); n_prot_ids, number of MS-based protein identifications including splice isoforms, filtered for decoys and contaminants; n_prot_acc_evid_protein, number of protein accessions with protein-level evidence; n_prot_acc_evid_transcript, number of accessions with transcript-level evidence; n_prot_acc_evid_homology, number of accessions with homology-based evidence; n_prot_acc_evid_predicted, number of accessions predicted in silico; n_prot_acc_evid_uncertain, number of accessions with uncertain evidence; n_pgrp_ids, number of non-redundant protein identifications, filtered for decoys and contaminants; n_pgrp_qts, number of non-redundant protein quantitations; n_pgrp_ids_by_site, number of non-redudant proteins identified by modification site; n_pgrp_decoys, number of non-redundant proteins detected as decoys (false positives); n_pgrp_conts, number of non-redundant proteins detected as contaminants; n_pgrp_ids, union of differentially regulated proteins identified in all conditions, filtered for decoys and contaminants; n_pgrp_ids_H/L+L/H, number of up- AND down-regulated proteins identified in both conditions H/L and L/H; n_pgrp_ids_H/L, number of up- OR down-regulated proteins identified in the H/L condition; n_pgrp_ids_L/H, number of up- OR down-regulated proteins identified in the L/H condition; n_pep_ids, number of redundant peptide identifications, filtered for decoys and contaminants; n_pep_qts, number of redundant peptide quantitations; n_unq_pep_seq+mod_ids, number of non-redundant peptide identifications unique by sequence and modifications; n_unq_pep_seq+mod_qts, number of non-redundant peptide quantitations unique by sequence and modifications; n_unq_pep_seq_ids, number of non-redundant peptide identifications unique by sequence; n_unq_pep_seq_qts, number of non-redundant peptide quantitations unique by sequence; n_pep_ids_decoys, number of redundant peptides detected as decoys (false positives); n_pep_ids_conts, number of redundant peptides detected as contaminants; n_unq_pep_seq_decoys, number of non-redundant peptide decoys unique by sequence; n_unq_pep_seq_conts, number of non-redundant peptide contaminants unique by sequence. For the exact values shown in the bar charts, refer to the tabulated data in the Supplementary Tables S2–S5.
Figure 3.Interactive visualization and query tools available through the PIQMIe web interface. (A) 2D scatterplots of protein quantitations from reciprocal SILAC ECM experiments before (left figure) and after (right figure) the use of fold-change and intensity-based significance B cutoffs (FC ≥ 1.5; P value < 0.05). The plots are divided into four quadrants: the 1st and 3rd quadrants contain proteins that are inconsistently up- or down-regulated in the reciprocal experiments (false positives) whereas the 2nd and 4th quadrants contain proteins that are consistently up- and down-regulated by activin A signaling, respectively. For example, the UGP2 is consistently up-regulated in both ECM experiments. Moreover, the scatterplots are accompanied by the Pearson's correlation coefficient (r) computed for a pair of (reciprocal) SILAC experiments to aid in assessing the reproducibility of the replicate experiments. (B) Searchable protein grid with a query builder enables filtering of tabulated data by applying Boolean and/or relational operators on one or more columns of the grid. An example query is shown to select proteins with consistent (normalized) SILAC ratios from the set of potentially regulated proteins (FC ≥ 2; P value < 0.05) annotated as ‘kinase’. (C) Peptide coverage map shows the location and distribution of identified peptides (in red) within their parent proteins of a group. In the group (ID: 596), three protein entries belong to the manually curated UniProtKB/Swiss-Prot (in blue and green) including the ‘leading’ or best-scoring protein (accession: P14618-2, in green), as identified by the MaxQuant/Andromeda search, while the remaining nine proteins belong to the automatically annotated (unreviewed) UniProtKB/TrEMBL section (in gray). In addition, users can choose which experiments to view in the map. Note: All protein groups, accessions and peptides reported in the web interface are provided with hyperlinks to the appropriate site.