| Literature DB >> 22340093 |
Chris Bauer1, Frank Kleinjung, Dorothea Rutishauser, Christian Panse, Alexandra Chadt, Tanja Dreja, Hadi Al-Hasani, Knut Reinert, Ralph Schlapbach, Johannes Schuchhardt.
Abstract
BACKGROUND: Recent development of novel technologies paved the way for quantitative proteomics. One of the most important among them is iTRAQ, employing isobaric tags for relative or absolute quantitation. Despite large progress in technology development, still many challenges remain for derivation and interpretation of quantitative results. One of these challenges is the consistent assignment of peptides to proteins.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22340093 PMCID: PMC3368728 DOI: 10.1186/1471-2105-13-34
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Peptide Heterogeneity. Exemplary chosen protein accession 40S ribosomal protein S30 (RS_30) for demonstration of peptide heterogeneity. Every line represents a unique peptide profile (peptide-spectrum-match) identified as originating from the RS_30 protein. iTRAQ ratios are calculated using 116 channel (SJL mouse with standard diet) as reference. For every ratio a box plot giving the lower quartile, median and upper quartile is drawn. Especially for the 117/116 ratio (NZO mouse with high fat diet) the quantitation ratios are very heterogeneous ranging from -0.5 to +1 (corresponding to a 1.4 fold down-regulation or 2 fold up-regulation).
Figure 2Workflow. Standard workflow of proteomics data evaluation (left hand side) compared to the PPINGUIN workflow presented in our manuscript (right hand side). Starting point for both workflows is the mzML [48] file containing the spectral peak data. In contrast to the standard workflow we employ clustering as a very early step prior to protein inference. This leads to splitting of spectra into different groups. Quantitation and identification is performed independently for each group. The result is a list of identified and quantified proteins ready for downstream analysis.
Experimental Design
| NZO_SD | NZO_HF | SJL_SD | SJL_HF | |
|---|---|---|---|---|
| Exp 1 | mouse:1 - channel:114 | mouse:4 - channel:117 | mouse:7 - channel:116 | mouse:10 - channel:115 |
| Exp 2 | mouse:2 - channel:115 | mouse:5 - channel:114 | mouse:8 - channel:116 | mouse:11 - channel:117 |
| Exp 3 | mouse:3 - channel:116 | mouse:6 - channel:115 | mouse:9 - channel:117 | mouse:12 - channel:114 |
Experimental design and iTRAQ labeling (114 - 117) for three experimental replications (Exp 1, Exp 2 and Exp 3). For every distinct combination of genotype and diet 3 different mouse individuals are used.
Experimental Reproducibility
| Ratio | MASCOT | X!Tandem/OpenMS | PPINGUIN | |
|---|---|---|---|---|
| NZO_SD/SJL_SD | 0.13 | 0.12 | 0.10 | |
| CV | NZO_HFD/SJL_SD | 0.17 | 0.16 | 0.14 |
| SJL_HFD/SJL_SD | 0.18 | 0.17 | 0.15 | |
| NZO_SD/SJL_SD | 0.19 | 0.17 | 0.14 | |
| StDev | NZO_HFD/SJL_SD | 0.25 | 0.22 | 0.20 |
| SJL_HFD/SJL_SD | 0.24 | 0.24 | 0.21 | |
Experimental reproducibility using the analysis methods investigated (columns). For the 3 experimental ratios (NZO_SD/SJL_SD, NZO_HFD/SJL_SD and SJL_HFD/SJL_SD) the mean coefficient of variation (CV) and the mean standard deviation for log2 quantitation ratios (see Methods) of all proteins are stated.
Figure 3Venn Diagram. Venn diagram visualizing the number of significantly identified protein accessions using the three different approaches: Mascot, XTandem/OpenMS and PPINGUIN. We refer to protein accessions identified in all three experimental replications of the diabetes dataset (see Methods).
Figure 4Sample peptide profiles. Visualization of peptide quantitation profiles of the three different approaches employed (rows) demonstrated for 3 exemplary chosen proteins (columns). The three rows correspond to the applied method: first row = MASCOT, second row = X!Tandem and OpenMS, last row = PPINGUIN. Each individual plot shows ratio profiles of peptides uniquely assigned to the corresponding protein. For every ratio a box plot giving the lower quartile, median and upper quartile is drawn.
Figure 5Ribosomal Proteins. Upper part: Quantitation profiles of the unique peptides assigned to the ribosomal protein RS30 detected in the first experiment. Labels are representing the samples: 114 - NZO_SD; 115 - SJL_HF; 116 - SJL_SD and 117 - NZO_HF. Colors orange and blue correspond to clusters 1 and 4 the peptides were identified in. Lower part: Protein sequence with positions of mapped peptides. Vertical bars displayed on the x-axis indicate predicted trypsin cleavage sites. The non-tryptic peptide was found with X!Tandem option 'refine unanticipated cleavages'.
Accordance with prior knowledge
| Protein ID | Description | P-Value | #Peptides | X!Tandem Score | |
|---|---|---|---|---|---|
| Q9Z204 | heterogeneous nuclear ribonucleoprotein C | 1.21 | 0.158 | 2 | 2.8 |
| O35490 | betaine-homocysteine methyltransferase | -0.979 | 0.00148 | 24 | 59.6 |
| P33267 | cytochrome P450, family 2, subfamily f* | -0.857 | 0.131 | 3 | 10.1 |
| P97872 | flavin containing monooxygenase 5 | -0.799 | 0.0425 | 3 | 10.6 |
| Q91V92 | ATP citrate lyase* | 0.72 | 0.231 | 5 | 9.4 |
| Q9Z2V4 | phosphoenolpyruvate carboxykinase 1* | -0.706 | 0.0782 | 2 | 6.8 |
| Q8VCH0 | acetyl-Coenzyme A acyltransferase 1B | 0.693 | 0.0318 | 2 | 6.1 |
| P10649 | glutathione S-transferase, mu 1* | -0.689 | 0.105 | 5 | 8.3 |
| P01942 | hemoglobin alpha, adult chain 1 | 0.678 | 0.359 | 16 | 16.2 |
| P70694 | aldo-keto reductase family 1* | -0.634 | 0.0245 | 6 | 17.7 |
| Q9CPY7 | leucine aminopeptidase 3 | -0.629 | 0.0926 | 4 | 17.5 |
| Q8R0Y6 | aldehyde dehydrogenase 1 family, member L1 | -0.566 | 0.0747 | 5 | 15.6 |
| P12710 | fatty acid binding protein 1* | 0.524 | 0.221 | 17 | 50.8 |
| P53657 | pyruvate kinase liver and red blood cell* | 0.51 | 0.278 | 8 | 35.3 |
Top list of differentially regulated proteins identified using PPINGUIN. Proteins marked with an asterisk (*) have previously been associated with diabesity [44]. P-values are calculated using one-sample t-test (null hypothesis: log2(NZO_HFD/SJL_SD) = 0). P-values are not used as a criterion for differential expression and are not corrected for multiple testing. With an increasing number of replicates in future studies significance of the p-values may be improved.