| Literature DB >> 23126499 |
Gorka Prieto1, Kerman Aloria, Nerea Osinalde, Asier Fullaondo, Jesus M Arizmendi, Rune Matthiesen.
Abstract
BACKGROUND: Protein inference from peptide identifications in shotgun proteomics must deal with ambiguities that arise due to the presence of peptides shared between different proteins, which is common in higher eukaryotes. Recently data independent acquisition (DIA) approaches have emerged as an alternative to the traditional data dependent acquisition (DDA) in shotgun proteomics experiments. MSE is the term used to name one of the DIA approaches used in QTOF instruments. MSE data require specialized software to process acquired spectra and to perform peptide and protein identifications. However the software available at the moment does not group the identified proteins in a transparent way by taking into account peptide evidence categories. Furthermore the inspection, comparison and report of the obtained results require tedious manual intervention. Here we report a software tool to address these limitations for MSE data.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23126499 PMCID: PMC3548767 DOI: 10.1186/1471-2105-13-288
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Classification of proteins into four evidence groups. 9 different proteins (Protein A to Protein I) comprising 6 different peptides (boxes 1 to 6) have been used to illustrate all possible scenarios. Conclusive proteins have at least one unique peptide; indistinguishable proteins have the same peptides including at least one discriminating peptide; ambiguous group is formed by proteins sharing discriminating peptides; non-conclusive proteins only have non-discriminating peptides.
Figure 2Peptide classification algorithm. Algorithm used by PAnalyzer to classify peptides before grouping the proteins into the different evidence groups.
Figure 3Replicate run example. 4 different proteins (A to D) comprising 5 peptides (1 to 5) have been used to illustrate the peptide voting and peptide merging approaches in a multiple run analysis example. The result of the individual samples (single run analysis) is also indicated.
Figure 4Block diagram for running PAnalyzer. PAnalyzer can read one or more ProteinLynx Global Server output files and mzIdentML files and exports the reorganized protein list to CSV, HTML and mzIdentML. The tool runs in every platform where a .NET version 4 compliant CLR is available.
Figure 5HTML output. Example of the protein details section of the HTML output. This conclusive protein has one unique peptide and two non-discriminating peptides. The unique peptide has been detected in 3 out of the 5 replicates (run numbers 2, 3 and 4). In peptides number 459 and 1567 no PTMs have been detected while in peptide number 2621 two peptide variants (different PTMs) have been identified. A number is used for unique peptides, a “*” is appended to discriminating peptides, and a “**” to non-discriminating peptides.
Comparison of results reported by PLGS and PAnalyzer
| Hit | 143 | 161 | 144 | 129 | 134 | |
| | Protein | 305 | 338 | 324 | 293 | 307 |
| | Conclusive | 173 | 178 | 179 | 161 | 157 |
| | Indistinguishable | 77 (29) | 94 (39) | 82 (32) | 63 (26) | 78 (31) |
| Ambiguous group | 0 | 4 (1) | 12 (3) | 0 | 18 (2) | |
| | Non-conclusive | 55 | 62 | 51 | 69 | 54 |
| | Filtered | 0 | 0 | 0 | 0 | 0 |
| Th1 | Th2 | Th3 | Th4 | Th5 | ||
| Hit | - | - | - | - | - | |
| | Protein | - | - | - | - | - |
| | Conclusive | 335 | 161 | 100 | 69 | 54 |
| | Indistinguishable | 74 (37) | 103 (38) | 106 (33) | 104 (31) | 84 (28) |
| Ambiguous group | 11 (3) | 9 (2) | 12 (2) | 11 (2) | 24 (2) | |
| | Non-conclusive | 98 | 81 | 82 | 66 | 65 |
| Filtered | 0 | 164 | 218 | 268 | 291 | |
Total protein extracts from HEK 293T cells have been analyzed in 5 replicate runs. In single run analysis, the number at the top row indicates the run identifier, while in multiple run analysis the number indicates the runs threshold. ProteinLynx Global Server rows show the number of hits and proteins reported, while the PAnalyzer rows show the number of proteins reported in the different evidence categories presented in the paper. The number of groups are indicated in parenthesis in the case of indistinguishable and ambiguous proteins. ProteinLynx Global Server (PLGS) cannot perform multiple run analysis. The peptide score filter is low confidence (red).