| Literature DB >> 25521444 |
Julian Uszkoreit, Nicole Plohnke, Sascha Rexroth, Katrin Marcus, Martin Eisenacher.
Abstract
BACKGROUND: Proteogenomics combines the cutting-edge methods from genomics and proteomics. While it has become cheap to sequence whole genomes, the correct annotation of protein coding regions in the genome is still tedious and error prone. Mass spectrometry on the other hand relies on good characterizations of proteins derived from the genome, but can also be used to help improving the annotation of genomes or find species specific peptides. Additionally, proteomics is widely used to find evidence for differential expression of proteins under different conditions, e.g. growth conditions for bacteria. The concept of proteogenomics is not altogether new, in-house scripts are used by different labs and some special tools for eukaryotic and human analyses are available.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25521444 PMCID: PMC4290607 DOI: 10.1186/1471-2164-15-S9-S19
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Screenshot of the Bacterial Proteogenomic Pipeline GUI. The GUI of the Bacterial Proteogenomic Pipeline leads the user through all steps required for a proteogenomic analysis. Shown is the final step, the analysis of the combined search results. After opening a file created in the "Combine Identifications" step, the identified peptide sequences are shown in a table with information about the sequence, the originating genomic sequence (usually the chromosome or a plasmid), corresponding protein accessions, whether or not the peptide occurs only in a pseudo protein, in an elongation of an annotated protein or is a standalone pseudo protein. Additionally the numbers of distinct identifications in all files and the (normalized) numbers of identifications per condition of the searched samples are given and represented in the bar charts in the lower half of the screen. For a selected peptide, the protein sequences containing the peptide are depicted, with the identified sequences highlighted in bold. The result table can be filtered and additional spectrum identification files can be added, for which the condition groups may be freely chosen.
Peptides found in the B. Japonicum analysis
| Sequence | number of identifications (normalized) | elongation / standalone | ORF start | ORF end | |
|---|---|---|---|---|---|
| VLVEGIER | 5 (2.62) | standalone | |||
| FSDYAFPPAVGYPSFAR | 23 (14.78) | standalone | yes | ||
| GRPVYGPSGPNTVYQQGR | 15 (10.79) | standalone | yes | ||
| KADLEAR | 24 (12.65) | standalone | |||
| ALVAEISR | 6 (3.02) | standalone | |||
| APPIEPR | 7 (5.19) | elongation | |||
| ASVQYFVTR | 7 (5.40) | standalone | yes | ||
| VAVDAAHK | 6 (3.41) | standalone | yes | ||
| VAVDAAHKEGK | 5 (3.01) | standalone | yes | ||
| IGELAEATGVTVR | 9 (6.21) | elongation | |||
| ALNLGIGLGHQR | 10 (7.00) | standalone | yes | ||
| VIESDAGDGER | 6 (4.99) | standalone | yes | ||
| ASADPAPSPAEAER | 5 (3.40) | standalone | yes | ||
| LAASQCPVAAIR | 5 (3.01) | standalone | yes | ||
| TTMEQATAAAK | 14 (7.63) | standalone | yes | ||
| LQMSADNVADSYAR | 6 (3.80) | standalone | yes | ||
| ADADLDVVIR | 5 (3.40) | standalone | yes | ||
| MVDCRIK | 5 (2.41) | standalone | |||
| AAEGTLR | 6 (4.01) | standalone | |||
| VIAGEQGAQR | 5 (3.40) | standalone | yes | ||
| ILVLYGSYR | 5 (3.60) | standalone | yes | ||
| VLDASTAYR | 5 (3.99) | standalone | yes | ||
| CYQSAAAYVGQDR | 7 (4.21) | standalone | yes | ||
| LVQIQCER | 5 (2.62) | standalone | |||
| GNALLNFGK | 5 (3.40) | standalone | |||
| AGSTPIPSAEAPDR | 5 (3.40) | standalone | yes | ||
| GQGEGAPGQASDR | 9 (4.42) | elongation | |||
| VVSKPLPTFTAASDLQIK | 16 (11.60) | standalone | yes | ||
| YKPFQWGASTYK | 5 (2.80) | standalone | yes | ||
| LILAEPAPGVR | 5 (3.60) | standalone | yes | ||
| AVGVLAAEYLR | 6 (4.40) | elongation | |||
| GCITPQTGRGQAASPVR | 16 (9.03) | standalone | |||
This table shows the peptides of pseudo proteins found in a proteogenomic analysis of B. Japonicum. MS/MS spectra were identified with MS-GF+ and X!Tandem, the combined search results were filtered on a Combined FDR Score level of 0.01. Only peptides, which had at least 5 distinct peptide spectrum matches are reported, peptides from the same ORF respectively pseudo protein are visually grouped by the alternating bold and recursive ORF positions.