| Literature DB >> 27821049 |
Dmitry Ischenko1,2, Dmitry Alexeev3,4, Egor Shitikov3, Alexandra Kanygina3,4, Maja Malakhova3, Elena Kostryukova3, Andrey Larin3, Sergey Kovalchuk3, Olga Pobeguts3, Ivan Butenko3, Nikolay Anikanov3, Ilya Altukhov3,4, Elena Ilina3, Vadim Govorun3.
Abstract
BACKGROUND: Proteomics of bacterial pathogens is a developing field exploring microbial physiology, gene expression and the complex interactions between bacteria and their hosts. One of the complications in proteomic approach is micro- and macro-heterogeneity of bacterial species, which makes it impossible to build a comprehensive database of bacterial genomes for identification, while most of the existing algorithms rely largely on genomic data.Entities:
Keywords: SAP; Spectral library
Mesh:
Substances:
Year: 2016 PMID: 27821049 PMCID: PMC5100282 DOI: 10.1186/s12859-016-1301-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Major steps of the speptide algorithm
Fig. 2Schematic representation of the major steps of the study and the data used
Results of Mascot identification of MS spectra against the annotated genomes of the species
| Species | Strain | Genome NCBI | Total # of | # of identified | # of identified | Mascot threshold | Decoy threshold |
|---|---|---|---|---|---|---|---|
| accession | spectra | spectra | unique peptides | score | score | ||
|
| A45 | AMYU00000000 | 74800 | 15108 | 6606 | 8 | 8 |
|
| 26695 | NC_018939 | 71801 | 14732 | 6737 | 8 | 8 |
|
| E48 | AYHQ00000000 | 58704 | 14655 | 6506 | 8 | 8 |
|
| J99 | NC_000921 | 68739 | 15279 | 6374 | 8 | 8 |
|
| H13-1 | AYUH00000000 | 64585 | 17554 | 7341 | 8 | 8 |
|
| i19.05 | JFBA00000000 | 72851 | 24964 | 6374 | 9 | 6 |
|
| n01.08 | JIBZ00000000 | 83672 | 32005 | 6761 | 9 | 6 |
|
| FA1090 | NC_002946.2 | 44803 | 11891 | 5072 | 9 | 10 |
Estimation of the number of possible SAP identifications based on the comparison of the whole set of peptides identified for each sample
| Subject | # of spectra | # of peptides | Query database | # of spectra | # of peptides | # of peptides with | # of possible peptides |
|---|---|---|---|---|---|---|---|
| database | in subject | in subject | in query | in query | 1 SAP detected | identifications with 1 | |
| database | database | database | database | by genomes | SAP by proteomes | ||
|
| 13306 | 6012 |
| 13789 | 5747 | 4293 | 630 |
|
| 13657 | 5932 |
| 13306 | 6012 | 3857 | 603 |
|
| 13789 | 5747 |
| 13657 | 5932 | 4308 | 502 |
|
| 40752 | 10759 |
| 12440 | 5529 | 3446 | 467 |
| (A45, J99, 26695) | |||||||
|
| 40752 | 10759 |
| 15194 | 6402 | 3322 | 523 |
| (A45, J99, 26695) | |||||||
|
| 10854 | 4587 |
| 29462 | 6606 | 764 | 66 |
| FA1090 |
| ||||||
|
| 10854 | 4587 |
| 32815 | 6453 | 765 | 87 |
| FA1090 |
|
Fig. 3The number of the true and false identifications with speptide algorithm for the samples in the validation study with 5 % and 1 % FDR. Each diagram shows: (i) the number of the peptides (spectra on the right) identified for the selected isolate for two settings of FDR parameters, (ii) the number of NA identifications (see in the text), (iii) the number of false identifications, (iv) the total number of peptides (spectra on the right) with SAPs which could be detected
Sensitivity and specificity of SAPs identification for H. pylori and N. gonorrhoeae at the validation stage for different a priori FDR cut-offs
| Strain | Sensitivity | Specificity | ||
|---|---|---|---|---|
| 5 % FDR | 1 % FDR | 5 % FDR | 1 % FDR | |
|
| 0.693 | 0.546 | 0.994 | 0.996 |
|
| 0.730 | 0.570 | 0.993 | 0.996 |
|
| 0.818 | 0.712 | 0.996 | 0.997 |
|
| 0.724 | 0.620 | 0.996 | 0.998 |
Fig. 4Comparison of the SAP identification in the peptides of H. pylori E48 against the combined H. pylori database (A45, J99, 26695) for four different algorithms: S - speptide, B - Byonic, P - SPIDER, M - pMatch. a Venn diagram for the unique peptides with SAP confirmed by Mascot search. b Numbers of true and false identifications for each algorithm
Fig. 5X-axis shows the number of false peptides identifications of speptide in the search the spectra of H. pylori H13-1 against the spectral library created from H. pylori 26695 spectra. False identifications assigned by additional Mascot search spectra of H. pylori H13-1 against protein database constructed from the annotated genome of H. pylori H13-1. Y-axis shows the number of peptides identifications of H. pylori H13-1 spectra in different speptide decoy searches: (i) shift of all MS1 peaks at +20 Th, (ii) shift of all MS2 peaks at +20 Th, (iii) library constructed from N. gonorrhoeae FA1090 spectra, (iv) library constructed from N. gonorrhoeae n01.08 spectra. The number of decoy identifications is normalized in a way that the number of spectrum-peptide pairs is the same in direct and decoy searches
Fig. 6Unrooted phylogenetic tree of 73 E. coli isolates created with neighbor joining and additional maximum parsimony methods on the basis of identified SAPs, leaves coded by strains host organism: (p) - pig, (d) - dog, (c) - cow and (s) - raw sewage