| Literature DB >> 30532202 |
Roger Karlsson1,2,3, Lucia Gonzales-Siles1,3, Margarita Gomila4, Antonio Busquets4, Francisco Salvà-Serra1,3,4,5, Daniel Jaén-Luchoro1,3,5, Hedvig E Jakobsson1,3, Anders Karlsson2, Fredrik Boulund3,6, Erik Kristiansson3,6, Edward R B Moore1,3,5.
Abstract
A range of methodologies may be used for analyzing bacteria, depending on the purpose and the level of resolution needed. The capability for recognition of species distinctions within the complex spectrum of bacterial diversity is necessary for progress in microbiological research. In clinical settings, accurate, rapid and cost-effective methods are essential for early and efficient treatment of infections. Characterization and identification of microorganisms, using, bottom-up proteomics, or "proteotyping", relies on recognition of species-unique or associated peptides, by tandem mass spectrometry analyses, dependent upon an accurate and comprehensive foundation of genome sequence data, allowing for differentiation of species, at amino acid-level resolution. In this study, the high resolution and accuracy of MS/MS-based proteotyping was demonstrated, through analyses of the three phylogenetically and taxonomically most closely-related species of the Mitis Group of the genus Streptococcus: i.e., the pathogenic species, Streptococcus pneumoniae (pneumococcus), and the commensal species, Streptococcus pseudopneumoniae and Streptococcus mitis. To achieve high accuracy, a genome sequence database used for matching peptides was created and carefully curated. Here, MS-based, bottom-up proteotyping was observed and confirmed to attain the level of resolution necessary for differentiating and identifying the most-closely related bacterial species, as demonstrated by analyses of species of the Streptococcus Mitis Group, even when S. pneumoniae were mixed with S. pseudopneumoniae and S. mitis, by matching and identifying more than 200 unique peptides for each species.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30532202 PMCID: PMC6287849 DOI: 10.1371/journal.pone.0208804
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Phylogenetic tree based on core genome analysis of the strains included in the “Curated Database”.
The tree includes all strains of S. pneumoniae, S. pseudopneumoniae, S. mitis, including the experimental strains used in the study, as well as the type strain of the other species of the Streptococcus Mitis Group included in the database and S. pyogenes. The tree is based on 168,439 homologous amino acid positions and was constructed, using PhyML software and the aLRT algorithm.
Proteotyping results of the twelve representative strains included in the study.
For each species, the Type strain, as well as two additional well-characterized reference strains, were included. The numbers of identified proteins, peptides and species-unique peptides after analyses with TCUP are shown (averages of triplicate analyses). The accuracies (%), i.e. proportion of correctly assigned peptides of the total number of species-unique peptides, are also shown. For a confirmed identification, a minimum threshold of five peptide matches per species was used.
| Organism | Strain | Distinct proteins | Peptide matches | Species-unique peptide matches | Accuracy (%) |
|---|---|---|---|---|---|
| CCUG 28588 | 590 | 3188 | 227 | ||
| CCUG 7206 | 590 | 2251 | 175 | ||
| CCUG 35180 | 600 | 2384 | 214 | ||
| CCUG 31611 | 610 | 3418 | 272 | ||
| CCUG 63687 | 660 | 2542 | 287 | ||
| CCUG 69183 | 479 | 3478 | 506 | ||
| CCUG 49455 | 611 | 2743 | 433 | ||
| CCUG 62647 | 574 | 2450 | 257 | ||
| CCUG 63747 | 524 | 2082 | 245 | ||
| CCUG 4207 | 401 | 1290 | 314 | ||
| CCUG 25570 | 450 | 1801 | 351 | ||
| CCUG 47803 | 493 | 2002 | 418 |
* The genome sequence of CCUG 69183 (= SK271) was present in the “Curated Database”, which is reflected by the higher number of species-unique peptide matches.
T signifies type strain of the species.
List of proteins identified from the species-unique peptides.
Peptides detected and identified in triplicate analysis of one of the twelve experimental strains, S. pneumoniae CCUG 28588T,were linked to their respective proteins. Only proteins having two or more peptide matches are shown here. Full lists of proteins and peptides for the representative strains of S. pneumoniae, S. pseudopneumoniae and S. mitis included in the study can be found in S4–S12 Tables.
| Accession number | Description | Number of peptides | Sequence coverage |
|---|---|---|---|
| WP_001035310.1 | hypothetical protein | 25 | 47% |
| WP_000685088.1 | UDP-glucose dehydrogenase | 11 | 44% |
| WP_001844726.1 | endo-beta-N-acetylglucosaminidase | 10 | 9% |
| WP_065251743.1 | choline-binding protein | 10 | 23% |
| WP_000679960.1 | beta-N-acetylhexosaminidase | 8 | 8% |
| WP_001214397.1 | NAD-dependent dehydratase | 7 | 24% |
| WP_000727933.1 | Foldase | 6 | 24% |
| WP_000434652.1 | thiol reductase thioredoxin | 4 | 44% |
| WP_000495824.1 | transcriptional regulator | 4 | 43% |
| WP_000811753.1 | alanine—tRNA ligase | 4 | 4% |
| WP_000036661.1 | dihydroxyacetone kinase | 3 | 9% |
| WP_000064115.1 | general stress protein | 3 | 24% |
| WP_000767195.1 | hypothetical protein | 3 | 13% |
| WP_000862350.1 | glycosyl transferase family 1 | 3 | 10% |
| WP_001079795.1 | galacturonic acid acetylase | 3 | 22% |
| WP_001818788.1 | PTS glucose transporter subunit IIABC | 3 | 8% |
| WP_000116461.1 | trigger factor | 2 | 8% |
| WP_000164758.1 | glycine—tRNA ligase subunit beta | 2 | 4% |
| WP_000201902.1 | RNA polymerase sigma factor SigA | 2 | 5% |
| WP_000245505.1 | 30S ribosomal protein S8 | 2 | 14% |
| WP_000411198.1 | choline kinase | 2 | 10% |
| WP_000432756.1 | alpha-mannosidase | 2 | 5% |
| WP_000529016.1 | serine protease | 2 | 4% |
| WP_000599104.1 | ribosome-associated factor Y | 2 | 6% |
| WP_000639574.1 | hypothetical protein | 2 | 43% |
| WP_000664173.1 | capsular polysaccharide biosynthesis CpsC | 2 | 16% |
| WP_000701442.1 | PTS fructose transporter subunit IIC | 2 | 5% |
| WP_000790743.1 | hypothetical protein | 2 | 16% |
| WP_001032504.1 | YSIRK signal domain/LPXTG anchor | 2 | 2% |
| WP_001092741.1 | arginine—tRNA ligase | 2 | 6% |
| WP_001162938.1 | dihydrolipoyl dehydrogenase | 2 | 5% |
| WP_001229596.1 | arginine ABC transporter ATP-binding | 2 | 15% |
| WP_001232820.1 | alkaline amylopullulanase | 2 | 4% |
| WP_001818543.1 | cell division DivIVA | 2 | 11% |
Fig 2Proteotyping results of mixed samples.
Cells of S. pneumoniae and S. pseudopneumoniae or S. mitis were mixed in ratios of 1:1. Following sample preparation, digestion and LC-MS/MS analyses, the results were evaluated, using pre-computed correction factors, reflecting the expected proportion of unique peptides of each of the species. In both mixes, S. pneumoniae:S. pseudopneumoniae and S. pneumoniae:S. mitis, the results reflected a composition of approximately 50% of each species (standard error bars on averages from triplicate analyses).
Fig 3Ranking according to matching efficiency (%) of the identified peptides against complete genome sequences from RefSeq database.
Proteotyping results, following MS-proteomic analyses of strains of S. aureus (A), P. aeruginosa (B) and S. pneumoniae (C). For S. aureus and P. aeruginosa, the top ranked peptide matches are all with the correct species in the RefSeq database. The next-best ranked matches are for other species of Pseudomonas and Staphylococcus, marked with arrows, reflecting a distinct drop from almost 100% down to 30% in matching efficiencies of identified peptides. From the analysis of S. pneumoniae (C), the top ranked matches all belong to the correct species (S. pneumoniae), although, due to the phylogenetic relationships and taxonomy of this species in the Streptococcus genus, the matching efficiencies to other species, especially for S. pseudopneumoniae and S. mitis, is relatively higher (as much as 70–80%), thus making discovery of species-unique peptides more difficult.
Lists of genomes of species of the Mitis Group of the genus Streptococcus, used for matching the proteomic data at two different time points.
Two databases were used, created in February 2015 (“Initial Database”) and August 2016 (“Curated Database”). The (T) denotes the presence of the Type strain genome of a given species in the database.
| Organism | “Initial Database” | “Curated Database” |
|---|---|---|
| 0 | 1 (T) | |
| 0 | 1 (T) | |
| 1 | 3 (T) | |
| 0 | 1 (T) | |
| 1 | 30 (T) | |
| 1 (T) | 1 (T) | |
| 1 (T) | 13 (T) | |
| 2 (T) | 3 (T) | |
| 0 | 1 (T) | |
| 27 | 31 (T) | |
| 1 | 6 (T) | |
| 1 | 6 | |
| 0 | 1 (T) | |
| 0 | 4 (T) | |
| Sum of Mitis Group genomes | 35 | 102 |
| 19 | 47 |
*S. oligofermentans is considered to be a later heterotypic synonym of S. cristatus and S. tigurinus is considered to be a subspecies of S. oralis [29].
+S oralis subsp. tigurinus genomes, previously classified as S. tigurinus, are included as S. oralis, according to Jensen et al., 2016 [29]. The reference strain for the subspecies, S. oralis subsp. tigurinus AZ_3a is included in the database.
Ψ S. pyogenes is included here, as an out-group species, since this species was part of the model system.
Proteotyping results of the twelve representative strains included in the study (averages of triplicate analyses).
The number of species-unique peptide matches and accuracies (%), using the two databases are shown in the columns headed “Initial Database” and “Curated Database”. A minimum threshold of five peptide matches per species was used in the analysis. The improvement in accuracies for S. mitis and S. pseudopneumoniae is highlighted in bold.
| Organism | Strain | Initial Database | Curated Database | ||
|---|---|---|---|---|---|
| Species-unique peptide matches | Accuracy (%) | Species-unique peptide matches | Accuracy (%) | ||
| CCUG 28588 | 354 | 98 | 227 | 97 | |
| CCUG 7206 | 286 | 100 | 175 | 100 | |
| CCUG 35180 | 380 | 100 | 214 | 100 | |
| CCUG 31611 | 377 | 272 | |||
| CCUG 63687 | 201 | 287 | |||
| CCUG 69183 | 227 | 506 | |||
| CCUG 49455 | 329 | 433 | |||
| CCUG 62647 | 230 | 257 | |||
| CCUG 63747 | 232 | 245 | |||
| CCUG 4207 | 319 | 100 | 314 | 100 | |
| CCUG 25570 | 360 | 100 | 351 | 100 | |
| CCUG 47803 | 427 | 100 | 418 | 100 | |
T signifies type strain of the species.