| Literature DB >> 25658760 |
Ramya Srinivasan1, Ulas Karaoz2, Marina Volegova3, Joanna MacKichan4, Midori Kato-Maeda5, Steve Miller6, Rohan Nadarajan6, Eoin L Brodie2, Susan V Lynch1.
Abstract
According to World Health Organization statistics of 2011, infectious diseases remain in the top five causes of mortality worldwide. However, despite sophisticated research tools for microbial detection, rapid and accurate molecular diagnostics for identification of infection in humans have not been extensively adopted. Time-consuming culture-based methods remain to the forefront of clinical microbial detection. The 16S rRNA gene, a molecular marker for identification of bacterial species, is ubiquitous to members of this domain and, thanks to ever-expanding databases of sequence information, a useful tool for bacterial identification. In this study, we assembled an extensive repository of clinical isolates (n = 617), representing 30 medically important pathogenic species and originally identified using traditional culture-based or non-16S molecular methods. This strain repository was used to systematically evaluate the ability of 16S rRNA for species level identification. To enable the most accurate species level classification based on the paucity of sequence data accumulated in public databases, we built a Naïve Bayes classifier representing a diverse set of high-quality sequences from medically important bacterial organisms. We show that for species identification, a model-based approach is superior to an alignment based method. Overall, between 16S gene based and clinical identities, our study shows a genus-level concordance rate of 96% and a species-level concordance rate of 87.5%. We point to multiple cases of probable clinical misidentification with traditional culture based identification across a wide range of gram-negative rods and gram-positive cocci as well as common gram-negative cocci.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25658760 PMCID: PMC4319838 DOI: 10.1371/journal.pone.0117617
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Clinical identity, identification technique, and source of all the clinical isolates used in this study.
| Clinical Identity | Identification Technique | Number of Isolates | Source |
|---|---|---|---|
|
| culture-based | 20 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 20 | Microbial Diseases Laboratory-California Public Health Department, CA |
|
|
| 2 |
|
|
|
| 6 |
|
|
|
| 7 |
|
|
|
| 3 |
|
|
|
| 3 |
|
|
|
| 3 |
|
|
|
| 2 |
|
|
|
| 4 |
|
|
| culture-based | 20 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 20 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 29 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 31 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 37 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 20 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| nucleic acid hybridization | 29 | MTB Research Laboratory-UCSF, CA |
|
| serotyping; | 2 | Clinical Microbiology Laboratory-UCSF, CA |
|
| serotyping; | 28 | Institute of Environmental Science and Research-Wellington, New Zealand |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 30 | Clinical Microbiology Laboratory-UCSF, CA |
|
| culture-based | 31 | Clinical Microbiology Laboratory-UCSF, CA |
Fig 1Bacterial identification by clinical microbiology laboratory techniques.
Typical temporal workflow of clinical microbiological laboratory to identify microbes from clinical samples based on phenotypic, biochemical, and culture-based techniques.
Fig 216S rRNA percent identity within and between genera.
Distributions (shown as violin plots) of 16S rRNA percent identity (y-axis of each figure) of pairs of training set sequences belonging to the same (gray) and different genera. 95% identity, the traditional genus level cutoff, has been marked for reference. The genus Mycobacterium has been categorized as a gram-positive in the figure. For all of the genera, sequence variability between sequences from the same genera was significantly higher than between those from different genera for all comparisons (Wilcoxon test one-sided p-value<0.0001).
Fig 416S rRNA percent identity within and between species (gram-negative bacteria).
Distributions (shown as violin plots) of 16S rRNA percent identity (y-axis of each figure) of pairs of training set sequences belonging to the same (gray) and different species for select gram-negative bacteria. 97% identity, the traditional species level cutoff, has been marked for reference. Species for which sequence variability between sequences from the same species was significantly higher than between those from different species are marked with a * (Wilcoxon test one-sided p-value<0.0001).
Fig 316S rRNA percent identity within and between species (gram-positive bacteria).
Distributions (shown as violin plots) of 16S rRNA percent identity (y-axis of each figure) of pairs of training set sequences belonging to the same (gray) and different species for select gram-positive bacteria. 97% identity, the traditional species level cutoff, has been marked for reference. Species for which sequence variability between sequences from the same species was significantly higher than between those from different species are marked with a * (Wilcoxon test one-sided p-value<0.0001).
Concordance rates between clinical and 16S rRNA based identification.
| Clinical Identification | 16S rRNA Identification | |||||
|---|---|---|---|---|---|---|
| Naïve Bayes Classifier | 16SpathDB | |||||
| % Concordance | % Concordance | No. isolates with definite identification | ||||
| Genus | Species | Genus | Species | Genus | Species | |
|
| 100 (20/20) | 85 (17/20) | 100 (20/20) | 85 (17/20) | 20/20 | 17/17 |
|
| 100 (20/20) | 100 (20/20) | 100 (20/20) | 100 (20/20) | 20/20 | 1/20 |
|
| 100 (2/2) | 0 (0/2) | 100 (2/2) | 50 (1/2) | 2/2 | 0/1 |
|
| 100 (6/6) | 83.33 (5/6) | 100 (6/6) | 0 (0/6) | 6/6 | - |
|
| 100 (7/7) | 0 (0/7) | 100 (7/7) | 57.14 (4/7) | 7/7 | 0/4 |
|
| 100 (3/3) | 0 (0/3) | 100 (3/3) | 100 (3/3) | 3/3 | 3/3 |
|
| 100 (3/3) | 100 (3/3) | 100 (3/3) | 100 (3/3) | 3/3 | 3/3 |
|
| 100 (3/3) | 0 (0/3) | 100 (3/3) | 0 (0/3) | 3/3 | - |
|
| 100 (2/2) | 0 (0/2) | 100 (2/2) | 50 (1/2) | 2/2 | 1/1 |
|
| 100 (4/4) | 75 (3/4) | 100 (4/4) | 0 (0/4) | 4/4 | - |
|
| 90 (18/20) | 90 (18/20) | 85 (17/20) | 80 (16/20) | 16/17 | 10/16 |
|
| 90 (18/20) | 85 (17/20) | 90 (18/20) | 85 (17/20) | 18/18 | 17/17 |
|
| 100 (29/29) | 96.55 (28/29) | 100 (29/29) | 6.9 (18/29) | 27/29 | 15/18 |
|
| 100 (30/30) | 96.67 (29/30) | 100 (30/30) | 96.67 (29/30) | 30/30 | 29/29 |
|
| 100 (31/31) | 93.33 (29/31) | 100 (31/31) | 0 (0/31) | 31/31 | - |
|
| 100 (30/30) | 60 (18/30) | 100 (30/30) | 93.33 (28/30) | 30/30 | 0/28 |
|
| 94.59 (35/37) | 91.89 (34/37) | 94.59 (35/37) | 89.19 (33/37) | 35/35 | 33/33 |
|
| 75 (15/20) | 75 (15/20) | 30 (6/20) | 30 (6/20) | 2/6 | 2/6 |
|
| 93.33 (28/30) | 93.33 (28/30) | 70 (21/30) | 60 (18/30) | 21/21 | 17/18 |
|
| 96.55 (28/29) | 93.1 (27/29) | 96.55 (28/29) | 96.55 (28/29) | 28/28 | 0/28 |
|
| 100 (2/2) | 100 (2/2) | 100 (2/2) | 100 (2/2) | 2/2 | 2/2 |
|
| 100 (28/28) | 100 (28/28) | 100 (28/28) | 96.43 (27/28) | 28/28 | 27/27 |
|
| 100 (30/30) | 100 (30/30) | 100 (30/30) | 100 (30/30) | 30/30 | 30/30 |
|
| 100 (30/30) | 96.67 (29/30) | 100 (30/30) | 96.67 (29/30) | 30/30 | 29/29 |
|
| 100 (30/30) | 100 (30/30) | 100 (30/30) | 100 (30/30) | 30/30 | 30/30 |
|
| 100 (30/30) | 100 (30/30) | 100 (30/30) | 100 (30/30) | 30/30 | 30/30 |
|
| 96.67 (29/30) | 76.67 (23/30) | 96.67 (29/30) | 76.67 (23/30) | 29/29 | 23/23 |
|
| 86.67 (26/30) | 86.67 (26/30) | 100 (30/30) | 100 (30/30) | 30/30 | 30/30 |
|
| 100 (30/30) | 93.33 (28/30) | 100 (30/30) | 90 (27/30) | 30/30 | 25/27 |
|
| 83.87 (26/31) | 77.4 (24/31) | 83.87 (26/31) | 70.97 (26/31) | 26/26 | 24/26 |
| ALL ISOLATES | 96.11(593/617) | 87.5(540/617) | 94 (580/617) | 80 (496/617) | 573/580 | 398/496 |
16S based genus and species identities were through the use of the Naïve Bayes classifier and an alignment based approach (16SpathDB). For each clinical identity, in addition to the concordance rates, the number of concordant isolates among all isolates is listed in parentheses. For the latter method, the last two columns give the number of isolates with definite identification among the concordant isolates
* For species level comparisons, species for the clinical identifications Enterobacter cloacae complex and Streptococcus viridans group clinical were:
Enterobacter cloacae complex species: Enterobacter asburiae, Enterobacter cloacae, Enterobacter hormaechei, Enterobacter kobei, Enterobacter ludwigii, Enterobacter nimipressuralis. Streptococcus viridans group species: (1) S. mitis group: S. mitis, S. sanguinis, S. parasanguinis, S. gordonii, S. oralis, S. cristatus, S. infantis, S. peroris, S. australis, S. oligofermentans, S. pneumoniae, S. pseudopneumoniae (2) S. mutans group: S. mutans, S. sobrinus (3) S. salivarius group: S. salivarius, S. vestibularis, S. thermophiles (4) S. bovis group: S. equinus, S. gallolyticus, S. infantarius, S. pasteurianus, and S. alactolyticus (5) S. anginosus group: S. anginosus, S. constellatus, S. intermedius.
Concordance rates between 16S rRNA based and clinical identification for isolates clinically identified by culture-based or non-16S based molecular methods.
| Naïve Bayes Classifier | 16SpathDB | |||
|---|---|---|---|---|
| Genus | Species | Genus | Species | |
| Molecular (non-16S based) | 98.87 (88/89) | 76.4 (68/89) | 98.87 (88/89) | 77.5 (69/89) |
| Culture | 96 (505/528) | 90 (473/528) | 93 (492/528) | 81 (427/528) |
Fig 516S rRNA based genus and species level isolate identities with the Naïve Bayes classifier.
Each isolate was assigned to one of 12 categories (A-H, a-d) based on the agreement between clinical and 16S rRNA based genus and species classifications and the confidence scores.