| Literature DB >> 23013113 |
Irene L G Newton1, Guus Roeselers.
Abstract
BACKGROUND: Microbial ecologists now routinely utilize next-generation sequencing methods to assess microbial diversity in the environment. One tool heavily utilized by many groups is the Naïve Bayesian Classifier developed by the Ribosomal Database Project (RDP-NBC). However, the consistency and confidence of classifications provided by the RDP-NBC is dependent on the training set utilized.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23013113 PMCID: PMC3520854 DOI: 10.1186/1471-2180-12-221
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Figure 1Phylogenetic relationships for the bacterial species included in the honey bee specific database (with bootstrap support indicated above branches if > 75%). Class level designations are highlighted in red while lower rank taxonomic designations are indicated using arrows on nodes. Specific clades identified previously in honey bees are colored in blue while novel clades identified here, including cultured isolates and well-described genera (such as Wolbachia), are colored in yellow.
Figure 2The effect of training set on the classification of sequences from the honey bee gut visualized by a heat map. Unique sequences (4,480) were classified using the NBC trained on either RDP, GG, or SILVA (A), three custom databases including near full length honey bee-associated sequences RDP + bees, GG + bees, SILVA + bees (B), or the near full length honey bee-associated sequences alone (C). Family-level taxonomic designations are shown and where taxonomic classifications occur across all three datasets, these are highlighted in bold lettering. Where a classification is unique to one training set, this is highlighted in red font. The average bootstrap score resulting from the classification is provided for each taxonomic assignment.
The taxonomic classification for 16S rRNA gene sequences improves with the addition of custom databases
| Kingdom | 4,480 | 0 | 4,480 |
| Phylum | 4,465 | 0 | 4,478 |
| Class | 4,453 | 4 | 4,479 |
| Order | 2,579 | 1,335 | 4,669 |
| Family | 1,870 | 2,784 | 4,216 |
| Genus | 595 | 2,552 | --* |
*HBDB sequences were not taxonomically assigned to genus so this level of taxonomic classification was excluded.
The number of 16S rRNA gene sequences from honey bee guts with identical or completely divergent classifications across three widely used training sets (RDP, Greengenes, SILVA) is shown. As the taxonomic levels become more fine, there is an increase in the discordance/errors in taxonomic placement across all three datasets. The addition of honey bee specific sequences greatly improves the congruence across all datasets (last column).
Bacterial isolates with genus and species designations that clade within the bee-specific groups
| Alpha-2.2 | |
| Alpha-2.1 | |
| Alpha-1 | |
| Firm-5 |
These isolates, and their existing taxonomic information, may inform research into the function of the honey bee gut microbiota.
Diversity of species and unique sequences found within honey bee microbiota
| Enterobacteriaceae | 1621 | 175 |
| gamma-1 | 436 | 48 |
| beta | 532 | 35 |
| Bifidobacteriaceae | 363 | 32 |
| firm-5 | 929 | 32 |
| firm-4 | 253 | 21 |
| alpha-2.1 | 90 | 15 |
| alpha-1 | 65 | 13 |
| Lactobacilliaceae | 86 | 12 |
| Flavobacteriaceae | 2 | 2 |
| Leuconostocaceae | 2 | 2 |
| Moraxellaceae | 6 | 2 |
| Sphingomonadaceae | 2 | 2 |
| Xanthomonadaceae | 2 | 2 |
| Actinomycetaceae | 1 | 1 |
| Aeromonadaceae | 1 | 1 |
| alpha-2.2 | 10 | 1 |
| Clostridiaceae | 2 | 1 |
| Corynebacteriaceae | 1 | 1 |
| Cytophagaceae | 1 | 1 |
| Enterococcaceae | 9 | 1 |
| Incertae_Sedis_XI | 1 | 1 |
| Kineosporiaceae | 1 | 1 |
| Nakamurellaceae | 1 | 1 |
| Oxalobacteraceae | 1 | 1 |
| Prevotellaceae | 1 | 1 |
For each family found with honey bee guts (based on SILVA + bees classification) the number of unique sequences and the number of 97% identical operational taxonomic units (OTUs) is shown.