| Literature DB >> 30863673 |
Gemma Henderson1, Pelin Yilmaz2, Sandeep Kumar1, Robert J Forster3, William J Kelly1, Sinead C Leahy1, Le Luo Guan4, Peter H Janssen1.
Abstract
The taxonomy and associated nomenclature of many taxa of rumen bacteria are poorly defined within databases of 16S rRNA genes. This lack of resolution results in inadequate definition of microbial community structures, with large parts of the community designated as incertae sedis, unclassified, or uncultured within families, orders, or even classes. We have begun resolving these poorly-defined groups of rumen bacteria, based on our desire to name these for use in microbial community profiling. We used the previously-reported global rumen census (GRC) dataset consisting of >4.5 million partial bacterial 16S rRNA gene sequences amplified from 684 rumen samples and representing a wide range of animal hosts and diets. Representative sequences from the 8,985 largest operational units (groups of sequence sharing >97% sequence similarity, and covering 97.8% of all sequences in the GRC dataset) were used to identify 241 pre-defined clusters (mainly at genus or family level) of abundant rumen bacteria in the ARB SILVA 119 framework. A total of 99 of these clusters (containing 63.8% of all GRC sequences) had no unique or had inadequate taxonomic identifiers, and each was given a unique nomenclature. We assessed this improved framework by comparing taxonomic assignments of bacterial 16S rRNA gene sequence data in the GRC dataset with those made using the original SILVA 119 framework, and three other frameworks. The two SILVA frameworks performed best at assigning sequences to genus-level taxa. The SILVA 119 framework allowed 55.4% of the sequence data to be assigned to 751 uniquely identifiable genus-level groups. The improved framework increased this to 87.1% of all sequences being assigned to one of 871 uniquely identifiable genus-level groups. The new designations were included in the SILVA 123 release (https://www.arb-silva.de/documentation/release-123/) and will be perpetuated in future releases.Entities:
Keywords: 16S rRNA genes; Next generation sequencing; Rumen bacteria; SILVA; Taxonomic assignment; Working taxonomic framework
Year: 2019 PMID: 30863673 PMCID: PMC6407505 DOI: 10.7717/peerj.6496
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Assignment to different taxonomic ranks with different frameworks.
(A) Number of taxa identified at different taxonomic ranks in the GRC dataset using different taxonomic frameworks. Also shown are the numbers of unique taxonomic strings returned. (B) Assignment (%) of GRC sequences to defined taxa at different taxonomic ranks using different frameworks.
Figure 2Schematic showing relative abundances of taxa after assignment of sequences using different taxonomic frameworks.
(A) shows assignments at the phylum level. (B) and (C) show taxa affiliated with the phyla (B) Bacteroidetes and (C) Firmicutes that occur at an abundance of at least 0.05% in any one sample. Any genus level taxa with a relative abundance below 0.05% are grouped together as “Other groups.” Blocks with the same colors represent the same taxonomic designations in the different frameworks. More detailed microbial community compositions are provided in Table S3.
Comparison of the nomenclature of taxonomic assignments made using different databases.
| OTU ID | Average abundance (%) | RDP release 11.4 | Greengenes 13_8 | SILVA 119 | SILVA 119Rum |
|---|---|---|---|---|---|
| 365725 | 0.240 | g_ | f_ | g_ | g_ |
| 722152 | 0.558 | c_ | f_ | f_ | g_ |
| 15480 | 0.265 | c_ | f_ | f_ | g_ |
| 9138 | 0.231 | g_ | g_ | g_ | g_ |
| 142948 | 0.556 | f_ | g_ | g_ | g_ |
| 664059 | 0.243 | g_ | g_ | g_ | g_ |
| 480108 | 0.475 | g_ | g_ | g_ | g_ |
| 90393 | 0.504 | o_ | o_ | f_ | g_ |
| 284365 | 0.269 | g_ | g_ | g_ | g_ |
| 237285 | 0.407 | g_ | g_ | g_ | g_ |
| 301314 | 0.051 | g_ | g_ | g_ | g_ |
| 493059 | 0.348 | f_ | g_ | g_ | g_ |
| 698124 | 0.265 | f_ | f_ | g_ | g_ |
| 732718 | 0.145 | f_ | g_ | g_ | g_ |
| 109054 | 0.049 | f_ | g_ | g_ | g_ |
| 234051 | 0.067 | f_ | g_ | f_ | g_[ |
| 311462 | 0.060 | f_ | o_ | f_ | g_[ |
| 205298 | 0.077 | f_ | o_ | f_ | g_ |
| 605934 | 0.068 | f_ | f_ | f_ | g_ |
| 295461 | 0.148 | f_ | f_ | f_ | g_[ |
| 401207 | 0.124 | f_ | f_ | f_ | g_ |
| 237588 | 0.070 | o_ | f_ | g_ | g_ |
| 580981 | 0.140 | f_ | g_ | g_ | g_ |
| 139212 | 0.180 | f_ | f_ | g_ | g_ |
Note:
Shown are the 24 unique examples among the 25 most abundant OTUs and 10 most abundant Lachnospiraceae and Ruminococcaceae, listing the lowest defined rank with a unique identifiable name. Each name is preceded by a letter giving the rank: c = class, o = order, f = family, g = genus. The average abundance in the GRC dataset is also given. The full dataset of 77 OTUs is given in Table S2.
Assignment of sequences in the GRC dataset to bacterial families and to named genera within those families.
| Family | Assigned | RDP release 11.4 | Greengenes 13_8 | SILVA 119 | SILVA 119Rum |
|---|---|---|---|---|---|
| To genera in family | 13.76 (5) | 22.20 (1) | 20.61 (3) | 24.80 (13) | |
| To family | 22.92 (6) | 22.42 (2) | 24.95 (5) | 24.96 (15) | |
| To genera in family | 0 (0) | 0 (0) | 0.05 (1) | 10.64 (1) | |
| To family | 0 (0) | 0.93 (1) | 10.70 (2) | 10.69 (3) | |
| To genera in family | 2.01 (25) | 6.49 (20) | 10.24 (27) | 15.95 (69) | |
| To family | 15.56 (27) | 12.59 (21) | 16.90 (30) | 16.98 (72) | |
| To genera in family | 2.31 (21) | 3.97 (7) | 5.22 (18) | 13.11 (43) | |
| To family | 10.59 (22) | 11.64 (8) | 13.25 (21) | 13.17 (45) | |
| To genera in family | 0.76 (4) | 0.63 (5) | 0.75 (5) | 1.78 (7) | |
| To family | 0.83 (5) | 1.78 (6) | 1.80 (6) | 1.80 (8) |
Note:
The numbers are the average percentage that those sequences make up in samples in the GRC dataset. The numbers in parentheses are the number of genera to which the sequences are assigned or the number of groups within the family (these include subgroups designated as “unclassified” and “uncultured” that have no unique genus-level identifier).
Genus-level taxa in the family Lachnospiraceae.
| SILVA 119 | SILVA 119Rum | ||
|---|---|---|---|
| Taxon | Abundance (%) | Abundance (%) | Taxon |
| 1.194 | 1.194 | ||
| 1.574 | 0.558 | ||
| 1.016 | |||
| 4.197 | 1.479 | ||
| 2.718 | |||
| 0.331 | 0.161 | ||
| 0.139 | |||
| 0.030 | |||
| 0.098 | 0.098 | ||
| 2.552 | 0.031 | ( | |
| 0.017 | ( | ||
| 0.260 | ( | ||
| 0.013 | ( | ||
| 0.065 | ( | ||
| 0.313 | ( | ||
| 0.267 | ( | ||
| 0.701 | ( | ||
| 0.173 | ( | ||
| 0.651 | ( | ||
| 0.015 | |||
| 0.046 | |||
| 0.099 | 0.099 | ||
| 0.503 | 0.503 | ||
| 0.163 | 0.163 | ||
| 0.344 | 0.344 | ||
| 1.355 | 1.355 | ||
| 0.610 | 0.610 | ||
| 0.125 | 0.125 | ||
| 0.198 | 0.198 | ||
| 3.692 | 0.965 | ||
| 0.281 | |||
| 0.016 | |||
| 0.049 | |||
| 0.389 | |||
| 0.291 | |||
| 0.052 | |||
| 0.102 | |||
| 0.012 | |||
| 0.138 | |||
| 0.020 | |||
| 0.311 | |||
| 0.087 | |||
| 0.979 | |||
Note:
The taxa are grouped so that the finer divisions using SILVA 119Rum are lined up alongside the original divisions made using SILVA 119. The abundances are the averages in the GRC dataset.