Literature DB >> 32134020

Construction & assessment of a unified curated reference database for improving the taxonomic classification of bacteria using 16S rRNA sequence data.

Shikha Agnihotry1, Aditya N Sarangi1, Rakesh Aggarwal2.   

Abstract

Background & objectives: For bacterial community analysis, 16S rRNA sequences are subjected to taxonomic classification through comparison with one of the three commonly used databases [Greengenes, SILVA and Ribosomal Database Project (RDP)]. It was hypothesized that a unified database containing fully annotated, non-redundant sequences from all the three databases, might provide better taxonomic classification during analysis of 16S rRNA sequence data. Hence, a unified 16S rRNA database was constructed and its performance was assessed by using it with four different taxonomic assignment methods, and for data from various hypervariable regions (HVRs) of 16S rRNA gene.
Methods: We constructed a unified 16S rRNA database (16S-UDb) by merging non-ambiguous, fully annotated, full-length 16S rRNA sequences from the three databases and compared its performance in taxonomy assignment with that of three original databases. This was done using four different taxonomy assignment methods [mothur Naïve Bayesian Classifier (mothur-nbc), RDP Naïve Bayesian Classifier (rdp-nbc), UCLUST, SortMeRNA] and data from 13 regions of 16S rRNA [seven hypervariable regions (HVR) (V2-V8) and six pairs of adjacent HVRs].
Results: Our unified 16S rRNA database contained 13,078 full-length, fully annotated 16S rRNA sequences. It could assign genus and species to larger proportions (90.05 and 46.82%, respectively, when used with mothur-nbc classifier and the V2+V3 region) of sequences in the test database than the three original 16S rRNA databases (70.88-87.20% and 10.23-24.28%, respectively, with the same classifier and region). Interpretation & conclusions: Our results indicate that for analysis of bacterial mixtures, sequencing of V2-V3 region of 16S rRNA followed by analysis of the data using the mothur-nbc classifier and our 16S-UDb database may be preferred.

Entities:  

Keywords:  16S rRNA; bioinformatics; hypervariable regions; metagenomics; microbiota

Mesh:

Substances:

Year:  2020        PMID: 32134020      PMCID: PMC7055167          DOI: 10.4103/ijmr.IJMR_220_18

Source DB:  PubMed          Journal:  Indian J Med Res        ISSN: 0971-5916            Impact factor:   2.375


Several human body sites contain a variety of organisms, including prokaryotes and eukaryotes, collectively referred to as microbiota. High-throughput genomic sequencing of 16S rRNA is used for profiling of microbiota1. This bacterial gene has nine hypervariable regions (HVRs) interspersed with conserved nucleotide sequences. Sequences of these HVRs differ between bacterial groups and hence can be used to identify different bacteria2. Further, high-throughput platforms allow sequencing of several nucleic acid molecules in parallel. One of the commonly used platforms, namely Illumina, can read DNA sequences beginning at each end of a DNA fragment for up to 300 nucleotides; these paired-end reads can then be merged to obtain sequences of up to 575-nucleotide long, enough to cover one or two adjacent HVRs. Sequencing of one or more HVR regions using this platform followed by taxonomic identification of each sequence by matching with a database of known bacterial 16S rRNA gene sequences is one of the most-frequently used methods for determining the type and abundance of various bacteria present in specimens that contain a mixture of several bacteria3. Three databases are commonly used to identify bacterial 16S rRNA sequences, namely SILVA4, Ribosomal Database Project (RDP)5 and Greengenes6. These databases overlap partially, with each containing some entries which are absent in the others. The databases also vary in their coverage; for instance, the most abundant genus in the Greengenes is Prevotella and that in the SILVA database is Lachnospira. Further, several taxonomy assignment methods such as UCLUST7, mothur-nbc8, rdp-nbc9 and SortMeRNA10 are used for comparing the query sequences to these databases. Effectiveness of bioinformatic analysis to assign individual reads in a high-throughput 16S rRNA sequence dataset to various bacterial species can be expected to vary with the HVR sequenced, the 16S rRNA reference database used and the taxonomic assignment methods used. A few studies have assessed the effect of varying one factor, e.g., the reference database11, or assignment method12 on the taxonomic assignment process, at a time; however, no study has comprehensively looked at the effect of varying all the three factors together. Therefore, we analysed the composite effect of varying the three factors. Further, it was hypothesized that the use of a hybrid database which included entries from all the three major 16S rRNA sequence databases, might improve the phylogenetic assignment. Hence, a unified 16S rRNA database was constructed by merging non-ambiguous, full-length prokaryotic 16S rRNA sequences with complete annotation up to the species level from the three commonly used databases. The relative performance in taxonomic assignment of this new unified database [referred to as 16S unified database (16S-UDb)] and each of its constituents taken individually was compared, using four different taxonomic assignment methods and using different HVR of 16S rRNA.

Material & Methods

Acquisition and preparation of individual 16S rRNA databases: For the Greengenes and SILVA databases, files containing 16S rRNA reference sequences (pre-clustered at 97% threshold) and corresponding taxonomy mapping information in QIIME format were downloaded ( and , respectively). For RDP (release 11.4), unaligned bacterial 16S rRNA sequences were downloaded () and made QIIME-compatible by removing sequences that were shorter than 1200 nucleotides or that contained any ambiguous base, followed by clustering at 97 per cent threshold using VSEARCH13. In addition, a taxonomy mapping file in QIIME-compatible format was created by linking RDP sequence identifiers of the representative sequences with 6-level (phylum, class, order, family, genus and species) taxonomic lineage hierarchy. Construction of a unified 16S rRNA database (16S-UDb): Bacterial 16S rRNA gene sequences were obtained from three databases - Greengenes (v.13.5), RDP (v11.4) and SILVA ‘All-Species Living Tree’ Project. For each, any sequences shorter than 1200 nucleotide in length, containing any ambiguous base, not classified up to the species level, or labelled as derived from environment or cloned material were purged. Sequences classified up to subspecies or isolate level were grouped to the species level. These curated data from the three databases were merged and clustered at 97 per cent threshold using VSEARCH13. Further, a taxonomy mapping file was created by linking sequence identifiers of the representative sequences with 6-level taxonomic lineage hierarchy. Construction of test dataset: A test dataset was required for comparing the performance of the unified and the individual 16S rRNA databases. For this, a pre-formatted 16S rRNA reference dataset was downloaded from the NCBI ftp site (), and sequences of 1200 nucleotides or longer in it were extracted using the ‘blastdbcmd’ utility of NCBI stand-alone BLAST version 2.2.28. The identifier of each sequence was used to query the NCBI taxonomy database () to obtain its taxonomic information. These data were used to create a taxonomy mapping file containing the reference identifiers and their 6-level taxonomic lineage hierarchies up to the species level; any sequences with incomplete species information were removed, and those with taxonomic lineage defined up to subspecies or isolates, were grouped to species level. Any duplicate sequences with 100 per cent identity were removed using CD-HIT14. This provided a test dataset of non-redundant full-length 16S rRNA sequences classified up to species level. Construction of test datasets of shorter lengths: Primer-pairs flanking seven individual HVRs (V2 to V8) of bacterial 16S rRNA and six pairs of adjacent HVRs [V23 (i.e., V2+V3), V34, V45, V56, V67, V78] were identified (Supplementary Table I (available from )).. Their sequences were aligned with the test dataset, using ClustalW program in-built in BioEdit Suite v.7.2.5 (), and the nucleotide sequences lying between the primers in primer pairs covering each HVR (V2 to V8; 7 sets) or two adjacent HVRs (i.e., V2+V3, V3+V4, V4+V5, V5+V6, V6+V7 and V7+V8; 6 sets) were extracted. This yielded a total of 13 test datasets. The V1 and V9 regions were ignored since these are rarely used for metagenomic analysis because of their incomplete nature15.
Supplementary Table I

Location and sequences of primers used for extracting data on each hypervariable region (HVR) and combinations of adjacent HVRs

HVRLocation ofthe forward primer Sequence of the forward primerLocation of the reverse primer Sequence of the reverse primerReference
V2 119 AGYGGCGNACGGGTGAGTAA338 TGCTGCCTCCCGTAGGAGT1
V3 341 CCTACGGGAGGCAGCAG534 ATTACCGCGGCTGCTGG2
V4 577 AYTGGGYDTAAAGNG785 TACNVGGGTATCTAATCC34
V5 784 AGGATTAGATACCCT907 CCGTCAATTCCTTTGAGTTT56
V6 961 TCGAtGCAACGCGAAGAA1085ACATtTCACaACACGAGCTGACGA7
V7 1099GYAACGAGCGCAACCC1238GTAGCRCGTGTGTMGCCC89
V8 1050ATGGCTGTCGTCAGCT1385ACGGGCGGTGTGTAC10
V2+V3119 AGYGGCGNACGGGTGAGTAA518 ATTACCGCGGCTGCTGG12
V3+V4341 CCTACGGGRSGCAGCAG798 GGGGTATCTAATCCC211
V4+V5577 AYTGGGYDTAAAGNG907 CCGTCAATTYYTTTRAGTTT356
V5+V6805 GACTACCAGGGTATCTAATCC1065AGGTGCTGCATGGCTGT9
V6+V7967 CAACGCGAAGAACCTTACC1238GTAGCRCGTGTGTMGCCC9
V7+V81046ACAGCCATGCAGCACCT1406GACGGGCGGTGWGTRCA9

N, any nucleotide; Y, C or T; D, not C; V, not T; R, A or G; M, A or C; lowercase nucleotide, low confidence base

Location and sequences of primers used for extracting data on each hypervariable region (HVR) and combinations of adjacent HVRs N, any nucleotide; Y, C or T; D, not C; V, not T; R, A or G; M, A or C; lowercase nucleotide, low confidence base Comparison of different approaches to identify an optimal pipeline: The performances of various combinations of four taxonomy assignment methods (UCLUST, SortMeRNA, mothur-nbc and rdp-nbc) and four 16S rRNA databases (Greengenes, SILVA, RDP and the new 16S-UDb) in correctly classifying reads included in the 13 HVR/HVR-pair test datasets, up to the genus and species levels were determined. This was done using 208 runs (=4×4×13) of the assign_taxonomy. py script of QIIME software package16. UCLUST-based classifier uses the consensus taxonomy assignment-based approach; it was invoked with the parameters min_consensus_fraction: 0.51, similarity: 0.9 and uclust_max_accepts: 3. SortMeRNA algorithm is based on approximate seeds and accounts for fast and sensitive analyses of rRNA sequences. SortMeRNA was run with parameters: SortMeRNA _coverage: 0.9, SortMeRNA _best_N_alignments: 5 and SortMeRNA _e_value: 1.0. The two naive Bayes-based classifiers (mothur-nbc and rdp-nbc) were retrained using the reference operational taxonomic units (OTUs) clustered at 97 per cent threshold and the corresponding taxonomy mapping files from the above four databases, individually, with minimum confidence score (-c) of 0.80. Performance of each classifier-database combination was calculated as the proportion of sequences in the test dataset that were correctly classified. Assessment of performance of the ‘optimal pipeline’ using a real-life dataset: For this, Illumina 16S rRNA gene V3 sequencing data were used from one of our previous studies17. It contained 20,508,594 high-quality reads obtained from stool specimens of 14 healthy persons [number of median reads per specimen: 272,327 (range 136,563 to 761,961); total reads 4,728,631] and 33 patients with enthesitis-related arthritis [median 397,351 (102,093 to 1,502,380); total reads 15,871,719]. These data were subjected to taxonomy assignment using the optimum approach identified in the previous step. The results obtained were compared with those obtained using the approach used in the original analysis i.e., using UCLUST Consensus Taxonomy Assigner and sub-sampled open-reference OTU picking protocol of QIIME 1.8 against the Greengenes v.13.8 reference OTUs pre-clustered at 97 per cent threshold with the software's default parameters16. During these analyses, any singleton OTUs (with only one sequence in all specimens taken together), unassigned OTUs and eukaryotic OTUs were removed from the ‘biom’ files generated by each approach. Further, to reduce noise, OTUs that were observed in fewer than 10 per cent of stool specimens or that accounted for fewer than 0.002 per cent of reads in all the specimens taken together were also purged. The relative performance of the two methods was assessed by comparing the BIOM files generated by each. Comparison of computational performances of various approaches: The time taken for different combinations of databases and classifiers were assessed using an Intel Corporation Xeon E7 Workstation with 6 processors (Intel Corp., CA, USA) using identical parameters, i.e., pre-filtration of sequences at 60 per cent identity and subsample-based open-reference OTU picking method with the use of 10 per cent subsample. The study was carried out at the Biomedical Informatics Centre, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, India.

Results & Discussion

Acquisition of 16S rRNA databases: The Greengenes database (v.13.5), when pre-clustered at 97 per cent threshold as inbuilt in QIIME V1 software, contained 99,322 sequences belonging to 1812 unique genera, and the SILVA database (v.123) with similar pre-clustering contained 216,401 sequences belonging to 3541 unique genera. For RDP, the QIIME-compatible data in a similar format contained 145,925 sequences belonging to 2737 unique genera. Figure A shows the number of taxonomic units (by name) at various ranks, i.e., phylum to genus, between the SILVA, RDP and Greengenes taxonomies. This comparison showed that the three databases varied markedly in the taxa represented in each, with only 19.8 per cent of the phyla, 9.7 per cent of classes, 9 per cent of orders, 14.8 per cent of families and 27.6 per cent of genus ranks present in any of the three databases being shared across all the databases.
Figure

Comparison of taxonomies based on taxon names found at each rank from phylum to genus. The three taxonomies, SILVA, Ribosomal Database Project (RDP) and Greengenes (GG) commonly used for 16S rRNA based analyses were compared in detail (Panel A) and then a union of these three databases (labelled as ALL) were compared against the unified 16S-UDb database (Panel B).

Comparison of taxonomies based on taxon names found at each rank from phylum to genus. The three taxonomies, SILVA, Ribosomal Database Project (RDP) and Greengenes (GG) commonly used for 16S rRNA based analyses were compared in detail (Panel A) and then a union of these three databases (labelled as ALL) were compared against the unified 16S-UDb database (Panel B). Construction of integrated 16S database: Of the 4,345,168 bacterial 16S rRNA gene sequences obtained from the three databases (Greengenes v.13.5: 1,262,986 sequences, RDP v.11.4: 3,070,243 sequences and SILVA ‘All-Species Living Tree’ Project v.123: 11,939 sequences), 2,629,394 (60.5%) were 1200 nucleotide or longer in length and free of ambiguous nucleotides. Of these, 405,538 were fully classified up to the species level. These 405,538 sequences were clustered at 97 per cent threshold using VSEARCH13. For the current version of 16S-UDb v1.0, this yielded 13,078 unified sequences from the bacterial kingdom, belonging to 36 phyla, 94 classes, 187 orders, 414 families, 2453 genera and 4881 unique species-like groups. Proteobacteria was the predominant phylum accounting for 38.24 per cent of all the sequences, followed by Firmicutes (28.91%), Actinobacteria (12.72%), Bacteroidetes (10.44%), Cyanobacteria (2.6%), Tenericutes (1.39%), Spirochaetes (0.83%), Verrucomicrobia (0.47%) and 29 other minor phyla. The numbers of taxonomic units (by name) shared between a union of the SILVA, RDP, Greengenes taxonomies (labelled as ALL) versus 16S-UDb at phylum to genus ranks are shown in Figure B. The top 10 phyla, class, order and genera in the Greengenes, SILVA, RDP and 16S-UDb databases are shown in Supplementary Table II. Of the taxonomic units at various levels contained in ‘ALL’ but not in 16S-UDb, a large majority were present in only one of the three starting databases, i.e., Greengenes, SILVA or RDP, and very few were represented in two or three of these (Supplementary Table III (available from ))..
Supplementary Table II

Top 10 phyla, class, order and genera of Greengenes, SILVA, RDP and 16S-UDb databases (clustered at 97% threshold)

 Rank TaxonGreengenesSILVARDP16S-UDb
PhylumProteobacteria27.6931.0129.2238.24
Firmicutes26.8429.4127.0329.91
Actinobacteria5.608.376.6012.72
Bacteroidetes13.0511.3412.8010.44
Cyanobacteria2.132.091.761.93
Tenericutes0.740.570.261.39
Spirochaetes1.451.081.030.83
Verrucomicrobia1.761.091.610.47
Deinococcus-ThermusN/A0.190.220.45
Planctomycetes3.422.592.400.38
ClassGammaproteobacteria9.1111.3510.1921.20
Bacilli4.556.485.4317.57
Actinobacteria3.736.546.5312.59
Clostridia21.7321.4819.2311.16
Alphaproteobacteria7.698.627.849.26
Betaproteobacteria4.045.394.494.56
Bacteroidia7.315.285.553.83
Flavobacteriia2.232.242.243.56
Deltaproteobacteria5.594.504.781.85
Sphingobacteriia0.571.992.271.58
OrderBacillales2.873.653.5112.70
Actinomycetales3.590.174.4311.90
Clostridiales20.9821.1018.7710.32
Pseudomonadales1.081.941.485.28
Enterobacteriales0.942.321.364.98
Lactobacillales1.602.801.864.79
Bacteroidales7.315.145.553.82
Flavobacteriales2.222.242.243.56
Unclassified8.966.5323.693.46
Burkholderiales1.993.112.392.97
FamilyEnterobacteriaceae0.942.321.364.97
Bacillaceae1.271.751.234.68
Flavobacteriaceae1.481.861793.43
Pseudomonadaceae0.621.240.973.36
Lachnospiraceae5.7811.067.473.30
Ruminococcaceae1.275.365.552.58
Rhodobacteraceae1.351.781.492.34
Lactobacillaceae0.480.740.502.10
Streptomycetaceae0.330.690.412.00
Moraxellaceae0.460.690.501.90
GenusBacillus0.741.281.235.55
Pseudomonas0.491.200.703.27
Ruminococcus0.670.360.302.52
Lactobacillus0.460.720.461.92
Prevotella1.070.420.981.91
Streptomyces0.260.640.461.87
Faecalibacterium0.280.490.541.83
Paenibacillus0.380.470.321.51
Clostridium0.800.440.341.43
Acinetobacter0.240.460.231.30
Supplementary Table III

Number of taxa contained in a union of the greengenes, silva and RDP databases (i.e. ALL) which were not contained in the newly constructed 16S-UDb

 LevelTotal number of taxa contained in ALL but not in 16S-UDbDistribution of taxa contained in ALL but notin the newly-constructed 16S- UDb, by the original databases which contained these taxa
GSRGRSGRRSRRGRSRR
Phylum97482467390
Class3451591412122011
Order9912526674624011
Family11762269141815021
Genus23413341341425214211563

G, Greengenes; S, silva; R, RDP; ⋂, intersection

Top 10 phyla, class, order and genera of Greengenes, SILVA, RDP and 16S-UDb databases (clustered at 97% threshold) Number of taxa contained in a union of the greengenes, silva and RDP databases (i.e. ALL) which were not contained in the newly constructed 16S-UDb G, Greengenes; S, silva; R, RDP; ⋂, intersection Contribution of each input database to the 16S-UDb database: Greengenes database contributed 3387 sequences belonging to 23 phyla, 45 classes, 79 orders, 150 families, 306 genus and 533 species to the 16S-UDb database. Among these sequences, Firmicutes was the predominant phylum with 43.52 per cent of all the sequences contributed, followed by Proteobacteria (26.28%), Bacteroidetes (13.14%), Actinobacteria (9.86%), Crenarchaeota (1.27%), Spirochaetes (1.18%), Euryarchaeota (1.09%) and Verrucomicrobia (0.83%). RDP contributed 6254 sequences from 30 phyla, 56 classes, 118 orders, 280 families, 1316 genus and 2880 species. Among these sequences, Proteobacteria (45.68%) was the most dominant phylum, followed by Firmicutes (28.06%), Actinobacteria (11.24%), Bacteroidetes (4.88%), Cyanobacteria (4.05%), Tenericutes (1.92%), Spirochaetes (0.80%) and Planctomycetes (0.56%). SILVA database contributed 3437 sequences from 29 phyla, 58 classes, 137 orders, 293 families, 1576 genus and 2264 species. Of these, Proteobacteria (36.49%) were the most predominant, followed by Firmicutes (19.84%), Actinobacteria (18.21%), Bacteroidetes (17.89%), Tenericutes (1.45%), Deinococcus-Thermus (0.93%), Verrucomicrobia (0.61%) and Chloroflexi (0.52%). Construction of gold standard test dataset: The NCBI Bacterial 16S Ribosomal RNA dataset () contained a total of 18,775 eligible (1200 nucleotide or longer in length) 16S rRNA sequences, of which 2132 were redundant and 5318 lacked taxonomic classification to species level. From the remaining full-length 16S rRNA sequences, 13 HVR/HVR-pair test datasets (each with 11,325 members) were constructed. The 16S rRNA test dataset comprised sequences from only the bacterial kingdom, related to 36 phyla, 86 classes, 162 orders, 383 families, 2453 genera and 4881 unique species-like groups. Proteobacteria was the dominant phylum with 38.89 per cent of all sequences (n=11325), followed by Actinobacteria (23.69%), Firmicutes (19.51%), Bacteroidetes (10.76%), Tenericutes (1.78%), Spirochaetes (0.91%), Cyanobacteria (0.54%), Verrucomicrobia (0.44%) and 28 other minor phyla. The top 10 phylum, class, order, family and genus groups in this test dataset are shown in Supplementary Table IV (available from ).
Supplementary Table IV

Top 10 phyla, classes, orders, families and genera contained in the test database

Phylum Class Order Family Genus
 Name %  Name %  Name %  Name %  Name %
Proteobacteria 38.89 Actinobacteria 23.54 Bacillales 8.53 Streptomycetaceae5.90 Streptomyces5.61
Actinobacteria23.69 Gammaproteobacteria 16.63Streptomycetales 5.89 Flavobacteriaceae5.06 Bacillus1.92
Firmicutes 19.51 Alphaproteobacteria13.25 Micrococcales5.59 Bacillaceae4.19 Lactobacillus1.56
Bacteroidetes10.76 Bacilli 12.59 Flavobacteriales5.20 Rhodobacteraceae3.13 Clostridium1.52
Tenericutes1.78 Clostridia6.00 Clostridiales5.05 Enterobacteriaceae2.32 Paenibacillus1.47
Spirochaetes0.91 Betaproteobacteria 5.55 Rhizobiales 4.78 Microbacteriaceae2.23 Pseudomonas1.46
Deinococcus-Thermus0.77 Flavobacteriia 5.20 Corynebacteriales 4.14Paenibacillaceae2.0 Mycobacterium1.45
Cyanobacteria0.54 Deltaproteobacteria 2.48Lactobacillales 4.06 Pseudonocardiaceae1.99 Flavobacterium1.08
Verrucomicrobia0.44 Sphingobacteriia 2.00 Burkholderiales 3.88Clostridiaceae1.98 Mycoplasma1.06
Thermotogae0.41 Cytophagia1.94 Rhodobacterales 3.41 Lactobacillaceae1.68 Vibrio0.92
Top 10 phyla, classes, orders, families and genera contained in the test database Identification of ‘optimal pipeline’ using the test dataset: The default reference database for QIIME v1 is a subset of Greengenes rRNA sequences pre-clustered clustered at 97 per cent identity16. Several meta-analyses and case-control analyses of human microbiome used this 97 per cent identity subset of Greengenes1718 and SILVA192021 as reference databases. Hence, we used the 16S-UDb clustered at 97 per cent threshold for our comparisons. Table I shows the performance of various combinations of four taxonomy assignment methods and four 16S rRNA databases (including the 16S-UDb clustered at 97 per cent threshold) in correctly classifying the 13 HVR/HVR-pair test datasets at the family, genus and species levels. For UCLUST and SortMeRNA, the SILVA database performed the best, 16S-UDb performed the best; this could indicate that different methods work better with databases of different sizes. However, overall, for each of the 13 datasets, the unified database (16S-UDb) and mothur-nbc classifier combination provided the highest correct classification rate at each taxonomic level.
Table I

Accuracy (%) of various combinations of four taxonomy assignment algorithms and four 16S rRNA sequence databases using different hypervariable regions

HVR16S rRNA databaseAccuracy in assigning bacterial family using each classifierAccuracy in assigning bacterial genus using each classifierAccuracy in assigning bacterial species using each classifier



UCLUSTSortMeRNARDPmothurUCLUSTSortMeRNARDPmothurUCLUSTSortMeRNARDPmothur
V2GG84.3782.7483.7185.8257.6255.3463.4169.972.852.619.1110.08
SILVA87.4385.6286.3088.2577.1673.0375.3882.082.101.8012.0514.86
RDPdb79.3577.5181.9083.2171.6766.1477.1684.273.243.9621.2324.20
16S-UDb77.2581.1389.0691.3060.6265.6483.0588.508.7111.4842.8445.61
V3GG80.8579.4370.3576.3153.6549.5546.6457.722.111.736.328.00
SILVA84.9783.1673.4383.3370.5566.0856.1967.961.460.857.0210.35
RDPdb76.1576.1171.4577.8664.1860.8660.0673.871.631.3712.9417.95
16S-UDb76.1177.2976.3185.2754.6855.7664.6976.855.635.7730.4837.49
V4GG85.5085.1784.4786.0959.3756.6060.6566.972.241.936.918.39
SILVA89.0787.7187.7489.2478.1874.1771.7279.431.471.177.3110.38
RDPdb82.1878.7582.1683.4774.2767.5571.7382.091.701.7013.4017.47
16S-UDb81.4580.0187.6790.9159.8160.4978.3185.626.306.1134.3039.83
V5GG82.0581.2574.1781.5551.0049.1845.0657.051.341.053.615.84
SILVA86.0083.8377.3685.0565.5260.7248.6965.860.840.503.070.55
RDPdb77.1073.9772.5079.4859.8754.8551.1768.881.020.706.7011.88
16S-UDb76.9979.2777.6287.1152.2154.2757.5373.333.533.5818.2427.57
V6GG77.2476.8364.8576.7547.9647.3443.9457.771.721.926.158.40
SILVA81.5982.1971.2881.4764.5863.8151.8067.581.331.627.1011.00
RDPdb73.6572.6867.7476.3059.7455.4952.2768.061.952.7514.0519.24
16S-UDb70.1878.9174.7684.5350.2558.6362.8875.916.6312.1027.4435.06
V7GG76.7877.2857.4573.2747.3944.6832.7948.151.170.882.684.78
SILVA79.5479.3162.7576.6859.5156.4732.9552.130.680.262.195.23
RDPdb70.0869.5360.3472.3953.3351.2139.1357.910.790.525.7410.63
16S-UDb71.8974.1065.1579.6446.8649.1046.4763.333.412.4714.2323.45
V8GG83.6880.7983.2285.5156.2048.1157.9565.471.971.246.228.07
SILVA86.7083.2185.3387.9073.8462.2367.4775.731.230.486.889.76
RDPdb79.0574.8180.6582.7268.0157.6370.2079.931.371.2914.0818.28
16S-UDb80.7980.5083.9990.6656.3359.0876.5283.974.533.9034.1539.50
V2+V3GG85.8683.5985.8186.5561.1155.7966.7870.882.852.529.4010.23
SILVA88.7386.5888.2889.2580.1473.9479.4383.842.011.6012.6415.02
RDPdb81.7678.0683.0483.7575.8467.7382.4787.202.793.1321.7824.28
16S-UDb80.1880.7290.4391.8363.1364.8286.8790.057.859.2545.0846.82
V3+V4GG85.7785.0186.0086.8160.6455.9664.7369.172.482.198.139.21
SILVA89.2988.3388.9589.7180.6475.4377.4783.061.571.219.0911.88
RDPdb82.9480.8983.3883.8877.1768.1680.9186.581.941.6616.4220.29
16S-UDb83.1280.1190.1991.8962.7562.1285.3389.476.135.8041.6144.81
V4+V5GG86.1585.9286.0486.8161.2662.1364.0468.542.306.237.468.74
SILVA89.4786.3189.0689.6480.0972.0175.8681.411.391.118.2310.97
RDPdb82.9777.8283.2483.8177.6465.9478.0885.031.821.9214.7418.68
16S-UDb82.9978.1489.7091.6561.8561.0283.6488.545.906.0238.5842.81
V5+V6GG74.6072.5284.7786.6050.6743.0060.9568.061.761.047.259.19
SILVA88.5784.2887.8089.2776.9763.9573.6381.201.160.618.649.72
RDPdb81.2670.8682.2083.4772.8756.4073.6983.281.621.4915.8820.64
16S-UDb81.1767.8888.5291.3159.1354.1580.2587.065.364.7537.9242.77
V6+V7GG83.5379.6781.6285.3455.2245.4356.7666.351.831.097.099.03
SILVA87.1180.4985.3288.2273.4559.2768.5378.411.310.488.8912.19
RDPdb78.7870.7580.4382.4867.9953.1769.7680.591.651.9016.4321.01
16S-UDb79.2074.3386.3590.6756.9654.0677.7185.165.244.8837.1742.09
V7+V8GG84.1380.0183.3185.4455.8546.0758.0465.571.920.926.218.07
SILVA87.3181.9385.3387.9574.8057.4167.0175.641.110.426.759.62
RDPdb78.8772.3380.7482.6968.1654.0370.0879.861.341.6714.0518.19
16S-UDb75.4380.4986.7590.6156.7459.1476.4283.944.494.4534.0039.37

Values in bold represents the highest values among those obtained using different databases and classifiers for the same HVR and up to the same taxonomic level. GG, greengenes, RDP, Ribosomal Database Project, 16S-UDb, the new 16S unified database; HVR, hypervariable regions

Accuracy (%) of various combinations of four taxonomy assignment algorithms and four 16S rRNA sequence databases using different hypervariable regions Values in bold represents the highest values among those obtained using different databases and classifiers for the same HVR and up to the same taxonomic level. GG, greengenes, RDP, Ribosomal Database Project, 16S-UDb, the new 16S unified database; HVR, hypervariable regions Using this database-classifier combination (16S-UDb and mothur-nbc), we compared the performance of the 13 test datasets. In this analysis, V2-3 HVR performed the best for classification at the genus and species levels, and the V3-4 HVR did the best for classification at the family level. The HVR providing the best discrimination depends on the composition of the sample (bacterial mixture)2223. The strength of our analysis was that we used various tools on both an idealized and a real-life dataset and obtained good performance with this database-classifier pair15. Assessment of performance of the ‘optimal pipeline’ using a real-life dataset: In data from our previous study17, the original analysis using the Greengenes databases and UCLUST classifier (the GG-UCLUST approach) was able to assign genus and species to 17,439,681 (85.0%) and 8,381,405 (40.9%), respectively, of all reads (numbering 20,508,594). By contrast, the 16S-UDb-mothur-nbc approach was able to assign genus and species to large proportions [17,886,699 (87.2%) and 12,450,839 (60.7%), respectively] of reads in this dataset. The 16S-UDb-mothur-nbc pipeline identified presence of a large number of families, genera and species in the test dataset (50, 94 and 175, respectively; Supplementary Tables V (available from ) and VI (available from )) than the original GG-UCLUST pipeline (47, 69 and 34, respectively). In the results from these two pipelines, 35 families, 59 genera and 32 species were common. In the species-level results from the two approaches, 145 species showed discordance, including two (Staphylococcus aureus and Lactobacillus zeae) that were detected only by the GG-UCLUST approach and 143 that were detected only by the 16S-UDb-mothur-nbc approach. These data showed that the 16S-UDb-mothur-nbc approach was more sensitive in identifying the presence of different bacterial species in 16S rRNA sequence datasets, such as those derived from bacterial mixtures.
Supplementary Table V

Comparison of family-level abundances (%) between healthy subjects and patients with enthesitis-related arthritis (ERA) using 16S-UDb- MOTHUR approach

 Taxonomic groupHealthy personsPatients with ERA
Median MinimumMaximumMedian MinimumMaximum
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae 69.21 1.00 87.35 9.02 0.03 88.11
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae 5.38 0.96 26.40 4.92 0.11 26.39
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae 5.11 0.53 20.07 10.53 0.09 85.35
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae2.54 0.01 8.09 0.77 0.01 12.32
p__Firmicutes; c__Clostridia; o__Clostridiales; f__UC Clostridiales 0.70 0.07 3.61 0.56 2.44E-03 14.75
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae0.66 0.01 10.85 0.72 3.96E-03 7.54
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Eubacteriaceae 0.59 0.04 7.57 0.74 0.01 7.50
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae 0.57 0.05 1.64 0.23 2.09E-03 16.51
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Veillonellaceae0.33 4.88E-03 2.91 0.15 2.78E-03 3.00
p__Firmicutes; c__UC Firmicutes; o__UC Firmicutes; f__UC Firmicutes 0.26 1.06E-03 1.67 0.26 2.32E-03 5.82
p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae 0.26 0.01 11.97 1.29 0.01 34.53
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Peptostreptococcaceae 0.22 0.01 30.80 0.14 2.09E-03 4.76
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae 0.22 1.98E-03 2.22 0.61 3.48E-03 10.29
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae 0.19 0.01 6.64 1.02 0.01 68.83
p__Proteobacteria; c__Gammaproteobacteria; o__Pasteurellales; f__Pasteurellaceae 0.109.75E-04 3.05 0.06 1.85E-03 7.87
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae0.10 0.01 5.67 0.10 2.10E-03 20.98
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Sutterellaceae 0.095.28E-04 0.69 0.23 3.17E-04 22.46
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae0.07 0.01 2.77 0.17 0.01 9.86
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae 0.07 3.53E-03 0.37 0.04 7.87E-04 0.92
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae 0.060.01 3.08 3.01 0.05 64.04
p__Proteobacteria; c__Gammaproteobacteria; o__Aeromonadales; f__Succinivibrionaceae 0.06 0.00 2.46 0.01 0.00 10.62
p__Actinobacteria; c__Coriobacteriia; o__Coriobacteriales; f__Coriobacteriaceae 0.05 7.93E-04 16.20 0.17 3.00E-04 6.71
p__Firmicutes; c__Clostridia; o__Clostridiales; f__UC Clostridiales0.04 3.17E-03 0.66 0.08 3.48E-04 2.71
p__Bacteroidetes; c__UC Bacteroidetes; o__UC Bacteroidetes; f__UC Bacteroidetes 0.040.00 8.04 0.00 0.00 13.74
p__Firmicutes; c__Clostridia; o__UC Clostridia; f__UC Clostridia0.03 2.64E-04 1.77 0.02 0.00 5.20
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Comamonadaceae 0.034.72E-04 0.22 4.52E-03 0.00 7.36
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Acidaminococcaceae 0.02 0.00 0.46 1.58E-03 0.00 0.85
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae 0.01 0.00 0.35 0.10 0.00 6.78
p__Actinobacteria; c__UC Actinobacteria; o__UC Actinobacteria; f__UC Actinobacteria 0.010.00 4.21 0.01 0.00 2.35
p__Actinobacteria; c__Actinobacteria; o__Coriobacteriales; f__Coriobacteriaceae0.01 0.00 0.89 0.01 0.00 0.59
p__Coriobacteriia; c__UC Coriobacteriia; o__Eggerthellales; f__Eggerthellaceae 0.010.00 0.47 0.01 0.00 0.49
p__Proteobacteria; c__Deltaproteobacteria; o__Desulfovibrionales; f__Desulfovibrionaceae 4.86E-03 0.00 1.60 2.72E-04 0.00 0.28
p__Firmicutes; c__Clostridia; o__UC Clostridia; f__UC Clostridia 4.23E-03 0.00 0.11 0.00 0.00 0.16
p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Micrococcaceae 1.31E-03 0.00 0.09 2.98E-03 0.00 0.17
p__Proteobacteria; c__UC Proteobacteria; o__UC Proteobacteria; f__UC Proteobacteria 1.28E-03 0.00 0.44 3.51E-03 0.00 8.33
p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Actinomycetaceae 1.07E-03 0.00 0.03 3.90E-03 0.00 1.15
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae 1 9.28E-04 0.00 0.08 3.00E-04 0.00 0.04
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Carnobacteriaceae 8.53E-04 0.00 0.01 2.08E-03 0.00 0.29
p__Tenericutes; c__Mollicutes; o__UC Mollicutes; f__UC Mollicutes7.49E-04 0.00 8.69 0.01 0.00 12.06
p__Proteobacteria; c__Epsilonproteobacteria; o__Campylobacterales; f__Campylobacteraceae 6.30E-04 0.00 0.43 9.51E-04 0.00 1.66
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Leuconostocaceae6.26E-04 0.00 0.09 8.87E-04 0.00 1.63
p__UC Bacteria; c__UC Bacteria; o__UC Bacteria; f__UC Bacteria 3.53E-04 0.00 0.03 2.14E-03 0.00 0.09
p__Firmicutes; c__Bacilli; o__Bacillales; f__Bacillales_Incertae Sedis XI 3.53E-040.00 0.01 3.00E-04 0.00 0.17
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Enterococcaceae 2.39E-04 0.00 0.01 0.01 0.00 6.71
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Oxalobacteraceae 2.10E-04 0.00 0.02 0.00 0.00 0.21
p__Proteobacteria;c__Deltaproteobacteria; o__UC Deltaproteobacteria; f__UC Deltaproteobacteria 0.00 0.00 0.07 8.16E-04 0.00 1.43
p__Proteobacteria; c__Alphaproteobacteria; o__UC Alphaproteobacteria; f__UC Alphaproteobacteria 0.00 0.00 2.16 5.09E-04 0.00 3.00E-03
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Aerococcaceae0.00 0.00 2.12E-03 2.83E-04 0.00 0.12
p__Tenericutes; c__Mollicutes; o__Acholeplasmatales; f__Acholeplasmataceae 0.00 0.00 0.06 2.54E-04 0.00 0.17
p__Firmicutes; c__Bacilli; o__UC Bacilli; f__UC Bacilli 0.00 0.000.02 1.93E-04 0.00 0.05
p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae 0.000.00 0.01 1.48E-04 0.00 0.57
p__Proteobacteria; c__Betaproteobacteria; o__Neisseriales; f__Neisseriaceae 0.00 0.00 2.60E-03 1.48E-04 0.00 0.03
p__Tenericutes; c__Mollicutes; o__Anaeroplasmatales; f__Anaeroplasmataceae 0.00 0.00 0.06 0.00 0.00 0.10
p__Coriobacteriia; c__UC Coriobacteriia; o__Coriobacteriales; f__Atopobacteriaceae 0.00 0.00 0.06 0.00 0.00 1.57
p__Spirochaetes; c__Spirochaetia; o__Spirochaetales; f__Brachyspiraceae0.00 0.00 0.21 0.00 0.00 3.48E-04
p__Coriobacteriia; c__UC Coriobacteriia; o__Coriobacteriales; f__Coriobacteriaceae 0.00 0.00 0.52 0.00 0.00 0.16
p__Elusimicrobia; c__Elusimicrobia; o__Elusimicrobiales; f__Elusimicrobiaceae0.00 0.00 0.17 0.00 0.00 0.01
p__Fusobacteria; c__Fusobacteriia; o__Fusobacteriales; f__Fusobacteriaceae 0.000.00 6.69E-04 0.00 0.00 1.44
p__Proteobacteria; c__Epsilonproteobacteria; o__Campylobacterales; f__Helicobacteraceae 0.00 0.00 0.00 0.00 0.00 0.77
p__Tenericutes; c__Mollicutes; o__Mycoplasmatales; f__Mycoplasmataceae 0.00 0.00 0.26 0.00 0.00 0.28
p__Firmicutes; c__Bacilli; o__Bacillales; f__Planococcaceae 0.00 0.00 3.27E-04 0.00 0.00 0.16
p__Firmicutes; c__Bacilli; o__Bacillales; f__Staphylococcaceae0.00 0.00 0.00 0.00 0.00 0.05
p__Coriobacteriia; c__UC Coriobacteriia; o__UC Coriobacteriia; f__UC Coriobacteriia0.000.00 4.72E-03 0.00 0.00 0.08
p__Verrucomicrobia; c__Verrucomicrobiae; o__Verrucomicrobiales; f__Verrucomicrobiaceae 0.00 0.00 0.01 0.00 0.00 0.69
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__UC Burkholderiales 0.00 0.00 0.33 0.00 0.00 0.11
Supplementary Table VI

Comparison of genus-level abundances (%) between healthy subjects and patients with enthesitis-related arthritis (ERA) using 16S-UDb-MOTHUR approach

Taxonomie groupHealthy personsPatients with ERA
MedianMinimumMaximumMedianMinimumMaximum
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Prevotella copri58.200.7877.538.870.0267.81
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Prevotella stercorea3.830.0315.630.081.16E-0318.81
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Faecalibacterium prausnitzii3.330.519.194.390.0779.55
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__UC Lachnospiraceae1.970.145.371.280.036.25
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Alloprevotella rava1.430.015.730.010.008.27
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Prevotella1.104.72E-036.270.036.54E-047.13
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Ruminococcus gnavus1.020.215.611.820.0316.98
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__UC Prevotellaceae0.953.27E-043.624.53E-030.002.36
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Dialister succinatiphilus0.910.017.870.221.62E-037.34
p__Firmicutes; c__Clostridia; o__Clostridiales; f__UC Clostridiales; g__UC Clostridiales0.700.073.610.562.44E-0314.75
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Roseburia faecis0.700.013.230.390.0210.65
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__UC Erysipelotrichaceae0.521.89E-038.760.073.60E-045.05
p__Firmicutes; c__UC Firmicutes; o__UC Firmicutes; f__UC Firmicutes; g__UC Firmicutes0.261.06E-031.670.262.32E-035.82
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Eubacteriaceae; g__Eubacterium eligens0.250.011.130.282.46E-044.11
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium cadaveris0.253.30E-031.570.041.09E-0315.51
p__Firmicutes; c__Clostridia; o__Clostridials; f__Veillonellaceae; g__Mitsuokella multacida0.226.02E-041.051.74E-030.001.06
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Peptostreptococcaceae; g__Clostridium bifermentans0.210.0128.800.142.09E-034.68
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Lachnospira pectinoschiza0.182.44E-031.490.035.18E-042.18
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Oscillospira guilliermondii0.173.70E-032.000.190.002.03
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__UC Ruminococcaceae0.172.91E-035.070.061.11E-034.01
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Subdoligranulum variabile0.140.023.510.673.14E-0312.37
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Butyrivibrio fibrisolvens0.134.26E-030.640.091.86E-032.15
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Intestinimonas butyriciproducens0.130.014.540.181.24E-0320.38
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacterial; f__Enterobacteriaceae; g__Escherichia coli0.120.016.630.810.0168.35
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Coprococcus catus0.120.038.420.192.66E-031.93
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Ruminococcus bromii0.125.28E-045.940.062.21E-046.09
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Eubacterium biforme0.123.96E-043.910.080.005.59
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__UC Porphyromonadaceae0.110.001.471.91E-030.006.02
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Blautia obeum0.100.012.460.232.51E-031.16
p__Proteobacteria; c__Gammaproteobacteria; o__Pasteurellales; f__Pasteurellaceae; g__Haemophilus parainfluenzae0.109.75E-042.990.061.85E-037.76
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Megasphaera indica0.101.66E-031.842.59E-030.001.96
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus ruminis0.104.88E-045.670.020.002.39
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Coprococcus eutactus0.081.32E-031.303.33E-030.003.90
p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae; g__Bifidobacterium kashiwanohense0.061.88E-032.670.322.10E-0310.24
pProteobacteria; c__Gammaproteobacteria; o__Aeromonadales; f__Succinivibrionaceae; g__Succinivibrio dextrinosolvens0.060.002.460.010.0010.62
p__Actinobacteria; c__Coriobacteriia; o__Coriobacteriales; fCoriobacteriaceae; g__Collinsella aerofaciens0.057.93E-0416.180.160.006.71
p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae; g__Bifidobacterium adolescentis0.051.84E-048.380.050.008.32
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Eubacteriaceae; g__Eubacterium coprostanoligenes0.055.28E-046.560.094.64E-042.65
p__Firmicutes; c__Clostridia; o__Clostridiales; f__UC Clostridiales; g__Flavonifractor plautii0.043.17E-030.660.083.48E-042.71
p__Bacteroidetes; c__UC Bacteroidetes; o__UC Bacteroidetes; f__UC Bacteroidetes; g__UC Bacteroidetes0.040.008.043.00E-030.0013.74
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Dorea formicigenerans0.040.012.730.060.001.62
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Eubacteriaceae; g__Eubacterium desmolans0.030.011.150.086.00E-042.21
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Turicibacter sanguinis0.039.75E-040.370.010.000.88
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Blautia producta0.032.12E-030.450.033.90E-040.21
p__Firmicutes; c__Bacilli; oLactobacillales; f__Streptococcaceae; g__Streptococcus0.032.12E-032.180.082.75E-034.27
p__Firmicutes; c__Clostridia; o__UC Clostridia; f__UC Clostridia; g__UC Clostridia0.032.64E-041.770.020.005.20
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Ruminococcus callidus0.030.000.221.80E-030.000.51
p__Firmicutes; c__Clostridia; o__Clostridiales; fVeillonellaceae; g__Veillonella dispar0.039.75E-042.500.081.60E-033.00
p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae; g__Bifidobacterium longum0.021.84E-033.120.371.20E-0334.32
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Sutterellaceae; g__Sutterella wadsworthensis0.020.000.690.132.84E-041.97
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Catenibacterium mitsuokai0.020.002.102.72E-040.002.21
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Comamonadaceae; g__UC Comamonadaceae0.024.72E-040.224.28E-030.007.36
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Bulleidia p-1630-c50.020.000.222.54E-040.002.59
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides0.012.51E-030.190.350.015.95
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium0.013.80E-030.100.010.002.82
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium lactatifermentans0.010.000.030.010.000.17
p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae; g__Bifidobacterium angulatum0.010.000.123.41E-030.005.22
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides dorei0.011.41E-031.460.101.55E-0320.71
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Oscillibacter valericigenes0.010.000.870.010.002.61
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides uniformis0.010.000.520.107.87E-042.75
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Parabacteroides merdae0.010.000.200.010.001.70
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__UC Clostridiaceae0.010.000.535.73E-040.000.78
p__Actinobacteria; c__UC Actinobacteria; o__UC Actinobacteria; f__UC Actinobacteria; g__UC Actinobacteria0.010.004.210.010.002.35
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides ovatus0.010.000.190.130.0048.14
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Butyrivibrio crossotus0.010.001.912.73E-030.008.38
p__Coriobacteriia; c__UC Coriobacteriia; o__Eggerthellales; f__Eggerthellaceae; g__Senegalimassilia anaerobia0.010.000.464.99E-030.000.49
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Ruminococcus bicirculans0.010.001.932.21E-030.0011.24
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Peptostreptococcaceae; g__Peptoclostridium difficile0.010.002.003.55E-030.000.11
p__Actinobacteria; c__Actinobacteria; o__Coriobacteriales; f__Coriobacteriaceae; g__Slackia isoflavoniconvertens0.010.000.473.44E-030.000.35
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides acidifaciens0.010.000.040.080.006.30
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Streptococcus infantis4.76E-031.20E-030.170.021.37E-033.31
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Klebsiella pneumoniae4.24E-031.84E-040.040.027.13E-0450.63
p__Firmicutes; c__Clostridia; o__UC Clostridia; f__UC Clostridia; g__Howardella ureilytica4.23E-030.000.110.000.000.16
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Prevotella ruminicola4.07E-030.000.431.24E-030.001.65
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Odoribacter splanchnicus3.75E-030.000.150.020.002.39
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Streptococcus luteciae3.62E-030.000.040.020.004.66
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides caccae3.27E-030.000.590.030.0015.86
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Parabacteroides distasonis3.22E-030.000.570.031.66E-047.90
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus salivarius2.81E-030.000.011.73E-040.000.10
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Sutterellaceae; g__Sutterella stercoricanis2.27E-030.000.353.44E-030.0022.14
p__Proteobacteria; c__Deltaproteobacteria; o__Desulfovibrionales; f__Desulfovibrionaceae; g__Desulfovibrio piger2.25E-030.001.290.000.000.26
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Lachnospiraceae; g__Roseburia2.18E-030.000.011.21E-030.000.01
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes finegoldii2.13E-030.000.220.030.003.05
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Comamonadaceae; g__Comamonas1.91E-030.000.020.000.000.09
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Eubacteriaceae; g__Eubacterium ventriosum1.89E-030.000.030.010.000.17
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Ruminococcus1.71E-030.000.098.16E-040.002.59
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium butyricum1.58E-030.000.162.29E-040.000.66
p__Actinobacteria; c__Actinobacteria; o__Coriobacteriales; f__Coriobacteriaceae; g__UC Coriobacteriaceae1.51E-030.000.312.80E-040.000.44
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Enterobacter hormaechei1.36E-030.000.050.010.005.08
p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Micrococcaceae; g__Rothia mucilaginosa1.31E-030.000.092.98E-030.000.17
p__Proteobacteria; c__UC Proteobacteria; o__UC Proteobacteria; f__UC Proteobacteria; g__UC Proteobacteria1.28E-030.000.443.51E-030.008.33
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides fragilis1.25E-030.000.070.010.0017.90
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Eubacteriaceae; g__Eubacterium1.24E-030.000.072.87E-040.000.24
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Veillonellaceae; g__Veillonella parvula1.12E-030.000.412.12E-030.000.11
p__Actinobacteria; c__Actinobacteria; o__Actinomycetales; f__Actinomycetaceae; g__Actinomyces odontolyticus1.07E-030.000.033.90E-030.001.15
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Allisonella histaminiformans1.04E-030.000.010.000.000.05
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Cronobacter sakazakii1.03E-030.000.051.45E-030.001.53
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Streptococcus sanguinis1.00E-030.000.400.010.000.15
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Prevotella salivae9.32E-040.000.983.48E-040.000.36
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae 1; g__Clostridium perfringens9.28E-040.000.083.00E-040.000.04
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Clostridium spiroforme9.16E-040.000.173.50E-030.000.53
p__Proteobacteria; c__Gammaproteobacteria; o__Pasteurellales; f__Pasteurellaceae; g__Actinobacillus parahaemolyticus8.89E-040.000.051.73E-040.000.12
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Carnobacteriaceae; g__Granulicatella adiacens8.53E-040.000.012.08E-030.000.29
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes putredinis7.96E-040.000.080.010.003.50
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Megamonas rupellensis7.90E-040.004.674.18E-032.87E-044.76
p__Tenericutes; c__Mollicutes; o__UC Mollicutes; f__UC Mollicutes; g__UC Mollicutes7.49E-040.008.690.010.0012.06
p__Proteobacteria; c__Deltaproteobacteria; o__Desulfovibrionales; f__Desulfovibrionaceae; g__Bilophila wadsworthia6.88E-040.000.310.000.000.28
p__Proteobacteria; c__Gammaproteobacteria; o__Pasteurellales; f__Pasteurellaceae; g__Haemophilus pittmaniae6.78E-040.000.011.48E-040.000.10
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Leuconostocaceae; g__Weissella confusa6.26E-040.000.056.00E-040.001.61
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Butyricicoccus pullicaecorum5.88E-040.000.262.80E-040.000.38
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Barnesiella intestinihominis5.71E-040.000.215.42E-040.002.25
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacterial; f__Enterobacteriaceae; g__Enterobacter cloacae5.60E-040.000.122.22E-030.000.82
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium propionicum4.37E-040.000.013.90E-040.000.98
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Acetanaerobacterium elongatum4.13E-040.000.022.71E-040.000.07
p__Firmicutes; c__Bacilli; o__Bacillales; f__Bacillales_Incertae Sedis XI; g__Gemella palaticanis3.53E-040.000.013.00E-040.000.17
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__UC Enterobacteriaceae2.96E-040.000.011.68E-030.001.12
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides plebeius2.59E-040.000.990.010.0052.68
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Acidaminococcaceae; g__Phascolarctobacterium succinatutens2.44E-040.000.382.29E-040.000.85
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Enterococcaceae; g__Enterococcus faecium2.39E-040.000.013.54E-030.006.68
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes senegalensis2.35E-040.000.029.34E-050.000.13
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Ruminococcus champanellensis2.24E-040.000.120.000.002.45
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Oxalobacteraceae; g__Oxalobacter formigenes2.10E-040.000.020.000.000.21
p__Actinobacteria; c__Coriobacteriia; o__Coriobacteriales; f__Coriobacteriaceae; g__Atopobium rimae1.64E-040.000.040.000.001.14
p__Proteobacteria; c__Epsilonproteobacteria; o__Campylobacterales; f__Campylobacteraceae; g__Campylobacter concisus1.47E-040.000.013.10E-040.000.11
p__Proteobacteria; c__Epsilonproteobacteria; o__Campylobacterales; f__Campylobacteraceae; g__Campylobacter1.18E-040.000.422.54E-040.001.66
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Eubacteriaceae; g__Eubacterium siraeum9.20E-050.000.991.33E-030.001.65
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes obesi9.20E-050.000.151.89E-040.000.20
p__Actinobacteria; c__Actinobacteria; o__Coriobacteriales; f__Coriobacteriaceae; g__Slackia exigua9.20E-050.000.100.000.000.08
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Hafnia alvei7.35E-050.000.044.85E-040.000.31
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus delbrueckii0.000.000.024.53E-030.003.38
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium leptum0.000.000.053.57E-030.000.32
p__UC Bacteria; c__UC Bacteria; o__UC Bacteria; f__UC Bacteria; g__Candidatus Saccharibacteria oral taxon0.000.000.032.14E-030.000.06
p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae; g__Bifidobacterium bifidum0.000.000.612.10E-030.001.01
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides coprocola0.000.000.041.78E-030.009.26
p__Actinobacteria; c__Coriobacteriia; o__Coriobacteriales; f__Coriobacteriaceae; g__Eggerthella lenta0.000.000.121.04E-030.000.75
p__Proteobacteria; c__Deltaproteobacteria; o__UC Deltaproteobacteria; f__UC Deltaproteobacteria; g__UC Deltaproteobacteria0.000.000.078.16E-040.001.43
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides nordii0.000.009.82E-046.00E-040.000.74
p__Proteobacteria; c__Alphaproteobacteria; o__UC Alphaproteobacteria; f__UC Alphaproteobacteria; g__UC Alphaproteobacteria0.000.002.165.09E-040.000.00
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Sutterellaceae; g__Parasutterella excrementihominis0.000.000.033.90E-040.000.54
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Dialister invisus0.000.000.043.90E-040.0012.13
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus mucosae0.000.000.023.48E-040.000.46
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Bulleidia moorei0.000.009.75E-043.31E-040.000.10
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes indistinctus0.000.000.052.87E-040.003.58
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Clostridium ramosum0.000.002.93E-032.87E-040.000.15
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Megasphaera micronuciformis0.000.000.052.84E-040.000.12
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Aerococcaceae; g__Abiotrophia defectiva0.000.002.12E-032.83E-040.000.12
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Holdemania massiliensis0.000.000.012.71E-040.000.08
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Veillonella criceti0.000.007.87E-042.71E-040.002.31
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Eubacterium dolichum0.000.000.012.54E-040.000.02
p__Tenericutes; c__Mollicutes; o__Acholeplasmatales; f__Acholeplasmataceae; g__UC Acholeplasmataceae0.000.000.062.54E-040.000.17
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Streptococcus anginosus0.000.006.81E-042.46E-040.000.05
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Bacteroidaceae; g__Bacteroides eggerthii0.000.001.00E-032.29E-040.002.49
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Butyricimonas virosa0.000.000.142.19E-040.000.72
p__Firmicutes; c__Bacilli; o__UC Bacilli; f__UC Bacilli; g__Gemella haemolysans0.000.000.021.93E-040.000.05
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Shigella sonnei0.000.001.95E-031.89E-040.000.05
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes0.000.000.021.48E-040.000.05
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus fermentum0.000.002.35E-041.48E-040.000.09
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Lactococcus lactis0.000.000.261.48E-040.001.56
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Megamonas0.000.001.201.48E-040.000.94
p__Proteobacteria; c__Betaproteobacteria; o__Neisseriales; f__Neisseriaceae; g__Neisseria meningitidis0.000.002.60E-031.48E-040.000.03
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Acidaminococcaceae; g__Acidaminococcus fermentans0.000.000.460.000.000.00
p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Acinetobacter0.000.002.01E-030.000.000.04
p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Acinetobacter baumannii0.000.003.34E-040.000.000.03
p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Acinetobacter j ohnsonii0.000.002.34E-030.000.000.09
p__Proteobacteria; c__Gammaproteobacteria; o__Pseudomonadales; f__Moraxellaceae; g__Acinetobacter lwoffii0.000.002.29E-030.000.000.55
pVerrucomicrobia; cVerrucomicrobiae; oVerrucomicrobiales; f__Verrucomicrobiaceae; g__Akkermansia muciniphila0.000.000.010.000.000.69
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes massiliensis0.000.004.47E-040.000.000.13
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Rikenellaceae; g__Alistipes onderdonkii0.000.000.080.000.000.29
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Ruminococcaceae; g__Anaerobacterium chartisolvens0.000.000.130.000.000.51
p__Tenericutes; c__Mollicutes; o__Anaeroplasmatales; f__Anaeroplasmataceae; g__Asteroleplasma anaerobium0.000.000.060.000.000.10
p__Actinobacteria; c__Actinobacteria; o__Bifidobacteriales; f__Bifidobacteriaceae; g__Bifidobacterium breve0.000.001.32E-040.000.000.23
p__Spirochaetes; c__Spirochaetia; o__Spirochaetales; f__Brachyspiraceae; g__Brachyspira corvi0.000.000.210.000.000.00
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Butyricimonas paravirosa0.000.001.34E-030.000.000.21
p__UC Bacteria; c__UC Bacteria; o__UC Bacteria; f__UC Bacteria; g__Candidatus Saccharimonas aalborgensis0.000.000.010.000.000.03
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium neonatale0.000.002.25E-030.000.000.18
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium perfringens0.000.007.05E-040.000.000.17
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium sporosphaeroides0.000.000.080.000.000.35
p__Firmicutes; c__Clostridia; o__Clostridiales; f__Clostridiaceae; g__Clostridium termitidis0.000.000.020.000.000.06
p__Coriobacteriia; c__UC Coriobacteriia; o__Coriobacteriales; f__Coriobacteriaceae; g__Collinsella tanakaei0.000.000.520.000.000.16
p__Elusimicrobia; c__Elusimicrobia; o__Elusimicrobiales; f__Elusimicrobiaceae; g__Elusimicrobium minutum0.000.000.170.000.000.01
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Enterobacter0.000.001.00E-030.000.000.09
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Enterobacter asburiae0.000.000.080.000.003.22
p__Firmicutes; c__Bacilli; oLactobacillales; f__Enterococcaceae; g__Enterococcus faecalis0.000.000.010.000.002.92
p__Firmicutes; c__Bacilli; oLactobacillales; f__Enterococcaceae; g__Enterococcus gallinarum0.000.000.000.000.000.13
p__Fusobacteria; c__Fusobacteriia; o__Fusobacteriales; f__Fusobacteriaceae; g__Fusobacterium varium0.000.006.69E-040.000.001.44
p__Actinobacteria; c__Actinobacteria; o__Coriobacteriales; f__Coriobacteriaceae; g__Gordonibacter pamelaeae0.000.000.010.000.000.07
p__Proteobacteria; c__Epsilonproteobacteria; o__Campylobacterales; f__Helicobacteraceae; g__Helicobacter macacae0.000.000.000.000.000.77
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Klebsiella oxytoca0.000.000.020.000.000.95
p__Firmicutes; c__Bacilli; o__Bacillales; f__Planococcaceae; g__Kurthia gibsonii0.000.003.27E-040.000.000.16
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus0.000.000.010.000.000.04
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus gasseri0.000.001.64E-030.000.000.44
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus oris0.000.001.31E-030.000.000.06
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus plantarum0.000.000.010.000.0020.16
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus pontis0.000.004.72E-030.000.000.09
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Morganella morganii0.000.001.20E-030.000.000.26
p__Tenericutes; c__Mollicutes; o__Mycoplasmatales; f__Mycoplasmataceae; g__Mycoplasma fermentans0.000.000.190.000.000.21
p__Tenericutes; c__Mollicutes; o__Mycoplasmatales; f__Mycoplasmataceae; g__Mycoplasma orale0.000.000.070.000.000.07
p__Coriobacteriia; c__UC Coriobacteriia; o__Coriobacteriales; f__Atopobacteriaceae; g__Olsenella scatoligenes0.000.000.060.000.001.57
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Parabacteroides goldsteinii0.000.000.020.000.000.44
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Paraprevotella clara0.000.000.040.000.000.82
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Pediococcus pentosaceus0.000.001.34E-030.000.000.18
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Acidaminococcaceae; g__Phascolarctobacterium faecium0.000.000.340.000.000.50
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Porphyromonadaceae; g__Porphyromonas asaccharolytica0.000.000.010.000.000.07
p__Bacteroidetes; c__Bacteroidia; o__Bacteroidales; f__Prevotellaceae; g__Prevotella buccalis0.000.006.81E-040.000.000.07
p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Proteus mirabilis0.000.000.000.000.000.08
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Selenomonas0.000.000.240.000.000.00
p__Firmicutes; c__Erysipelotrichia; o__Erysipelotrichales; f__Erysipelotrichaceae; g__Sharpea p-3329-23G20.000.000.170.000.000.00
p__Firmicutes; c__Bacilli; o__Bacillales; f__Staphylococcaceae; g__Staphylococcus gallinarum0.000.000.000.000.000.05
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Streptococcaceae; g__Streptococcus salivarius0.000.000.010.000.000.10
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__Sutterellaceae; g__Sutterella0.000.001.79E-030.000.000.08
p__Proteobacteria; c__Betaproteobacteria; o__Burkholderiales; f__UC Burkholderiales; g__UC Burkholderiales0.000.000.330.000.000.11
p__Coriobacteriia; c__UC Coriobacteriia; o__UC Coriobacteriia; f__UC Coriobacteriia; g__UC Coriobacteriia0.000.004.72E-030.000.000.08
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__Veillonella0.000.000.000.000.000.37
p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Leuconostocaceae; g__Weissella paramesenteroides0.000.000.040.000.000.30
p__Coriobacteriia; c__UC Coriobacteriia; o__Eggerthellales; f__Eggerthellaceae; g__UC Eggerthellaceae0.000.002.34E-030.000.000.20
p__Firmicutes; c__Negativicutes; o__Selenomonadales; f__Veillonellaceae; g__UC Veillonellaceae0.000.000.430.000.002.21E-03
Comparison of family-level abundances (%) between healthy subjects and patients with enthesitis-related arthritis (ERA) using 16S-UDb- MOTHUR approach Comparison of genus-level abundances (%) between healthy subjects and patients with enthesitis-related arthritis (ERA) using 16S-UDb-MOTHUR approach The relative abundances of the four most frequent phyla (Bacteroidetes, Firmicutes, Proteobacteria and Actinobacteria) and for the 32 species identified by both the approaches (Table II) were generally similar. However, for one species, namely Faecalibacterium prausnitzii (Family: Ruminococcaceae), the per cent abundances using the 16S-UDb-mothur-nbc approach [healthy: 3.33 (0.51-9.19), ERA: 4.39 (0.07-79.55); P<0.01] were much higher than those using the GG-UCLUST approach [healthy: 0.09 (0.02-0.63), ERA: 0.13 (0.001-1.84); P<0.01]. This indicated that the use of the optimum database and classifier combination identified by us might not only allow detection of a larger number of bacterial species in a mixture but also help better assess their abundances.
Table II

Relative abundance (in %) of different taxonomic groups detected in the real-life test dataset using the GG-UCLUST and 16S-Udb-mothur approaches. Data are shown separately for healthy subjects and those with disease

Taxonomic groupSubject group16S rRNA database and classifier algorithm used

Greengenes-UCLUST16S-UDb-mothur


MedianMinimumMaximumMedianMinimumMaximum
p_Bacteroidetes; f_Prevotellaceae; g_Prevotella; s_copriHealthy58.170.7977.1658.20.7877.53
Patient8.920.0267.538.870.0267.81
p_Bacteroidetes; f_Prevotellaceae; g_Prevotella; s_stercoreaHealthy3.860.0315.73.830.0315.63
Patient0.081.16E-0317.930.081.16E-0318.81
p_Firmicutes; f_Veillonellaceae; g_Mitsuokella; s_multacidaHealthy0.1901.040.226.02E-041.05
Patient1.75E-03011.74E-0301.06
p_Firmicutes; f_Erysipelotrichaceae; g_Eubacterium; s_biformeHealthy0.123.99E-043.850.123.96E-043.91
Patient0.0805.620.0805.59
p_Proteobacteria; f_Pasteurellaceae; g_Haemophilus; s_parainfluenzaeHealthy0.19.65E-042.980.19.75E-042.99
Patient0.061.86E-037.750.061.85E-037.76
p_Firmicutes; f_Lactobacillaceae; g_Lactobacillus; s_ruminisHealthy0.14.82E-045.70.14.88E-045.67
Patient0.0202.40.0202.39
p_Firmicutes; f_Ruminococcaceae; g_Faecalibacterium; s_prausnitziiHealthy0.090.020.623.330.519.19
Patient0.131.05E-031.844.390.0779.55
p_Actinobacteria; f_Coriobacteriaceae; g_Collinsella; s_aerofaciensHealthy0.057.99E-0416.220.057.93E-0416.18
Patient0.1706.720.1606.71
p_Actinobacteria; f_Bifidobacteriaceae; g_Bifidobacterium; s_adolescentisHealthy0.0508.40.051.84E-048.38
Patient0.0508.310.0508.32
p_Firmicutes; f_Veillonellaceae; g_Veillonella; s_disparHealthy0.032.35E-032.850.039.75E-042.5
Patient0.111.84E-033.040.081.60E-033
p_Actinobacteria; f_Bifidobacteriaceae; g_Bifidobacterium; s_longumHealthy0.021.83E-033.140.021.84E-033.12
Patient0.371.20E-0334.40.371.20E-0334.32
p_Firmicutes; f_Erysipelotrichaceae; g_Bulleidia; s_p-1630-c5Healthy0.0200.220.0200.22
Patient2.54E-0402.612.54E-0402.59
p_Bacteroidetes; f_Porphyromonadaceae; g_Parabacteroides; s_distasonisHealthy3.12E-0300.563.22E-0300.57
Patient0.031.67E-047.870.031.66E-047.9
p_Bacteroidetes; f_Bacteroidaceae; g_Bacteroides; s_caccaeHealthy3.08E-0300.453.27E-0300.59
Patient4.28E-03015.760.03015.86
p_Actinobacteria; f_Micrococcaceae; g_Rothia; s_mucilaginosaHealthy1.31E-0300.091.31E-0300.09
Patient2.98E-0300.182.98E-0300.17
p_Bacteroidetes; f_Bacteroidaceae; g_Bacteroides; s_fragilisHealthy1.25E-0300.071.25E-0300.07
Patient0.01017.960.01017.9
p_Firmicutes; f_Clostridiaceae; g_Clostridium; s_perfringensHealthy8.10E-0400.08007.05E-04
Patient2.84E-0400.03000.17
p_Proteobacteria; f_Pasteurellaceae; g_Actinobacillus; s_parahaemolyticusHealthy6.58E-0400.058.89E-0400.05
Patient1.73E-0400.11.73E-0400.12
p_Bacteroidetes; f_Bacteroidaceae; g_Bacteroides; s_plebeiusHealthy2.58E-0400.992.59E-0400.99
Patient0.01052.880.01052.68
p_Firmicutes; f_Lactobacillaceae; g_Lactobacillus; s_mucosaeHealthy000.02000.02
Patient3.51E-0400.453.48E-0400.46
p_Actinobacteria; f_Coriobacteriaceae; g_Eggerthella; s_lentaHealthy000.02000.12
Patient3.29E-0400.751.04E-0300.75
p_Firmicutes; f_Veillonellaceae; g_Veillonella; s_parvulaHealthy000.071.12E-0300.41
Patient2.71E-0400.072.12E-0300.11
p_Firmicutes; f_Erysipelotrichaceae; g_Eubacterium; s_dolichumHealthy000.01000.01
Patient2.54E-0400.022.54E-0400.02
p_Firmicutes; f_Streptococcaceae; g_Streptococcus; s_anginosusHealthy006.81E-04006.81E-04
Patient2.47E-0400.052.46E-0400.05
p_Bacteroidetes; f_Bacteroidaceae; g_Bacteroides; s_eggerthiiHealthy001.01E-03001.00E-03
Patient2.30E-0402.522.29E-0402.49
p_Firmicutes; f_Lachnospiraceae; g_Blautia; s_productaHealthy000.010.032.12E-030.45
Patient1.94E-0400.160.033.90E-040.21
p_Firmicutes; f_Clostridiaceae; g_Clostridium; s_neonataleHealthy000002.25E-03
Patient000.18000.18
p_Proteobacteria; f_Oxalobacteraceae; g_Oxalobacter ; s_formigenesHealthy000.012.10E-0400.02
Patient000.21000.21
p_Proteobacteria; f_Enterobacteriaceae; g_Morganella; s_morganiiHealthy001.22E-03001.20E-03
Patient000.27000.26
p_Proteobacteria; f_Moraxellaceae; g_Acinetobacter; s_johnsoniiHealthy002.36E-03002.34E-03
Patient000.09000.09
p_Proteobacteria; f_Moraxellaceae; g_Acinetobacter; s_lwoffiiHealthy002.30E-03002.29E-03
Patient000.55000.55
p_Verrucomicrobia; f_Verrucomicrobiaceae; g_Akkermansia; s_muciniphilaHealthy000.01000.01
Patient000.69000.69
Relative abundance (in %) of different taxonomic groups detected in the real-life test dataset using the GG-UCLUST and 16S-Udb-mothur approaches. Data are shown separately for healthy subjects and those with disease Computational performances of different approaches: The 16S-UDb-mothur-nbc approach needed a shorter computational time than the GG-UCLUST approach (59 vs. 84 min, respectively) for completion. For analysis of high-throughput 16S rRNA sequence data for bacterial mixtures, several taxonomy assignment methods, at least three 16S rRNA reference databases and various HVRs of the 16S rRNA gene were used. In this study, a unified database was constructed by merging non-ambiguous, fully annotated, full-length 16S sequences from the three commonly-used databases. The performance of various combinations of 16S sequence databases, HVRs and taxonomy assignment methods was assessed. Our analysis showed that 16S-UDb (clustered at 97% identity), the unified 16S rRNA database that we created, performed better than the currently-available 16S rRNA databases in that it was able to assign taxonomic lineage up to the family, genus and species levels to a large proportion of sequences in a test database and in a real-life dataset. Further, a combination of this database with mothur-nbc classifier had the best performance among all the database-classifier combinations, as did a region covering the V2 and V3 HVRs compared to the other HVRs. Our study had a limitation in that we clustered the reference sequences using a 97 per cent threshold. Edgar24 has reported that the use of 100 per cent identity threshold may be better than the usual 97 per cent threshold for accurate prediction. This is because the use of traditional 97 per cent threshold may place sequences from more than one species and/or genera in the same cluster, leading to some erroneous classification at genus and species levels25. Thus, there is a need to repeat this study at a future date using 100 per cent identity threshold instead of the 97 per cent identity threshold. In conclusion, our analysis shows that sequencing of V2-V3 region of the 16S rRNA, followed by analysis of the sequence data obtained using the mothur-nbc classifier of QIIME v1 and our unified 16S-UDb database () may be used for analysis of bacterial mixtures, such as those present various body sites and in environmental specimens.
  34 in total

1.  Search and clustering orders of magnitude faster than BLAST.

Authors:  Robert C Edgar
Journal:  Bioinformatics       Date:  2010-08-12       Impact factor: 6.937

2.  Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB.

Authors:  T Z DeSantis; P Hugenholtz; N Larsen; M Rojas; E L Brodie; K Keller; T Huber; D Dalevi; P Hu; G L Andersen
Journal:  Appl Environ Microbiol       Date:  2006-07       Impact factor: 4.792

3.  Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.

Authors:  Qiong Wang; George M Garrity; James M Tiedje; James R Cole
Journal:  Appl Environ Microbiol       Date:  2007-06-22       Impact factor: 4.792

4.  Comparison of species richness estimates obtained using nearly complete fragments and simulated pyrosequencing-generated fragments in 16S rRNA gene-based environmental surveys.

Authors:  Noha Youssef; Cody S Sheik; Lee R Krumholz; Fares Z Najar; Bruce A Roe; Mostafa S Elshahed
Journal:  Appl Environ Microbiol       Date:  2009-06-26       Impact factor: 4.792

5.  Natural product biosynthetic gene diversity in geographically distinct soil microbiomes.

Authors:  Boojala Vijay B Reddy; Dimitris Kallifidas; Jeffrey H Kim; Zachary Charlop-Powers; Zhiyang Feng; Sean F Brady
Journal:  Appl Environ Microbiol       Date:  2012-03-16       Impact factor: 4.792

6.  SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data.

Authors:  Evguenia Kopylova; Laurent Noé; Hélène Touzet
Journal:  Bioinformatics       Date:  2012-10-15       Impact factor: 6.937

7.  Conversion of the Amazon rainforest to agriculture results in biotic homogenization of soil bacterial communities.

Authors:  Jorge L M Rodrigues; Vivian H Pellizari; Rebecca Mueller; Kyunghwa Baek; Ederson da C Jesus; Fabiana S Paula; Babur Mirza; George S Hamaoui; Siu Mui Tsai; Brigitte Feigl; James M Tiedje; Brendan J M Bohannan; Klaus Nüsslein
Journal:  Proc Natl Acad Sci U S A       Date:  2012-12-27       Impact factor: 11.205

8.  Specificity and sensitivity of eubacterial primers utilized for molecular profiling of bacteria within complex microbial ecosystems.

Authors:  S A Huws; J E Edwards; E J Kim; N D Scollan
Journal:  J Microbiol Methods       Date:  2007-06-30       Impact factor: 2.363

9.  Characterization of the salivary microbiome in people with obesity.

Authors:  Yujia Wu; Xiaopei Chi; Qian Zhang; Feng Chen; Xuliang Deng
Journal:  PeerJ       Date:  2018-03-16       Impact factor: 2.984

10.  Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences.

Authors:  Robert C Edgar
Journal:  PeerJ       Date:  2018-04-18       Impact factor: 2.984

View more
  2 in total

1.  MetaSquare: an integrated metadatabase of 16S rRNA gene amplicon for microbiome taxonomic classification.

Authors:  Chun-Chieh Liao; Po-Ying Fu; Chih-Wei Huang; Chia-Hsien Chuang; Yun Yen; Chung-Yen Lin; Shu-Hwa Chen
Journal:  Bioinformatics       Date:  2022-05-13       Impact factor: 6.931

2.  Gut Microbiome Composition in Obese and Non-Obese Persons: A Systematic Review and Meta-Analysis.

Authors:  Mariona Pinart; Andreas Dötsch; Kristina Schlicht; Matthias Laudes; Jildau Bouwman; Sofia K Forslund; Tobias Pischon; Katharina Nimptsch
Journal:  Nutrients       Date:  2021-12-21       Impact factor: 5.717

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.