| Literature DB >> 30534599 |
Tsute Chen1,2, Yanmei Huang1,2, Isabel F Escapa1,2, Prasad Gajare1, Floyd E Dewhirst1,2, Katherine P Lemon1,3.
Abstract
The expanded Human Oral Microbiome Database (eHOMD) is a comprehensive microbiome database for sites along the human aerodigestive tract that revealed new insights into the nostril microbiome. The eHOMD provides well-curated 16S rRNA gene reference sequences linked to available genomes and enables assignment of species-level taxonomy to most next-generation sequences derived from diverse aerodigestive tract sites, including the nasal passages, sinuses, throat, esophagus, and mouth. Using minimum entropy decomposition coupled with the RDP Classifier and our eHOMD V1-V3 training set, we reanalyzed 16S rRNA V1-V3 sequences from the nostrils of 210 Human Microbiome Project participants at the species level, revealing four key insights. First, we discovered that Lawsonella clevelandensis, a recently named bacterium, and Neisseriaceae [G-1] HMT-174, a previously unrecognized bacterium, are common in adult nostrils. Second, just 19 species accounted for 90% of the total sequences from all participants. Third, 1 of these 19 species belonged to a currently uncultivated genus. Fourth, for 94% of the participants, 2 to 10 species constituted 90% of their sequences, indicating that the nostril microbiome may be represented by limited consortia. These insights highlight the strengths of the nostril microbiome as a model system for studying interspecies interactions and microbiome function. Also, in this cohort, three common nasal species (Dolosigranulum pigrum and two Corynebacterium species) showed positive differential abundance when the pathobiont Staphylococcus aureus was absent, generating hypotheses regarding colonization resistance. By facilitating species-level taxonomic assignment to microbes from the human aerodigestive tract, the eHOMD is a vital resource enhancing clinical relevance of microbiome studies. IMPORTANCE The eHOMD (http://www.ehomd.org) is a valuable resource for researchers, from basic to clinical, who study the microbiomes and the individual microbes in body sites in the human aerodigestive tract, which includes the nasal passages, sinuses, throat, esophagus, and mouth, and the lower respiratory tract, in health and disease. The eHOMD is an actively curated, web-based, open-access resource. eHOMD provides the following: (i) species-level taxonomy based on grouping 16S rRNA gene sequences at 98.5% identity, (ii) a systematic naming scheme for unnamed and/or uncultivated microbial taxa, (iii) reference genomes to facilitate metagenomic, metatranscriptomic, and proteomic studies and (iv) convenient cross-links to other databases (e.g., PubMed and Entrez). By facilitating the assignment of species names to sequences, the eHOMD is a vital resource for enhancing the clinical relevance of 16S rRNA gene-based microbiome studies, as well as metagenomic studies.Entities:
Keywords: 16S; Corynebacterium; Dolosigranulum; Lawsonella; Staphylococcus; microbiota; nares; nasal; respiratory tract; sinus
Year: 2018 PMID: 30534599 PMCID: PMC6280432 DOI: 10.1128/mSystems.00187-18
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
The eHOMD outperforms comparable databases for species-level taxonomic assignment to 16S rRNA reads from nostril samples (SKn data set)
| Database | No. of reads identified | % reads identified |
|---|---|---|
| HOMDv14.5 | 22,274 | 50.2 |
| eHOMDv15.1 | 42,197 | 95.1 |
| SILVA128 | 40,597 | 91.5 |
| RDP16 | 38,815 | 87.5 |
| NCBI 16S | 38,337 | 86.4 |
| Greengenes GOLD | 31,195 | 70.3 |
Reads identified via blastn search at 98.5% identity and 98% coverage.
Performance of eHOMD and comparable databases for species-level taxonomic assignment to 16S rRNA gene data sets from sites throughout the human aerodigestive tract
| Data set | 16S | 16S | Sequencing | Sample | No. of samples | No. of reads | Database | No. of reads | % reads |
|---|---|---|---|---|---|---|---|---|---|
| Laufer- | V1-V2 | 27F, 338R | Roche/454 | Nostril | 108 children | 120,274 | eHOMDv15.1 | 96,233 | 80.0 |
| SILVA128 | 97,233 | 80.8 | |||||||
| RDP16 | 97,464 | ||||||||
| NCBI 16S | 87,082 | 72.4 | |||||||
| Allen- | V1-V3 | 27F, 534R | 454-FLX | Nasal | 10 adults | 75,310 | eHOMDv15.1 | 68,594 | 91.1 |
| SILVA128 | 69,082 | ||||||||
| RDP16 | 65,028 | 86.4 | |||||||
| NCBI 16S | 63,892 | 84.8 | |||||||
| Pei-Blaser | CL | 318F, 1519R, | CL | Esophageal | 4 adults (10 libraries | 7,414 | eHOMDv15.1 | 7,276 | |
| SILVA128 | 7,019 | 94.7 | |||||||
| RDP16 | 6,847 | 92.4 | |||||||
| NCBI 16S | 6,686 | 90.2 | |||||||
| Harris-Pace | CL | 27F, 907R | CL | Brochial | 57 children | 3,203 | eHOMDv15.1 | 2,684 | |
| SILVA128 | 2,633 | 82.2 | |||||||
| RDP16 | 2,500 | 78.1 | |||||||
| NCBI 16S | 2,427 | 75.8 | |||||||
| HMPnV1-V3 ( | V1-V3 | 27F, 534R | Roche/454 | Nostril | 227 adults | 2,338,563 | eHOMDv15.1 | 2,133,083 | |
| SILVA128 | 2,035,882 | 87.1 | |||||||
| RDP16 | 1,965,611 | 84.1 | |||||||
| NCBI 16S | 1,932,732 | 82.6 | |||||||
| van der Gast- | CL | 7F, 1510R | CL | Expectorated | 14 adults (CF) | 2,137 | eHOMDv15.1 | 2,123 | |
| SILVA128 | 2,084 | 97.5 | |||||||
| RDP16 | 2,057 | 96.3 | |||||||
| NCBI 16S | 2,045 | 95.7 | |||||||
| Flanagan- | CL | 27F, 1492R | CL | Endotracheal | 6 adults, 1 child | 3,278 | eHOMDv15.1 | 3,193 | 97.4 |
| SILVA128 | 3,199 | ||||||||
| RDP16 | 3,193 | 97.4 | |||||||
| NCBI 16S | 3,186 | 97.2 | |||||||
| Perkins- | CL | 8F, 1391R | CL | Extubated | 8 adults | 1,263 | eHOMDv15.1 | 1,008 | |
| SILVA128 | 1,000 | 79.2 | |||||||
| RDP16 | 916 | 72.5 | |||||||
| NCBI 16S | 832 | 65.9 | |||||||
Reads identified via blastn search at 98.5% identity and 98% coverage.
CL, clone library.
CF, cystic fibrosis.
See Text S1 in the supplemental material.
FIG 1The process for identifying human microbial taxa (HMTs) from the aerodigestive tract to generate the eHOMD. Schematic of the approach used to identify taxa that were added as human microbial taxa (HMT) to generate the eHOMDv15.04. The colored boxes indicate databases (blue), data sets (gray), newly added HMTs (green), and newly added eHOMDrefs for the present HMTs (orange). The performance of blastn searches is indicated by yellow ovals and performance of other tasks is indicated in white rectangles. HMT replaces the old HOMD taxonomy prefix HOT (human oral taxon). (A) Process for generating the provisional eHOMDv15.01 by adding bacterial species from culture-dependent studies. (B and C) Process for generating the provisional eHOMDv15.02 by identifying additional HMTs from a data set of 16S rRNA gene clones from human nostrils. (D and E) Process for generating the provisional eHOMDv15.03 by identifying additional candidate taxa from culture-independent studies of aerodigestive tract microbiomes. (F and G) Process for generating the provisional eHOMDv15.04 by identifying additional candidate taxa from a data set of 16S rRNA gene clones from human skin. Please see Materials and Methods for detailed description of the processes depicted in panels A to G. Abbreviations: NCBI 16S, NCBI 16 Microbial database; eHOMDref, eHOMD reference sequence; db, database; ident, identity. Data sets included SKns (11–16), Allen et al. (22), Laufer et al. (21), Pei et al. (25, 26), Harris et al. (27), and Kaspar et al. (19).
Number of species-level taxa in eHOMDv15.1 that are indistinguishable at various percent identity thresholds for 16S rRNA regions V1-V3 and V3-V4
| % identity | No. of taxa that are indistinguishable | |
|---|---|---|
| V1-V3 | V3-V4 | |
| 99 | 37 | 269 |
| 99.5 | 22 | 171 |
| 100 | 14 | 63 |
For nonnasal skin samples, the eHOMD performs best for species-level taxonomic assignment to 16S rRNA reads from oily skin sites (SKs data set)
Reads identified via blastn search at 98.5% identity and 98% coverage. The skin type is indicated in color as follows: oily (blue), dry (red), and moist (green).
Summary of eHOMD data at the phylum level
| Phylum | No. of taxa | No. of eHOMDrefs | No. of genomes |
|---|---|---|---|
| 5 | 3 | 1 | |
| 118 | 153 | 292 | |
| 125 | 179 | 133 | |
| 1 | 1 | 5 | |
| 3 | 0 | 3 | |
| 3 | 1 | 4 | |
| 1 | 2 | 1 | |
| 1 | 0 | 1 | |
| 266 | 341 | 581 | |
| 37 | 46 | 60 | |
| 5 | 3 | 2 | |
| 141 | 174 | 393 | |
| 19 | 16 | 7 | |
| 50 | 64 | 35 | |
| 8 | 15 | 8 | |
| WPS-2 | 1 | 0 | 1 |
Data were compiled at the time of writing this paper; for updated summary and at different taxonomy levels, visit the eHOMD web site (http://www.homd.org/index.php?name=HOMD&taxonomy_level=1).
FIG 2A small number of genera and species account for the majority of taxa in the HMP nares V1-V3 data set at both an overall level and individual level. (A and C) Taxa identified in the reanalysis of the HMP nostril V1-V3 data set graphed based on cumulative relative abundance of sequences at the genus level (A) and species/supraspecies level (C). The top 10 taxa are labeled. Prevalence (Prev) as a percentage is indicated by the color gradient. The genus Cutibacterium includes species formerly known as the cutaneous Propionibacterium species, e.g., Propionibacterium acnes (70). (B and D) The minimum number of taxa at the genus level (B) and species/supraspecies level (D) that accounted for 90% of the total sequences in each person’s sample based on a table of taxa ranked by cumulative abundance from greatest to least. Ten or fewer species/supraspecies accounted for 90% of the sequences in 94% of the 210 HMP participants in this reanalysis. The cumulative relative abundance of sequences does not reach 100% because 1.5% of the reads could not be assigned a genus and because 4.9% of the reads could not be assigned a species/supraspecies.
FIG 3Three common nasal species/supraspecies exhibit increased differential relative abundance when S. aureus is absent from the nostril microbiome. In contrast, no other species showed differential abundance based on the presence or absence of Neisseriaceae [G-1] bacterium HMT-174 or Lawsonella clevelandensis. (A to C) We used ANCOM to analyze species/supraspecies-level composition of the HMP nares V1-V3 data set when Neisseriaceae [G-1] bacterium HMT-174 (A), (B) L. clevelandensis (Lcl) (B), or S. aureus (Sau) (C) was either absent (−) or present (+). Results were corrected for multiple testing. The dark bar represents the median, and lower and upper hinges correspond to the first and third quartiles. Each gray dot represents the value for a sample, and multiple overlapping dots appear black. Coryne. acc_mac_tub represents the supraspecies Corynebacterium accolens_macginleyi_tuberculostearicum.