| Literature DB >> 35663034 |
Shenghui Li1,2,3, Ruochun Guo2, Yue Zhang2, Peng Li2, Fang Chen1, Xifan Wang3,4, Jing Li5, Zhuye Jie6, Qingbo Lv2, Hao Jin2,7, Guangyang Wang1, Qiulong Yan1.
Abstract
The human oral cavity is a hotspot of numerous, mostly unexplored, viruses that are important for maintaining oral health and microbiome homeostasis. Here, we analyzed 2,792 publicly available oral metagenomes and proposed the Oral Virus Database (OVD) comprising 48,425 nonredundant viral genomes (≥5 kbp). The OVD catalog substantially expanded the known phylogenetic diversity and host specificity of oral viruses, allowing for enhanced delineation of some underrepresented groups such as the predicted Saccharibacteria phages and jumbo viruses. Comparisons of the viral diversity and abundance of different oral cavity habitats suggested strong niche specialization of viromes within individuals. The virome variations in relation to host geography and properties were further uncovered, especially the age-dependent viral compositional signatures in saliva. Overall, the viral genome catalog describes the architecture and variability of the human oral virome, while offering new resources and insights for current and future studies.Entities:
Keywords: Microbiome; Virology
Year: 2022 PMID: 35663034 PMCID: PMC9160773 DOI: 10.1016/j.isci.2022.104418
Source DB: PubMed Journal: iScience ISSN: 2589-0042
Figure 1Overview of the oral virome database
(A) Map of the world showing the number of metagenomic samples per country and the distribution of oral cavity sites. Pie plots show the proportions of oral cavity sites for each country. Bracketed number indicates the number of metagenomic samples. Detailed information of all samples is provided in Table S1.
(B) CheckV estimation of completeness (upper panel) and contamination (bottom panel) of the OVD viruses.
(C) Accumulation curves for vOTUs (upper panel), approximately genus-level groups (middle panel), and approximately family-level groups (bottom panel) from the OVD catalog.
(D) UpSet plot shows the number of vOTUs shared by existing virome databases. CHVD, Cenote Human Virome Database (only oral viruses are included); GVD, Gut Virome Database; GPD, Gut Phage Database; MGV, Metagenomic Gut Virus.
Figure 2Phylogenomic analysis of high-quality vOTUs in OVD
(A) A proteomic tree of 10,931 high-quality vOTUs. The tree was generated using ViPTreeGen (Nishimura et al., 2017). Outer rings display metadata for each vOTU: innermost ring, viral family-level taxonomic assignments; ring 2, phylum-level host assignments; ring 3: lysogenic or lytic types; and outermost ring, the sequence length of the vOTUs. See Table S3 for details.
(B) Distribution of prokaryotic hosts of high-quality vOTUs. The vOTUs are grouped at the family level, and the host taxa are shown at the family level. The number of vOTUs that had more than one predicted host is labeled by light blue color.
Figure 3Phylogenomic analysis of the predicted Saccharibacteria phages and jumbo viruses
(A) Heatmap shows the pairwise proteomic similarly among 344 Saccharibacteria phages. F1–F2 and G1–G7 represent the approximately family-level and genus-level groups, respectively. Right panel represents the host predictions of vOTUs (gray tile: the phage was predicted to infect the corresponding species). AAI, average amino acid identity. See Table S5 for details.
(B) A proteomic tree of 391 jumbo viruses. The tree was generated using ViPTreeGen (Nishimura et al., 2017). F1–F8 represent the approximately family-level groups. Outer circles display the host assignments at the phylum and family levels. The branch lengths of the tree are labeled using logarithmic coordinate (showing by gray numbers). See also Figure S8.
(C) Pie plots show the functional distribution of Kyoto Encyclopedia of Genes and Genomes (KEGG)-annotated genes for jumbo (upper panel) and non-jumbo (bottom panel) viruses.
(D) The proportions of lysogenic and lytic types for jumbo and non-jumbo viruses. See Table S6 for details of subfigures (B–D).
Figure 4Comparison of viromes among four oral cavity sites
(A) Boxplots show the viral richness (the number of observed vOTUs, bottom panel) and diversity (Shannon’s index, upper panel) of four sites. Wilcoxon rank-sum test was implemented between two groups.
(B) Principal coordinates analysis (PCoA) based on the Bray-Curtis distance of vOTU profiles. Samples are shown at the first and second principal coordinates (PC1 and PC2), and the ratio of variance contributed by these two PCs is shown. Ellipsoids represent an 80% confidence interval surrounding each group.
(C) Barplots show the proportions of site-enriched viruses grouping by their family-level taxa (left panel) and phylum-level host predictions (right panel). See Table S8 for details.
(D) Heatmap shows the occurrence rates of functions that exhibited as the top 50 functions in the vOTUs that significantly enriched in four oral sites. “+” represents the function with the highest occurrence rate in the corresponding vOTU groups. Occurrence rate represents the ratio of number of site-enriched viruses with corresponding KO to the total number of site-enriched viruses. Bold font shows several enzymes that are described as examples in Results. The comparison result of these functions is shown in Table S9.
Figure 5Alterations of the salivary virome across geography and host gender, age, and body mass index
(A) Principal coordinates analysis (PCoA) based on the Bray-Curtis distance of vOTU profiles of the salivary virome. Samples are shown at the first and second principal coordinates (PC1 and PC2), and the ratio of variance contributed by these two PCs is shown. Ellipsoids represent a 95% confidence interval surrounding each group. PERMANOVA, permutational multivariate ANOVA.
(B and C) Boxplots show the viral richness (the number of observed vOTUs) and diversity (Shannon’s index) of the salivary samples grouped by their geography (B) and gender (C).
(D and E) Distribution of the viral richness and diversity of the salivary samples at different ages (D) and BMI (E). A smooth curve is formed based on the diversity index and the age/BMI of the samples using the geom_smooth function in the R platform. Points colored gray indicate the samples without available age/BMI data.
Figure 6Alterations of the dental plaque virome across geography and host gender, age, and body mass index
(A) Principal coordinates analysis (PCoA) based on the Bray-Curtis distance of vOTU profiles of the dental plaque virome. Samples are shown at the first and second principal coordinates (PC1 and PC2), and the ratio of variance contributed by these two PCs is shown. Ellipsoids represent a 95% confidence interval surrounding each group. PERMANOVA, permutational multivariate ANOVA.
(B and C) Boxplots show the viral richness (the number of observed vOTUs) and diversity (Shannon’s index) of the dental plaque samples grouped by their geography (B) and gender (C).
(D and E) Distribution of the viral richness and diversity of the dental plaque samples at different ages (D) and BMI (E). A smooth curve is formed based on the diversity index and the age/BMI of the samples using the geom_smooth function in the R platform. Points colored gray indicate the samples without available age/BMI data.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| OVD database | This paper | |
| Scripts about virus identification | This paper | |
| CHVD database | ( | |
| GVD database | ( | |
| GPD database | ( | |
| MGV database | ( | |
| NCBI RefSeq (virus) | https://ftp.ncbi.nlm.nih.gov/refseq/release/viral | |
| Virus-Host DB | ( | |
| crAss-phages database | ( | |
| Viral proteins from Benler’s study | ( | |
| human genome GRCh38 | NCBI | |
| BUSCO (bacteria) | ( | |
| Oral SGBs database | ( | |
| KEGG database | ( | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| HMP, sequencing reads | ( | NCBI BioSample -see |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| ( | NCBI BioSample -see | |
| fastp v0.20.1 | ( | |
| bowtie2 v2.4.1 | ( | |
| SPAdes v3.14.1 | ( | |
| CheckV v0.7.0 | ( | |
| DeepVirFinder v1.0 | ( | |
| VIBRANT v1.2.1 | ( | |
| VirSorter2 v2.2.2 | ( | |
| hmmsearch v3.3.1 | ( | |
| BLAST v2.9.0 | ( | |
| Prodigal v2.6.3 | ( | |
| DIAMOND v2.0.6.144 | ( | |
| MinCED v0.4.2 | ( | |
| ViPTreeGen v1.1.2 | ( | |
| iTOL v6.3.2 | ( | |
| R v4.0.3 | https:// | |