| Literature DB >> 32411107 |
Kento Tominaga1, Daichi Morimoto2, Yosuke Nishimura3, Hiroyuki Ogata4, Takashi Yoshida1.
Abstract
Bacteroidetes is one of the most abundant heterotrophic bacterial taxa in the ocean and play crucial roles in recycling phytoplankton-derived organic matter. Viruses of Bacteroidetes are also expected to have an important role in the regulation of host communities. However, knowledge on marine Bacteroidetes viruses is biased toward cultured viruses from a few species, mainly fish pathogens or Bacteroidetes not abundant in marine environments. In this study, we investigated the recently reported 1,811 marine viral genomes to identify putative Bacteroidetes viruses using various in silico host prediction techniques. Notably, we used microbial metagenome-assembled genomes (MAGs) to augment the marine Bacteroidetes reference genomic data. The examined viral genomes and MAGs were derived from simultaneously collected samples. Using nucleotide sequence similarity-based host prediction methods, we detected 31 putative Bacteroidetes viral genomes. The MAG-based method substantially enhanced the predictions (26 viruses) when compared with the method that is solely based on the reference genomes from NCBI RefSeq (7 viruses). Previously unrecognized genus-level groups of Bacteroidetes viruses were detected only by the MAG-based method. We also developed a host prediction method based on the proportion of Bacteroidetes homologs in viral genomes, which detected 321 putative Bacteroidetes virus genomes including 81 that were newly recognized as Bacteroidetes virus genomes. The majority of putative Bacteroidetes viruses were detected based on the proportion of Bacteroidetes homologs in both RefSeq and MAGs; however, some were detected in only one of the two datasets. Putative Bacteroidetes virus lineages included not only relatives of known viruses but also those phylogenetically distant from the cultured viruses, such as marine Far-T4 like viruses known to be widespread in aquatic environments. Our MAG and protein homology-based host prediction approaches enhanced the existing knowledge on the diversity of Bacteroidetes viruses and their potential interaction with their hosts in marine environments.Entities:
Keywords: Bacteroidetes; Bacteroidetes virus; computational viral host prediction; environmental viral genomes; metagenome assembled genomes
Year: 2020 PMID: 32411107 PMCID: PMC7198788 DOI: 10.3389/fmicb.2020.00738
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
The number of EVGs assigned to Bacteroidetes viruses according to nucleotide based-methods (i.e., CRISPR, tRNA, BLASTn, and oligonucleotide frequency) using Bacteroidetes genomes.
| 3,695 Refseq Bacteroidetes genomes | 3 | 0 | 16 | 0 | 19 |
| 518 TARA Bacteroidetes MAGs | 1 | 14 | 18 | 5 | 38 |
FIGURE 1Proportion of the Bacteroidetes homologs in viral genomes. (A) Proportion of the Bacteroidetes homologs among protein-coding genes. (B) Proportion of the Bacteroidetes homologs among cellular organism homologs. The proportions on the right are those calculated from NCBI RefSeq and those on the left are calculated from TARA MAGs. Red and blue boxes represent cultured Bacteroidetes viruses and cultured viruses infecting other prokaryotes, respectively. The boxes represent the first quartile, median, and third quartile. Asterisks denote significance (Mann–Whitney U-test, *P < 0.05, **P < 0.001). (C) Proportion of the Bacteroidetes homologs in RefSeq (orange) and other cellular organism homologs in RefSeq (blue) of the 1,811 EVGs. (D) Proportion of the Bacteroidetes homologs in MAGs (orange) and other cellular organism homologs in TARA MAGs (blue) of the 1,811 EVGs. Scatter plots showing the proportion Bacteroidetes homologs among protein-coding genes (x-axis) and among cellular organism homologs (y-axis) for the comparison against RefSeq (E) and TARA-MAG (F). Viruses passing the cut off values for the prediction of Bacteroidetes EVGs are shown in red circles. Viruses passing the two criteria (i.e., (i) at least 7.9% or 4.2% of genes should be homologs of Bacteroidetes genes in RefSeq or TARA-MAGs, respectively; (ii) the Bacteroidetes homologs should account for at least 18.8 or 38.9% of cellular homologs in RefSeq or TARA-MAGs, respectively) but have only few Bacteroidetes homologs (RefSeq: homolog <5 genes, TARA MAGs: homolog <3 genes) are shown in gray triangles. Other viruses that did not pass the cut off values are shown in blue squares.
General genomic features of the Bacteroidetes gOTUs identified in this study.
| G160 | 13 | 9 | 37,551 | 38.5 | 2.2% | 11.1% | |
| G178 | 1 | 1 | 40,754 | 32.6 | 11.4% | 0% | |
| G185 | 4 | 2 | 54,812 | 31.7 | 1.9% | 0.5% | |
| G189 | 3 | 1 | 58,769 | 35.4 | 5.3% | 3.7% | |
| G199 | 2 | 1 | 36,245 | 35.8 | 5.9% | 2.5% | |
| G203 | 5 | 2 | 31,173 | 30.7 | 5.7% | 7.0% | |
| G204 | 3 | 3 | 32,490 | 32.4 | 4.4% | 6.4% | |
| G205 | 2 | 2 | 27,613 | 33.8 | 3.5% | 15.1% | |
| G206 | 4 | 3 | 27,672 | 35.8 | 3.2% | 7.4% | |
| G207 | 3 | 1 | 31,013 | 33.2 | 4.7% | 4.7% | |
| G210 | 8 | 5 | 34,852 | 38.2 | 7.5% | 4.6% | |
| G211 | 1 | 1 | 34,002 | 34.9 | 7.8% | 9.8% | |
| G341 | 1 | 1 | 39,514 | 39.3 | 10.2% | 10.2% | |
| G398 | 1 | 1 | 179,949 | 32.0 | 6.3% | 0.4% | T4 like |
| G405 | 1 | 1 | 143,709 | 33.4 | 8.5% | 7.3% | Far-T4 like |
| G493 | 21 | 21 | 32,686 | 33.5 | 31.8% | 5.3% | Novel sub-clade of Flavobacteriaceae group 1 |
| G494 | 3 | 3 | 31,174 | 31.7 | 22.8% | 8.0% | |
| G535 | 1 | 1 | 33,328 | 30.5 | 36.0% | 4.0% | |
| G536 | 1 | 1 | 39,973 | 35.3 | 28.6% | 7.1% | |
| G537 | 1 | 1 | 41,032 | 42.0 | 55.4% | 21.4% | |
| G541 | 4 | 4 | 33,608 | 40.6 | 43.6% | 35.1% | |
| G542 | 1 | 1 | 44,120 | 33.1 | 36.1% | 22.2% | |
| G544 | 1 | 1 | 38,581 | 32.6 | 44.1% | 8.5% | |
| G561 | 1 | 1 | 42,760 | 32.6 | 25.8% | 1.6% | Bacteroidetes viral lineages |
| G563 | 1 | 1 | 51,661 | 49.2 | 3.3% | 4.9% | |
| G790 | 1 | 1 | 58,364 | 33.9 | 34.7% | 26.4% | |
| G794 | 9 | 7 | 12,003 | 31.2 | 0% | 0% | |
| G810 | 3 | 1 | 43,470 | 46.8 | 2.0% | 2.3% | |
| G815 | 3 | 3 | 32,908 | 39.6 | 28.6% | 30.8% |
FIGURE 2An approximately maximum likelihood phylogenetic tree computed from the multiple alignment of Gp23 (major capsid protein) of TARA_ERS490346_N000037 (G405) and T4-like superfamily viruses. The protein sequences were collected from RefSeq and Lake Pavin viromes (Roux et al., 2015b). Circles indicate nodes with bootstraps higher than 0.9.
FIGURE 3Abundance of the Bacteroidetes EVGs in Global Ocean surface waters. (A) Virome fragment recruitments of Bacteroidetes EVG groups in each oceanic region. Sampling sites of TARA ocean expedition used for analysis are shown in red circle. Bar graphs represents normalized virome FPKM (fragments per kilobase per mapped million reads) of each Bacteroidetes EVG group at the site. (B) A heatmap shows normalized virome FPKM of abundant Bacteroidetes EVGs (i.e., gOTUs passing average relative abundance >0.1% and/or relative abundance >15% at least a site within Bacteroidetes EVGs). The scale bar on the left side represents FPKM value. Average FPKM values are shown in the right panel. Novel Bacteroidetes EVGs detected in this study are highlighted in red text. Oceanic regioown in the map in panel (A) are shown under x-axis.