| Literature DB >> 32341166 |
Ward Deboutte1, Leen Beller2, Claude Kwe Yinda2,3, Piet Maes2, Dirk C de Graaf4, Jelle Matthijnssens1.
Abstract
Honey bees (Apis mellifera) produce an enormous economic value through their pollination activities and play a central role in the biodiversity of entire ecosystems. Recent efforts have revealed the substantial influence that the gut microbiota exert on bee development, food digestion, and homeostasis in general. In this study, deep sequencing was used to characterize prokaryotic viral communities associated with honey bees, which was a blind spot in research up until now. The vast majority of the prokaryotic viral populations are novel at the genus level, and most of the encoded proteins comprise unknown functions. Nevertheless, genomes of bacteriophages were predicted to infect nearly every major bee-gut bacterium, and functional annotation and auxiliary metabolic gene discovery imply the potential to influence microbial metabolism. Furthermore, undiscovered genes involved in the synthesis of secondary metabolic biosynthetic gene clusters reflect a wealth of previously untapped enzymatic resources hidden in the bee bacteriophage community.Entities:
Keywords: Apis mellifera; bacteriophages; prokaryotic viruses; viral metagenomics
Mesh:
Year: 2020 PMID: 32341166 PMCID: PMC7229680 DOI: 10.1073/pnas.1921859117
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Bee-associated prokaryotic viruses display a high interindividual diversity and contain a large number of unknown viral proteins. (A) Species accumulation curves as a function of the number of pools sequenced. Vertical lines indicate SDs based on 100 permutations. (B) Swarm plot reflecting putative viral contigs larger than 5 kb that were present in one sample or more (140 in total). Presence is defined as a coverage >10. A dot represents a single contig. The box shows the three quartile values, and the whiskers extend to 1.5 interquartile ranges of the lower and upper quartile. All 140 dots are drawn in the plot. (C) Edge-weighted spring-embedded layout network depicting the samples as nodes and edges as the number of contigs shared between them. Edge thickness reflects the number of contigs. Green nodes depict pools derived from healthy colonies; red nodes depict pools derived from weak colonies. Edge thickness ranges from 1 to 15. (D) GC percentage of all representative putative viral contigs as a function of their log10-transformed coverage in the pool of which the representative was derived. Log10-transformed length is indicated by color intensity. (E) Number of pVOGs found back in the putative viral contigs, normalized by the amount of predicted ORFs as a function of their log10-transformed coverage. Log10-transformed length is indicated by color intensity.
Fig. 2.Retrieved prokaryotic viruses display a significant difference in genomic variables and infect a wide range of known bee-gut bacteria. (A) Frequency of strand shift in function of coding density (number of ORFs per kilobase). Data from the bacterial dataset are indicated in blue; data from the viral dataset are indicated in orange. Boxplots for individual parameters are also denoted, and asterisks designate significance (Mann–Whitney U test; P value for coding density = 5.10−164; P value for strand shift frequency = 4.10−87). The box shows the three quartile values, and the whiskers extend to 1.5 interquartile ranges of the lower and upper quartile. Dots independently drawn fall outside of this range. (B) Maximum-likelihood phylogenetic tree for bacterial sequences included in the host-calling effort. Gray integers indicate bootstrap values. The tree is colored according to bacterial genera. Number of contigs linked to a specific bacterial species are indicated by the stacked horizontal bar plots (CRISPR-spacer counts and tRNA similarity). Shades of gray indicate the number of specific bacterial species that gave hits to a single contig (CRISPR spacers) or indicate a specific viral contig (tRNA similarity). Single contigs displaying CRISPR-spacer hits to multiple bacteria are indicated with colored tax links between the tips. A single color corresponds to a single contig.
Fig. 3.The vast majority of retrieved prokaryotic viruses cannot be classified confidently. (A) Counts indicating the classification status of the putative viral contigs using vConTACT2. “Clustered Assigned” denotes retrieved contigs falling into clusters containing reference sequences; “Clustered Not-Assigned” denotes retrieved contigs falling in clusters without reference sequences. (B) Number of clusters that contained both confidently clustered large contigs and reference sequences. (C) Number of clusters that contained both confidently clustered contigs and reference sequences. (D) Scalable force directed placement layout genome network containing the retrieved clustered putative prokaryotic viruses (red) and the viral family of reference sequences (other colors). The most prevalent viral families are indicated in yellow (Myoviridae), green (Podoviridae), and orange (Siphoviridae).
Fig. 4.Functional annotation reveals a large metabolic overlap between bacterial and prokaryotic virus proteins. (A) Number of viral and bacterial proteins included in the analysis (blue), number of clusters remaining after collapsing on 50% AA identity (orange), and amount of protein clusters with a hit through either eggnog-mapper or InterProScan. (B) Venn diagram depicting the number of protein clusters and the number of putative AMGs identified. (C) Functional network depicting the KEGG pathways represented in the viral protein clusters. Edge weight reflects the number of GO accessions associated with each pathway, ranging from 1 to 43. All edge weights larger than 1 are indicated with a number on the terminal node. Pathways outlined with a red rectangle depict functional pathways found encoded by viral genes that are reflected by the bacterial representatives of the AMGs as well.