| Literature DB >> 28368387 |
Blair G Paul1, David Burstein2, Cindy J Castelle2, Sumit Handa3, Diego Arambula4, Elizabeth Czornyj4, Brian C Thomas2, Partho Ghosh3, Jeff F Miller4,5,6, Jillian F Banfield2,7,8, David L Valentine1,9.
Abstract
Major radiations of enigmatic Bacteria and Archaea with large inventories of uncharacterized proteins are a striking feature of the Tree of Life1-5. The processes that led to functional diversity in these lineages, which may contribute to a host-dependent lifestyle, are poorly understood. Here, we show that diversity-generating retroelements (DGRs), which guide site-specific protein hypervariability6-8, are prominent features of genomically reduced organisms from the bacterial candidate phyla radiation (CPR) and as yet uncultivated phyla belonging to the DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota and Nanohaloarchaea) archaeal superphylum. From reconstructed genomes we have defined monophyletic bacterial and archaeal DGR lineages that expand the known DGR range by 120% and reveal a history of horizontal retroelement transfer. Retroelement-guided diversification is further shown to be active in current CPR and DPANN populations, with an assortment of protein targets potentially involved in attachment, defence and regulation. Based on observations of DGR abundance, function and evolutionary history, we find that targeted protein diversification is a pronounced trait of CPR and DPANN phyla compared to other bacterial and archaeal phyla. This diversification mechanism may provide CPR and DPANN organisms with a versatile tool that could be used for adaptation to a dynamic, host-dependent existence.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28368387 PMCID: PMC5436926 DOI: 10.1038/nmicrobiol.2017.45
Source DB: PubMed Journal: Nat Microbiol ISSN: 2058-5276 Impact factor: 17.745
Figure 1Prevalence of DGRs identified in groundwater metagenomes
a, Schematic of a genomic diversity-generating retroelement cassette and the mutagenic retrohoming mechanism. b, Clusters of reverse transcriptase protein sequences at >70% global pairwise alignment displayed by filter size, with the numbers of unique/shared RT clusters in parentheses. c, Distribution of DGR occurrence in reconstructed genomes, given as a fraction of 1 Mbp for each discrete interval. d, Incidence of DGRs in the archaeal DPANN superphyla, Parcubacteria (OD1) and Microgenomates (OP11), and other CPR phyla.
Figure 2Phylogeny of diversity-generating retroelements and radiation of novel lineages
a, Phylogeny and taxonomic association of diversifiers. Sequences obtained in this study are highlighted in black on the outer edge of the tree. Shaded slices indicate either candidate superphyla comprising groundwater organisms, or bacterial phyla and phage with previously sequenced RTs. The tree was constructed with 346 sequences and 261 alignment positions. Archaeal sequences from the DPANN superphylum are indicated in either dark red (Pacearchaeota), or light red (Woesearchaeota) shaded slices. Paraphyletic groups of species with closely related RTs are indicated with the associated range of pairwise TR sequence similarity. Symbols (hexagons and triangles) highlight DGRs identified in this study, which are found in close proximity to putative prophage or mobile elements (i.e. transposons or conjugative elements). Diversifiers that have been previously examined are: BPP, a Bordetella pertussis phage; T. denticola, Treponema denticola; L. pneumophila, Legionella pneumophila strain Corby; and ANMV-1, a marine virus of an uncultivated archaeal host. White circles indicate bootstrap values >70% for basal nodes. The scale indicates amino acid substitutions per site. b, Phylogenomic tree of CPR organisms, highlighting major lineages that contain at least one DGR. The phylogenomic tree was originally presented by Hug et al. (2016).
Figure 3Putative functional classes of DGR variable proteins. a, Functional annotations for variable proteins identified in at least two distinct candidate phyla. Conserved domains and features based on structural homology are shown. Phyre2 confidence values are given for predictions and their closest known structures. b, Distribution of common variable proteins found in candidate phyla, labeled i – iv, based on the examples in (a).