| Literature DB >> 21890906 |
Lakshminarayan M Iyer1, Dapeng Zhang, Igor B Rogozin, L Aravind.
Abstract
The deaminase-like fold includes, in addition to nucleic acid/nucleotide deaminases, several catalytic domains such as the JAB domain, and others involved in nucleotide and ADP-ribose metabolism. Using sensitive sequence and structural comparison methods, we develop a comprehensive natural classification of the deaminase-like fold and show that its ancestral version was likely to operate on nucleotides or nucleic acids. Consequently, we present evidence that a specific group of JAB domains are likely to possess a DNA repair function, distinct from the previously known deubiquitinating peptidase activity. We also identified numerous previously unknown clades of nucleic acid deaminases. Using inference based on contextual information, we suggest that most of these clades are toxin domains of two distinct classes of bacterial toxin systems, namely polymorphic toxins implicated in bacterial interstrain competition and those that target distantly related cells. Genome context information suggests that these toxins might be delivered via diverse secretory systems, such as Type V, Type VI, PVC and a novel PrsW-like intramembrane peptidase-dependent mechanism. We propose that certain deaminase toxins might be deployed by diverse extracellular and intracellular pathogens as also endosymbionts as effectors targeting nucleic acids of host cells. Our analysis suggests that these toxin deaminases have been acquired by eukaryotes on several independent occasions and recruited as organellar or nucleo-cytoplasmic RNA modifiers, operating on tRNAs, mRNAs and short non-coding RNAs, and also as mutators of hyper-variable genes, viruses and selfish elements. This scenario potentially explains the origin of mutagenic AID/APOBEC-like deaminases, including novel versions from Caenorhabditis, Nematostella and diverse algae and a large class of fast-evolving fungal deaminases. These observations greatly expand the distribution of possible unidentified mutagenic processes catalyzed by nucleic acid deaminases. Published by Oxford University Press.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21890906 PMCID: PMC3239186 DOI: 10.1093/nar/gkr691
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Representative structures of the deaminase fold. All structural cartoons are shown in an approximately similar orientation. The α-helices are colored purple, β-sheets yellow and loops gray. The predicted and known active site residues and substrates and ligands (if known) are labeled. The β-strand which adopts different orientations in the two major deaminase divisions is shown in dark green. Surface diagrams are colored based on their positions relative to the center of the structure (outside to inside: blue to red) to illustrate the binding cleft. For the JAB domain, only the relevant portion of the dimeric Ub-substrate that interacts with the active site is rendered. Similarly, for the AICAR transformylase only the region of the B chain (the other change of the dimeric unit) that interacts with the active site pocket is rendered.
Figure 2.Reconstructed evolutionary history for the deaminase fold and key structural features. On the left is a reconstructed evolutionary history of the deaminase fold. Individual lineages are listed to the right and grouped according to the classification given in the text and Table 1. The inferred evolutionary depth of the lineages is traced by solid horizontal lines across the relative temporal epochs representing major evolutionary transitional periods shown as vertical lines. Horizontal lines are colored according to their observed phyletic distributions; the key for this coloring scheme is given at the bottom right of the figure. Dashes indicate uncertainty in terms of the origins of a lineage, while gray ellipses group lineages of relatively restricted phyletic distribution with more broadly distributed lineages, indicating that the former likely underwent rapid divergence from the latter. Known and predicted functions of the deaminases are shown next to the clade names. On the right are topologies of the two major divisions within the deaminase superfamily. Insert positions characteristic of various deaminase lineages are marked in both the evolutionary history and topology diagrams. The β-strands and α-helices of the conserved deaminase core are colored yellow and orange respectively. Additional structural elements are colored dark green. Refer to the key for coloring schemes and abbreviations. Additionally, Fu: fungi, Pl: Plants, Na: Naegleria, Oo: Oomycetes.
Phyletic distribution and synapomorphies of deaminase clades
| Clades | Phyletic distribution | Synapomorphies | Additional comments |
|---|---|---|---|
| CDD/CDA cytidine deaminases | Bacteria, sporadic in archaea eukaryotes | C[H]AE in Hel-2 (H only in minority), PCxxCRmotif in Hel-3, E at the end of Str-5 | Involved in pyrimidine salvage pathway; a distinct branch of this clade in oomycetesis fused to SAM and tudor domains. |
| Blasticidin S-deaminase(BSD) (CDD/CDA derived) | Firmicutes, actinobacteria, fungi | Same as above | Produces a modified base that is part of the antibiotic blasticidin S |
| Plant Des/Cda (CDD/CDA derived) | Plants | Same as CDD/CDA for N-terminal domain, C-terminal deaminase domain inactive | Predicted editing deaminase |
| LmjF36.5940-like | Kinetoplastids stramenopiles, chlorophytes, | Same as CDD/CDA | Kinetoplastids versions are fused to CCCH domains, and also contain a C2C2 insert between Str-1 and Str-2. All other members are fused to a Rossmann fold domain at the N-terminus. |
| PITG_06599-like | Haptophytes, stramenopiles | Same as CDD/CDA, N-terminal deaminase domain lacks the first C of the CxxC motif, C-terminal deaminase domain inactive | Contains two deaminase domains, both of which appear to be inactive |
| DYW like | Actinobacteria, bacteroidetes, firmicutes, gammaproteobacteria, ascomycetes, | K between Hel-1 and Str-1, insert between Str-2 and Hel-2 with a basic residue, HxEK motif in Hel-2, D at the end of Str-4. The classical DYW family in plants and | Eukaryotic versions are editing deaminase. Associated domains in eukaryotes: PPR, TPR, Ankyrins. Secretion pathways: T2SS, T6SS, T7SS, PrsW related. Repeats: PAAR, RHS. Peptidases involved in delivery: HINT, PrsW. Immunity proteins: Imm5 |
| BURPS668_1122 | Actinobacteria, bacteroidetes, cyanobacteria, firmicutes, β-proteobacteria, γ-proteobacteria, | RxxDxExK in Hel-2; Insert between Str-2 and Hel-2 CxxCxS motif in Hel-3, many members are truncated after Hel-3 | Secretory pathways: T2SS, T5SS, T7SS (WxG and LDxD), terminase based, T6SS, SPVB. Repeats:Hemagglutinin, RHS, PAAR, Immunoglobulin. Peptidases involved in delivery: HINT, CPD-like thiol peptidase. Immunity proteins: Imm2, Imm3, SUKH |
| Pput_2613 | Insert between Hel-1 and Str-1 and Str-2 and Hel-2;HTE motif in Hel-2; PCxxCK motif in Hel-3 | Secretory pathways: T2SS, T6SS Repeats: RHS, FN3, Immunoglobulin. Some associated with an inactive transglutaminase | |
| SCP1.201 | Actinobacteria, β-proteobacteria | P at the beginning of Hel-1, insert between Str-2 and Hel-2, [HD]xEx[KQ] in Hel-2; N at the end of Str-3, related to the | Secretory pathways: T2SS, T6SS, T7SS. Repeats:PAAR, ALF, RHS. Peptidases involved in delivery: HINT. Immunity proteins: Imm1, Imm4 |
| YwqJ | Actinobacteria, bacteroidetes, cyanobacteria, firmicutes, fusobacteria, planctomycetes, proteobacteria, basidiomycota | Gx[CH]xE in Hel-2; Insert between Str-2 and Hel-2 contains a conserved histidine; insert between Str-3 and the CxxC motif; several members are truncated after Hel-3 or Str-4 | Secretory pathways: T2SS, T5SS, T7SS (N-terminal WxGorLDxD domains), SPVB. Repeats: RHS, ALF, PAAR, hemagglutinin. Immunity proteins: SUKH3, Imm6. Associations in polytoxins:HD hydrolase, C2-like peptidase, papain-like peptidase |
| MafB19 | Actinobacteria, cyanobacteria, firmicutes, planctomycetes, proteobacteria | N at the end of Str-2, HxE in Hel-2, V at the end of Str-3,+xxCxxC motif in Hel-3, G at the beginning of Str-4 | Secretory pathways: T2SS, T5SS, T6SS, MafBN-dependent secretion. Repeats: RHS, Hemagglutinin peptidases involved in delivery: HINT. Immunity proteins: SUFU, SUKH |
| TadA-Tad2(ADAT2), Tad3 (ADAT3) | Pan-bacterial, eukaryotic, Tad3 pan-eukaryotes | E before Str-1, N in Str-2, EPClMC motif in Hel-2, basic residue after Str-4, Two helices after Str-5, E and F conserved in first C-terminal additional helix | tRNA editing deaminase; in eukaryotes Tad2 and Tad3 form a heterodimer; Tad3 lacks the E in the HxE motif; in several basidiomycetes, Tad3 is fused to a SET domain that might be involved in synthesis of a modified tRNA base or methylation of associated protein |
| Bd3614 | R before Str-1, lacks the terminal Str-5, HAExN motif in Hel-2; shares M in the CxxC motif with Tad2, CxMxC, acidic residue at the end of Str-4 | In the neighborhood of a gene encoding the 23S rRNA G2445-modifying methylase. Fused to a distinct N-terminal globular domain | |
| Tad1, ADAR | Tad1-Pan-eukaryotic, ADAR only in metazoans | D two residues before HxE motif, two adjacent arginines in Hel-2 that bind substrate, three stranded insert in CxxC motif that forms a cap over substrate pocket, DK motif in Hel-3 of which the K binds substrate, R at the end of Hel-4 that contact D of DH, Additional hairpin after Str-5 that packs with Str-2 | Tad1 involved in tRNAAl |
| RibD-like (diamino-hydroxy-phosphoribosyl aminopyrimidinedeaminase) | Pan-bacterial, sporadic in euryarchaea, plants, stramenopiles and choanoflagellates, | HxE in Str-2, insert in CxxC motif that contains a conserved H, extended insert between Str-4 and Hel-4 | Riboflavin biosynthesis pathway. Some versions in plants are inactive; usually fused to a C-terminal DHFR reductase domain.In saccharomycete yeasts, the protein is further fused to S4 and pseudouridine synthase domains at the N-terminus |
| Guanine deaminase | Pan-bacterial, sporadic in euryarchaea, eukaryotes | Obligate dimer, insert-between Str-2 and Hel-2, strand swapping of Str-5 between dimers, large helical insert between Str-4 and Str-5 | Catabolism of guanine |
| dCMP deaminase and ComE | Pan-bacterial, sporadic in archaea, dsDNA viruses, eukaryotes | Bihelical insert between Str-2 and Hel-2 that contains a Zn-binding motif with two C and a H, C between Hel-1 and Str-1 also contributes to this motif, NXXP at the end of Str-2, NA motif two residues after HxE motif, TxxxT in Str-3, Y between Str-4 and Hel-4 | Uracil biosynthetic pathway; Note: |
| AID/APOBEC | Vertebrates | Extended loop between Hel-1 and Str-1, charged residue at the end of Str-1, W in Str-3, SxS just before the PCxxC motif in Hel-3, APOBEC-4 have a CxxxxxC signature in Hel-3, basic residue in extended loop between Str-4 and Hel-4, M at the end of Str-5, two additional helices after Str-5, F in first additional Helix shared with the Tad2-TadA family, highly conserved W between the terminal helices, several basic residues in second terminal helix | Mutagenic diversification of immunity molecules, mRNA editing, mutagenic anti-viral activity; lamprey PmCDA2 fused to a C-terminal AT-hook domain; |
| Novel AID/APOBEC-like | Nematodes | HxEE motif in Hel-2, insert in the CxxC motif of Hel-4, E in Str-5, residues or elements shared with AID/APOBEC: extended loop between Str-4 and Hel-4; large hydrophobic residue (L/M) at the end of Str-5, two helices after Str-5, Da (a: aromatic) in the first additional C-terminal helix, W in second additional C-terminal helix | Fast evolving homologs of the above deaminases. The |
| Novel AID/APOBEC-like bacterial homologs | R before Str-1, D at the end of Hel-2, KxxE motif in Hel-6. Residues/elements shared with classical AID/APOBEC; deaminases: E in Hel-3, large hydrophobic residue (W) in Str-3, extended loop between Str-4 and Hel-4, V/M in Str-5, two additional helices after Str-5, D in first additional helix | Secretory pathways: SPVB. Repeats: RHS | |
| XOO_2897 | Actinobacteria, firmicutes, β-, γ-, δ-proteobacteria | E in insert between Str-3 and Hel-3, aromatic residue between Str-4 and Hel-4 shared with AID/APOBEC deaminases, truncation after Hel-4, Str-5 absent, a subset have an insert between Str-2 and Hel-2, this same subset has a C just before Str-1 | Secretory pathways: T2SS, T6SS, T7SS. Repeats: RHS, PAAR. Immunity proteins: SUKH4 |
| OTT_1508 | Actinobacteria, chloroflexi, cyanobacteria, fibrobacteres/acidobacteria, firmicutes, α and gammaproteobacteria, Fungi, | GxxK motif before the CxxC motif; Extended loop between Str-4 and Hel-4 with a conserved polar (usually H) and axxP (a: aromatic); fungal proteins have a helical insert between Str-2 and Hel-2 | Secretory pathways: T7SS, PVC, T6SS. Peptidases involved in delivery: PVC metallopeptidase Immunity: SUFU (fused). Polytoxins: HTH, DOC, ColE3, Kinase. Fungal version fused to an N-terminal α + β globular domain, Apicomplexan versions fused to tRNA guanine transglycosylase domain; intracellular parasites may have more than one copy; some fungi have lineage-specific expansions of this family |
aIndicates novel clades reported in this study.
Figure 3.Multiple alignment of the deaminase superfamily. Proteins are denoted by their gene name, species abbreviations and GI (Genbank Index) numbers separated by underscores and are further grouped by their familial associations, shown to the right of the alignment. Secondary structure assignments are shown above the alignment, where the green arrow represents the β-strand and the orange cylinder the α-helix. Helices not part of the universally conserved deaminase core are shown in a different color. Secondary structure was derived from a combination of crystal structures and alignment-based predictions. Inserts are replaced by the corresponding number of residues. Columns in the alignment are colored based on their amino acid conservation at 65% consensus. Residues shared by members of the AID/APOBEC clade are marked in a red box. A temporary id was assigned for the Emiliania huxleyi sequence and its complete sequence is available in the Supplementary Data. Red asterisks are placed at the end of sequences that are truncated and lack terminal secondary structure elements of the conserved deaminase core. The coloring scheme and consensus abbreviations are as follows: h, hydrophobic (ACFILMVWY); l, aliphatic (LIV) and a, aromatic (FWY) residues shaded yellow; b, big residues (LIYERFQKMW), shaded gray; s, small residues (AGSVCDN) and u, tiny residues (GAS), shaded green; p, polar residues (STEDKRNQHC) shaded blue; c, charged residues (DEHKR) shaded magenta and zinc coordinating residues shaded red. Strand-5 of the two distinct deaminase divisions are aligned separately given their independent emergence. Species abbreviations are as follows; Aae: Aquifex aeolicus; Acel: Acetivibrio cellulolyticus; Asp.: Actinomyces sp.; Ater: Aspergillus terreus; Atha: Arabidopsis thaliana; BPT4: Enterobacteria phage T4; Bbac: Bdellovibrio bacteriovorus; Bcer: Bacillus cereus; Bdor: Bacteroides dorei; Bpse: Burkholderia pseudomallei; Bsub: Bacillus subtilis; CAmo: Candidatus Amoebophilus; CKor: Candidatus Koribacter; Ccin: Coprinopsis cinerea; Cele: Caenorhabditis elegans; Cneo: Cryptococcus neoformans; Daci: Delftia acidovorans; Ecol: Escherichia coli; Ehux: Emiliania huxleyi; Hsap: Homo sapiens; Lmaj: Leishmania major; Lmon: Listeria monocytogenes; Mkan: Methanopyrus kandleri; Mory: Magnaporthe oryzae; Msp.: Micromonas sp.; Ncra: Neurospora crassa; Ngru: Naegleria gruberi; Nmen: Neisseria meningitidis; Nmul: Nakamurella multipartita; Nvec: Nematostella vectensis; Otsu: Orientia tsutsugamushi; Paer: Pseudomonas aeruginosa; Pbra: Pseudomonas brassicacearum; Pemar: Perkinsus marinus; Petma: Petromyzon marinus; Pinf: Phytophthora infestans; Plum: Photorhabdus luminescens; Pmar: Planctomyces maris; Pput: Pseudomonas putida; Psta: Pirellula staleyi; Psyr: Pseudomonas syringae; Rbel: Rickettsia bellii; Sare: Salinispora arenicola; Scel: Sorangium cellulosum; Scer: Saccharomyces cerevisiae; Scoe: Streptomyces coelicolor; Smoe: Selaginella moellendorffii; Tadh: Trichoplax adhaerens; Tequ: Taylorella equigenitalis; Tgon: Toxoplasma gondii; Tspi: Trichinella spiralis; Wend: Wolbachia endosymbiont; Xcam: Xanthomonas campestris; Xory: Xanthomonas oryzae.
Figure 4.Representative domain architectures of the deaminase superfamily. Proteins are denoted by their name, species and gi. Architectures are grouped based on the deaminase lineage in which they are present. Domains newly identified in this study are indicated by blue margin. For most part, standard domain names were used (as in PFAM). The various families of the SUKH superfamily of anti-toxins (e.g. Smi1, SUKH3 or SUKH4) are individually labeled. Other domain abbreviations: BactIG—a family of immunoglobulin fold domains found in bacteria; Bd3614N- N-terminal domain found in Bd3614-like deaminases; CPD—a Clostridium difficile Toxin A CPD type thiol peptidase; NT-α—N-terminal α-helical domain limited to firmicutes; PG_binding: peptidoglycan binding; various PT domains are pre-toxin domains; PseudoN—N-terminal domain limited to Pseudomonas; TM—transmembrane; Toxin_PL—Predicted papain-like peptidase toxin; SP—signal peptide; Tail_Fiber, a phage tail fiber-like peptidase; Tu—tudor; X—uncharacterized globular domains; Y—novel Rossmann fold domain. MafBN is a Neisseria-specific domain involved in toxin delivery along with the MafA lipoprotein.
Figure 5.Gene neighborhoods and contextual connection network of the deaminase superfamily. (A) Individual genes are represented as arrows pointing from the 5′- to the 3′-end of the coding frame. Genes were named according to their domain architectures. For each operon, the gene name, species name and gi of the deaminase (marked with a star) are indicated. Uncharacterized genes are shown as small gray boxes. Where possible, secretion pathways are indicated. Smi1, SUKH3 and SUKH4 are different clades of immunity proteins belonging to the SUKH superfamily. (B) Domains linked in a polypeptide are indicated by solid lines, whereas, contextual linkages between genes in operons are indicated by dashes of different colors. Lines are colored based on the deaminase clade. Black arrows indicate the polarity of domain arrangement in a polypeptide with the arrowhead pointing to the C-terminus, and white arrows show the order of genes in operons from 5′ to 3′. Multiple copies of domains or their direct linkages in operon are shown with arrow cycles. Key protein domains that correspond to diverse secretion systems (T5SS, T2SS, T7SS, T6SS, PVC, PrsW and the terminase system) are grouped together. Different deaminase clades are labeled with deaminase followed by numbers from 1 to 12. Toxin domains that are present in polytoxins are linked with bold lines. For domain abbreviations, please refer to Figure 4 legend.