| Literature DB >> 12466548 |
Raja Mazumder1, Lakshminarayan M Iyer, Sona Vasudevan, L Aravind.
Abstract
2',3' Cyclic nucleotide phosphodiesterases are enzymes that catalyze at least two distinct steps in the splicing of tRNA introns in eukaryotes. Recently, the biochemistry and structure of these enzymes, from yeast and the plant Arabidopsis thaliana, have been extensively studied. They were found to share a common active site, characterized by two conserved histidines, with the bacterial tRNA-ligating enzyme LigT and the vertebrate myelin-associated 2',3' phosphodiesterases. Using sensitive sequence profile analysis methods, we show that these enzymes define a large superfamily of predicted phosphoesterases with two conserved histidines (hence 2H phosphoesterase superfamily). We identify several new families of 2H phosphoesterases and present a complete evolutionary classification of this superfamily. We also carry out a structure- function analysis of these proteins and present evidence for diverse interactions for different families, within this superfamily, with RNA substrates and protein partners. In particular, we show that eukaryotes contain two ancient families of these proteins that might be involved in RNA processing, transcriptional co-activation and post-transcriptional gene silencing. Another eukaryotic family restricted to vertebrates and insects is combined with UBA and SH3 domains suggesting a role in signal transduction. We detect these phosphoesterase modules in polyproteins of certain retroviruses, rotaviruses and coronaviruses, where they could function in capping and processing of viral RNAs. Furthermore, we present evidence for multiple families of 2H phosphoesterases in bacteria, which might be involved in the processing of small molecules with the 2',3' cyclic phosphoester linkages. The evolutionary analysis suggests that the 2H domain emerged through a duplication of a simple structural unit containing a single catalytic histidine prior to the last common ancestor of all life forms. Initially, this domain appears to have been involved in RNA processing and it appears to have been recruited to perform various other functions in later stages of evolution.Entities:
Mesh:
Substances:
Year: 2002 PMID: 12466548 PMCID: PMC137960 DOI: 10.1093/nar/gkf645
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1(Previous two pages) Multiple alignment of a selected set of 2H domains. Proteins are represented by their gene names, species abbreviations and gi numbers. The 85% consensus shown below the alignment was based on the following amino acid classes: h, hydrophobic residues (L,I,Y,F,M,W,A,C,V) and l, aliphatic (L,I,A,V) residues shaded yellow; o, alcohol (S,T) group containing residues, shaded blue. The secondary structure of the Arabidopsis Appr>p cyclic phosphodiesterase is shown above the alignment, where H denotes residues present in helices and E (extended) in strands. Family specific groupings are shown to the right of the alignment. Species abbreviations are as follows: Aae, Aquifex aeolicus; Af, A.fulgidus; Ana, Anabaena sp.PCC 7120; Ap, A.pernix; ARV, Avian rotavirus; At, Arabidopsis thaliana; Atu, Agrobacterium tumefaciens; Bmel, Brucella melitensis; Bs, B.subtilis; Bst, Bacillus stearothermophilus; BV, Berne virus; Ca, Carassius auratus; Cac, Clostridium acetobutylicum; Ccr, Caulobacter crescentus; Ce, Caenorhabditis elegans; Cgl, C.glutamicum; CIV, Chilo iridescent virus; Ddi, D.discoideum; Dm, Drosophila melanogaster; Drad, Deinococcus radiodurans; Ec, E.coli; Feac, Ferroplasma acidarmanus; FPV, Fowlpox virus; HCoV, Human coronavirus; HRV, Human rotavirus; Hs, Homo sapiens; MHV, Mouse hepatitis virus; Mj, M.jannaschii; Mkan, M.kandleri; Mlo, M.loti; Mm, Mus musculus; Mma, M.mazei Goe1; Mta, M.thermautotrophicus; Mtu, Mycobacterium tuberculosis; Pa, P.abyssi; Ph, Pyrococcus horikoshii; Psa, P.aeruginosa; Rsol, Ralstonia solanacearum; Sa, Staphylococcus aureus; Sc, S.cerevisiae; Scoe, Streptomyces coelicolor; Sme, Sinorhizobium meliloti; Sp, S.pombe; Spn, Streptococcus pneumoniae; SRV, Snakehead retrovirus; Sso, S.solfataricus; Ssp, Synechocystis sp. PCC 6803; T4, Bacteriophage T4; Tac, Thermoplasma acidophilum; Tm, Thermotoga maritima; WDSV, Walleye dermal sarcoma virus; WEHV1, Walleye epidermal hyperplasia virus type 1; WEHV2, Walleye epidermal hyperplasia virus type 2; WssV, Shrimp white spot syndrome virus; ZRV, Zebrafish endogenous retrovirus.
Figure 2Evolutionarily conserved structure of the 2H phosphoesterase domain. (A) Structure of the plant CPDase (PDB id: 1FSI) showing the secondary structure elements conserved across the superfamily. The residues involved in catalysis are shown in the ball-and-stick representation. (B) Schematic representation of the secondary structure topology of the 2H phophoesterase domain. β strands are represented as arrows, while the α helices are rods. Secondary structural element numbering is based on ascending order from the N-terminal end. Side chains comprising the catalytic core are shown in greater detail. Inserts and sequence synapomorphies are shown with the number of residues in inserts given in brackets. Note the two topologically similar and equivalent structural units.
Classification and phyletic distribution of 2H-phosphoesterases
| Bacteria | Archaea | Eukaryotes and viruses | |
|---|---|---|---|
| Group 1: Archaeo-bacterial LigT | |||
| Family 1 LigT/2′-5′ RNA ligase | Atua, Aaeb, Bmel, Bs, Bst, Ccr, Cte, Cgl, Drad, Ec, St, Psa, Rsol, Mlo, Mle, Mtu, Scoe, Sme, Tmb, Thteb, Xax, Xca, Yp | Ap, Sso, Pyae, Af, Mta, Mac, Mkan, Mma, Pa, Ph, Tac, Tvo | – |
| Family 2 tRNA-ligase-C-terminal domain-like | – | – | Cal, Sp, Sc |
| Divergent members in Group I | |||
| Bacteriophage T4-like | Bacteriophage T4 | ||
| Group II: Eukaryotic ligT | |||
| Family 1 Eukaryotic LigT-like family | – | – | Tc, Ag, Cpa, Mm, Nc, Ce, Os, At, Hs, Dm, Sp |
| Family 2 RNA virus LigT-like family | – | – | BV, HCoV, MHV, ARV, HRV |
| Family 3 | – | – | At, Dm, Ehi, Fr, Hs, Sp |
| Group III (YjcG-like) | |||
| Family 1 YjcG-like | Ana, Bs, Cac, Drad, Sa, Scoe, Tel | – | – |
| Family 2 mll4975-like α-proteobacterial family | Atu, Bmel, Mlo, Sme | – | – |
| Family 3 2H domains in UBASH3A proteins | – | – | Dm, Hs |
| Family 4 At5g40190-like plant-specific family | At | ||
| Group IV (mlr3352 like) | |||
| Family 1 mlr3352-like family | Atu, Ccr, Mlo, Spn, Ssp | – | CIV |
| Divergent members of the 2H superfamily | |||
| Family 1 Brain phosphodiesterase | – | – | Vertebrates |
| Family 2 Piscine retrovirus-polyprotein associated | – | – | SRV, WDSV, WEHV1, WEHV2, ZRV |
| Plant CPDases | – | – | At |
| DNA virus | – | – | FPV |
| DNA virus | WssV | ||
| Cpd1p | – | – | Sc |
| Faci_p1766-like | – | Feac | – |
| DD00921-like | – | – | Ddi |
| Cgl1020-like | Cgl | – | – |
| MM1887-like | – | Mma | – |
aOrganism abbreviations: Aae, A.aeolicus; Af, A.fulgidus; Ag, Anopheles gambiae; Ana, Anabaena sp. PCC 7120; Ap, A.pernix; ARV, Avian rotavirus; At, A.thaliana; Atu, A.tumefaciens; Bmel, B.melitensis; Bs, B.subtilis; Bst, B.stearothermophilus; BV, Berne virus; Ca, C.auratus; Cal, Candida albicans; Cac, C.acetobutylicum; Ccr, C.crescentus; Ce, C.elegans; Cgl, C.glutamicum; CIV, Chilo iridescent virus; Cpa, C.parvum; Cte, Chlorobium tepidum; Ddi, D.discoideum; Dm, D.melanogaster; Drad, D.radiodurans; Ec, E.coli; Ehi, E.histolytica; Feac, F.acidarmanus; Fr, Fugu rubripes; FPV, Fowlpox virus; HCoV, Human coronavirus; HRV, Human rotavirus; Hs, H.sapiens; MHV, Mouse hepatitis virus; Mac, Methanosarcina acetivorans; Mma, M.mazei Goe1; Mj, M.jannaschii; Mkan, M.kandleri; Mle, Mycobacterium leprae; Mlo, M.loti; Mm, M.musculus; Mta, M.thermautotrophicus; Mtu, M.tuberculosis; Nc, Neurospora crassa; Os, Oryza sativa; Pa, P.abyssi; Ph, P.horikoshii; Psa, P.aeruginosa; Pyae, P.aerophilum; Rsol, R.solanacearum; Sa, S.aureus; Sc, S.cerevisiae; Scoe, S.coelicolor; Sme, S.meliloti; Sp, S.pombe; Spn, S.pneumoniae; SRV, Snakehead retrovirus; Sso, S.solfataricus; ZRV, Zebrafish endogenous retrovirus; Ssp, Synechocystis sp. PCC 6803; St, Salmonella typhi; T4, Bacteriophage T4; Tac, T.acidophilum; Tc, Trypanosoma cruzi; Tel, Thermosynechococcus elongates; Tm, T.maritima; Thte, Thermoanaerobacter tengcongensis; Tvo, Thermoplasma volcanium; WDSV, Walleye dermal sarcoma virus; WEHV1, Walleye epidermal hyperplasia virus type 1; WEHV2, Walleye epidermal hyperplasia virus type 2; WssV, Shrimp white spot syndrome virus; Xax, Xanthomonas axonopodis; Xca, Xanthomonas campestris; Yp, Yersinia pestis, ZRV, Zebrafish endogenous retrovirus.
b2H phosphodiesterase of the archaeal LigT family.
Figure 3Maximum-likelihood phylogenetic tree, domain architectures and operon organization of 2H proteins. All branches with RELL bootstrap support <50% are collapsed and the values that support a node are shown in the remaining cases. Conserved gene neighborhoods (operons) that are discussed in the text are represented by boxed arrows with the gene names written within. Domain abbreviations are as follows: 2H, 2H phosphoesterase; KH, K homology; UBA, ubiquitin associated; SH3, Src homology 3; PGAM, phosphoglycerate mutase; PNK, P-loop nucleotide kinase; ZK, zinc knuckle; rvp, retroviral aspartyl protease; INT, integrase; Hismacro, phosphoesterase domain found in Macro histone 2 and the Appr-1″-p processing enzyme. Mj1316 domain is a predicted RNA binding domain typified by the protein MJ1316 protein of Methanococcus. The species abbreviations are as in Figure 1.
Figure 4Surface view of family-specific conserved residues in different 2H phosphoesterase families. The family-specific conserved areas are shown in blue and the catalytic residues are shown in red. Note the pocket forming the active site. (A) Archaeo-bacterial LigT-like group. (B) CGI-18-like eukaryotic LigT proteins. (C) Top view of the same. (D) CG16790 family. (E) YjcB family.