In eukaryotes and viruses that infect them, the 5' end of mRNA molecules, and also many other functionally important RNAs, are modified to form a so-called cap structure that is important for interactions of these RNAs with many nuclear and cytoplasmic proteins. The RNA cap has multiple roles in gene expression, including enhancement of RNA stability, splicing, nucleocytoplasmic transport, and translation initiation. Apart from guanosine addition to the 5' end in the most typical cap structure common to transcripts produced by RNA polymerase II (in particular mRNA), essentially all cap modifications are due to methylation. The complexity of the cap structure and its formation can range from just a single methylation of the unprocessed 5' end of the primary transcript, as in mammalian U6 and 7SK, mouse B2, and plant U3 RNAs, to an elaborate m(7)Gpppm(6,6)AmpAmpCmpm(3)Um structure at the 5' end of processed RNA in trypanosomes, which are formed by as many as 8 methylation reactions. While all enzymes responsible for methylation of the cap structure characterized to date were found to belong to the same evolutionarily related and structurally similar Rossmann Fold Methyltransferase superfamily, that uses the same methyl group donor, S-adenosylmethionine; the enzymes also exhibit interesting differences that are responsible for their distinct functions. This review focuses on the evolutionary classification of enzymes responsible for cap methylation in RNA, with a focus on the sequence relationships and structural similarities and dissimilarities that provide the basis for understanding the mechanism of biosynthesis of different caps in cellular and viral RNAs. Particular attention is paid to the similarities and differences between methyltransferases from human cells and from human pathogens that may be helpful in the development of antiviral and antiparasitic drugs.
In eukaryotes and viruses that infect them, the 5' end of mRNA molecules, and also many other functionally important RNAs, are modified to form a so-called cap structure that is important for interactions of these RNAs with many nuclear and cytoplasmic proteins. The RNA cap has multiple roles in gene expression, including enhancement of RNA stability, splicing, nucleocytoplasmic transport, and translation initiation. Apart from guanosine addition to the 5' end in the most typical cap structure common to transcripts produced by RNA polymerase II (in particular mRNA), essentially all cap modifications are due to methylation. The complexity of the cap structure and its formation can range from just a single methylation of the unprocessed 5' end of the primary transcript, as in mammalian U6 and 7SK, mouse B2, and plant U3 RNAs, to an elaborate m(7)Gpppm(6,6)AmpAmpCmpm(3)Um structure at the 5' end of processed RNA in trypanosomes, which are formed by as many as 8 methylation reactions. While all enzymes responsible for methylation of the cap structure characterized to date were found to belong to the same evolutionarily related and structurally similar Rossmann Fold Methyltransferase superfamily, that uses the same methyl group donor, S-adenosylmethionine; the enzymes also exhibit interesting differences that are responsible for their distinct functions. This review focuses on the evolutionary classification of enzymes responsible for cap methylation in RNA, with a focus on the sequence relationships and structural similarities and dissimilarities that provide the basis for understanding the mechanism of biosynthesis of different caps in cellular and viral RNAs. Particular attention is paid to the similarities and differences between methyltransferases from human cells and from human pathogens that may be helpful in the development of antiviral and antiparasitic drugs.
Nascent transcripts produced by RNA polymerases universally carry a 5′ triphosphate (5′ppp). Processed RNA molecules, such as rRNAs and tRNAs, generated from precursors whose 5′ segments were removed by nucleolytic cleavage, carry a 5′ monophosphate (5′p). In several types of cellular and viral RNAs, the 5′ end is further modified enzymatically, by a variety of modification enzymes, to introduce various chemical structures that are collectively dubbed as the “5′ caps”. This cap is absent in bacterial and archaeal transcripts.The most typical and widely studied cap modification comprises the addition of an N7-methylguanosine (m7G) linked via an inverted 5′–5′ triphosphate bridge to the 5′-terminal nucleoside of the transcript. This structure termed cap0 is a characteristic feature of transcripts that are produced by RNA polymerase II, such as messenger RNAs (mRNAs) of all eukaryotic organisms and many viral RNAs. It is typically introduced in sequential steps: (1) hydrolysis of 5′ γ-phosphate of a nascent pre-mRNA to generate a 5′ diphosphate mRNA end; (2) transfer of a guanine monophosphate nucleoside; and (3) methylation of the guanine at the N7 position. The cap0 structure was shown to be essential for cell growth of Saccharomyces cerevisiae and survival of mammalian cells; it is critical for mRNA interactions with many nuclear and cytoplasmic proteins and has multiple important roles in gene expression, including enhancement of RNA stability, splicing, nucleocytoplasmic transport, and translation initiation. Enzymes responsible for cap0 formation have been well characterized in many organisms and viruses.In many instances, m7G-capped RNAs are modified further, in particular by additional methylation steps at the cap0guanosine or methylation of the first few transcribed nucleoside residues. For instance, the cap0guanosine is modified by addition of 2 methyl groups at the N2 position, yielding a trimethylguanosine (m2,2,7G or TMG) cap, in some small nuclear RNAs (snRNAs) and nucleolar RNAs (snoRNAs) required for pre-mRNA splicing (e.g., U1, U2, U4, and U5), pre-rRNA processing (U3 and U8), and telomere addition (telomerase RNA), as well as in several selenoprotein mRNAs.In higher eukaryotes, the 5′ ends of mRNA and snRNA are modified further by ribose 2′-O-methylation on the first and second transcribed nucleosides, yielding cap1 and cap2 modifications, respectively. In humans, cap0 and cap1 methylations are found on all mRNA molecules, while about half of the capped and polyadenylated RNA molecules contain a 2′-O-methylated residue at the second transcribed position. The U1, U2, U4, and U5 snRNAs are methylated at both the first 2 positions. Cap1 and cap2 methylations in U2 snRNA are required for the formation of spliceosomal E-complex and, as a consequence, for efficient pre-mRNA splicing. In some organisms, such as in Trypanosomes, as many as 4 first residues of the nascent transcript undergo ribose methylation, to generate the cap4 structure. These additional methylation steps are often important for RNA processing, translation and stability, although their role has not been fully elucidated.Alternative capping pathways have been invented by certain viruses. For instance, in Alphaviruses, the precursor of cap0 is first methylated to a m7G triphosphate and only then connected to the 5′ end of the RNA. Nonsegmented negative-sense (nsNS) RNA viruses have evolved a different mechanism for mRNA cap formation in that the guanylyltransferase transfers GDP rather than GMP onto the 5′ end of the RNA and the resulting cap structure is first monomethylated on the ribose of the first transcribed residue (yielding GpppAm structure), and only later the guanosine is methylated to m7G.In addition to ribose 2′-O-methylation, base moieties of the first transcribed nucleosides may be methylated, thereby increasing the catalog of 5′ cap structures. In particular, the first adenine nucleoside of the transcript is often methylated at the N6 position. In Trypanosomes, the fourth uridine residue is also methylated at the N3 position. The role of these base methylations is unclear, and the enzymes responsible for these modifications remain to be characterized.Some small RNAs, including mammalian U6 and 7SK, mouse B2, and plant U3, present a completely different 5′ cap structure, which is chemically minimalistic compared to the elaborate guanosinecap. This alternative cap is generated by methylation of a γ-phosphateoxygen at the unprocessed 5′ end of the primary transcript.It is clear that apart from guanosine addition to the 5′ end, essentially all cap modifications are due to methylations. The cap structure of mRNAs in trypanosomes, m7Gpppm6,6AmpAmpCmpm3Um, is formed with as many as 8 methylation steps. In all cases that have been experimentally characterized to date, methylations of caps in all organisms and viruses are catalyzed by S-adenosyl-l-methionine (SAM)-dependent methyltransferases (). For the most common types of methylation reactions implicated in cap modification, the crystal structures of the representative proteins have been determined (). All cap methyltransferases characterized structurally belong to the Rossmann Fold Methyltransferase (RFM) superfamily. The topology of the RFM fold is very similar to the typical Rossmann fold (↓6‑↓5‑↓4‑↓1‑↓2‑↓3), with an additional, 7th ß‑strand inserted into the sheet in an antiparallel manner (↓6‑↑7‑↓5‑↓4‑↓1‑↓2‑↓3) (). The methyl group donor (SAM) binding site is formed by loops following strands 1, 2, and 3, while the substrate to be methylated is typically bound by loops following strands 4, 5, and 6. Various families of RFM enzymes exhibit fusions with other domains, extensions of termini, and insertions within the conserved RFM domain, in particular following strand 5. These elaborations of the common fold are often involved in substrate binding or in oligomerization.
Table 1.
Representative cellular and viral cap methyltransferases with experimentally characterized RNA cap methyltransferase activities. The enzymes, for which crystal structures were determined, are shown in bold.
Methylation position
base
2′-O-ribose
other
Species
cap0
TMG
cap1
cap2
cap3/cap4
ɣ-phosphate
cellular enzymes
Homo sapiens
RNMT7,26
TGS169,70
CMTr138
CMTr246
BCDIN382
Saccharomyces cerevisiae
Abd12
Tgs164
Encephalitozoon cuniculi
Ecm125
Giardia lamblia
Tgs165, Tgs268
Trypanosoma brucei
TbCmt127, TbCgm183
TbTgs184
TbMTr141,42
TbMTr227
TbMTr347
viral enzymes
Vaccinia virus
D1/D1228,29
VP3985
Flavivirus
NS560
NS560
Vesicular stomatitis virus
L protein52
L protein52
Reovirus
lambda 249
lambda 249
Bluetongue virus
VP451
VP451
SARS-Coronavirus
nsp1434
nsp16/nsp1043
Table 2.
Experimentally determined structures of cap-specific methyltransferases. “cap0 + cap1” indicates 2 activities encoded in separate domains in one polypeptide, while “cap0/cap1” indicates 2 activities associated with one domain. a—indicates structures available in the PDB, for which no corresponding articles are available in the literature.
MTase type
organism / virus
protein
ligand1
ligand2
PDB
cap0 (m7G)
Homo sapiens
RNMT
sinefungin
—
3eppa
cap0 (m7G)
Homo sapiens
RNMT
SAH
—
3bgva
cap0 (m7G)
Encephalitozoon cuniculi
RNMT
sinefungin
—
2hv986
cap0 (m7G)
Encephalitozoon cuniculi
RNMT
AzoSAM
—
1z3c87
cap0 (m7G)
Encephalitozoon cuniculi
RNMT
SAH
m7GpppG
1ri125
cap0 (m7G)
Encephalitozoon cuniculi
RNMT
—
m7GpppG
1ri225
cap0 (m7G)
Encephalitozoon cuniculi
RNMT
SAH
—
1ri325
cap0 (m7G)
Encephalitozoon cuniculi
RNMT
SAM
—
1ri425
cap0 (m7G)
Encephalitozoon cuniculi
RNMT
—
—
1ri525
cap0 (m7G)
vaccinia virus
D1, D12
SAH
—
2vdw33
cap0 (m7G)
vaccinia virus
D1, D12
SAH
—
4cke88
cap0 (m7G)
vaccinia virus
D1, D12
SAH
GTP
4ckb88
cap0 (m7G)
vaccinia virus
D1, D12
SAH
—
4ckc88
cap1 (XpppNm)
Homo sapiens
CMTr1
SAM
m7GpppGAUC
4n4839
cap1 (XpppNm)
Homo sapiens
CMTr1
SAM
m7GpppG
4n4939
cap1 (XpppNm)
Homo sapiens
CMTr1
—
—
4n4a39
cap1 (XpppNm)
vaccinia virus
VP39
SAH
m7GpppGAAAAA
1av6 36
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
m3Ade
3mag89
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
m1Ade
1b4289
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
m3Cyt
3mct89
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
m1Cyt
1bky89
cap1 (XpppNm)
vaccinia virus
VP39-D182A
—
m7G
4dcg89
cap1 (XpppNm)
vaccinia virus
VP39-E233Q
—
m7G
1eqa89
cap1 (XpppNm)
vaccinia virus
VP39-E233A
—
—
1eam89
cap1 (XpppNm)
vaccinia virus
VP39
—
—
1vp3 90
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
—
1vp9 90
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
m7GpppG
1v39 90
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
m7GpppG
1p39 90
cap1 (XpppNm)
vaccinia virus
VP39-dC26
—
m7GDP
2vp3 90
cap1 (XpppNm)
vaccinia virus
VP39
SAM
—
1vpt 35
cap1 (XpppNm)
vaccinia virus
VP39
SAH
m7,9G
1jsz91
cap1 (XpppNm)
vaccinia virus
VP39
SAH
—
1jte91
cap1 (XpppNm)
vaccinia virus
VP39
SAH
m7GpppG
1jtf91
cap1 (XpppNm)
SARS virus
ns10-ns16
SAM
—
3r2443
cap0 + cap1
reovirus
lambda2
—
—
1ej649
cap0 + cap1
bluetongue virus
VP4
—
GpppG
2jha51
cap0 + cap1
bluetongue virus
VP4
SAH
—
2jhp51
cap0 + cap1
bluetongue virus
VP4
—
m7GDP
2jh851
cap0 + cap1
bluetongue virus
VP4
—
GTP
2jh951
cap0 + cap1
bluetongue virus
VP4
—
—
2jhc51
cap0/cap1
West Nile virus
NS5
SAH
—
2oy060
cap0/cap1
Wesselsbron virus
wv-MTase
SAM
m7GpppG
3emb61
cap0/cap1
Wesselsbron virus
wv-MTase
SAM
GpppG
3elw61
cap0/cap1
Wesselsbron virus
wv-MTase
SAM
—
3elu61
cap0/cap1
Wesselsbron virus
wv-MTase
SAH
—
3ely61
cap0/cap1
Wesselsbron virus
wv-MTase
sinefungin
—
3eld61
cap0/cap1
Wesselsbron virus
wv-MTase
sinefungin
m7GpppG
3emd61
cap0/cap1
Meaban virus
mvMTase
SAH
—
2oxt62
cap0/cap1
Murray Valley enc. virus
NS5
SAH
—
2px263
cap0/cap1
Murray Valley enc. virus
NS5
SAH
—
2px463
cap0/cap1
Murray Valley enc. virus
NS5
SAH
—
2px563
cap0/cap1
Murray Valley enc. virus
NS5
SAH
m7GTP
2px863
cap0/cap1
Murray Valley enc. virus
NS5
SAH
GpppG
2pxa63
cap0/cap1
Murray Valley enc. virus
NS5
SAM
GpppA
2pxc63
cap0/cap1
Dengue virus
NS5
—
m7GpppA
2p3o59
cap0/cap1
Dengue virus
NS5
—
m7GpppG
2p4059
cap0/cap1
Dengue virus
NS5
—
m7GpppGm
2p4159
cap0/cap1
Dengue virus
NS5
—
GpppA
2p3l59
cap0/cap1
Dengue virus
NS5
—
GpppG
2p3q59
cap0/cap1
Dengue virus type 2
NS5
SAH
—
1l9k57
cap0/cap1
Dengue virus type 2
NS5
SAH
GMP
2p1d57
cap0/cap1
Dengue virus type 2
NS5
SAH
ribavirin
1r6a92
TMG (m2,2,7G)
Homo sapiens
TGS1
SAH
m7GpppG
3gdh69
TMG (m2,2,7G)
Homo sapiens
TGS1
—
m7GpppA
3egi70
mpppN
Homo sapiens
BCDIN3
SAM
—
3g07a
Figure 1.
Schematic representation of the conserved core of Rossmann-fold Methyltransferase (RFM) catalytic domains. The β-sheet is composed of 7 β-strands (gray arrows) surrounded by 6 α-helices (semi-transparent tubes) forms the fold that is typical for SAM-dependent methyltransferases. All secondary structure elements of the conserved core are labeled as α1, β1, etc. The stars indicate points of most frequent insertions and terminal fusions with other domains.
Representative cellular and viral cap methyltransferases with experimentally characterized RNA cap methyltransferase activities. The enzymes, for which crystal structures were determined, are shown in bold.Experimentally determined structures of cap-specific methyltransferases. “cap0 + cap1” indicates 2 activities encoded in separate domains in one polypeptide, while “cap0/cap1” indicates 2 activities associated with one domain. a—indicates structures available in the PDB, for which no corresponding articles are available in the literature.Schematic representation of the conserved core of Rossmann-fold Methyltransferase (RFM) catalytic domains. The β-sheet is composed of 7 β-strands (gray arrows) surrounded by 6 α-helices (semi-transparent tubes) forms the fold that is typical for SAM-dependent methyltransferases. All secondary structure elements of the conserved core are labeled as α1, β1, etc. The stars indicate points of most frequent insertions and terminal fusions with other domains.In this review, we discuss cellular and viral methyltransferases involved in 5′ cap RNA biosynthesis, with emphasis on the sequence structure relationships in the light of the experimentally determined structures of enzymes complexed with their ligands. We focus on comparison of enzymes with similar activities that generate products with chemically similar structures.Throughout the article, we follow the nomenclature of cap modifying enzymes and their products commonly used in the literature. We use terms “capX methylation” and “capX methyltransferase” (where X is a number) to refer to some enzymatic activity or its product at a particular position X. On the other hand, the term “capX structure” is used to refer to a fully modified cap structure. For instance “cap2 structure” indicates a cap methylated on the inverted guanosine and the first 2 ribosesugars in the nucleotide sequence; i.e., m7GpppN1mN2m. It should be emphasized that some of the cap methyltranserases discussed here have not yet been fully characterized and it cannot be ruled out that they act at multiple positions.
Cap-specific m7G methyltransferases
Cellular and viral RNA capguanine-N7-methyltrasferases methylate RNA with the GpppN 5′ terminus to form an m7GpppN (cap0) structure. Eukaryotic enzymes catalyze this reaction in the nucleus. Many viruses, however, replicate in the cytoplasm of their eukaryotic host, and the cellular capping machinery is not accessible for their RNAs; hence, these viruses have evolved their own capping enzymes to form a cap structure that can be recognized by the cellular translation machinery for gene expression. Examples include Flaviviridae, Nidovirales, Mononegavirales and Poxviridae (reviewed in ref. 12). While the cellular and viral mRNA capping apparatus is functionally similar, the enzyme organization differs greatly across evolution.The Abd1 protein from Saccharomyces cerevisiae is a monofunctional cap0 methyltransferase, and biochemically has been one of the best studied methyltransferases involved in cap structure biosynthesis. Its enzymatic activity is critical for yeast cell growth and the gene ABD1 that encodes the Abd1 protein is essential. Abd1 is a founding member of a protein family that is strongly conserved in eukaryotes as well as in viruses. The crystal structure of S. cerevisiaeAbd1 itself could not be determined, but eventually it was solved for its homolog, the Ecm1 protein from E. cuniculi (). Purified Ecm1 is a monomeric protein that catalyzes methyl transfer to GpppRNA to form cap0, but also to free mononucleotidesGTP, GDP or dGTP (deoxy-GTP). The methyltransferase domain in Ecm1, and by inference also in other homologous cap0 methyltransferases, exhibits the RFM fold with a characteristic insertion that forms a characteristic β-meander structure involved in the formation of the cap-binding site. This insertion is common between methyltransferases that methylate G to m7G in the RNA cap, and methyltransferases that N-methylate the amino acid glycine. This relationship between 2 different types of methyltransferases, as well as the cap0 methyltransferase structure, were correctly predicted using bioinformatics before the first structure of the cap0 methyltransferase was determined.
Figure 2.
Crystal structure of the cap0 methyltransferase from . A stick representation of the ligands bound to cap0 methyltransferase. The guanosine cap analog position was defined based on the structure deposited as 1RI2 in the PDB, and the methyl group donor position was depicted based on the structure deposited as 1RI4 in the PDB. Secondary structure elements that correspond to elements of the conserved RFM core are labeled. Secondary structure elements outside of the conserved core are not labeled.
Crystal structure of the cap0 methyltransferase from . A stick representation of the ligands bound to cap0 methyltransferase. The guanosinecap analog position was defined based on the structure deposited as 1RI2 in the PDB, and the methyl group donor position was depicted based on the structure deposited as 1RI4 in the PDB. Secondary structure elements that correspond to elements of the conserved RFM core are labeled. Secondary structure elements outside of the conserved core are not labeled.In the humancapping system, the cap0 methyltransferase (RNMT) consists of a catalytic subunit related to Abd1 and an obligate activating subunit, RAM (RNMT-activating miniprotein). The C-terminal catalytic domain of RNMT has essentially the same structure as Abd1. RNMT also has an N-terminal domain that is conserved in mammals, but not required for catalytic activity. However, it contains 2 nuclear localization signal motifs and the nuclear localization of RNMT is essential for cell viability. The cap0 methyltransferases, members of the above-mentioned family, were also identified and chartacterized in other eukaryotes, including TbCmt1 in Trypanosoma brucei, for example.As mentioned above, the viral cap0 methyltransferases possess a catalytic domain that is closely related to the eukaryotic cap0 methyltransferases, but it often functions in the context of other domains. For instance, the vaccinia virus possesses an enzyme that is composed of D1 and D12 polypeptides that execute all 3 steps in cap0 biosynthesis. The D1 subunit contains triphosphatase and guanylyltransferase activities in the N-terminal domain, and a cap0 methyltransferase domain that forms a heterodimer with the D12 subunit. The methyltransferase active site is located entirely in the D1 subunit and has a weak cap0 modification activity that is stimulated allosterically by D12. Interestingly, the D12 structure resembles a degenerate cap 2′-O-ribose methyltransferase domain (see below), but it lacks a proper SAM binding site and does not show any methyltransferase activity on its own.In the SARS-coronavirus, a nonstructural protein 14 (nsp14) was initially identified as an exoribonuclease (and termed ExoN). Later, it was shown that it also exhibits cap0 methyltransferase activity. Analysis of protein variants with substitutions of conserved residues in the ExoN (N-terminal) and methyltransferase (C-terminal) domains revealed that both active sites are functionally distinct; however, the integrity of the ExoN domain turned out to be essential for the function of the cap0 methyltransferase domain. Nsp14 shows little sequence similarity to known methyltransferases; however, its structure has not been determined experimentally, hence its phylogenetic relationships to other enzymes remain unclear.
Cap-specific 2′-O-ribose methyltransferases
A poxvirus cap1-forming enzyme (VP39 protein from vaccinia virus), was the first methyltransferase involved in the cap structure formation, for which a crystal structure was determined and also the first one for which a structure of a ternary complex of an enzyme with the cofactor and RNA substrate was determined. It has become one of the best studied members of a large family of methyltransferases that act on the 2′-OH-ribose group in RNA, which includes also enzymes such as RrmJ and fibrillarin. Although they share little sequence identity with each other, these 2′-O-ribose methyltransferases are characterized by the presence of a conserved tertiary fold characteristic for all RFM enzymes and a conserved K-D-K catalytic triad between the methyl group donor binding site, and the cap binding site. VP39 is a single-domain protein with additional structural elements at both the N- and C-termini, which wrap around the RFM core and form a binding pocket for the cap. In the ternary complex, the m7G base of the cap is bound sandwiched between 2 aromatic side chains, and oriented in such a way that the Hoogsteen edge modified by addition of the methyl group on N7 faces the protein, thus explaining the ability of VP39 to sense the methylation status of the substrate, which is the basis of its preference for substrates that already have an N7-methylated cap.In humans, cap1 formation is catalyzed by the CMTr1 enzyme. It is composed of several domains, including the N-terminal catalytic RMF domain with a conserved K-D-K triad characteristic for 2′-O-ribose methyltransferases and a guanylyltransferase-like domain that lacks catalytic residues. The N-terminal domain of CMTr1 shares a global architecture with the VP39 protein and is sufficient for cap1 activity in vitro. Interestingly, while the cofactor-binding sites, active sites, and the sites of binding of the nascent RNA chain exhibits similarities with the VP39 and CMTr1 enzymes (and likewise the conformations of the respective ligands), their cap-binding sites exhibit large differences in the shape of the m7G-binding pocket. As a result, CMTr1 binds m7G in a different way, in which the sugar edge of the capguanosine faces the protein, and the methyl group on N7 faces the solvent (). These structural differences explain why CMTr1 is relatively insensitive to the absence of cap0 methylation and therefore is able to act, at least in vitro, on substrates with unmethylated guanosine.
Figure 3.
Comparison of the crystal structures of 2′-. (A) Superimposition of the catalytic domain of human CMTr1 methyltransferase (colored black; PDB ID: 4N48), VP39 methyltransferase from the vaccinia virus (colored dark gray; PDB ID: 1AV6) and the NS5 protein from the Wesselsbron virus (colored bright gray; PDB ID: 3EMB). The ligands are shown in stick representation and they are colored corresponding to the hue used for protein molecules representation. Secondary structure elements that correspond to elements of the conserved RFM core are labeled (α1, β6, and β7 are hidden behind other elements and their labels have been omitted). Secondary structure elements outside of the conserved core are not labeled. (B) The capped oligoribonucleotide (m7GpppGAUC) located in its binding pocket on the surface of human CMTr1 MTase is shown in stick representation. The side chains of Phe206 and Glu373 that correspond to stacking residues in viral methyltransferases and the 3 catalytic residues are also displayed. (C) The crystal structure of the VP39 methyltransferase from vaccinia virus in complex with m7GpppGAAAAA (shown in stick representation). The methylated guanine ring is stacked by 2 aromatic rings of Tyr22 and Phe180. (D) A stick representation of the cap0 structure analog—m7GpppG bound by NS5 flaviviral 2′-O-ribose methyltransferase.
Comparison of the crystal structures of 2′-. (A) Superimposition of the catalytic domain of humanCMTr1 methyltransferase (colored black; PDB ID: 4N48), VP39 methyltransferase from the vaccinia virus (colored dark gray; PDB ID: 1AV6) and the NS5 protein from the Wesselsbron virus (colored bright gray; PDB ID: 3EMB). The ligands are shown in stick representation and they are colored corresponding to the hue used for protein molecules representation. Secondary structure elements that correspond to elements of the conserved RFM core are labeled (α1, β6, and β7 are hidden behind other elements and their labels have been omitted). Secondary structure elements outside of the conserved core are not labeled. (B) The capped oligoribonucleotide (m7GpppGAUC) located in its binding pocket on the surface of humanCMTr1 MTase is shown in stick representation. The side chains of Phe206 and Glu373 that correspond to stacking residues in viral methyltransferases and the 3 catalytic residues are also displayed. (C) The crystal structure of the VP39 methyltransferase from vaccinia virus in complex with m7GpppGAAAAA (shown in stick representation). The methylated guanine ring is stacked by 2 aromatic rings of Tyr22 and Phe180. (D) A stick representation of the cap0 structure analog—m7GpppG bound by NS5 flaviviral 2′-O-ribose methyltransferase.Proteins with cap1 methyltransferase activities were also characterized in the alfalfa looper moth Autographa californica nucleopolyhedrovirus (orf69) and in T. brucei (TbMTr1). Both of these enzymes are relatively closely related to the humanCMTr1 enzyme.In the SARS virus, cap1 methylation is catalyzed by a complex comprised of 2 partners: the nsp16 protein that is clearly related to the above-mentioned cap1 methyltransferases, but is inactive on its own, and a small regulatory protein nsp10 that is required for nsp16 to bind both the SAM methyl group donor and the RNA substrate. The crystal structure of the snp10-nsp16 complex showed that, in nsp16, the SAM-binding region is partially degenerated compared to “partner-independent” ribose methyltransferases, and nsp10 stabilizes the SAM binding pocket and extends the RNA-binding groove of nsp16.Apart from the enzymes responsible for cap1 methylation, methyltransferases have been characterized that act on additional residues in the nascent RNA chain. Many eukaryotic organisms possess a 2′-O-ribose methyltransferase that methylates the 2nd residue in mRNA and in other RNA molecules. The cap2 methyltransferase has been characterized in T. brucei (TbMTr2) and in humans (CMTr2). Interestingly, while CMTr2 appears to be closely related to its human paralog CMTr1 as well as to TbMTr1, TbMTr2 is more closely related to the vaccinia viruscap1 methyltransferase. In trypanosomes, a third 2′-O-ribosecap methyltransferase was identified and termed TbMTr3, which is responsible for the methylation of the third residue of the cap and is required for the methylation of the fourth residue. TbMTr3 is a close relative of TbMTr2 and of VP39, and is only remotely related to other eukaryotic cap 2′-O-ribose methyltransferases, which suggests that trypanosomes acquired enzymes for “additional” methylation by adapting proteins from viruses. A phylogenetic study of 2′-O-ribose methyltransferases revealed that the relationships between cellular and viral enzymes are quite complex, and that these proteins can vary greatly in number even in closely-related organisms. Furthermore, alveolate species were identified that possessed as many as 4 2′-O-ribose methyltransferases, suggesting that certain enzymes of this group may act with different substrate specificities or that new cap structures with additional methylation sites remain to be discovered.
Proteins with cap0 and cap1 methyltransferase activities
A number of viral proteins were reported to possess both cap0 and cap1 methyltransferase activities. In most of them, this is due to the presence of multiple domains. For instance, in the humanreovirus (a virus with a dsRNA genome), the cap structure formation is catalyzed by a large multidomain protein lambda 2, which in turn is a part of the reovirus core: an assembly with a relative molecular mass of 52 MDa that synthesizes, modifies and exports viral mRNA. The structure of the humanreovirus core has been solved at low resolution, revealing a series of domains that include a putative guanylyltransferase domain and 2 putative methyltransferase (RFM) domains. It has been suggested that the order of the domains in the lambda 2 protein corresponds to the order of the capping reactions: guanosine transfer followed by cap0 and cap1 methylation. However, comparison of domain structures suggested that the functional assignments may be different, as the RFM domain 1 shared a putative active site with the corresponding structurally characterized 2′-O-ribose methyltransferases, including the cap1 methyltransferase, whereas the RFM domain 2 exhibited structural similarity to the cap0 methyltransferases. It should be noted that the putative cap1 methyltransferase domain of reovirus exhibits a similar cap-binding platform formed by N- and C-terminal extensions, as in VP39 and humanCMTr1 enzymes; however, its putative m7G-binding site is more open.In bluetongue virus, another member of the reoviruses, the structure of the VP4 protein revealed a multi-domain protein with an N-terminal guanylyltransferase domain and 2 RFM domains, of which one was inserted into another. The inserted RFM domain exhibited clear similarities to the cap1 methyltransferases and in 3 crystal forms had GpppG, m7GDP, or GTP bound in the position of the cap-binding site, while the other RFM domain exhibited low but significant similarity to known cap0 methyltransferase structures.In the non-segmented, negative-sense single-stranded RNA viruses [order Mononegavirales (MNV)] that include pathogens such as respiratory syncytial virus, measles, mumps, rabies, parainfluenza, vesicular stomatitis virus (VSV), and Marburg and Ebola viruses, one of the common components of the viral ribonucleoprotein core is the large (L) protein, which encodes multiple functions such as the RNA-dependent RNA polymerase and activities responsible for mRNA capping, cap0 and cap1 methylation, poly(A) polymerase and protein kinase. Using bioinformatics methods, we and others predicted that the C-terminal region of that protein (conserved region VI) encodes a domain homologous to 2′-O-ribose methyltransferases and is likely to function as a cap1 methyltransferase. Later it was found that, in VSV, this region is essential not only for cap1, but also for cap0 methyltransferase activity and that the same SAM-binding site and part of the K-D-K triad is used for both reactions. The structural basis of this phenomenon remains to be determined.In Flaviviruses (positive-sense, single-stranded RNA viruses), an RFM domain with a similar dual methyltransferase function was identified. In a non-structural protein 5, the N-terminus was first unambiguously characterized as a cap1 (2′-O-ribose) methyltransferase. Later, it was shown that this domain takes part also in cap0 (m7G) methylation using the same SAM-binding site during cap synthesis. Interestingly, in these viruses, the order of methylation is different than in Mononegavirales, as cap0 methylation precedes cap1 methylation. Several structures were determined for the flavivirus cap methyltransferases known or predicted to be bifunctional, including Dengue, West Nile, Wesselbron, Meaban, and Murray Valley encephalitis viruses and they all revealed high similarity to the cap1 methyltransferases, and little if any similarity to the classical cap0 methyltransferases. It should be noted that these methyltransferases share a similar cap-binding platform structure with VP39 and humanCMTr1 enzymes (a platform formed by N- and C-terminal extensions); however, the orientation of the bound guanosine residue suggests that their mode of cap-recognition is different from both poxvirus and human enzymes ().
Tgs1/Tgs2
The enzyme responsible for the trimethylguanosine (m2,2,7G, TMG) synthesis was first identified in yeast and named yTgs1. The Tgs enzymes of budding and fission yeast and Giardia are relatively small polypeptides (239–315 amino acids) consisting of little more than an RFM methyltransferase catalytic domain (), whereas metazoan Tgs1 proteins are much larger, because they include an N-terminal extension not found in lower eukaryotes. Tgs1 activity is strictly dependent on prior cap0 (m7G) methylation, thereby restricting its activity to RNAs that were already methylated by cap0 methyltransferase. Similar substrate requirements are characteristic for the GiardiaTgs2 enzyme. Interestingly, in contrast to Tgs1 methyltransferases able to catalyze 2 sequential N2 methylation steps leading to TMGcap formation, Tgs2 activity is apparently limited to a single round of N2 methylation, resulting in the synthesis of a 2,7-dimethylguanosine (m2,7G) product. Bioinformatics analyses predicted that the Tgs enzymes are related to a large group of RFM enzymes that act on exocyclic amine groups in nucleic acid bases, including m6A, m4C, and m2G and have a characteristic NPPY-like motif at the active site. The crystal structure of the active C-terminal methyltransferase domain of the humanTGS enzyme bound to a minimal substrate m7GTP as well as the reaction product SAH has been reported, confirming these predictions and revealing the atomic details of these protein-ligand interactions.
Figure 4.
Crystal structure of the human TGS1 protein. Trimethylguanosine synthase catalyzes hypermethylation of cap0 structure. In a 2-step reaction, 2 methyl groups are transferred to the amine group of m7G and, as a result, the m2,2,7G structure is formed. The crystal structure of human TGS1 methyltransferase in complex with m7Gppp and SAH (shown in stick representation) is deposited in the PDB as 3GDH. Secondary structure elements that correspond to elements of the conserved RFM core are labeled. Secondary structure elements outside of the conserved core are not labeled.
Crystal structure of the humanTGS1 protein. Trimethylguanosine synthase catalyzes hypermethylation of cap0 structure. In a 2-step reaction, 2 methyl groups are transferred to the amine group of m7G and, as a result, the m2,2,7G structure is formed. The crystal structure of humanTGS1 methyltransferase in complex with m7Gppp and SAH (shown in stick representation) is deposited in the PDB as 3GDH. Secondary structure elements that correspond to elements of the conserved RFM core are labeled. Secondary structure elements outside of the conserved core are not labeled.
Other methyltransferases involved in cap-specific base modifications
Studies of cap composition of human mRNAs conducted in mid-70s revealed that when the first nucleotide of the transcript is an adenosine, this base can be methylated to m6A. The enzyme that catalyzes the conversion of m7GpppAm ends of mRNA to m7Gpppm6Am has been isolated from a cytoplasmic fraction of HeLa cells. The isolated enzyme showed no activity toward internal adenosines. Recently, Schwartz and coworkers studied the m6A mRNA methylome following depletion of multiprotein methyltransferase complex components METTL3, METTL14, KIAA1429, and WTAP, and implicated the involvement of the METTL3, METTL14, and KIAA1429 proteins in m6A formation at the internal sites but not at the 5′ sites. The full characterization of the cap-specific m6A methyltransferase activity requires further studies in vitro.In trypanosomes, the first adenine of the hypermethylated cap4 structure is not only methylated at the ribose, but also dimethylated at the N6 position, to form m6,6Am. The methyltransferase responsible for the latter reaction remains unknown. Further, the fourth uracil in that structure is modified to m3U, and the enzyme responsible for this modification also remains unknown.
Bin3/γ–methyltransferases
The γ-methylphosphatecap structure is unique in that it is an alternative to the guanosine-containing cap. It is formed by a single methyltransfer reaction to a γ -phosphateoxygen at the 5′ end of the primary transcripts of certain small RNA molecules such as mammalian U6 and 7SK, mouse B2 and plant U3. The enzyme responsible for this reaction, Bicoid-interacting protein 3 (Bin3), is a methyltransferase conserved in eukaryotes. It is, however, absent from S. cerevisiae. A structure of the humanBin3 homolog (BCDIN3) was determined, revealing a conserved RFM core (). An enzyme-substrate complex is not yet available, and the details of protein-RNA recognition and the mechanism of discrimination between Bin3 substrates and non-substrates remain to be determined.
Figure 5.
Crystal structure of human BCDIN3 γ-methyltransferase. A stick representation of SAM as a donor of the methyl group which is transferred by the BCDIN3 (PDB ID: 3G07) enzyme on the 5′ γ-phosphate group of the 7SK snRNA molecule. Secondary structure elements that correspond to elements of the conserved RFM core are labeled (β5 is hidden behind α5 and therefore its label has been omitted). Secondary structure elements outside of the conserved core are not labeled.
Crystal structure of humanBCDIN3 γ-methyltransferase. A stick representation of SAM as a donor of the methyl group which is transferred by the BCDIN3 (PDB ID: 3G07) enzyme on the 5′ γ-phosphate group of the 7SK snRNA molecule. Secondary structure elements that correspond to elements of the conserved RFM core are labeled (β5 is hidden behind α5 and therefore its label has been omitted). Secondary structure elements outside of the conserved core are not labeled.
Conclusions and Future Perspectives
In recent years significant progress has been made in understanding the mechanism of formation of different RNA cap structures. This progress has been driven in particular by the identification and characterization of novel methyltransferases that take part in cap biosynthesis, and by the determination of their crystal structures. This knowledge also has a practical dimension, as the capping process is essential for eukaryotic cells as well as for the life cycle of viruses that infect them. In this context, the difference between the structures of the human enzymes and the enzymes from human pathogens could be exploited to develop new drugs. In particular, viruses that evolved alternative enzymes to synthesize the same cap structures as are synthesized by human cellular machinery are attractive targets for the development of inhibitors that could specifically block viral methyltransferases.To date, numerous high-resolution structures of viral RNA capping enzymes have been determined, in particular for cap methyltransferases from various flaviviruses, which have been considered an attractive new antiviral target. Based on knowledge of structures, efforts have been made toward the identification of specific inhibitors of these enzymes. For instance, a structure-based search for new inhibitors was performed for the dengue virus methyltransferase. The development of compounds that specifically inhibit viral methyltransferases will be aided by the recent structure determination of the catalytic domain of the humancap1 methyltransferase, which shares the global architecture, but exhibits a different cap-binding site compared to the viral enzymes. The humancap1 methyltransferase appears to be essential and cannot be knocked out in human cells (our unpublished data), therefore the development of inhibitors specific against that human enzyme could be also useful as tools to study the cellular function of cap1 methylation.The study on the process of SL RNA maturation in trypanosomal parasites could benefit from structure analysis of trypanosomal methyltransferases. While the cap0 and cap1 methyltransferases in trypanosomes are relatively closely related to their human counterparts, bioinformatics analyses identified cap2 and cap3/4 methyltransferases as close homologs of the vaccinia viruscap1 methyltransferase. While the analysis of protein-RNA interactions and search for potential regulatory molecules (e.g., inhibitors) could be guided by homology models developed so far, experimental determination of high resolution structures for cap methyltransferases in trypanosomes would be definitely useful.A complete understanding of RNA cap biosynthesis requires not only structure determination of the enzymes that are well characterized biochemically, but also the identification of the genes and proteins that encode the cap methylation machinery. Some of the prominent enzymatic activities known to exist that are still awaiting unequivocal identification of the corresponding proteins include m6A methylation of the first transcribed nucleoside of capped RNAs in humans, and m6,6A and m3U methylation of the first and the fourth residues in the cap4 structure in capped RNAs in trypanosomes. A comprehensive biochemical and structural characterization of these enzymes could further contribute to the possibility of developing new drugs against trypanosomal parasites and new tools to study RNA metabolism in human cells.
Authors: Miroslaw Smietanski; Maria Werner; Elzbieta Purta; Katarzyna H Kaminska; Janusz Stepinski; Edward Darzynkiewicz; Marcin Nowotny; Janusz M Bujnicki Journal: Nat Commun Date: 2014 Impact factor: 14.919
Authors: François Ferron; Lorenzo Subissi; Ana Theresa Silveira De Morais; Nhung Thi Tuyet Le; Marion Sevajol; Laure Gluais; Etienne Decroly; Clemens Vonrhein; Gérard Bricogne; Bruno Canard; Isabelle Imbert Journal: Proc Natl Acad Sci U S A Date: 2017-12-26 Impact factor: 11.205