Literature DB >> 18367473

Presence and role of cytosine methylation in DNA viruses of animals.

Karin Hoelzer1, Laura A Shackelton, Colin R Parrish.   

Abstract

Nucleotide composition varies greatly among DNA viruses of animals, yet the evolutionary pressures and biological mechanisms driving these patterns are unclear. One of the most striking discrepancies lies in the frequency of CpG (the dinucleotide CG, linked by a phosphate group), which is underrepresented in most small DNA viruses (those with genomes below 10 kb) but not in larger DNA viruses. Cytosine methylation might be partially responsible, but research on this topic has focused on a few virus groups. For several viruses that integrate their genome into the host genome, the methylation status during this stage has been studied extensively, and the relationship between methylation and viral-induced tumor formation has been examined carefully. However, for actively replicating viruses--particularly small DNA viruses--the methylation status of CpG motifs is rarely known and the effects on the viral life cycle are obscure. In vertebrate host genomes, most cytosines at CpG sites are methylated, which in vertebrates acts to regulate gene expression and facilitates the recognition of unmethylated, potentially pathogen-associated DNA. Here we briefly introduce cytosine methylation before reviewing what is currently known about CpG methylation in DNA viruses.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18367473      PMCID: PMC2396429          DOI: 10.1093/nar/gkn121

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

CpG underrepresentation in vertebrate genomes

The denotation ‘CpG’ is shorthand for the occurrence of a cytosine linked, through a phosphate bond, to a guanine. CpGs are underrepresented in most eukaryote genomes, but the frequency varies widely among species and is negatively correlated with the presence and extend of cytosine methylation in the genome (1,2). In vertebrate genomes, CpGs are present at one-third to one-fourth of the expected frequency, yet the reasons are disputed (1–4). Cytosine and guanine tend to have higher stacking energies than adenine and thymine, so structural constrains may be important in CpG avoidance (5). In many species, the proportion of tRNAs containing CpG in their anticodons is lower than that of tRNAs with other dinucleotides; therefore, the transcription efficiency might be higher for codons not containing CpGs (6). Another explanation may lie in the fact that unmethylated CpGs can stimulate innate immune responses, potentially resulting in autoimmune reactions (7–9); therefore, large numbers of CpGs may be detrimental if not all are methylated. Finally, methylated cytosines have a tendency for spontaneous deamination, which may also account for the CpG depletion (10–12). While the deamination of unmethylated cytosines leads to uracil and can be corrected by cellular DNA repair machinery, the transition of methylated cytosines to thymines is irreversible, leading to elevated mutation frequencies in highly methylated genomes. In general, vertebrate genomic CpGs are highly methylated. Sixty to 90% of genomic CpGs are thought to be in a methylated state (13,14), but both CpG frequency and methylation patterns can vary widely across a single vertebrate genome. Notably, some regions are CpG-enriched yet practically devoid of methylation. These sequence stretches, termed ‘CpG islands’, are >500 bp in length and comprise ∼1% of total genomic DNA [e.g. the human genome contains more than 29 000 such islands, with estimates reaching as high as 45 000 islands per haploid genome (15,16)]. The islands are often associated with 5′ promoter regions of housekeeping genes [for reviews see, among others, (15,17,18)].

Vertebrate CpG methyltransferases

Vertebrate genomes are methylated by DNA methyltransferases (DNMTs) which convert cytosine to 5-methylcytosine (5Me-cytosine). DNMTs are functionally divided into ‘de novo’ (DNMT3a and DNMT3b) and ‘maintenance’ (DNMT1) methyltransferases (Table 1). Their catalytic domains appear highly conserved across species and S-adenosyl methionine (SAM) appears to function as the only methyl donor [reviews of mammalian DNA methyltransferases can be found in (19–22)]. DNMT expression levels in nontransformed tissues have rarely been studied explicitly. The available data indicates the translation of all three DNMTs in the majority of tissues, but translation levels seem to vary between tissues, cell differentiation levels, host developmental stages and potentially host species. Robertson et al. (23) analyzed DNMT mRNA levels in various human tissues and found DNMT1, 3a and 3b expressed in nearly all analyzed fetal and adult tissues. In adult tissues, DNMT1 appeared to be generally more highly expressed than DNMT3a and 3b, and DNMT1 mRNA was detected in all analyzed tissues except the small intestine. DNMT3a and 3b mRNAs were detected in large quantities in heart, skeletal muscle, thymus, kidney, liver and peripheral blood mononuclear cells, but were present at low levels in all other analyzed tissues. DNMT1 and 3a seemed to be expressed at high levels in all analyzed fetal tissues, while DNMT 3b was expressed at high levels only in fetal liver. The authors proposed fetal hepatic hematopoiesis as the reason for the observed high levels of DNMT3b in the fetal liver. In bovine fetuses, Golding et al. (24) found DNMT3b highly expressed in rumen, kidney, testes and lung while the highest mRNA levels were again detected in the liver. MessengerRNAs of all three DNMTs were again detected in all of the adult bovine tissues analyzed, but highest levels were detected in kidney, brain and testes. Mizuno et al. (25) detected mRNAs for all three DNMTs in human neutrophils, monocytes, T lymphocytes, bone marrow cells and CD34-positive immune cells, but mRNA levels varied between cell types. DNMT1 appeared to be expressed at high levels in all cell types except neutrophiles, while DNMT3a mRNA levels in neutrophiles and T lymphocytes appeared high. DNMT3b mRNA levels were low in differentiated cells such as neutrophiles, but high in bone marrow and particularly CD34-positive cells.
Table 1.

Overview of vertebrate DNA methyltransferases and their functions

DNMTFunctional roleReference
De-novo DNMTs
DNMT 3aEmbryonic development, methylation of CpG sites, meiosis induction in sperm(?)(110–112)
DNMT 3bEmbryonic development, spermatogenesis(?)(110,113)
DNMT 3LMaternal genomic imprinting, silencing of retrotransposons in spermatogonial stem cells(114)
Maintenance DNMTs
DNMT 1Cellular maintenance methylation, maintenance of imprinting, silencing of mobile elements during genomic demethylation, contribution to histone deacetylases(115–118)
DNMT 2Unclear. Methylation in Drosophila melanogaster(?)(119)

The enzymatic functions attributed to known DNMTs are summarized here, and some key references are provided. Some simplifications were made for the purpose of clarity, and the reader is referred to specific reviews of DNMTs, indicated in the text, for more detail. Cases where the functional role has been proposed, but not yet established conclusively, are indicated by question marks.

Overview of vertebrate DNA methyltransferases and their functions The enzymatic functions attributed to known DNMTs are summarized here, and some key references are provided. Some simplifications were made for the purpose of clarity, and the reader is referred to specific reviews of DNMTs, indicated in the text, for more detail. Cases where the functional role has been proposed, but not yet established conclusively, are indicated by question marks. Methylation patterns are generally stable, sustained by DNMT1 and are inherited by both daughter DNA molecules during mitosis. However, during meiosis both parental genomes are demethylated prior to fertilization, and cellular methylation patterns become established de novo during embryogenesis. This process usually starts during implantation and is finished by the end of gastrulation (26). The development of site-specific methylation patterns is crucial for normal embryonic development (27–29) and is responsible for genomic imprinting and for X chromosome silencing (30,31).

Methylation-induced gene silencing

CpG methylation acts to suppress transcription in several ways (32). It can directly prevent the binding of transcription factors to the promoter regions of genes, most likely due to steric hindrance. Alternatively, a number of proteins, known as methyl-CpG binding (MeCP) proteins, can selectively bind methylated CpG sites. One of these, Kaiso, recognizes methylated CpG sites through a zinc-finger motif, while another subgroup shares a conserved methyl-CpG-binding domain (MBD). The MBD proteins MBD 1, 2 and MeCP2, as well as Kaiso, all function as transcriptional repressors. MBD 3, however, associates with the nucleosome remodeling and histone deacetylation (NuRD) complex, a co-repressor complex containing histone deacetylases. Deacetylation of histones increases their affinity for DNA, leading to the formation of inactive heterochromatin (30,33,34).

Methylation-induced silencing of foreign DNA

Methylation is involved in the inactivation of integrated foreign DNA, such as retrotransposons, proviral sequences and other transposable elements (TEs) (33). These ‘parasitic’ DNAs constitute a quantitatively significant portion of mammalian genomes, comprising, for example, ∼40% of the human genome. They are notably GC rich and, within mammalian genomes, are hypermethylated at CpG motifs. In several studies, the inhibition of de novo or maintenance methylation was associated with increased transcription levels of TEs, indicating that effective TE silencing requires both de novo and maintenance methylation, likely catalyzed by DNMT3L and DNMT1, respectively (35,36).

Methylation and innate immunity

CpG methylation also plays an important role in immune surveillance and the detection of pathogens, as unmethylated CpGs are a signature of bacterial and other pathogen-associated DNA. CpG-containing oligodeoxynucleotides (ODNs), even as short as 6 nt in length, can be potent stimulators of the vertebrate immune system, resulting in the activation of a wide variety of immune cells such as B cells, NK cells, monocytes/macrophages and dendritic cells (DCs). Stimulation with CpG ODNs directly activates plasmacytoid dendritic cells (PDCs) and B cells. The PDCs secrete large amounts of IFN-α and IL-12 which subsequently stimulate monocytes, myeloid DCs, T cells and type-1 NK cells (37–39). One pathway through which unmethylated CpGs trigger innate immune responses is by binding a member of the Toll-like receptor (TLR) family, TLR9 [see, as one example, (40)]. TLR9 is expressed at high levels on subsets of human and mouse DCs and gene homologs have been found in many other mammalian species. The degree to which particular CpG sequences stimulate immune responses depends upon their flanking nucleotide sequences, and motifs which confer maximal stimulation are at least partially species-specific (41). For example, ODNs containing the motif GACGTT very efficiently stimulate murine and rabbit B cells, but only weakly stimulate human B cells, which are most effectively stimulated by GTCGTT (42). Some similarities in recognition sequences among related species appear likely. For example, GTCGTT motifs are highly immune stimulatory in a variety of vertebrates including humans, dogs, cats, cows, chicken and nonhuman primates (41). ODNs that are thymine rich at the 3′ end and contain a TpC dinucleotide at their 5′ end appear to be generally potent immune stimulators, while those that contain CpG motifs toward the 3′ end appear to be less immunogenic (41). Generalities, however, may be misleading in some cases, as resulting immune effects are thought to be highly cell-type and species specific (43–45). CpG dinucleotides in vertebrate genomes are most commonly preceded by a C and followed by a G (46), a signature which does not appear to be immune stimulatory. Motifs such as TTCCGA, on the other hand, appear to be potent stimulators of the human immune system and may play a role in autoimmune diseases such as lupus erythematosus. Other CpG motifs, for example those containing poly(G) sequences, are thought to inhibit NF-κB-induced immune stimulation (47). The relatively low occurrence of stimulatory motifs in mammalian DNA might explain the lower immunogenicity of unmethylated vertebrate DNA compared to bacterial DNA.

Stimulatory motifs in DNA virus genomes

The genomes of some, but not all, viruses are also lacking many such stimulatory sequences. For example, likely stimulatory sequences are highly underrepresented in the genomes of adenovirus serotypes 2 and 5, but not in serotype 12. Adenovirus type 2 and 5 can cause persistent infections, their DNA is not immune stimulatory and it might even suppress immune stimulation by DNA from other sources. Type 12, however, does not cause persistent infections, and its DNA appears highly immune stimulatory. Experimental evidence for a correlation between the stimulatory potential of the viral DNA and the number of supposedly stimulatory motifs has been presented (48). Table 2 summarizes the frequencies of certain putatively stimulatory and nonstimulatory motifs in genomes from several members of different viral families. General conclusions are difficult to draw since the biological effects of individual motifs likely differ between virus families and host species. Moreover, the limited genome size and CpG underrepresentation complicates inference for small DNA viruses. However, certain patterns appear noticeable even without using formal statistical inference. The ratio of total stimulatory to total nonstimulatory motifs is higher than expected based on the viral nucleotide composition for Frog adenovirus, Frog virus 3 and herpes simplex virus type-1 (HSV1), while it is lower than expected for all other viruses for which this value can be calculated. Putatively stimulatory motifs are rarely present more frequently than expected based on the viral nucleotide composition, with the exception of HSV1 and Frog virus 3. Putatively nonstimulatory motifs, on the other hand, are present at higher frequency than expected in several cases, as seen for several adeno- and herpesviruses. The driving forces determining this nucleotide composition are difficult to disentangle. HSV1 induces mainly subclinical, persistent and recurrent infections but HSV1 DNA has been shown to efficiently stimulate TLR9 responses (49). The mammalian genomes examined contain a functional TLR9 ortholog, but chickens lack an orthologous TLR9 gene, and the same is likely true for most if not all avian species. Unfortunately, data on other avian species are thus far limited (50) and it is not known, in most cases, whether amphibian species have a functional TLR9. Several other factors likely influence presence and distribution of CpG-containing motifs, but many of these remain to be understood.
Table 2.

Frequency of TLR9 stimulatory and non-stimulatory/inhibitory sequences

GpC and CpG contents, as well as individual nucleotide and motif frequencies, were determined from the publicly available reference sequences (RefSeq) in Genbank (MatLab script available from the authors upon request). Sequence length, frequencies and percentages of individual nucleotides, GpC and CpG content, as well as the frequencies of potentially stimulating and non-stimulating motifs, are provided. Potentially stimulatory sequences are those described by Rankin et al. (120); values represent the total number of the respective motif present in the genome. The number in parentheses indicates the number of motifs expected in the sequence, given the individual nucleotide composition. This was calculated using the following formula: E(UVWXYZ) = 10−12* sequence length*(%U*%V*%W*%X*%Y*%Z), where E(UVWXYZ) = expected value of the motif UVWXYZ; U,V,W,X,Y,Z = nucleotides of which the motif consists, and %U = % of viral sequence consisting of base U. Expected values were rounded to the nearest integer. Likely non-stimulatory and potentially inhibitory sequences are those described by Krieg; values again represent the total number of the respective motif present in the genome, and expected values were calculated as describe above. ‘Total stimulatory’ or ‘total non-stimulatory’ values represents the sum of all putatively stimulatory or non-stimulatory motifs in the sequence. ‘Ratio of stimulatory/non-stimulatory sequences’ represent the fraction of the total number of stimulatory divided by the total number of non-stimulatory sequences. ‘CpG’ or ‘GpC’ represents the total number of the respective dinucleotide CpG or GpC in the viral DNA sequence. ‘CpG/GpC’ represents fraction of the number of CpG motifs divided by the number of GpC motifs in the viral DNA. ‘RefSeq’ number represent GeneBank accession number. ‘#’ no RefSeq FPV sequence available.

Frequency of TLR9 stimulatory and non-stimulatory/inhibitory sequences GpC and CpG contents, as well as individual nucleotide and motif frequencies, were determined from the publicly available reference sequences (RefSeq) in Genbank (MatLab script available from the authors upon request). Sequence length, frequencies and percentages of individual nucleotides, GpC and CpG content, as well as the frequencies of potentially stimulating and non-stimulating motifs, are provided. Potentially stimulatory sequences are those described by Rankin et al. (120); values represent the total number of the respective motif present in the genome. The number in parentheses indicates the number of motifs expected in the sequence, given the individual nucleotide composition. This was calculated using the following formula: E(UVWXYZ) = 10−12* sequence length*(%U*%V*%W*%X*%Y*%Z), where E(UVWXYZ) = expected value of the motif UVWXYZ; U,V,W,X,Y,Z = nucleotides of which the motif consists, and %U = % of viral sequence consisting of base U. Expected values were rounded to the nearest integer. Likely non-stimulatory and potentially inhibitory sequences are those described by Krieg; values again represent the total number of the respective motif present in the genome, and expected values were calculated as describe above. ‘Total stimulatory’ or ‘total non-stimulatory’ values represents the sum of all putatively stimulatory or non-stimulatory motifs in the sequence. ‘Ratio of stimulatory/non-stimulatory sequences’ represent the fraction of the total number of stimulatory divided by the total number of non-stimulatory sequences. ‘CpG’ or ‘GpC’ represents the total number of the respective dinucleotide CpG or GpC in the viral DNA sequence. ‘CpG/GpC’ represents fraction of the number of CpG motifs divided by the number of GpC motifs in the viral DNA. ‘RefSeq’ number represent GeneBank accession number. ‘#’ no RefSeq FPV sequence available.

DNA VIRUSES AND CYTOSINE METHYLATION

To understand the impact of cytosine methylation on the viral life cycle and the evolution of base composition, the particularities of each virus will need to be considered. Differences will inevitably exist between actively replicating viral DNA and that which is integrated into the host genome. The type of viral persistence will also be of importance. The integration of adeno- or polyomavirus DNA into the host genome is usually a terminal process since the viruses cannot liberate their genomes and are therefore no longer infectious. The evolutionary roles of methylation in these cases will likely differ from those seen in other viruses, such as Herpesviruses, which can liberate their genome after periods of latency. But differences may also exist between large and small viruses—with many larger viruses encoding their own replication machinery and additional proteins which modify host cell processes and immune responses. The susceptibility of the viral genome to methylation and immune recognition will also be affected by other factors, such as the location of replication within the cell and the specific intracellular trafficking route (Figure 1).
Figure 1.

Comparison of DNA virus infection pathways. Major intracellular trafficking routes and characteristics of replication are shown for the DNA virus families discussed. The importance of methylation and immune recognition is indicated. Simplifications and generalizations were made for the purpose of clarity.

Comparison of DNA virus infection pathways. Major intracellular trafficking routes and characteristics of replication are shown for the DNA virus families discussed. The importance of methylation and immune recognition is indicated. Simplifications and generalizations were made for the purpose of clarity. Viruses that integrate into the host genome and the effects of methylation on their life cycles have long been of particular interest. Many studies have focused on adenoviruses, but comprehensive knowledge of the methylation status of other viruses, such as polyomaviruses and herpesviruses, has also been obtained (Table 3).
Table 3.

Overview of GC content, CpG frequency and methylation status of small and large DNA viruses

VirusGenome sizea (kb)GC frequencyaCpG contenta (ρCPG)Methylation status during active replicationMethylation status during latencyHost speciesEffect on host methylation


ReplicatingReferencebIntegratedEpisomalReferenceb
Large dsDNA viruses
Adenoviridae28–450.3–0.650.5–1.13Un/hypomethylated(121)Methylated(121)Mammal, birdDNMT upregulation
Alpha-herpesvirinae130–1500.4–0.710.9–1.17Un/hypomethylated(67)Un/hypo- mehtylated(89)Mammal, birdDNMT upregulation
Beta-herpesvirinae140–2400.4–0.671.0–1.25UnknownN/AUnknownN/AMammalUnknown
Gamma-herpesvirinae110–1850.3–0.610.3–0.66Un/hypomethylated(66)Methylated(65)MammalDNMT upregulation
    (122)
Ranid herpesvirusc220–2300.5–0.550.8–0.95Methylated(70)UnknownN/AAmphibianViral 5-cytosine methyltransferases?
Iridoviridae140–3830.2–0.560.5–0.84Methylated(74)Amphibian, fishViral 5-cytosine methyltransferases?
Poxviridae130–3750.2–0.640.8–1.23UnknownN/AMammal bird invertebrateUnknown
Small dsDNA viruses
Papilloma-viridae7–80.4–0.540.1–0.57Partially methylated(60)Methylated(123)MammalDNMT upregulation
Polyoma-viridae50.4–0.480.05–0.78Un/hypomethylated(51)Methylated(51)Mammal, birdDNMT upregulation
Small ssDNA viruses
Autonomous Parvoviridae4–60.3–0.50.3–0.71UnknownN/AMammalUnknown
Dependo-virinae4–60.4–0.580.6–1.03UnknownN/AMammalUnknown
Circoviridae20.5–0.570.4–0.87UnknownN/AMammal, birdUnknown
Anellovirus40.50.67UnknownN/AHumanUnknown

The GC content, CpG content and, where known, methylation status is shown for viral families/subfamilies. Where applicable, a distinction is made between active replication and latency. If latent, the state of the genome (i.e. integrated or episomal) is specified. Preferred host species are indicated along with any known effect the virus has on host cell methylation. Where relevant, representative references are provided. Dash indicates this form is not known to occur for this virus.

aGC and CpG contents as determined by authors [here or in (83)] according to methods described therein); GC content represents the relative frequency of G and C in the sequence. CpG content represents the observed divided by the expected frequency of the dinucleotide CpG in the sequence.

bReference refers to a representative study, which focuses on one member of the virus family.

cUnpublished results, analysis done by authors as described above (83).

Overview of GC content, CpG frequency and methylation status of small and large DNA viruses The GC content, CpG content and, where known, methylation status is shown for viral families/subfamilies. Where applicable, a distinction is made between active replication and latency. If latent, the state of the genome (i.e. integrated or episomal) is specified. Preferred host species are indicated along with any known effect the virus has on host cell methylation. Where relevant, representative references are provided. Dash indicates this form is not known to occur for this virus. aGC and CpG contents as determined by authors [here or in (83)] according to methods described therein); GC content represents the relative frequency of G and C in the sequence. CpG content represents the observed divided by the expected frequency of the dinucleotide CpG in the sequence. bReference refers to a representative study, which focuses on one member of the virus family. cUnpublished results, analysis done by authors as described above (83). Early studies of viral methylation focused on DNA from polyomavirus-infected cells (51). The early methods (e.g. quantification of methyl-H3 incorporation into the genome) used to measure methylation were limited in their sensitivity and did not show the genomic distribution of methylated sites. Technical advances later made it possible to study cytosine methylation in a more detailed and site-specific manner. However, the various roles of methylation during persistent viral infections, in the silencing of viral genomes, and in immune evasion and tumor formation are still being uncovered. The impact of methylation during active viral replication is generally incompletely understood, and for many small DNA viruses it is currently not known whether the viral genome is methylated during active replication.

Adenovirus DNA methylation: summarizing technical advances in viral methylation studies

Early studies of adenovirus methylation relied on chromatographic and radioactive techniques to distinguish methylated from unmethylated bases. The techniques were complicated by limited sensitivity, the difficulty of isolating pure viral fractions and the inability to discriminate between host and integrated viral DNA. The investigators nevertheless carefully compared actively replicating and integrated viral DNA and their conclusions have stood the test of time. Genomic DNA from adenovirus type 2 and 12 infected cells was heavily methylated, more than cellular DNA from uninfected control cells (52). Whether the integration of viral DNA altered methylation patterns in the host genome or whether the observed differences were due exclusively to the integrated viral DNA was not determined. However, actively replicating adenovirus type 2 and 12 DNA appeared to have few or no methylated bases (52). More sensitive analysis was allowed by pairs of methylation-sensitive and insensitive restriction endonucleases. In particular, the isoschizomers HpaII and MspI have been used frequently in studies of DNA methylation. Both enzymes recognize the motif CCGG, but the catalytic activity of HpaII is inhibited by CpG methylation while the activity of MspI is not affected. The new technique verified the previous results with actively replicating adenovirus type 2 and 12 DNA showing no detectable methylated CCGG sites and the restriction patterns were consistent among viral DNAs from different cells (53). Inference from restriction enzyme analyses, however, remained limited to those sites recognizable by restriction enzyme pairs. Several subsequent studies focused on adenovirus DNA and the absence of methylated sites could later be conclusively established through bisulfate sequencing (51–53). Bisulfate converts unmethylated cytosines to thymines and thus allows the sensitive detection of specific methylated sites by sequencing the genome prior to and after treatment. The absence of methylated sites in the replicating adenovirus 2 genome has been shown using this technique (54–56). Many studies of adenovirus DNA have focused on integrated viral genomes or genomic segments, largely because of their oncogenic potential. Integrated adenoviral DNA is known to be hypermethylated, but the regulatory events which determine the methylation status of replicating or integrated sequences are not well understood. A recombinant between adenovirus 12 and host cell DNA (referred to as SYREC) is unmethylated over the whole genome during active replication including the host-derived sequence, which is methylated in the chromosome. This led to speculations about the existence of adenovirus-encoded de-methylation proteins, or a role for the virus-encoded replication machinery in methylation avoidance (57). However, no specific mechanisms have been defined, and the speed of viral replication or compartmentalization of this process within the host cell has also been proposed as factors contributing to the lack of methylation.

Methylation status of polyoma- and papillomaviruses

Studies of papovavirus DNA methylation also focused on integrated sequences, mostly of human papillomavirus strains and SV40 polyomavirus. Polyomavirus and papillomavirus genomes are CpG-depleted (Table 3) and any CpG motifs present are clustered within certain regions of the viral genome (58), complicating the analysis of methylation patterns. The early studies, however, concluded that there was little methylation in replicating polyomaviruses (51), with the possible exception of a CpG site near the early promoter (59). Several other studies concurred and found several human papillomavirus strains to be hypomethylated during active replication, again with methylated sites clustering in specific regions of the viral genome. The integrated viral DNA, on the contrary, appeared heavily methylated. However, distinguishing between actively replicating and integrated DNA appears difficult in some cases (60–62).

Methylation status of herpesvirus genomes

Herpesviruses establish latent infections without integrating into the host genome, and reactivate to re-establish active replication. During latency, the circularized viral genome remains quiescent in an episomal state, and can be replicated by the host cell replication machinery. Active (e.g. lytic) viral replication differs in several respects. Production of viral progeny inevitably results in cell lysis, utilizes the viral-encoded replication machinery and involves a separate origin of replication [e.g. in the case of Epstein–Barr Virus (EBV) oriLyt functions as the origin of replication in the lytic cycle while oriP is responsible for replication of episomal EBV DNA (63)]. The methylation status during lytic DNA replication is likely different from that during latency, but most studies have focused on latent infections and the differences have rarely been studied explicitly. Differences might also exist among herpesvirus subfamilies, as CpG motifs are highly underrepresented in gammaherpesvirus genomes, but not in alpha- or betaherpesvirus genomes (64). The reasons for these differences remain elusive. Gammaherpesviruses cause predominantly persistent, lymphoproliferative diseases. Studies of some gammaherpesviruses, such as EBV and Kaposi's Sarcoma-associated herpesvirus (KSHV), indicate that methylation serves an intricate regulatory role during the viral life cycle (see below). The EBV genome appears to be hypomethylated or unmethylated during lytic infection, but highly methylated during latency, and appears to become demethylated during reactivation (65). There is some evidence for methylation during active replication of another gammaherpesvirus, herpesvirus saimiri, since Kaschka-Dierich et al. (66) found both the linear and circular viral DNA to be heavily methylated. The alphaherpesvirus herpes simplex virus (HSV) appears unmethylated during active replication and latency (67–69) while two frog herpesviruses (ranid herpesvirus 1 and 2) appear to be methylated during replication. The ranid herpesvirus genomes also contain putative DNA cytosine-5 methyltransferases sequences located in ORFs 86 and 120 (70). Methylation plays a pivotal role in the life cycle of EBV (71,72), where methylation of specific genes appears to be involved in the transition from lytic to latent infection. Most EBV nuclear antigens (EBNAs) are expressed only for short periods during the lytic infection, after which their transcription is silenced through hypermethylation of the viral promoters Cp and Wp. Production of the indispensable EBNA1 is then achieved by expression from an alternative ‘TATA-less’ promoter Qp. Reinitiation of lytic infection can occur spontaneously or may be triggered by events such as immunoglobulin crosslinking of the host cell and cell proliferation. Reactivation is mediated through the expression of the immediate early genes Zta and Rta, which are regulated by the promoters Zp and Rp, the latter of which is hypermethylated during latency. While Rp hypermethylation is a major regulator of latent infection, Zp hypermethylation is dispensable. EBV therefore seems to use methylation-induced gene silencing as an immune evasion strategy. While a role for methylation in the EBV lifecycle is widely accepted, its importance for other herpesviruses is still disputed.

Methylation status of iridiviruses and ascoviruses

The genomes of iridoviruses appear to be heavily methylated during active replication and methylation might again have a regulatory role. Iridoviruses are large DNA viruses that infect fish, amphibians and reptiles. Initial stages of replication occur in the nucleus, after which the nascent viral genomes are transported into the cytoplasm where the second stage of replication occurs (73). Willis and Granoff (74) analyzed the iridovirus frog virus 3 (FV3) by radioactive labeling/restriction enzyme digestion and found the viral DNA to be heavily methylated, with an estimated 20% of cytosines methylated. Time-course experiments later indicated changing methylation patterns over the course of infection. Willis et al. (75) analyzed nuclear and cytoplasmic extracts collected at various times postinfection (p.i.) and found methylated viral DNA after ∼6–7 h p.i. This DNA appeared to be located in the cytoplasm, while nuclear DNA, collected at earlier time points, appeared to be hypomethylated. The authors also provided evidence against demethylation of parental viral genomes upon host cell infection. Further, Schetter et al. (76) showed that nascent FV3 DNA in the nucleus is unmethylated or hypomethylated at early times p.i, but later becomes hypermethylated (after ∼6 h p.i. in their assay). Cytoplasmic viral DNA, on the other hand, appears to be methylated at all times. The authors reported DNA methyltransferase activity in nuclear extracts of FV3 infected cells, beginning at a time p.i. which coincided with the appearance of methylated viral genomes. This activity was absent from cytoplasmic extracts, indicating FV3 might regulate methylation of its genome. It was later discovered that the FV3 genome encodes a putative DNA cytosine-5 methyltransferase sequence, which bears similarities to other known methyltransferases (77). The putative methyltransferase, however, is more homologous to prokaryotic enzymes than to eukaryotic methyltransferases. While transcription (from an early promoter) and translation of this FV3 protein have been verified, direct evidence for its methylation activity is still limited. Another iridovirus, fish lymphocystis disease virus (FLDV), was also found to have a highly methylated genome, even though its GC content is considerably lower than that of FV3 (78,79). FLDV also contains a putative 5-cytosine methyltransferase (53.3% identical to that of FV3) (80) but evidence for its function is, thus far, scarce. Besides the frog herpesviruses and iridoviruses, ascoviruses, which infect insects and are closely related to the iridoviruses, are also heavily methylated during replication (81).

Methylation status of small DNA viruses

In contrast to most large DNA viruses, CpG is underrepresented in the majority of small DNA viruses (64,82,83). It is possible that cytosine methylation is partly responsible for this underrepresentation. Small DNA viruses encode only a minimal number of proteins and use the host cell replication machinery for replication. Low CpG frequencies may have been selected to avoid methylation by host methyltransferases, to maximize translation efficiencies, or as a means of reducing CpG mediated immune responses. However, apart from the small polyomaviruses and papillomaviruses described above, there appears to have been few close examinations of the methylation status of small DNA viruses (particularly those that do not integrate), such as autonomous parvoviruses, circoviruses and anelloviruses. The roles and effects of methylation might be different from those seen in integrating and/or large DNA viruses, but further research is needed to understand the specific effects of methylation on small, nonintegrating DNA viruses. Adeno-associated virus (AAV) integrates into the host genome, and has been examined primarily in the context of guanine methylation. Productive infection with an AAV normally requires co-infection with a helper virus (usually an adenovirus or, less frequently, a herpesvirus). In the absence of a helper virus, AAVs may remain in the cell as persistent dsDNA, or they may be stably integrated into the host genome. Upon infection by a helpervirus, the integrated AAV DNA can be rescued and undergo active replication. The CpG contents of AAVs more closely resemble those of their helper viruses than those of the closely related autonomous parvoviruses, which show strong CpG suppression. The mechanisms of integration have been extensively studied in AAV serotype 2 (AAV2), which predominantly integrates in a specific region of chromosome 19 (locus 19q13.3t). Integration is directed by AAV-encoded Rep proteins, which likely recognize guanine residues within GCTC repeating motifs (located in the AAV terminal hairpin). In vitro methylation of these guanine residues has been shown to effectively inhibit DNA–protein interactions and likely interferes with integration in vivo (84). Similar results were obtained when examining the integration of AAV4 (which utilizes a slightly different recognition motif) into African green monkey cells (85). Nevertheless, it is not known what role, if any, cytosine-5 methylation plays in the natural AAV life cycle, if AAV methylation influences integration, or if guanines or cytosines are methylated during active replication.

METHYLATION AND INTEGRATED OR LATENT VIRAL DNA

Methylation of integrated viral DNA has been studied extensively and the topic is extensively covered in the scientific literature. Areas of interest have included changes in methylation patterns during integration, the effects of viral methylation on host cells and the correlation between virally induced methylation and tumor formation. Excellent reviews have been published on these topics and the reader is referred to these articles, indicated in the following discussion, for additional detail. Here we briefly summarize what is known about methylation during integration and latency and the regulatory elements involved, contrasting the level of understanding to what is known about methylation during active replication. As already mentioned above, most DNA viruses such as adenoviruses (57,59,86–88) and polyomaviruses (83) appear specifically methylated during latency or when in an integrated form. In contrast, the alphaherpesvirus HSV does not appear to be methylated during latency (89), but the reasons for this difference are unclear. The processes by which integrating viruses are de novo methylated are only partly understood, but it is clear that the extent and pattern of de novo methylation depend upon the time after integration and site in the host genome (90), the integrated sequences (90–94), and possibly the cell type (57,90). De novo methylation tends to be initiated within confined genomic regions and then spreads throughout the viral genome (90,93,95,87). For example, in a study examining the methylation pattern of adenovirus 12 over the course of integration into hamster cells, methylation began in the center of the viral DNA (between map units 30 and 75) and subsequently spread outward in both directions. Some regions of the viral genome remained unmethylated throughout the study, namely the right terminal repeat and regions near the left terminus (95). It is not known which factors initiate methylation of the genome, nor the determinants of the sequential methylation pattern. Methylation patterns appear to be site specific. In many viruses there seems to be an inverse correlation between viral gene expression and degree of methylation, with late viral genes being generally more susceptible to methylation effects than early genes (53,91). Promoter regions of integrated viruses are frequently hypomethylated. For example, promoter regions in adenovirus 12 are hypomethylated at CCGG sites while the viral genome is otherwise heavily methylated (57,59,96). Methylation of viral genes, particularly at promoter sequences, appears to be a reversible event (96) but the mechanism by which viral genomes are demethylated remains to be determined. Demethylation has also been shown to occur after in vitro methylation of several viral DNAs. For example, SV40 DNA recovered 20 h after microinjection into cells retained the artificial methylation pattern under nonpermissive culture conditions, while allowing early gene transcription. The DNA, however, became demethylated after viral replication (97). Loss of methylation has also been observed in adenoviruses, in the case of the SYRECs mentioned above (86,98) which are methylated when covalently linked to the host genome, but unmethylated during active replication. They require the presence of fully functional adenovirus helper for efficient lytic replication, which led some to postulate a role for the viral transcription machinery in the loss of methylation. Unfortunately, experimental data in support of this hypothesis are so far scarce. Infection with integrating (and even some nonintegrating) viruses can induce alterations in the methylation patterns of host DNA, a potential factor in the development of malignant tumors. For example, herpesviruses KSHV and EBV, polyomaviruses BK and SV40, papillomavirus HPV, several adenoviruses and hepatitis B virus encode proteins that activate/upregulate DNMT1, 3a, or 3b. This results in the hypermethylation and, therefore, the downregulation of a number of cellular genes. One of the cellular genes frequently hypermethylated in these infections is the cell cycle regulating tumor suppressor gene p16INK4a, which is commonly hypermethylated in many types of cancer (99). Differentiation of host cell tumors and the level of genome methylation of integrated viruses generally appear to be positively correlated, as seen for various papillomaviruses such as Shope papillomavirus (100) and human papillomavirus (HPV) (101,102), the herpesvirus EBV (72,103) and various adenoviruses (96).

METHYLATION AND VIRAL GENE EXPRESSION

For iridoviruses, gene expression at functional levels can be maintained despite genome hypermethylation. Methylated sites, however, are not distributed randomly across the genome and methylation levels differ among CpG motifs. FLDV, for example, has a considerably higher level of methylation at CCGG than CGCG motifs (79). The physiological significance of this, however, is still disputed. In vitro studies of the related FV3 showed that specific methylation (using HpaII methyltransferase) of CCGG motifs in a late promoter, L1140, abolished its function. However, the methylation of all CpG sites by the indiscriminate SssI methyltransferase or the specific methylation of GCGC motifs by HhaI methyltransferase had no dramatic effect. This appears to be a rare example where, counterintuitively, complete methylation of a late promoter does not abolish its function (methylation frequently abolishes the function of late viral promoters, while early promoters tend to be less susceptible to the repressive effect of cytosine methylation). The underlying mechanism has not been elucidated, but steric hindrance in asymmetrically methylated DNA might play a role (104). It seems possible that methylation of CCGG motifs results in complex secondary structures in the promoter region, whereas indiscriminant CpG methylation or methylation of GCGC motifs balances these secondary structures. A similar phenomenon was seen in adenovirus type 2, where methylation of CCGG sites within the (late) E2a promoter inhibited the transcription of viral DNA, while methylation of GCGC motifs had no repressive effect (105,106). The transcription activity of the polyomavirus SV40 was found not drastically decreased when in vitro methylated with indiscriminant DNMTs from rat liver (107,108). In contrast, upon specific methylation of HpaII sites (CCGG-specific), transcription activity was reduced markedly (59). However, the interpretation of these latter results is complicated by the possibility of incomplete methylation by the rat liver extract, as rat liver DNA is a competitive inhibitor of adenovirus methylation (55) and low-level contaminants might likewise have inhibited complete methylation of SV40. In vitro methylation studies involving human papillomavirus type 16 (HPV16) showed that indiscriminate SssI methylation significantly decreased the transcriptional activity of the long control region (LCR), which is devoid of ORFs but contains several cis-acting regulatory elements. Enhancers located in the LCR are responsive to both cellular and viral factors, such as the viral E2 gene products. Transcriptional silencing was measured using an LCR-containing reporter plasmid and inhibition was again believed to be steric, specifically due to interference with the binding of transcription regulator E2. Notably, hypomethylation of HPV16 E2-binding sites has been observed in highly differentiated, but not in undifferentiated, tumor cells (101). However, direct comparison of the HPV16 and SV40 studies is difficult. The effects of indiscriminate methylation on early and late viral gene products in the genomic context are hard to infer from this study. The time of transcription appears to impact the effects of methylation in several viruses. For example, the in vitro HpaII (CCGG-specific) methylation of SV40 early genes (i.e. those coding for the large T-antigen) does not lead to transcriptional repression, and the in vitro methylated sites are subsequently lost during active replication (108). Late viral gene expression, however, can be efficiently inhibited by in vitro methylation of HpaII sites within the 5′ part of this region (109). Site-specific methylation might be involved in the transition to latency in some cases (as in the case of EBV). On the other hand, methylation does not seem to be involved in the switch from early to late gene expression in HSV (89). Likewise, the major late promoter of adenovirus 2, which induces the switch from early to late gene expression, is unmethylated in nonintegrated viruses (56), suggesting methylation does not play a major role in the regulation of early versus late gene expression for this virus.

CONCLUSIONS

A number of studies show that CpG methylation can greatly affect the life cycles of DNA viruses, but its exact role in natural infections remains unclear. Those effects likely differ between different viruses and are dependent upon many factors, such as the stage in the viral life cycle, the host species and infected tissue, the flanking nucleotide motifs, the genes in question and the genomic location. Where it has been studied, major differences have been seen for some viruses between the actively replicating and the latent or integrated viral DNA; the former is often unmethylated or hypomethylated, while the later often shows specific and regulated methylation. Differences may also exist between large and small DNA viruses, potentially because of their rate and site of replication and differences in their ability to manipulate the host responses to unmethylated DNA. Thus, it is necessary to not only examine individual viruses, but to examine their dynamic methylation status and the effects of any viral methylation on both the virus and the host cell. The various ways in which mutational bias, codon usage and host selections (e.g. through TLR9-dependent responses) influence CpG frequency and methylation status need to be defined. It will be especially important to determine the unique roles and regulators of methylation, and their divergent effects on the genome evolution of small and large DNA viruses during active or latent infection, and in packaged, integrated or episomal DNA. A deeper knowledge of the relationships among the virus, the host cell and methylation (and de-methylation) machinery will provide essential insights into determinants of methylation status, gene expression, replicative behavior, as well as activation of pattern recognition receptors and immune responses.
  121 in total

Review 1.  Transposable elements and the epigenetic regulation of the genome.

Authors:  R Keith Slotkin; Robert Martienssen
Journal:  Nat Rev Genet       Date:  2007-04       Impact factor: 53.242

2.  A Dnmt2-like protein mediates DNA methylation in Drosophila.

Authors:  Natascha Kunert; Joachim Marhold; Jonas Stanke; Dirk Stach; Frank Lyko
Journal:  Development       Date:  2003-08-27       Impact factor: 6.868

3.  Analysis of DNA (cytosine 5) methyltransferase mRNA sequence and expression in bovine preimplantation embryos, fetal and adult tissues.

Authors:  Michael C Golding; Mark E Westhusin
Journal:  Gene Expr Patterns       Date:  2003-10       Impact factor: 1.224

4.  Herpes simplex virus type 1 DNA is immunostimulatory in vitro and in vivo.

Authors:  Patric Lundberg; Paula Welander; Xiao Han; Edouard Cantin
Journal:  J Virol       Date:  2003-10       Impact factor: 5.103

Review 5.  Viral genes and methylation.

Authors:  Mukesh Verma
Journal:  Ann N Y Acad Sci       Date:  2003-03       Impact factor: 5.691

6.  CpG methylation of human papillomavirus type 16 DNA in cervical cancer cell lines and in clinical specimens: genomic hypomethylation correlates with carcinogenic progression.

Authors:  Vinay Badal; Linda S H Chuang; Eileen Hwee-Hong Tan; Sushma Badal; Luisa L Villa; Cosette M Wheeler; Benjamin F L Li; Hans-Ulrich Bernard
Journal:  J Virol       Date:  2003-06       Impact factor: 5.103

7.  Methylation patterns of papillomavirus DNA, its influence on E2 function, and implications in viral infection.

Authors:  Kitai Kim; Peggy A Garner-Hamrick; Chris Fisher; Denis Lee; Paul F Lambert
Journal:  J Virol       Date:  2003-12       Impact factor: 5.103

8.  In vitro methylation of the BsuRI (5'-GGCC-3') sites in the E2a region of adenovirus type 2 DNA does not affect expression in Xenopus laevis oocytes.

Authors:  L Vardimon; U Günthert; W Doerfler
Journal:  Mol Cell Biol       Date:  1982-12       Impact factor: 4.272

Review 9.  Stealth technology: how Epstein-Barr virus utilizes DNA methylation to cloak itself from immune detection.

Authors:  Qian Tao; Keith D Robertson
Journal:  Clin Immunol       Date:  2003-10       Impact factor: 3.969

Review 10.  Mammalian DNA (cytosine-5) methyltransferases and their expression.

Authors:  Sriharsa Pradhan; Pierre-Olivier Esteve
Journal:  Clin Immunol       Date:  2003-10       Impact factor: 3.969

View more
  53 in total

Review 1.  A proteomics perspective on viral DNA sensors in host defense and viral immune evasion mechanisms.

Authors:  Marni S Crow; Aaron Javitt; Ileana M Cristea
Journal:  J Mol Biol       Date:  2015-02-26       Impact factor: 5.469

Review 2.  Epigenetic regulation of Kaposi's sarcoma-associated herpesvirus replication.

Authors:  Shara N Pantry; Peter G Medveczky
Journal:  Semin Cancer Biol       Date:  2009-02-21       Impact factor: 15.707

Review 3.  Models of coding sequence evolution.

Authors:  Wayne Delport; Konrad Scheffler; Cathal Seoighe
Journal:  Brief Bioinform       Date:  2008-10-29       Impact factor: 11.622

4.  Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level.

Authors:  Yoshiyuki Suzuki; Takashi Gojobori; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2009-07-06       Impact factor: 16.240

5.  Cellular Antisilencing Elements Support Transgene Expression from Herpes Simplex Virus Vectors in the Absence of Immediate Early Gene Expression.

Authors:  Fang Han; Yoshitaka Miyagawa; Gianluca Verlengia; Selene Ingusci; Marie Soukupova; Michele Simonato; Joseph C Glorioso; Justus B Cohen
Journal:  J Virol       Date:  2018-08-16       Impact factor: 5.103

Review 6.  Application of mass spectrometry to molecular diagnostics of viral infections.

Authors:  Lilia M Ganova-Raeva; Yury E Khudyakov
Journal:  Expert Rev Mol Diagn       Date:  2013-05       Impact factor: 5.225

Review 7.  Functions and Malfunctions of Mammalian DNA-Cytosine Deaminases.

Authors:  Sachini U Siriwardena; Kang Chen; Ashok S Bhagwat
Journal:  Chem Rev       Date:  2016-09-01       Impact factor: 60.622

8.  Potential m6A and m5C Methylations within the Genome of A Chinese African Swine Fever Virus Strain.

Authors:  Lijia Jia; Jianjun Chen; Haizhou Liu; Wenhui Fan; Depeng Wang; Jing Li; Di Liu
Journal:  Virol Sin       Date:  2020-04-08       Impact factor: 4.327

9.  The emergence of parvoviruses of carnivores.

Authors:  Karin Hoelzer; Colin R Parrish
Journal:  Vet Res       Date:  2010-02-15       Impact factor: 3.683

10.  Virus-host coevolution: common patterns of nucleotide motif usage in Flaviviridae and their hosts.

Authors:  Francisco P Lobo; Bruno E F Mota; Sérgio D J Pena; Vasco Azevedo; Andréa M Macedo; Andreas Tauch; Carlos R Machado; Glória R Franco
Journal:  PLoS One       Date:  2009-07-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.