Literature DB >> 30253420

Differences in Cpg Island Distribution Between Subgenotypes of the Hepatitis B Virus Genotype.

Lin Chen1, Yi Shi1, Wanrong Yang1, Yafei Zhang1, Qinxiu Xie1, Yunsong Li2, Xu Li1, Jun Li3, Zhenhua Zhang1,3.   

Abstract

BACKGROUND Hepatitis B virus (HBV) genotypes show genomic variations, resulting in different CpG islands in each HBV genotypes or subgenotype. This study aimed to establish reference sequences for each HBV subgenotype of A-H genotypes and to analyze the characteristics of the CpG islands. MATERIAL AND METHODS There were 3,037 retrieved whole-genome sequences of HBV genotypes A-H from GenBank, 28 subgenotype reference sequences were established for these genotypes. CpG islands of the subgenotype reference sequences were analyzed, and 939 strains were selected from the 3,037 genomic sequences. Differences in CpG islands between subgenotypes were compared using the chi-squared and non-parametric tests. RESULTS Of the 28 subgenotype reference sequences established, 11 subgenotype reference sequences lacked CpG island I, and only F4 contained a new CpG island. Of all selected strains, 48.35% (454/939) contained three traditional CpG islands I, II, and III (no new islands); 45.05% (423/939) lacked CpG island I; 38.98% (366/939) contained only CpG islands II and III; and 12.46% (117/939) contained new islands (genotypes A1, D7) (genotype G had no new islands). Strains with or without CpG island I, or new islands between subgenotypes of each HBV genotype were significantly different (P<0.05). Strains containing CpG islands I, II, and III and new islands among different subtypes in HBV genotypes A, C, and F were significantly different (P<0.05). CONCLUSIONS Different HBV genotypes and subgenotypes had characteristic CpG island patterns. Strains with or without CpG island I, or new islands among subgenotypes of each HBV genotype, were significantly different.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30253420      PMCID: PMC6180904          DOI: 10.12659/MSM.910049

Source DB:  PubMed          Journal:  Med Sci Monit        ISSN: 1234-1010


Background

Worldwide, hepatitis B virus (HBV) infection has high rates of morbidity and mortality and represents a serious public health issue with changes in epidemiological features resulting from factors including migration, genetic variation, and the effects of treatment [1-3]. Because of the absence of ‘proofreading’ activity in HBV DNA polymerase, random nucleotide misincorporation into the replicating DNA strand leads to a high mutation rate, which is estimated at between 1.4–3.2×10−6 substitutions per site per year in the whole HBV genome [4]. Currently, HBV isolates worldwide have been divided into ten well-accepted genotypes (A to J), based on an inter-genotypic difference of greater than 8% in sequences [5-7]. The establishment of representative reference sequences might facilitate studies on HBV infection and its pathogenicity. Certain HBV references sequences had been reported previously. However, these HBV references sequences are not truly representative, since they were either simply based on the first identified isolates or isolates of small sample sizes [8–15. Considering representativeness and consistency, we previously established reference sequences of HBV genotypes A (A1, A2, A3, A5), B2, and C (C1, C2, C5, C6) on the basis of large numbers HBV sequences, and deposited them in GenBank, with GenBank accession codes being KP234050–KP234053 (A1, A2, A3, A5) and KM999990–KM999993 (C1, C2, C5, C6) [15-17]. The synthesized consensus genome of subgenotype (subtype) B2 is replication-competent upon transfection into hepatoma cells in vitro and expression and replication in mice [18]. Representative reference sequences of other genotypes still need to be established. Viral gene expression is believed to be partially regulated by DNA methylation, which usually occurs in the promoter region leading to transcriptional silencing and gene repression, via multiple mechanisms in human tissues [19]. DNA methylation frequently occurs in the CpG dinucleotide-rich region known as the CpG island. Currently, six CpG islands, including three newly identified CpG islands (IV, V, and VI), are generally accepted to exist. These CpG islands are located at the transcription start site or are located upstream and downstream of the promoters [20-22]. Host DNA methyltransferase mediates CpG methylation, and methylated HBV CpG islands limit their regulation of viral protein expression [23,24]. Agents promoting methylation of nuclear HBV DNA may constitute another new antiviral treatment modality than nucleotide analog drugs and interferons [25]. This study aimed to establish standard sequences between HBV subtypes and to further clarify CpG-enriched sites in the HBV genome in subtypes of HBV genotypes A–H. The location, length, and distribution of CpG islands in the genome of individual subtypes and their representative strains were investigated. Comparative analysis of CpG islands was performed among the reference sequences and representative strains of HBV genotypes A–H. The data from this study might provide insights into the clinicopathological and virological characteristics of distinct HBV genotypes and subtypes.

Material and Methods

Sequence sources and criteria

The GenBank Nucleotide Database was searched up to December 1, 2013, at the National Center for Biotechnology Information, using the keywords, ‘hepatitis B virus,’ ‘genotype,’ and ‘complete genome.’ Complete genome sizes ranging from 3100–3300 bp were included.

Establishment of hepatitis B virus (HBV) subtype references

Reference sequences of eight subtypes of HBV genotypes C and A were created in 2015 and 2016, respectively [16,17]. Using a similar method, phylogenetic and molecular phylogenetic analysis was conducted using the Maximum Likelihood statistical method and Tamura-Nei model, with 1000 bootstrap replicates, using the MEGA version 7 software (Kumar, Stecher, and Tamura 2015). All sequences were compared with known HBV references including A1 (KP234050), Aafr (AF297621), Bj (AB073858), C2 (KM999991), C (AB033556), D (X02496), E (X75657), F (X69798), G (AF160501), and H (AY090454), to determine the subtypes they belonged to. Finally, 1,680 strains belonging to 20 subtypes of HBV genotypes B, D–H were identified. All strains in each subtype were aligned using the AlignX software, a component of Vector NTI Advance11.5 (Thermo Fisher Scientific, Waltham, MA, USA), using the ClustalW algorithm. The nucleotide in each position of the reference sequence of each subtype was determined on the basis of the nucleotide with the highest frequency in the corresponding position. Phylogenetic trees of the whole genome and S gene of 28 HBV subtype reference sequences were constructed using the Unweighted Pair Group Method with Arithmetic Mean method with 1000 bootstrap replicates, using MEGA 7 software.

Computation of CpG islands

A pool of strains was selected from the 3,037 strains based on which reference sequences were established, and all CpG islands were identified in them. To construct this pool, 50 strains were randomly selected from a certain subtype if that subtype had more than 50 strains, and all were selected if a subtype had less than 50 strains. In total, 939 strains from 28 HBV subtypes were selected. The CpG islands were computed using two online methods, the MethPrimer () and the CpG Plot (). The criteria used to distinguish a CpG island included a GC content of ≥50%, the observed/expected CpG dinucleotide of ≥0.6, and a window size of ≥100 bp [23,26]. Information regarding the size, number, location, and other features of CpG islands in the reference sequences of HBV subtypes and selected strains were further collected. CpG islands were classified primarily by their positions in the reference sequences. Identities of CpG islands proximal to one another, separated by a boundary, were determined on the basis of positions where major regions of those islands were located.

Statistical analysis

The chi-squared (χ2) test was performed using SPSS version 16.0 (IBM, New York, USA) for the composition of the CpG islands among the various subtypes. Non-parametric tests were performed using SPSS version 16.0 (IBM, New York, USA) for the length and position of each CpG island of each genotype to determine whether CpG islands showed significant diversity between different subtypes of the same HBV genotype. Statistical significance was defined as P<0.05.

Results

Characteristics of the strains of hepatitis B virus (HBV)

In total, 3,037 whole-genome sequences of HBV A–H genotypes met the selection criteria and were included in the analysis. Among them, 1,357 belonged to genotypes A and C; 1,680 belonged to genotypes B, D–H (Table 1). The geographic distribution has been summarized in Supplementary Table 1. We established 28 genotype reference sequences of HBV genotypes A–H and deposited them in GenBank (Table 1). To reduce statistical bias resulting from differences in sample size, each subtype comprised no more than 50 strains. Therefore, 939 HBV strains were further selected for CpG island analysis, with genome size ranging 3117–3225 bp (A), 3128–3227 bp (B), 3104–3220 bp (C), 3101–3212 bp (D), 3185–3215 bp (E), 3129–3227 bp (F), 3234–3251 bp (G), and 3234–3251 bp (H) (Table 2).
Table 1

The features of strains and references for HBV subgenotypes A–H.

Geno-typeSub-genotypeStrains number of subgenotypeSize range of strains (bp)GeneBank number of referencesSize of references (bp)Nucleotide size of references (bp)Amino acid size of references (aa)
SXPCSXPC
AA11553117–3253KP23405032216814652538558226154845185
A22313115–3228KP23405132212538558845185
A3223117–3221KP23405232212538558845185
A5253215–3226KP23405332212538558845185
BB1243125–3227KP34100732152532552843183
B23773101–3248KP341008
B3233128–3218KP341009
B4173200–3215KP341010
B6393215–3218KP341011
B7133215KP341012
B9193179–3215KP341013
CC11993110–3239KM999990
C26993101–3254KM999991
C593215KM999992
C6173119–3220KM999993
DD1I1313110–3215KP32259931822499832
D1M3013101–3191KP32260031822499832
D21563149–3218KP32260131822499832
D31333110–3182KP32260231822499832
D5313119–3188KP32260331822499832
D7333170–3194KP32260431822499832
EE1983185–3212KX18658432122529842
FF1683161–3217KX264496
F2183182–3215KX264497
F3213131–3219KX264498
F4283214–3227KX264499
GG283234–3251KX26450032482529588842195
HH223187–3218KX264501

D1I is D1 India, D1M is D1 Middle East. S, X, P, C is S, X, P, C regions of HBV DNA genome. ‘–’ The same as the number before the first occurrence of the symbol in this column.

Table 2

The CpG islands’ features in strains of HBV subgenotypes A–H.

GenotypeSub-genotypeSelected strainsLack of CGI IStrains contain 3 conventional CGIsRatio of new CGIsNew CGIs
No.CGI No. of each strainRatioNumber of strains
AA1502–334%17(0*)33(0*)0
A2502–420%10(1)40(2)6%IV
A3222–382%18(2)4(0)9%IV
A5252–376%19(1)6(0)4%VI
BB1242–44%1(0)23(3)13%IV, V
B2502–410%5(2)45(5)14%IV
B3232–422%5(3)18(7)43%IV, V
B4172–435%6(1)11(1)12%IV
B6392–53%1(0)38(14)36%IV, V
B7133–40013(5)38%IV, VI
B9192–416%3(0)16(5)26%IV
CC1502–492%46(4)4(1)10%IV, V, VI
C2502–480%40(4)10(3)14%IV, V, VI
C592–3100%9(1)011%IV
C6172–412%2(1)15(1)12%VI
DD1I502–422%11(1)39(4)10%IV
D1M502–422%11(0)39(2)4%IV
D2502–448%24(1)26(3)8%IV
D3502–314%7(1)43(0)2%IV
D5313–43%1(1)30(3)13%IV, V
D7331–342%14(0)18(0)0
EE502–416%8(3)42(2)10%IV
FF1502–3100%50(1)02%V
F2182–3100%18(1)06%IV
F3212–3100%21(1)05%V
F4282–3100%28(22)079%V
GG282–393%26(0)2(0)0
HH222–4100%22(4)018%IV, V

CGI – CpG island.

Number of strains containing new CpG islands. ‘–’ Absence of new CpG island.

Among the 28 reference sequences, genome lengths were different. However, the lengths of S and X genes were the same, with 681 and 465 bases coding for 226 and 465 amino acids, respectively (Table 1). The phylogenetic relationship between whole genomes and S genes among 28 reference sequences is shown in Figure 1A. The phylogenetic relationship of the S region is very similar to that of the whole genome, but with some minor differences. Their clustered tendency was similar to those reported previously [21].
Figure 1

Phylogenetic trees and CpG island distribution among hepatitis B virus (HBV) subtype reference sequences. (A) Phylogenetic tree diagram for the whole genome and S gene. (B) CpG island distribution. The horizontal axis denotes HBV genome sequence. The vertical blue strips indicate CpG islands I, II, V, and III, respectively. Vertical red bars below the horizontal axis indicate CpG dinucleotides, and the more intensive the red bars, the higher the enrichment of CpG dinucleotides.

Characteristics of the CpG islands

Of 28 reference sequences, each had 2–3 CpG islands. Among them, 17 (60.71%) contained three traditional CpG islands I, II, and III, while 11 other subtypes (39.29%) lacked a CpG island I (Table 3, Figure 1B). Only subtype F4 contained a new island, CpG island V (located at sites 1933–2036). In most of the reference sequences, the position and size of CpG islands II and III were similar. However, significant differences existed in those of CpG island I (Table 3).
Table 3

The number, size and position of CpG Islands in HBV subgenotype references.

References genotypeNo. of CGIsCGI ICGI IICGI III
Size (bp)PositionSize (bp)PositionSize (bp)Position
A1318699–2844171247–16631612282–2442
A23101185–2854361228–16631562294–2449
A324361228–16631442299–2442
A524361228–16631462294–2439
B13154112–2654511221–16711652298–2462
B23101112–2124311221–16641232333–2455
B33103110–2124501223–16721562300–2455
B43153115–2674951178–16721552300–2454
B63176111–2864551211–16651462298–2443
B73175112–2864451228–16721562300–2455
B93103110–2124491223–16711582300–2457
C124251247–16711672280–2446
C224511215–16651492298–2446
C525181215–17322122234–2445
C6315876–2334841247–17301632294–2456
D1I3101186–2864441228–16711582289–2446
D1M3101186–2864411228–16681712276–2446
D23101186–2864411228–16681712276–2446
D33103184–2864341228–16671672280–2446
D53101186–2864371228–16641832276–2458
D73101186–2864181239–16671092334–2442
E3100184–2834281240–16671232334–2456
F124211242–16691582298–2455
F224101242–16511622300–2461
F324231243–16671242335–2458
F43104*1933–2036*4161242–16651602299–2458
G24561163–16281452350–2494
H26201106–17281182336–2453

CGI – CpG island.

This is CpG island V. The blank represents CpG island absences. The first T of the EcoRI cleavage site is position 1 which by genotypes B/C as the standard sequences.

Of the 939 selected strains, each strain contained 1–5 CpG islands. Furthermore, 48.35% (454/939) of strains included only CpG islands I, II, and III, but no new islands and 12.46% (117/939) contained new islands (Table 2). Among selected strains, only one strain contained a single CpG island (CpG island II, D7, FJ90442), 366 (38.98%) contained only CpG islands II and III, 62 (6.60%) contained four CpG islands, and only one strain contained five CpG islands (CpG islands I–V, B6, DQ463802). The number of strains containing CpG islands I–VI was 515, 939, 938, 65, 47, and 8, respectively. The distribution of CpG islands I–III is shown in Table 4 and Figure 2. A non-parametric test (rank sum test) was performed for the start sites, end sites, and lengths of CpG islands I, II, and III, respectively, between different subtypes of the same HBV genotype (A–D and F). Differences between CpG island I of each subtype of C (start sites, end sites, and lengths), and between the length of CpG island II in each subtype of A were not significant. All other sites displayed a significant difference (P<0.05) (Supplementary Table 2).
Table 4

CpG islands distribution characteristics in selected sequences.

CGIStart positionEnd positionLength
I184* (76–187#)285 (211–304)102 (100–193)
II1228 (1109–1248)1667 (1624–1725)439 (406–560)
III2298 (2234–2349)2453 (2405–2492)157 (108–200)
IV330 (257–557)436 (374–659)112 (100–226)
V1933 (1921–1945)2036 (2024–2054)104 (100–124)
VI2877 (2800–2891)2986 (2942–2991)105 (100–166)

CGI – CpG island.

Median;

the range of percentiles 2.5–97.5.

Figure 2

Frequency distribution of the start and end sites and the size of CpG islands in selected strains of hepatitis B virus (HBV). The left and middle panels show location frequency distribution of the start and end of each CpG island, respectively. The right panel shows the frequency distribution of the size of the CpG island. The horizontal axis refers to the site or size of CpG islands. The vertical axis refers to the frequency. Different colors represent different hepatitis B virus (HBV) genotypes. Annotations on the histogram indicate the major sites of individual subtypes (criteria: number of strains ≥20 and accounts for ≥50% of all strains from the corresponding subtype). The class interval of the start and end sites for CpG island IV are 50 nt and 20 nt for other CpG islands. N is the sum of all events.

CpG islands I and IV had a wide range of start and end sites and overlaps frequently occurred between the start and end sites of these CpG islands among different strains. The standard deviation (SD) values of the start site of CpG islands I–VI were determined to be 42.66, 34.49, 27.44, 88.39, 6.01, and 38.87, respectively, on using SPSS version 16.0. Based on the range and SD of the start site, CpG islands I and IV showed the highest dispersion, consistent with the distribution of CpG island I among different reference sequences. The start site of CpG island IV with or without island I displayed no significant difference in a rank sum test (P=0.413). Because CpG island IV was downstream of CpG island I, a similar analysis was performed for the start sites of CpG island IV and end sites of CpG island I of all strains containing CpG island IV displaying a significant difference (P<0.001). Therefore, CpG islands I and IV were considered not to be the same island.

The absence of CpG island I

Among 939 selected strains, 423 (45.05%) lacked CpG island I. Specifically, 100% of strains of subtypes C5, F1–F4 and genotype H, and more than 75% strains of subtypes A3, A5, C1, C2, and genotype G, respectively, but less than 48% strains of other (sub)genotypes lacked CpG island I. B7 was the only subtype where all strains contained CpG island I (Table 2) (Figure 3A). With respect to the genotype, CpG island I was lacking in 43.54% (A), 11.35% (B), 76.98% (C), 25.76% (D), 16% (E), 100% (F), 92.86% (G) and 100% (H). Overall, CpG island I was most frequently absent in genotypes C, F, G, and H. Strains with new islands accounted for 13.24% (56/423) of those without a CpG island I, and 11.82% (61/516) of those with a CpG island I. A chi-squared test was performed, but the differences were not statistically significant (P=0.513).
Figure 3

Characteristics of CpG islands in selected strains from different subtypes of hepatitis B virus (HBV) (A) Percentages of hepatitis B virus (HBV) strains with differences in CpG island I, or new island status between different subtypes. CGI represents the CpG island. Green and blue colors represent strains lacking CpG island I. Orange and brown colors represent strains containing CpG island I. Blue and orange colors represent the presence of new islands. (B) The frequency composition of strains containing CpG islands I, II, III, and new islands, respectively, in each HBV subtype or genotype. The P-value represents the chi-squared analysis of the composition ratios between the subtypes (within the square brackets).

Characteristics of new CpG islands

Of the 939 selected strains, 117 (12.46%) contained 120 new CpG islands. The percentage of strains containing new CpG islands were 4.08% (6/147), 24.86% (46/185), 11.9% (15/126), 6.06% (16/264), 10% (5/50), 21.37% (25/117), 0%, and 18.18% (4/22) in HBV genotypes A–H, respectively. Subtypes A1, D7, and genotype G displayed no new CpG islands (Table 2, Figure 3). Sixty-five strains of HBV genotypes A–F and H contained CpG island IV. Among genotypes B, C, D, F, and H, 47 strains contained CpG island V, primarily distributed in B6 (12/47) and F4 (22/47). Eight strains from genotypes A–C contained CpG island VI (two in genotypes A5 and B7, six in genotype C). GQ924641 (B3, no CpG island I), DQ463802 (B6, having CpG island I) and AB516395 (H, no CpG island I) contained two new islands: CpG islands IV and V. The distribution, median, and 95% range of the start site, end site, and length of CpG islands IV, V, and VI are shown in Table 4 and Figures 2, 4.
Figure 4

Distribution of CpG islands in selected strains of the hepatitis B virus (HBV) from different subtypes of hepatitis B virus (HBV). The green rectangular arrows show the four open reading frames of the C, P, S, and X regions of the hepatitis B virus (HBV) genome. The green rectangles around the horizontal axis represent promoters (Cp, Sp1, Sp2, and Xp) and enhancers (Enh I and Enh II). Nucleotide positions are marked by the reference sequence of HBV genotype C.

Since the differences in CpG island I and new islands were greater than those in CpG islands II and III among each subtype, the chi-squared analysis was performed for the composition ratios of strains with or without CpG island I or new islands between subtypes of the same HBV genotype in this study (Figure 3A). The strains with CpG islands I, II, and III, and new islands among subtypes of each genotype were also analyzed via a similar chi-squared test (Figure 3B). Figure 3A shows that the composition of strains with or without CpG island I or new CpG islands in each subtype of the same genotype was significantly different (P<0.05). However, there was no significant difference between some subtypes (P>0.05). The information provided in Figure 3B is not the same as in Figure 3A. It is evident that there were no significant differences in the composition of strains containing CpG islands I, II, and III and new islands among the subtypes of HBV genotypes B and D (P>0.05), while genotypes A, C, and F showed the opposite trend (P<0.05).

Discussion

Methylation is a major form of epigenetic modification of genomic DNA, which serves as an important means for functional regulation of the genome and is believed to be involved in the cellular resistance to viral DNA invasion into the nucleus [27]. CpG island methylation of hepatitis B virus (HBV) DNA has been shown to play important roles in regulating the adaptability of the virus, silencing transcription, and down-regulating viral replication [23,28-30]. DNA methylation is related to the specific location of CpG in mammals [31], and the distribution of CpG islands might affect HBV genome methylation. Also, studies have shown that CpG islands of HBV DNA are divergent among different genotypes, which might partially account for the difference in clinical outcome among different HBV genotypes [21]. The absence of CpG island I is common in genotypes A, C, F, G and H, and new CpG islands are common in genotypes B, C, E, F, H, but rare in A, D, and G [21,22]. However, the findings of the present study showed that there were significant differences between specific subtypes, and that every subtype may possess a different CpG island distribution within the same genotype (Figure 1B). Also, the results of the chi-squared test also showed that there were significant differences in CpG island composition among the various subtypes in the same genotype. For example, the absence rates of CpG island I were very high in subtypes C1, C2, and C5, but very low in subtype C6. The CpG island I absence rate in genotype B ranged from 0 (B7) to 35% (B4), genotype D range from 3% (D5) to 48% (D2). Therefore, the characteristics of CpG islands in HBV subtypes may be more accurate than that of its genotypes. Previously, the synthesized genome of the B2 subtype has been shown to be fully replication-competent by in vitro transfection into hepatoma cells, in addition to expression and replication in mice [18]. Therefore, the large number of features of HBV subtypes reference sequences and CpG islands obtained in this study may provide a more substantial theoretical basis for HBV methylation research. Previous studies that have compared infections caused by HBV genotypes B and C found that infections with genotype B resulted in earlier hepatitis B e antigen (HBeAg) seroconversion, whereas infections with genotype C had an increased risk of developing cirrhosis and hepatocellular carcinoma (HCC) [32-34]. Other studies have speculated that this may be related to the increased absence rate of CpG island I in HBV genotype C [21]. From the previously published data and the findings of this study, HBV genotypes B and C are mostly distributed in China, and mainly include the B2, C1, and C2 subtypes (Supplementary Table 1). Consistent with previous findings, the present study showed that the rate of absence of CpG island I in HBV subtypes C1 and C2 was greater when compared with that in subtype B2. Therefore, it is possible to infer that the infection with HBV subtypes A3, A5 and genotypes F, G and H will result in similar clinical outcomes when compared with subtype C, showing a higher risk of developing cirrhosis and hepatocellular carcinoma (HCC) (Figure 3A). Two previously published clinical studies have shown an association between HBV genotype F and severe liver disease and HCC, and have shown that the risk of the development of HCC associated with HBV infection was significantly greater for genotype F than for genotypes A–D [35,36]. However, the function of CpG island I remain unclear, and how the absence of CpG island I facilitates the development of cirrhosis and HCC in HBV infection in patients requires further research. The data obtained in the present study showed that CpG islands I–III among different subtypes in the almost all of HBV genotype showed significant differences (P<0.05), except for the CpG island I of genotype C (Supplementary Table 2). Figure 2 illustrates the same phenomenon in which the start sites of HBV subtypes B2 and B6 strains are focused around the same site, with the end sites being scattered. This phenomenon also appears widely in other CpG islands of other HBV subtypes, and the relationship between these occurrences and the different outcomes of infection with different HBV genotypes or the same genotype remain unclear. This study had several limitations. All data analyzed were downloaded from GenBank, and the number of strains in some HBV genotypes and subtypes was small. Also, this study was a theoretical analysis of the differences in CpG islands, which are potential methylation sites between the various genotypes, and the possible impact of the findings on clinical outcomes was not specifically studied. Although CpG islands of HBV strains from a specific subtype were quite different, the major distribution tendency of the start and end sites and the lengths of CpG islands were consistent with those of corresponding subtype reference sequences. This finding indicated that the reference sequences used in this study were reliable, representative of the general characteristics of strains from respective HBV subtypes, and suitable to serve as models and tools for evaluation of subtype-specific CpG islands as possible targets for methylation. Importantly, these sequences could be used for the investigation of potential roles of HBV DNA methylation in the clinical course of HBV infection.

Conclusions

The present study established 28 subtype reference sequences of HBV genotypes A–H. The composition of strains with or without CpG island I, or new islands between different subtypes within the same HBV genotype, showed significant differences. CpG islands I–III among different subtypes in almost all HBV genotypes showed significant differences, except for the CpG island I in genotype C. The findings of this study might provide a foundation for further studies on the role of HBV DNA methylation in determining subtype-specific HBV viral biology, immune reactions against HBV infection, the clinical course of HBV infection, and drug sensitivity. The main geographic distribution of each HBV subgenotype strains. This table only lists countries which have 10 strains at least and ≥20% of the total number of that continents. Bold number indicates the total number of this subgenotype strains in this continents. China includes mainland China and Hong Kong, but not Taiwan region. The P values of non-parametric tests of CpG islands among different subtypes of HBV genotypes. The blank represents CpG island absences.
Supplementary Table 1

The main geographic distribution of each HBV subgenotype strains.

RegionA1A2A3A5B1B2B3B4B6B7B9C1C2C5C6D1ID1MD2D3D5D7EF1F2F3F4GH
Asia4047243622315313191856949179821783693149
 India1662285331
 Japan402127
 China27670605
 Indonesia14131117
 Vietnam12
 Thailand55
 Iran122
 Syria60
Europe31591102621284623910
 Belgium6915
 Poland45
 Turkey84
 Russia16
 Sweden14
Africa64102143211733198
 South Africa38
 Cameroon13
 Tunisia1633
 Guinea70
 Niger66
America481521536531018681821281413
 Haiti3621
 Canada26
 Greenland10
 Argentina1717
 Chile30
 United States16

This table only lists countries which have 10 strains at least and ≥20% of the total number of that continents. Bold number indicates the total number of this subgenotype strains in this continents. China includes mainland China and Hong Kong, but not Taiwan region.

Supplementary Table 2

The P values of non-parametric tests of CpG islands among different subtypes of HBV genotypes.

GenotypeCpG island ICpG island IICpG island III
Starting siteEnding siteLengthStarting siteEnding siteLengthStarting siteEnding siteLength
A0.0000.0020.0000.0000.0290.5470.0250.0000.000
B0.0000.0000.0000.0000.0000.0000.0000.0000.000
C0.5060.3230.4600.0000.0000.0040.0000.0010.000
D0.0000.0000.0000.0000.0000.0000.0000.0000.000
F0.0000.0000.0000.0000.0000.000

The blank represents CpG island absences.

  36 in total

Review 1.  Epigenomics: beyond CpG islands.

Authors:  Melissa J Fazzari; John M Greally
Journal:  Nat Rev Genet       Date:  2004-06       Impact factor: 53.242

Review 2.  Epigenetics and gene expression.

Authors:  E R Gibney; C M Nolan
Journal:  Heredity (Edinb)       Date:  2010-05-12       Impact factor: 3.821

Review 3.  DNA methylation and human disease.

Authors:  Keith D Robertson
Journal:  Nat Rev Genet       Date:  2005-08       Impact factor: 53.242

4.  A new genotype of hepatitis B virus: complete genome and phylogenetic relatedness.

Authors:  L Stuyver; S De Gendt; C Van Geyt; F Zoulim; M Fried; R F Schinazi; R Rossau
Journal:  J Gen Virol       Date:  2000-01       Impact factor: 3.891

5.  In vitro and in vivo replication of a chemically synthesized consensus genome of hepatitis B virus genotype B.

Authors:  Zhenhua Zhang; Jianbo Xia; Binghu Sun; Yu Dai; Xu Li; Joerg F Schlaak; Mengji Lu
Journal:  J Virol Methods       Date:  2014-11-27       Impact factor: 2.014

6.  Genotype H: a new Amerindian genotype of hepatitis B virus revealed in Central America.

Authors:  Patricia Arauz-Ruiz; Helene Norder; Betty H Robertson; Lars O Magnius
Journal:  J Gen Virol       Date:  2002-08       Impact factor: 3.891

Review 7.  Hepatitis B virus genotypes.

Authors:  Anna Kramvis; Michael Kew; Guido François
Journal:  Vaccine       Date:  2005-03-31       Impact factor: 3.641

8.  A genetic variant of hepatitis B virus divergent from known human and ape genotypes isolated from a Japanese patient and provisionally assigned to new genotype J.

Authors:  Kanako Tatematsu; Yasuhito Tanaka; Fuat Kurbanov; Fuminaka Sugauchi; Shuhei Mano; Tatsuji Maeshiro; Tomokuni Nakayoshi; Moriaki Wakuta; Yuzo Miyakawa; Masashi Mizokami
Journal:  J Virol       Date:  2009-07-29       Impact factor: 5.103

9.  Evidence that methylation of hepatitis B virus covalently closed circular DNA in liver tissues of patients with chronic hepatitis B modulates HBV replication.

Authors:  Yanhai Guo; Yongnian Li; Shijie Mu; Ju Zhang; Zhen Yan
Journal:  J Med Virol       Date:  2009-07       Impact factor: 2.327

10.  [Establishment of reference sequences of hepatitis B virus genotype B and C in China].

Authors:  Zhen-hua Zhang; Ling Zhang; Meng-ji Lu; Dong-liang Yang; Xu Li
Journal:  Zhonghua Gan Zang Bing Za Zhi       Date:  2009-12
View more
  1 in total

1.  The establishment of reference sequence for SARS-CoV-2 and variation analysis.

Authors:  Changtai Wang; Zhongping Liu; Zixiang Chen; Xin Huang; Mengyuan Xu; Tengfei He; Zhenhua Zhang
Journal:  J Med Virol       Date:  2020-03-20       Impact factor: 20.693

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.