Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Eight novel hepatitis C virus genomes reveal the changing taxonomic structure of genotype 6.

Literature DB >> 23015745

Eight novel hepatitis C virus genomes reveal the changing taxonomic structure of genotype 6.

Hongren Wang¹, Zhiguo Yuan¹, Eleanor Barnes², Manqiong Yuan³, Chunhua Li³, Yongshui Fu⁴, Xueshan Xia⁵, Gang Li¹, Paul N Newton^6,7,8, Manivanh Vongsouvath⁶, Paul Klenerman², Oliver G Pybus⁹, Donald Murphy¹⁰, Kenji Abe¹⁰, Ling Lu³.

Abstract

Analysis of partial hepatitis C virus sequences has revealed many novel genotype 6 variants that cannot be unambiguously classified, which obscure the distinctiveness of pre-existing subtypes. To explore this uncertainty, we obtained genomes of 98.0-98.8% full-length for eight such variants (KM35, QC273, TV257, TV476, TV533, L349, QC271 and DH027) and characterized them using phylogenetic analyses and per cent nucleotide similarities. The former four are closely related phylogenetically to subtype 6k, TV533 and L349 to subtype 6l, QC271 to subtypes 6i and 6j, and DH027 to subtypes 6m and 6n. The former six defined a high-level grouping that comprised subtypes 6k and 6l, plus related strains. The threshold between intra- and inter-subtype diversity in this group was indistinct. We propose that similar results would be seen elsewhere if more intermediate variants like QC271 and DH027 were sampled.

Entities: Chemical

Mesh：

Substances：
RNA, Viral

Year: 2012 PMID： 23015745 PMCID： PMC3542719 DOI： 10.1099/vir.0.047506-0

Source DB: PubMed Journal: J Gen Virol ISSN： 0022-1317 Impact factor: 3.891

The hepatitis C virus (HCV) is genetically highly variable and is currently classified into six confirmed and one provisional genotype. Among them, genotype 6 exhibits the greatest genetic diversity and has been proposed to have an older evolutionary origin than other HCV genotypes (Salemi & Vandamme, 2002). Divergent isolates of genotype 6 have been found exclusively in South-east Asia or among emigrants from there, suggesting that the strains are endemic to that region (Bernier ; Mellor ; Noppornpanth ; Shinji ; Stuyver ; Simmonds et al., 1996; Thaikruea ; Theamboonlers ). Taxonomically, as many as 23 subtypes of genotype 6 (6a–6w) have been assigned and for each at least one full-length genome sequence has been characterized (Kuiken ). Whole genome sequences are the gold standard for genetic and evolutionary analysis of HCV and for accurate classification. Measuring the extent of HCV diversity is essential not only for understanding the origin and evolution of HCV, but also for defining new preventive strategies and developing novel therapies and vaccines. The current HCV nomenclature confirms the designation of genotypes and subtypes based on phylogenetic analysis of full-length genome sequences. In terms of nucleotide identity a difference of 31–33 % is required to discriminate genotypes, while for subtypes no such fixed criterion is proposed because they are thought to represent an epidemiological phenomenon associated with their recent spreads. However, all the currently designated subtypes do show nucleotide differences by >15 % (Simmonds ). Using partial genome sequences we have previously found a number of novel HCV-6 variants whose nucleotide distances from the currently defined subtypes are around 15 %, making their classification ambiguous. This ambiguity is reflected in phylogenetic analyses: some subtypes are distinct and separated by long internal branches, whereas other subtypes are more closely related and sometimes seem to merge into a single but larger phylogenetic group. Here, we demonstrate this by generating and analysing 98.0–98.8 % of full-length genome sequences from six variants related to subtypes 6k and 6l (KM35, QC273, TV257, TV476, TV533 and L349). In addition, we also determined such sequences for two other HCV-6 variants (DH027 and QC271) that appear to not fall within any currently known subtypes. HCV genomes were determined each with 22–30 overlapping amplicons for the following 10 strains: KM35, QC273, TV257, TV476, TV533, L349, TV317, TV494, D027 and QC271. Their lengths ranged from 9412 to 9533 nt, corresponding to the nucleotide numbering of 1 to 9452–9564 in the H77 genome, covering 98.0–98.8 % of the full-length. The 5′ UTRs were all 338 nt long, while the 3′ UTRs varied from 23 to 144 nt long. Six isolates (KM35, QC273, TV476, TV533, D027 and QC271) had their 3′ UTRs amplified through to the poly(U) tract, but for four isolates (TV257, L349, TV317 and TV494) the poly(U) tracts were not obtained. Isolates KM35, QC273 and TV317 each contain a single ORF of 9048 nt. TV257, TV476, TV533, L349, TV494 and QC271 each contain an ORF of 9051 nt, while the ORF of DH027 is 9054 nt long. The sizes of the 10 HCV protein encoding regions were as follows: core (573 nt/191 aa), E1 (576 nt/192 aa), E2 (1092–1098 nt/364–366 aa), P7 (189 nt/63 aa), NS2 (651 nt/217 aa), NS3 (1893 nt/631 aa), NS4A (162 nt/54 aa), NS4B (783 nt/261 aa), NS5A (1350–1353 nt/450–451 aa) and NS5B (1776 nt/591 aa) (see Table S1, available in JGV Online). TV317 and TV494 grouped closely with two isolates of subtype 6l: D33 and 537796 (Fig. 1). Since this grouping is unambiguous, the classification of TV317 and TV494 will no longer be discussed. Each of the remaining eight variants was pairwise compared with the 54 reference sequences shown in Fig. 1(a). These reference strains represent the 23 subtypes (6a–6w) currently assigned under genotype 6. They included five genomes of subtype 6a, four genomes each of subtypes 6e, 6m, 6n and 6t, three genomes each of subtypes 6f, 6i, 6o, 6u, 6v and 6w, two genomes each of subtypes 6g, 6j and 6l, and one representative each from subtypes 6b, 6c, 6d, 6h, 6k, 6p, 6q, 6r and 6s. When compared to each other, the eight novel variants showed nucleotide similarities of 76.7–83.7 % across the whole genome and of 76.0–83.2 % across the entire ORF (Table S2). When compared to the 54 reference sequences, their nucleotide similarities were 72.2–86.2 % across the whole genome and 71.4–85.7 % across the entire ORF (Table S3). Within the 10 viral genes, core and NS5B showed the highest similarities, whilst P7 and NS2 the lowest (Table S4).

Fig. 1.

Phylogenetic trees estimated from (a) complete nucleotide sequences and (b) predicted amino acid sequences. Reference HCV sequences are each indicated by a subtype name followed by an isolate name. KM35, QC273, TV257, TV476, TV533, L349, D027 and QC271 represent the eight novel genotype 6 variants completely sequenced in this study and are indicated each with a red circle. TV317 and TV494 are two 6l isolates that were also completely sequenced in this study; they were marked each with a green circle. Bootstrap analysis values of ≥70 % are shown in italics. Bars indicate a genetic distance of 0.10 nucleotide or 0.05 amino acid substitutions per site. Of the eight novel variants, six (KM35, QC273, TV257, TV476, TV533 and L349) were found to be roughly equally similar to subtypes 6k and 6l. The former four (KM35, QC273, TV257 and TV476) are found to be more closely related, but remaining somewhat distant, to 6k (isolate VN405) than to 6l. These four exhibit nucleotide similarities of 83.2–85.8 % to 6k, and of 80.7–81.4 % to 6l. Conversely, isolates TV533 and L349 exhibit nucleotide similarities of 82.7–86.2 % to 6l, and of 80.5–81.0 % to 6k. Recently, we have characterized two variants KM41 and KM45 that are related to 6k (Lu ) and exhibit nucleotide similarities of 83.3–83.4 % to VN405, which is the prototype isolate of 6k. Likewise, QC271 was roughly equally similar to subtypes 6i and 6j, whilst DH027 was roughly equally similar to subtypes 6m and 6n. QC271 exhibits nucleotide similarities of 85.2–85.5 % to 6j and of 83.0–83.8 % to 6i, whilst DH027 displays nucleotide similarities of 83.9–85.0 % to 6n and of 81.0–81.3 % to 6m. The nucleotide similarities of the genomes described above fall close to the threshold by which different subtypes of HCV are discriminated making their classification difficult. A phylogenetic tree was estimated using the obtained genome sequences. The phylogeny showed that isolates KM35, QC273, TV257 and TV476 formed a loose cluster with VN405, KM41 and KM45. Within this cluster, three subsets can be divided. The first contains KM41, KM45 and QC273, the second contains TV257 and TV456, and the third contains KM35 and VN405. Genetic distances among the three subsets (18.2–18.6 %) are comparable to those between subtypes 6f and 6r (19.3–19.8 %), 6i and 6j (18.5–19.4 %) and 6m and 6n (20.8–22.9 %). Isolates TV533 and L349 were loosely grouped in a second cluster with four 6l isolates (537796, D33, L349 and TV494). Taken together, these two clusters form a larger group that contains 13 isolates related to subtypes 6k and 6l. The internal branch lengths that separate lineages in this group appear smaller than in the remainder of the HCV genotype 6 tree (Fig. 1a). In addition to subtypes 6k and 6l, there are other well-supported taxonomic groupings above the subtype level: subtypes 6m and 6n cluster strongly together, as also do subtypes 6h, 6i and 6j. The isolate DH027 was placed between 6m and 6n, whilst isolate QC271 was placed between 6i and 6j. The addition of DH027 and QC271 clearly interrupts the separation of 6m/6n and 6i/6j (Lu ). There was strong bootstrap support for a group comprising subtypes 6k, 6l, 6m, 6n, 6h, 6j, 6i and their related viruses, and all the eight novel variants reported here belong to this clade. We estimated a second phylogeny using predicted amino acid sequences (Fig. 1b) and its topology was consistent with the nucleotide phylogeny in Fig. 1(a). Sequences from the ten protein-coding regions were also analysed separately, and similar structures were obtained (data not shown). It is possible that the phylogenetic tree shape may be affected by recent viral recombination events that occurred between subtypes 6k and 6l, between 6i and 6j, and between 6m and 6n. To investigate this, pairwise similarity scores were calculated between the eight novel variants and the 54 reference sequences that represent subtypes 6a–6w by using the rdp software. In each case, similar plot patterns were observed but no evidence of recent viral recombination events was seen (data not shown). In this study, HCV genomes of 98.0–98.8 % full-length were determined for eight novel genotype 6 variants (DH027, KM35, L349, QC271, QC273, TV257, TV476 and TV533). All those except for DH027 and QC271 were classified into a large cluster containing both subtypes 6k and 6l. Of them, six were each distant from the prototypic isolates of 6k and 6l. Within this cluster there are several short internal branches above the subtype level; such branches are rare in the rest of the genotype 6 phylogeny, and represent active viral transmission in the distant past. One explanation is that the 6k/6l-related group has been sampled more densely, such that the long internal branches present in other parts of the tree represent insufficient sampling: the phylogenetic positions of DH027 and QC271 (which are both equidistant between pairs of subtypes) further support this notion. Other pairs of subtypes that appear to be clearly separated (e.g. 6a/6b, 6c/6d, 6 g/6w, 6o/6p, 6q/6t, 6u/6v etc.) may therefore become interrupted and less distinct as further diversity is uncovered. This is likely to be the case once further molecular epidemiology studies of HCV are completed in South-east Asian countries in which there is currently a lack of extensive HCV surveillance. It is interesting to note that a breakdown in subtype distinctiveness has also been described for human immunodeficiency virus type 1 (HIV-1): widespread surveillance and sampling of HIV-1 from central Africa (Vidal ) largely eroded the long internal branches that previously had defined highly distinct HIV-1 subtypes (Rambaut ). Analysis of our eight novel variants revealed two features: (i) they are slightly more distinct from subtype prototype sequence than other strains, making their subtype assignment more difficult; (ii) a larger cluster comprising subtypes 6k, 6l and related viruses exists, representing a more ancient phylogenetic grouping. A similar grouping of 6i/6j and 6m/6n could be defined if more variants like DH027 and QC271 are found. Further groupings of subtypes, specifically 6f/6r and 6a/6b, are strongly suggested by the existence of isolates that appear to be placed between the subtypes in each pair (data not shown); these isolates have yet to be entirely sequenced. We therefore hypothesize that many HCV variants are still unsampled and represent an important missing component of global HCV diversity, within which there may be less or no clear separation of subtypes. If this is the case then there could be an unmanageable profusion of subtype designations in the future. A total of 10 serum samples was used in this study. KM35 was from a voluntary blood donor and DH027 was from an HIV-1-infected injection drug user; both were originally from Kunming City, Yunnan Province, China (Fu ; Xia ). Isolates TV257, TV317, TV476, TV494 and TV533 were all from blood donors from Ho Chi Minh City, Vietnam (Pham ). L349 was from a patient in Vientiane city, Lao PDR (Laos) (Syhavong ; Pybus ). QC271 and QC273 were sampled in Quebec, Canada from individuals who had the origins from Thailand and Cambodia, respectively (Murphy ). These samples were selected because our preliminary analyses of their partial core–E1 sequences have shown ambiguous classification between subtypes. The genome sequence of each HCV isolate was determined from 100 µl of serum using the methods described previously (Li ). In brief, RNA was extracted using Tripure (Roche). cDNA was transcribed using AMV reverse transcriptase (Roche) and random hexamers (Promega). Overlapping fragments were amplified using the Fast Start PCR system (Roche) with the primers listed in Table S5. To avoid PCR false positives, standard procedures were taken (Kwok & Higuchi, 1989). At least one negative control, one positive control and a water blank were included in each of the following steps: RNA extraction, reverse transcription and the 1st and 2nd rounds of PCR. After PCR, the amplicons were purified using QIAquick PCR purification kit (Qiagen) according to the manufacturer’s protocol. To obtain consensus sequences to reflect the heterogeneity of viral population within each individual, the purified amplicons were sequenced directly. The sequencing was done in both directions by using ABI Prism BigDye 3.0 terminators with an appropriate primer on an ABI Prism 3500 genetic analyser (PE Applied Biosystems). The resulting chromatograms were corrected using SeqMan in the dnastar package (dnastar Inc.). The finalized sequences were aligned using BioEdit (Tippmann, 2004) followed by manual adjustments and corrections. Maximum-likelihood phylogenetic trees were estimated using PHYML (Guindon & Gascuel, 2003) under the GTR+I+Γ6 nucleotide substitution model. The transition/transversion rate ratio, the proportion of invariable sites, and the gamma distribution shape parameter were estimated from the alignment. Base frequencies were adjusted to maximize the likelihood. Bootstrap resampling was performed in 500 replicates. For pairwise sequence comparisons, nucleotide similarities were calculated using mega5 (Kumar ) and genetic distances displayed from the tree file. To detect possible virus recombination events, we used rdp3 (Recombination Detection Program, version 3) (Martin ). The program was run under default settings with the following adjustments: (i) window size was set to 40 nt; (ii) linear sequences option was chosen; (iii) six different methods (rdp, geneconv, MaxChi, Bootscan, Chimaera and SiScan) were performed simultaneously on the multiple sequence alignment; and (iv) only events detected by more than two methods were listed.

27 in total

1. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood.

Authors: Stéphane Guindon; Olivier Gascuel
Journal: Syst Biol Date: 2003-10 Impact factor: 15.683

2. Analysis for free: comparing programs for sequence analysis.

Authors: Helge-Friedrich Tippmann
Journal: Brief Bioinform Date: 2004-03 Impact factor: 11.622

3. Complete genomic sequences for hepatitis C virus subtypes 6e and 6g isolated from Chinese patients with injection drug use and HIV-1 co-infection.

Authors: Chunhua Li; Yongshui Fu; Ling Lu; Weizhi Ji; Jian Yu; Curt H Hagedorn; Linqi Zhang
Journal: J Med Virol Date: 2006-08 Impact factor: 2.327

4. Unprecedented degree of human immunodeficiency virus type 1 (HIV-1) group M genetic diversity in the Democratic Republic of Congo suggests that the HIV-1 pandemic originated in Central Africa.

Authors: N Vidal; M Peeters; C Mulanga-Kabeya; N Nzilambi; D Robertson; W Ilunga; H Sema; K Tshimanga; B Bongo; E Delaporte
Journal: J Virol Date: 2000-11 Impact factor: 5.103

5. Risk factors for hepatitis C virus infection among blood donors in northern Thailand.

Authors: Lakkana Thaikruea; Satawat Thongsawat; Niwat Maneekarn; Dale Netski; David L Thomas; Kenrad E Nelson
Journal: Transfusion Date: 2004-10 Impact factor: 3.157

6. Analysis of HCV genotypes from blood donors shows three new HCV type 6 subgroups exist in Myanmar.

Authors: Toshiyuki Shinji; Yi Yi Kyaw; Katsunori Gokan; Yasuhito Tanaka; Koji Ochi; Nobuchika Kusano; Takaaki Mizushima; Shin-ichi Fujioka; Hidenori Shiraha; Aye Aye Lwin; Yasushi Shiratori; Masashi Mizokami; Myo Khin; Masayuki Miyahara; Shigeru Okada; Norio Koide
Journal: Acta Med Okayama Date: 2004-06 Impact factor: 0.892

7. The Los Alamos hepatitis C sequence database.

Authors: Carla Kuiken; Karina Yusim; Laura Boykin; Russell Richardson
Journal: Bioinformatics Date: 2004-09-17 Impact factor: 6.937

8. The unique HCV genotype distribution and the discovery of a novel subtype 6u among IDUs co-infected with HIV-1 in Yunnan, China.

Authors: Xueshan Xia; Ling Lu; Kok Keng Tee; Wenhua Zhao; Jianguo Wu; Jing Yu; Xiaojie Li; Yixiong Lin; Muhammad Mahmood Mukhtar; Curt H Hagedorn; Yutaka Takebe
Journal: J Med Virol Date: 2008-07 Impact factor: 2.327

9. Complete genomes for hepatitis C virus subtypes 6f, 6i, 6j and 6m: viral genetic diversity among Thai blood donors and infected spouses.

Authors: Ling Lu; Chunhua Li; Yongshui Fu; Lakkana Thaikruea; Satawat Thongswat; Niwat Maneekarn; Chatchawann Apichartpiyakul; Hak Hotta; Hiroaki Okamoto; Dale Netski; Oliver G Pybus; Donald Murphy; Curt H Hagedorn; Kenrad E Nelson
Journal: J Gen Virol Date: 2007-05 Impact factor: 3.891

10. RDP3: a flexible and fast computer program for analyzing recombination.

Authors: Darren P Martin; Philippe Lemey; Martin Lott; Vincent Moulton; David Posada; Pierre Lefeuvre
Journal: Bioinformatics Date: 2010-08-26 Impact factor: 6.937

12 in total

1. A panel of 16 full-length HCV genomes was characterized in China belonging to genotypes 1-6 including subtype 2f and two novel genotype 6 variants.

Authors: Ru Xu; Wangxia Tong; Lin Gu; Chunhua Li; Yongshui Fu; Ling Lu
Journal: Infect Genet Evol Date: 2013-09-05 Impact factor: 3.342

2. Impact of Preexisting Hepatitis C Virus Genotype 6 NS3, NS5A, and NS5B Polymorphisms on the In Vitro Potency of Direct-Acting Antiviral Agents.

Authors: Fiona McPhee; Joseph Ueland; Vincent Vellucci; Scott Bowden; William Sievert; Nannan Zhou
Journal: Antimicrob Agents Chemother Date: 2019-03-27 Impact factor: 5.191

3. The full-length genome sequences of nine HCV genotype 4 variants representing a new subtype 4s and eight unclassified lineages.

Authors: Ling Lu; Yan Xu; Jie Yuan; Chunhua Li; Donald G Murphy
Journal: Virology Date: 2015-04-05 Impact factor: 3.616

4. Conservation in China of a novel group of HCV variants dating to six centuries ago.

Authors: Yuling An; Tao Wu; Min Wang; Ling Lu; Chunhua Li; Yuanpin Zhou; Yongshui Fu; Guihua Chen
Journal: Virology Date: 2014-07-18 Impact factor: 3.616

5. Origin of hepatitis C virus genotype 3 in Africa as estimated through an evolutionary analysis of the full-length genomes of nine subtypes, including the newly sequenced 3d and 3e.

Authors: Chunhua Li; Ling Lu; Donald G Murphy; Francesco Negro; Hiroaki Okamoto
Journal: J Gen Virol Date: 2014-05-02 Impact factor: 3.891

6. Nine additional complete genome sequences of HCV genotype 6 from Vietnam including new subtypes 6xb and 6xc.

Authors: Chunhua Li; Van H Pham; Kenji Abe; Ling Lu
Journal: Virology Date: 2014-08-30 Impact factor: 3.616

7. Expanded classification of hepatitis C virus into 7 genotypes and 67 subtypes: updated criteria and genotype assignment web resource.

Authors: Donald B Smith; Jens Bukh; Carla Kuiken; A Scott Muerhoff; Charles M Rice; Jack T Stapleton; Peter Simmonds
Journal: Hepatology Date: 2014-01 Impact factor: 17.425