The classification of hepatitis C virus (HCV) genotypes is of clinical importance as it may help to predict drug therapy responses and estimate treatment duration. The classical method of HCV subgenotype classification is whole genome sequencing (WGS). However, the high cost and time-consuming nature of WGS limits its usage in clinical practice. A number of studies have been conducted to confirm whether specific regions of HCV could replace WGS in the classification of HCV subgenotypes. In the present study, we used the HCV database to select HCV sequences from different countries. The neighbor-joining method was used to construct phylogenetic trees based on different regions of HCV (core, E1, E2 and NS5B), to confirm which region could replace WGS in subgenotype classification. Our results indicated that the core, E1 and E2 regions could not be used to classify the HCV subgenotype correctly (core failed to recognize subgenotypes c and a, E1 failed to discriminate between subgenotypes a and b, and E2 failed to identify subgenotypes a and c). The NS5B region provided the correct subgenotype classification. The HCV samples (n=153) collected from patients in Sichuan province, (Southwest China) were sequenced and classified based on the NS5B region. The results indicated that the major subgenotype of HCV in patients from Sichuan was 1b (51.6%, n=79); other subgenotypes included 3b (30.1%, n=46), 3a (7.8%, n=12), 6a (8.5%, n=13), 2a (n=2) and 6n (n=1). The data from our analysis may prove to be helpful in future epidemiological investigations of HCV, and may aid in the prevention and clinical treatment of HCV.
The classification of hepatitis C virus (HCV) genotypes is of clinical importance as it may help to predict drug therapy responses and estimate treatment duration. The classical method of HCV subgenotype classification is whole genome sequencing (WGS). However, the high cost and time-consuming nature of WGS limits its usage in clinical practice. A number of studies have been conducted to confirm whether specific regions of HCV could replace WGS in the classification of HCV subgenotypes. In the present study, we used the HCV database to select HCV sequences from different countries. The neighbor-joining method was used to construct phylogenetic trees based on different regions of HCV (core, E1, E2 and NS5B), to confirm which region could replace WGS in subgenotype classification. Our results indicated that the core, E1 and E2 regions could not be used to classify the HCV subgenotype correctly (core failed to recognize subgenotypes c and a, E1 failed to discriminate between subgenotypes a and b, and E2 failed to identify subgenotypes a and c). The NS5B region provided the correct subgenotype classification. The HCV samples (n=153) collected from patients in Sichuan province, (Southwest China) were sequenced and classified based on the NS5B region. The results indicated that the major subgenotype of HCV in patients from Sichuan was 1b (51.6%, n=79); other subgenotypes included 3b (30.1%, n=46), 3a (7.8%, n=12), 6a (8.5%, n=13), 2a (n=2) and 6n (n=1). The data from our analysis may prove to be helpful in future epidemiological investigations of HCV, and may aid in the prevention and clinical treatment of HCV.
Hepatitis C virus (HCV) is a single-strand RNA virus which belongs to the Flaviviridae family. The HCV genome is approximately 9.6 kb in length with 5′ and 3′ non-coding regions and a single open reading frame (1). The high mutation frequency of HCV which often occurs during viral replication is due to the lack of proofreading activity of RNA polymerase, thus contributing to the genotyping of HCV. There are 6 genotypes and >90 subgenotypes of HCV with regional differences (2). The geographical distribution of the main HCV genotypes is as follows (3): 1a in USA, 2a and 2b in North America and Japan, 3 in India, 4 in the Middle East and North Africa, 5 and 6 in Hong Kong, and 1b and 2a in China (4–10). The differences in routes of HCV transmission based on differences in regional genotypes may aid in epidemiological investigations and source tracing (11).HCV genotypes greatly influence the effects of antiviral therapy and the sustained virological response (SVR) (12). HCV genotyping is important for designing an antiviral therapy plan and predicting the effects of antiviral therapy. Pegylated interferon (PEG-IFN)-α in combination with ribavirin (RBV) is the standard therapy for chronic HCV infection. Previous studies have indicated that patients with genotypes 2 and 3 have a higher SVR to PEG-IFN-α/RBV therapy than patients with genotype 1 (13,14).The differences in HCV genotypes significantly contribute to the immune escape and sustained infection and these differences have greatly impeded the development of HCV vaccines (15). Thus, the identification of HCV genotypes and subgenotypes in different regions is critical to epidemiological investigation, diagnosis, vaccine development and clinical therapy. There are currently 5 methods used for the identification of the HCV genotype: nucleotide sequence analysis (16), specific primer amplification (17), probe hybridization (18), restriction fragment length polymorphism (19) and phylogenetic analysis (20). According to a new agreement of HCV genotype naming rules published in 2005 (20), phylogenetic analysis was considered the most accurate method for the identification of HCV genotypes. With phylogenetic analysis, PCR amplification, fragment sequencing and phylogenetic analysis are conducted on the specific regions of HCV genes to identify HCV genotypes and subgenotypes.In the present study, the sequences from the core, E1, E2 and NS5B regions reported previously (21) were selected and examined by phylogenetic analysis to confirm whether the subgenotype classification of specific regions based on phylogenetic analysis could replace whole genome analysis in the identification of HCV genotypes. The established classification based on specific regions was then used in the analysis of the sequences from patients in Sichuan province.
Materials and methods
Construction of phylogenetic trees based on sequences from the HCV sequence database
The sequences of the core, E1, E2 and NS5B regions and the total HCV genome, which have been reported in different countries, were selected from the HCV sequence database (http://hcv.lanl.gov/content/sequence/NEWALIGN/align.html). MEGA 5.0 was used to compare all the sequences, and phylogenetic trees were constructed using the neighbor-joining method (MEGA 5.0), and the reliability of the trees was evaluated by the bootstrap method with 1,000 replications.
Samples
A total of 153 blood samples were obtained from HCV-positive patients with HCV RNA >104 IU/ml at either Chengdu Infectious Diseases Hospital, the Affiliated Hospital of Luzhou Medical College or the Second People's Hospital of Yibin (both in Sichuan, Southwest China). All patients provided written informed consent. Patients with hepatitis A, B and autoimmune hepatitis were excluded from this study.
RNA extraction and RT-PCR
The Hepatitis C virus nucleotide quantitative detection kit (20131201/1; Qiagen Shenzhen Co., Ltd., Shenzhen, China) was used to isolate and measure the HCV RNA levels. HCV RNA was amplified using a reverse transcription kit (AK5401 Takara Biotechnology Co., Ltd., Dalian, China) according to the manufacturer's instructions. The primers used and the amplification procedure were as previously described (11). The sequencing for target fragments was performed by BioSune Biotechnology Co., Ltd. (Shanghai, China).
HCV sequence analysis
All HCV sequences were edited using BioEdit software. To avoid potential laboratory errors, the Basic Local Alignment Search Tool (BLAST) and phylogenetic analysis were used to identify HCV genotypes. All the nucleotide sequences obtained in this study were screened using the online BLAST search tool (http://blast.ncbi.nlm.nih.gov/Blast.cgi) to search for sequence similarities to previously reported sequences in the database. The classification of HCV genotypes and subgenotypes was according to the Consensus proposals for a unified system of nomenclature of hepatitis C virus genotypes (20). The phylogenetic tree was constructed using MEGA 5.0 software. The reference sequences were: 1b: CN. BJ.HQ639947; CN.BJ.JX961151; CN.HN.JX961093; CN.js. JQ303617; CN.js.JQ303531; CN.KC844051; CN.YN. FJ462981; JP.x.JT.D11355; 1b: USA.EU256090; 2a: JP.x.AY746460; JP.x.HC-J6.D00944; 3a: CN.bj.HQ639941; CN.hb.KF292145; CN.SH.HQ912953; CN.ZJ. HQ318890; Creteil.AM423014;.FR.FJ872277; JP.NC_009824; 3b: 3b.CN.EU081507; 3b.CN. FJ462985; 3b.CN.HK.KC441470; CN.js.JQ303583; CN.sh. JQ065709; CN.yn. EU081464; CN.YN. FJ462950; CN.YN. FJ462973; CN.YN.FJ462968; CN.YN.FJ462999; 6a: CN. HK.DQ480520; CN.hk.KC441477; CN.hk.KC441481; CN.js. JQ303556; JCN.js.JQ303557; 6a.CN.KC844037; HK.x.6a33. AY859526; 6b: x.x.Th580.NC_009827; 6n: USA.DQ835768.
Results
NS5B could replace whole genome sequencing (WGS) to classify HCV subgenotypes
Phylogenetic trees were constructed to identify the correct method for the classification of HCV subgenotypes. All analyzed sequences were from the HCV database, including whole genome sequences (n=470), core region sequences (n=781), E1 region sequences (n=899), E2 region sequences (n=627) and NS5B region sequences (n=466). The results indicated that the phylogenetic tree of the whole genome sequences divided all subgenotypes correctly (Fig. 1). The analysis of the NS5B region sequences revealed that subgenotype classification based on the NS5B-tree could replace the WGS method as all analyzed sequences were separated correctly (Fig. 2). We did not display the figure of phylogenetic subtree strains of the 1b, 1a, 4, 2, 6 subgenotypes as they did not reveal the abnormal sequences.
Figure 1
Phylogenetic tree of 470 HCV whole genome sequences. Sequences were obtained from the HCV database (http://hcv.lanl.gov/content/sequence/NEWALIGN/align.html). Ο1b, Ο1a, Ο4, Ο6, Ο2 represent 1b, 1a, 4, 6, 2 subgenotypes, respectively.
Figure 2
Phylogenetic tree of 466 HCV NS5B region sequences. Sequences were obtained from the HCV database (http://hcv.lanl.gov/content/sequence/NEWALIGN/align.html). Ο1b, Ο1a, Ο6, Ο2 represent 1b, 1a, 4, 2 subgeno-types, respectively.
However, the phylogenetic trees of the other regions, including the core, E1 and E2 regions, failed to distinguish all subgenotypes. Among these, the core-tree failed to separate subgenotypes 1c and 1a. 1c AY651061 was in 1a (Fig. 3B), whereas in the E1-tree, 1b AY746882 was located in 1a (Fig. 4B) and 1a AY388455 was in 1b (Fig. 4C). It was also found that E2 could not replace WGS for subgenotype classification as the E2-tree failed to separate subgenotypes 1c and 1a; 1a EU529679 was located in 1c (Fig. 5). We did not display the figure of phylogenetic subtree strains of the 4, 3, 2, 5, 6 subgenotypes as they did not reveal the abnormal sequences.
Figure 3
Phylogenetic tree of 781 HCV core region sequences. Sequences were obtained from the HCV database (http://hcv.lanl.gov/content/sequence/NEWALIGN/align.html). (A) The whole phylogenetic tree. Ο1b, Ο1a, Ο4, Ο3, Ο2, Ο6 represent 1b, 1a, 4, 3, 2, 6 subgenotypes, respectively. (B) The partial sub-tree of HCV subgenotype 1a that included abnormal sequences which should be displayed in non-1a strains.
Figure 4
Phylogenetic tree of 899 HCV E1 region sequences. Sequences were obtained from the HCV database (http://hcv.lanl.gov/content/sequence/NEWALIGN/align.html). (A) The whole phylogenetic tree. Ο1a, Ο1b, Ο3, Ο5, Ο4, Ο2, Ο6 represent 1a, 1b, 3, 5, 4, 2, 6 subgenotypes, respectively. (B) The partial sub-tree of HCV 1a subgenotype strain that included abnormal sequences which should be displayed in non-1a strains. (C) The partial sub-tree of HCV 1b subgenotype strain that included abnormal sequences which should be displayed in non-1b strains.
Figure 5
Phylogenetic tree of 627 HCV E2 region sequences. Sequences were obtained from the HCV database (http://hcv.lanl.gov/content/sequence/NEWALIGN/align.html). Ο1b, Ο1a, Ο4, Ο2, Ο6 represent 1b, 1a, 4, 2, 6 subgenotypes, respectively.
HCV subgenotype analysis in infected patients from Sichuan province
A total of 153 HCV samples obtained from infected patients from Sichuan province were isolated and amplified by RT-PCR. After sequencing the PCR products, all samples were subjected to subgenotype classification using the neighbor-joining method. Our results indicated that the dominant HCV subgenotype in infected patients from Sichuan province was 1b (n=79, 51.6%), which was closely related to sequences from patients in Yunnan, Henan, Jiangsu and Zhejiang (Fig. 6E). Subgenotype 3b was identified in 46 cases (30.1%) and was closest to the sequences from patients in Yunnan, Fujian, Jiangsu and Shanghai in the phylogenetic tree (Fig. 6B). There were 12 cases (7.8%) belonging to subgenotype 3a and closely related to subgenotypes from patients in Shanghai, Beijing, Henan, as well as in France and Japan (Fig. 6C). There were 13 cases (8.5%) of subgenotype 6a in the present study which were closely related to sequences from patients in Jiangsu and Hong Kong (Fig. 6D). There were also 2 cases of subgenotype 2a and 1 case of subgenotype 6n, which were closely related to sequences from in Japan and USA, respectively (Fig. 6A).
Figure 6
Phylogenetic tree of 153 HCV NS5B sequences from patients in Sichuan province. (A) Phylogenetic tree constructed based on NS5B sequences. Ο3b, Ο3a, Ο6a, Ο1b represent 3b, 3a, 6a, 1b subgenotypes, respectively. (B) The sub-tree of HCV 3b subgenotypes. (C) The sub-tree of HCV 3a subgenotypes. (D) The sub-tree of HCV 6a subgenotypes. (E) The sub-tree of HCV 1b subgenotypes. The reference sequences of HCV variants were cited from GenBank.
Discussion
HCV is one of the main pathogens which induce chronic liver disease, eventually leading to cirrhosis and hepatocellular carcinoma (22). There are almost 0.17 billion people infected with HCV, and this accounts for 3% of the total global population (23). In China, there are 25–50 million HCV-positive individuals (24), and 3–4 million new cases diagnosed annually worldwide. Apart from the 20% of patients who experience viral clearance, the majority of HCV-infected individuals will remain positive for HCV for life and they are more likely to develop chronic hepatitis, liver cirrhosis and hepatocellular carcinoma than hepatitis B virus-infected patients; this poses a serious public health concern (25).The different subgenotypes of HCV have various biological and molecular epidemiological characteristics (26). The correct subgenotype classification plays a very important role in diagnosis, clinical therapy (27) and vaccine development (28). WGS is the most accurate method for the classification of HCV genotypes; however, WGS cannot be widely used in clinical diagnosis as it is expensive and time-consuming. Thus, the best choice is to use specific regions instead of the whole genome.HCV whole genome sequences selected from HCV database, which were isolated from patients from different countries and regions and had an annotated source of mature peptides, were used for phylogenetic analysis by the neighbor-joining method. The results of the analysis of the core, E1, E2 and NS5B regions indicated that NS5B could replace WGS in genotype classification, while the core region did not recognize the 1c and 1a subgenotypes (1c AY651061 was located in 1a). E1 did not distinguish subgenotypes 1a and 1b (1b AY746882 was in 1a and 1a AY388455 in 1b), E2 failed to identify subgenotypes 1a and 1c (1a EU529679 in 1c). The results confirmed the role of the NS5B region in HCV subgenotype classification. Attempts (9,17,21,29) have previously been made to replace WGS by more simple and rapid methods for genotype classification. The classified results could be confused by the 1a and 1c subgenotypes when the core region and E1 region are applied in the classification.Research into HCV subgenotype distribution may prove to be helpful in epidemiological studies (30). The distribution of genotypes and subgenotypes in different regions were differs due to the population mobility and increased numbers of drug users (drugs injected). The dominant HCV subgenotype in China is 1b, but there are differences between the northern and southern regions (2). The differences in genotypes from patients from southern China are complex and a possible reason for this is that the number of injection drug users has greatly increased. The genotyping of 60 patients with chronic HCV from Kunming, a city in southern China, indicated that the major subgenotypes were 3b, 3a and 1b (31).In the present study, the amplification, sequencing and classification of the NS5B region were conducted in 153 samples fromp patients from Sichuan province. The results indicated that 6 HCV subgenotypes, including 1a, 3b, 3a, 6a, 2a and 6n, were circulating in patients from Sichuan province. The dominant subgenotype was 1b (79 cases, 51.6%) and the other subgenotypes were 3b (46 cases, 30.1%), 3a (12 cases, 7.8%), 6a (13 cases, 8.5%) while 2a and 6n had only 2 cases and 1 case, respectively. Sichuan and Yunnan are neighboring provinces and this revealed the fact that the sequences from Sichuan were closley related to those from patients in Yunnan province. Parts of sequences were related to patients from Jiangsu, Shanghai and Fujian and this is possibly due to the return of migrant workers. The fact that there were also sequences from patients in the US, Japan and Hong Kong indicated the broken of region limitation in HCV subgenotype distribution by population mobility. The results of our research have highlighted new issues regarding the prevention and management of HCV.The clinical usage of antiviral drugs could be guided by the HCV genotype (32). However, the limitation of the present study was that the classification based on the NS5B region could guide the usage of drugs targeted directly to NS5B residues, as well as other target drugs of other non-recombinant HCV genotypes, but this method has no use in the guidance of recombinant genotypes.
Authors: Jannick Verbeeck; Piet Maes; Philippe Lemey; Oliver G Pybus; Elke Wollants; Ernie Song; Frederik Nevens; Johan Fevery; Wayne Delport; Schalk Van der Merwe; Marc Van Ranst Journal: J Virol Date: 2006-05 Impact factor: 5.103
Authors: L Stuyver; R Rossau; A Wyseur; M Duhamel; B Vanderborght; H Van Heuverswyn; G Maertens Journal: J Gen Virol Date: 1993-06 Impact factor: 3.891