| Literature DB >> 36072891 |
Tomomi Ando1, Hideki Aizaki1, Masaya Sugiyama2, Tomoko Date3, Kazuhiko Hayashi4, Masatoshi Ishigami5, Yoshiaki Katano6, Hidemi Goto4, Masashi Mizokami3, Masamichi Muramatsu1, Makoto Kuroda7, Takaji Wakita1.
Abstract
The viral genome quasispecies composition of hepatitis C virus (HCV) could have important implications to viral pathogenesis and resistance to anti-viral treatment. The purpose of the present study was to profile the HCV RNA quasispecies. We developed a strategy to determine the full-length HCV genome sequences co-existing within a single patient serum by using next-generation sequencing technologies. The isolated viral clones were divided into the groups that can be distinguished by core amino acid 70 substitution. Subsequently, we determined HCV full-length genome sequences of three independent dominant species co-existing in the sequential serum with a 7-year interval. From phylogenetic analysis, these dominant species evolved independently. Our study demonstrated that multiple dominant species co-existed in patient sera and evolved independently.Entities:
Keywords: Core 70 polymorphism; E, envelope; HCV; HVR, hyper variable region; IRRDR, IFN and ribavirin resistance-determining region; ISDR, IFN sensitivity-determining region; NGS, next-generation sequencing; NS, nonstructural; Next-generation sequencing; PEG-IFN, pegylated interferon; Q, glutamine; Quasispecies; R, arginine; RBV, ribavirin; SVR, sustained virological response; core70, the core amino acid 70 substitutions
Year: 2022 PMID: 36072891 PMCID: PMC9441305 DOI: 10.1016/j.bbrep.2022.101327
Source DB: PubMed Journal: Biochem Biophys Rep ISSN: 2405-5808
Fig. 1Population analysis of core70 and the surrounding region. Core70 (nucleotide position: 549–551) and a part of the E1 region (500–1040) were amplified. The PCR products were subjected to TA cloning. Twelve clones were isolated from each PCR product and sequenced. The sequences were grouped based on the core70 sequence. The top numbers show the nucleotide positions having variety in more than 10% of all clones, and the left numbers show the clone numbers. The nucleotides of clone 1 (which has core70 Q) are show in the top line of white boxes. The black boxes represent nucleotides different from clone 1. Non-synonymous substitutions are shown only if the nucleotides were completely different between two groups. (A) sPt1-98, (B) sPt1-05
Fig. 2Flowchart of the method for determining full-length sequences in patient serum, Pt1-98 and Pt1-05. Ful-length HCV seuences were determined by the method as described in this flowchart. Illumina GAIIx is suitable for determining consensus sequence and generating substitutions list because generating library for Illumina analysis does not need sequence specific amplification. 454 GS FLX is suitable for identifying the combination of substitutions in small regions because the read length is relatively long (−500 mer), though generating library for 454 analysis needs sequence specific amplification. Using primers avoiding substitutions, we could amplify the viral sequence maintaining the diversity detected by Illumina analysis for generating 454 libraly. After identifying small regions in which reads can be separated to some groups depend on the substitutions pattern using 454 analysis, we designed species specific primers including substitutions. Using these primers, we could amplify each viral sequence specifically. The PCR products were sequenced using capillary sequencing. The determined sequences were verified using every data from Illumina and 454 analysis, and then we determined the full-length viral genomic RNA sequences, pt1-98_Q1, Q2, R and pt1-05_Q1, Q2, R.
Fig. 3The position of substitutions in each dominant species against the consensus sequence. From each serum sPt1-98 (A) and sPt1-05 (B), we determined the positions of substitutions (pt1-98_MUT and pt1-05_MUT) against the consensus sequence determined by Illumina analysis and three independent dominant species (pt1-98_Q1, Q2, R and pt1-05_Q1, Q2, R) by the method in Fig. 2. The upper and lower horizontal boxes show the nucleotide (nt) and amino acid (aa) sequence of open reading frame in HCV genome. The vertical lines show the positions of each substitution against the Illumina consensus sequence. The red vertical lines indicate the positions of mutations in the species with core70 R, and the green and blue lines indicate the species with core70 Q (Q1 and Q2, respectively). Black vertical lines in pt1-98_MUT and pt1-05_MUT show the mutations not found in the species with core70 R, Q1 and Q2. Dotted lines are the borders for dividing each virus protein. pt1-98_MUT and pt1-05_MUT show every position of sequence variation determined by Illumina analysis. *, position of core70; heavy black line, position of IRRDR regions. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)
Fig. 4Phylogenetic trees of subtype 1b strains estimated from (A) core, (B) NS5B, and (C) full-genome gene sequence. Bootstrap values based on 2000 replicates each are shown for branches with more than 50% bootstrap support. The scale bars are in units of nucleotide substitutions per site. The trees were rooted with subtype 1a strains (H77; not shown). The red lines indicate the sequences identified from serum samples sPt1-98 and sPt05. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)