Literature DB >> 26457332

Analyzing the Effects of Pretreatment Diversity on HCV Drug Treatment Responsiveness Using Bayesian Partition methods.

Yao Fu1, Gang Chen2, Xuan Guo3, Jing Zhang4, Yi Pan.   

Abstract

Traditional therapies for Hepatitis C Virus (HCV) often yield unsatisfactory results. The reason for this may lie in the mechanism of drug resistance of the HCV virus. Despite doing a plain vanilla comparison between the treated and untreated groups, this paper takes a detour and investigates the drug resistance mechanism to interferon plus ribavirin combined therapy by comparing pretreatment sequence data between response and non-response patients in the NS5A region for genotype 1a HCV virus. We use Bayesian probabilistic models to detect single mutation or mutation combinations, and infer interaction structures between these mutations, to investigate the drug resistance combinations differences between those patients. We hope to decipher, at least partially, the reason behind the unsatisfactory results received from interferon plus ribavirin therapy. AUTHOR
SUMMARY: HCV treatment results have been historically suboptimal[1-3]. HCV drug resistance, which further hinders the treatment effects, is caused by mutations of viral proteins that disrupt the drugs' binding but do not affect the viral survival. Due to the high rate and low fidelity of HCV replication, resistant strains quickly become dominant in a viral population under the selection pressure of a drug. M.J. Donlin et al indicate that pretreatment sequence diversity correlates with response effects[15]. We incorporate this idea and use a Bayesian approach to look into the pretreatment sequences diversity of HCV virus between response and non-response groups, under a combined treatment of interferon and ribavirin.

Entities:  

Year:  2015        PMID: 26457332      PMCID: PMC4597793     

Source DB:  PubMed          Journal:  J Bioinform Proteom Rev


Introduction

As a single-strand RNA virus, Hepatitis C virus (HCV) has been categorized into at least six genotypes each with several subtypes. Spreading in different regions, the genotypes will have dissimilar response patterns to interferon-based therapy. In clinic, less than 20% of chronic patients show sustained response with interferon monotherapy treatment[, while the therapy with the IFN and ribavirin combination showed significant improvement in response rates[. An accurate interpretation of the mechanism behind the antiviral resistance to IFN therapy will be a key factor for developing better treatment strategies. Many studies have found that variations in HCV sequences play a role in response to IFN-based therapies, especially for the variations in the NS5A region[. NS5A is a nonstructural protein which leads to the resistance of IFN treatment by blocking the function of an important mediator in IFN response, dsRNA dependent protein kinase (PKR)[. In 1995, an “interferon sensitivity determining region” (ISDR amino acid [aa] 2209–2248) defined by Enomoto et al. in NS5A, is enriched with mutations related to resistance to IFN[. This finding has been confirmed by several studies[. Moreover, some other study showed evidence that in PKR binding domain (PKRbd aa 2209–274) of NS5A mutations can hamper the viral replication[. However, conflicting results about these two regions are also found, which complicate the role of NS5A in response to IFN. In addition to ISDR and PKRbd, other domains in NS5A have also been concerned in resistance to virus, such as proline-rich region (PRR aa 2283–2327), AR1 (aa 2144–2185) and AR2 (aa 2221–2272) of nuclear localization signal (NLS), and the variable region 3 (V3 aa 2356–2379) in the C terminus[. This study will concentrate on NS5A region particular for HCV genotype 1a. In NS5A region there are 1344 base pairs, linking to 448 amino acids. The goal of this paper is to compare the pretreatment sequence patterns between those patients who respond positively to the treatment and those who don’t, and then infer the possible mutation positions that may affect the treatment effects. To better facilitate the discussion of our findings, we first introduce the inner structure of NS5A region. It constitutes the following regions: the membrane attachment region (aa 1–236); the carboxyl region (aa 237–448); and the regions within the carboxyl end, such as PKRbd (aa 237–302), variable region 4 (V4; aa 310–330), variable region 3 (V3; aa 381–409), the region between V4 and V3 (aa 331–380), and the downstream region of V3 (aa 410–448). In this study, we employed Bayesian models, which are originally proposed by Zhang et al. for investigating mutation interactions of HIV caused by a certain drug treatment[. Based on the Bayesian variable partition (BVP) model, we first used Metropolis-Hastings algorithm on the data of the interferon treatment to sort out mutations associated with drug resistance, and then applied a recursive model selection (RMS) procedure on the selected mutation positions to infer the dependence structure with the interacting effects.

Methods

Here we first employed the Bayesian variable partition (BVP) model[ to search for the mutation positions. After detecting the interaction mutation positions, we further applied the Recursive Model Selection[ (RMS) on selected mutations by BVP to infer more detailed dependence structure among the interacting positions.

Bayesian Variable Partition Model

The response group and non-response group can be represented as two data matrices A=[A,…,A (of dimension N×m) and B=[B,…,B (of dimension N×m), respectively (each row is a sequence, each column is a position of protein HCV RT). Here N or N denotes the number of sequences in response or non-response group respectively, and m denotes the number of positions. The m positions can be partitioned into four sets: set 1 contains positions independent with each other sharing the same distribution in response and non-response groups; set 2 contains positions independent with each other but with different distributions in two groups; set 3 contains positions dependent with the same distribution in two groups; set 4 contains positions dependent with different distributions in two groups. These four sets are corresponding to the four hypotheses in the result section. Let I=(I,…,I indicate the membership of the positions with I=1,2,3 and 4, respectively, and A(1) and B(1) denote the sequences in l set from two groups. Our goal is to infer the sets of positions with different distributions in two groups (that is I = 2 or 4). Assume that there are c possible values (amino acid types) at position j, and let Θ={(θ,θ,…,θ):I=1} be the amino acid frequencies of each position in set 1 in both groups, thus, the likelihood of (A(1), B(1)) is where {nj1,…,n} are number of sequences taking kth value in (A(1), B(1)). Assume a Dirichlet prior on Θ1, that is, Θ1~Dirichlet(α) where α(α1,…, α) By integrating out Θ1, we have the marginal probability: where |α| is the sum of all elements in α, and N=N+N. Different to positions in set 1, two priors, Dirichlet (βA) and Dirichlet (βB), are used on the amino acid frequencies of each positions in set 2 in group A and B, respectively. By integrating out frequencies, we obtain Positions in set 3 and 4 influence the resistance statuses through interactions. Thus, each amino acid combination over set 3 or 4 represents a potential mutation. Assume there are c(3) and c(4) possible value combinations for set 3 and 4, respectively, we use Dirichlet(γ) prior on the combination frequencies in set 3 and use two priors, Dirichlet (δA) and Dirichlet(δB), on the combination frequencies in set 4 for response group and non-response group. By integrating out frequencies, we obtain Combining formulas from (1) to (7), we have the posterior distribution of I as In this study, we assume most positions should be in set 1 or set 3 in prior (i.e. unassociated with drug resistance), P(Ii=2)=P(Ii=4)=0.01, and . We further set the parameters for all Dirichlet priors to 0.5. A Markov chain Monte Carlo (MCMC) algorithm[ can be designed to sample from this posterior distribution so as to infer which variables are associated with the treatment status. More details on BVP can be found in[.

Recursive Model Selection

We applied the RMS procedure to infer the detailed dependence structure among the interacting positions generated by BVP. Our strategy is to recursively apply a model selection of two classes of cruder models, that is the chain-dependence model and the V-dependence model, until the data do not support more detailed models. We say that a group of variables XG follow a chain-dependence model if the index set G can be partitioned into three subsets U, V, and W such that XU and XW are independent given XV, such as XU→XV→XW. The joint distribution of a chain-dependence model is We say that a group of variables XG follow a V-dependence model if XU and XW are mutually independent, that is XU→XW←XV. The joint distribution of a V-dependence model is In these two models, only set W is allowed to be empty, in which case these models become the saturated model. We use a model indicator to imply the membership of the L positions with for the chain-dependence model and for the V-dependence model. Let 𝕻 denote the set partition, the posterior distribution of ICV and 𝕻 is We set equal prior probability for ICV and 𝕻 is An MCMC algorithm is designed to simulate from (11) and to find the optimal model type and variable partition. More details on RMS can be found in[. The procedure is applied recursively until only single-variable nodes are available. BVP model and RMS procedure were utilized sequentially to the data of response and non-response patients. For the comparison, we applied BVP to 47 response datasets versus 29 non-response dataset, and contrived the difference between these two, recognizing that there exist different pretreatment patterns that we should account for differently. The detailed accession numbers for total 76 sequences are showed in Table 1, 2. More information about these sequences can be accessed in the reference[.
Table 1

Accession numbers for 47 response sequences.

Response Sequences
Accession numberAccession numberAccession number
1AF26504717AM60092733EF407419
2AF26511118AM60093634EF407420
3AF26512919AM60091935EF407421
4AF26513220AM60092336EF407422
5AF26513821AM60095037EF407423
6AM60095322AM60091438EF407424
7AM60093823AM60093039EF407425
8AM60092524AM60094440EF407411
9AM60093225AM60095141EF407412
10AM60092926AM60094242FJ958369
11AM60092127EF40741343FJ958414
12AM60095528EF40741444FJ958465
13AM60093429EF40741545FJ958543
14AM60094630EF40741646FJ958850
15AM60094831EF40741747FJ958939
16AM60094032EF407418
Table 2

Accession numbers for 29 non-response sequences

Non-response Sequences
Accession NumberAccession NumberAccession Number
1EF40743211AF26510521EF407434
2EF40743712AF26500922EF407435
3EF40744513AF26502823EF407436
4EF40742714EF40742724EF407437
5EF40743015EF40742825EF407438
6EF40743616EF40742926EF407439
7AF26514117EF40743027EF407440
8AF26513518EF40743128EF407441
9AF26512119EF40743229EF407442
10AF26511720EF407433

Results

As M.J. Donlin et al. indicated, pretreatment sequence diversity correlates with response effects[. On observing this finding, our analysis is to test the following four proposed hypotheses. Hypothesis 1: the positions are independent with each other, and the probability distribution of the pretreatment sequences of response and non-response groups is the same; Hypothesis 2: the positions are independent with each other, and the pretreatment sequences of response and non-response groups have different probability distributions; Hypothesis 3: the positions are dependent, and the distribution of the pretreatment sequences of response and non-response groups is the same; Hypothesis 4: the positions are dependent, and the pretreatment sequences of response and non-response groups have different probability distributions. We applied Bayesian Variable Partition (BVP) model and Recursive Model Selection (RMS) procedure to the pretreatment sequences of response (47 sequences) and non-response (29 sequences) samples, as described in detail in the methods section. We run the proposed method with multiple random restarts, and multiple chains will converge to different multiple local modes. So given different results, we consider all of them meaningful. We did not do any heuristic or subjective selection. Table 3 shows the results of the analysis, namely, positions which have 95% or more probability for us to infer one of our four hypotheses. As shown in Table 3, the results are not uniform from different Markov Chains. The limited sample size of our analysis may be the reason for this inconsistency. However, looking at the common positions from almost all the 20 chains still gives us a reliable idea of the mutation mechanism of positions 49, 349, and 199, 209, 242, 398 which have the highest frequencies among these 20 Markov chains. Positions 49 and 349 are statistically different in response and non-response patients and are independent of other positions. Positions 199, 209, 242, 398 are dependent and demonstrate significant difference in response and non-response patients. Position 49 is in membrane attachment region; Position 349 is in the region between V3 and V4; Positions 199 and 209 are in membrane attachment region; Position 242 is in ISDR region; Position 398 is in V3 region. These positions may have some biological influence on drug resistance to IFN and ribivirin.
Table 3

Positions whose posterior probabilities of H2 or H4 are larger than 0.95.

#ChainH2 (P >0.95)H4 (P > 0.95)
149 34923 64 71 245 269 276 288 291 382 388 445
249 34913 390 391
349 34934 242 377 398
4Null44 133 226 269 276 285 288 296 304 305 382
549 34961 64 135 296 423 430
649 34916 23 61 135 226 255 388
749 349199 209 242 398
849 34924 44 107 133 135 226 304 305 422
949 34923 44 61 64 71 135 245 269 276 285 291 382 388 439 445
1049 349133 264 280 305 392
114916 95 131 383 410 439
1249 34934 199 209 242 377 398
1349 34971 127 245 269 276 285 288 353 439 445
1449 34923 64 71 245 269 276 285 288 291 382 388 445
15349242 390 391 398
1649 349Null
1749 34934 95 135 255 377
1849 349199 209 242 398
1949 349133 269 276 285 288 304 305 382
2049 349199 209 242 398
While analyzing single positions as above is helpful, a lot of positions are not mutating independently. [Figure 3] shows the interacting positions detected by BVP in response samples and [Figure 4] shows the interacting positions detected by BVP in non-response samples. Table 4 and Table 5 show the dependence structure inferred by RMS in detail, for the response and non-response group respectively. At position 285, we found that the frequency of amino acid E is 13.8% in non-response samples and 8.5% in the response samples. A more significant result was found at position 199, where the frequency of amino acid L decrease from 100% to 87.2%, from non-response samples to response samples. Similar yet less significant patterns were found at position 226, where the amino acid M decrease from 20.75 to 14.9%, from non-response samples to response samples. For dependent positions, we observed similar results, as shown jointly in Table 4 and Table 5. At positions 107, 226, 288, 410, 439, the amino acid combination EMIAE does not exist in response samples, which indicates that those positions combined may be a distinguishing factor for response and non-response patients. There are other non-existent combinations at those positions. For instance, KEIAG, TMVAG, TLIAE, are all combinations that only exist in non-response samples.
Figure 3

Flowchart of detected mutation positions and position combinations in the pretreatment sequence of patients who respond to the treatment.

Figure 4

Flowchart of detected mutation positions and position combinations in the pretreatment sequence of patients who don’t respond to the treatment.

Table 4

Detailed position interaction relations for positions for the pretreatment sequence of patients who respond to the treatment.

PositionsAmino Acids Frequency
285EDV
Non-response13.80%86.20%0.00%
response8.50%89.40%2.10%
199LV
Non-response100.00%0.00%
response87.20%12.80%
245ATVNY
Non-response44.80%48.30%6.90%0.00%0.00%
response29.80%63.80%2.10%2.10%2.10%
107/226/388/410/439E+M+I+A+EK+L+I+A+ET+V+I+A+ET+L+I+A+GK+E+I+A+GT+M+V+A+GT+L+I+A+ET+V+I+A+GT+E+I+V+GT+M+I+A+G
Non-response6.90%6.90%10.30%6.90%10.30%6.90%10.30%6.90%3.40%3.40%
response0.00%2.10%8.50%8.50%0.00%0.00%0.00%19.10%0.00%0.00%
107/226/388/410/439K+V+V+A+GK+M+I+A+GM+E+I+A+EK+V+I+A+ET+E+I+A+GK+V+I+A+GK+V+V+A+ET+M+I+A+EK+M+I+A+EK+L+I+D+E
Non-response6.90%3.40%3.40%3.40%3.40%3.40%3.40%0.00%0.00%0.00%
response0.00%0.00%0.00%2.10%14.90%6.40%0.00%2.10%6.40%2.10%
107/226/388/410/439S+V+I+A+ET+M+V+A+ET+L+T+A+EK+V+I+G+ET+W+T+A+DT+M+V+S+GT+V+I+T+EE+M+I+A+GK+E+A+A+ET+E+I+A+E
Non-response0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%0.00%
response2.10%2.10%2.10%2.10%2.10%2.10%4.30%2.10%2.10%2.10%
107/226/388/410/439K+L+V+A+GT+E+V+A+G
Non-response0.00%0.00%
response2.10%2.10%
Table 5

Detailed position interaction relations for positions for the pretreatment sequence of patients who don’t respond to the treatment.

PositionsAmino Acids Frequency
226MLVEW
Non-response21.70%11.50%49.30%0.70%0.70%
response24.00%18.00%38.00%6.00%0.00%
107/245/392E+A+NK+A+NT+A+NT+T+DK+T+NT+T+NT+V+ST+V+DM+A+DK+T+S
Non-response6.90%17.20%6.90%6.90%10.30%27.60%3.40%3.40%3.40%3.40%
response2.10%8.50%12.80%17.00%8.50%34.00%0.00%0.00%0.00%2.10%
107/245/392T+A+DK+A+DT+T+VS+A+NT+N+NK+V+NK+Y+N
Non-response3.40%6.90%0.00%0.00%0.00%0.00%0.00%
response2.10%2.10%2.10%2.10%2.10%2.10%2.10%
To show the prediction power of the reported mutations by our method, an SVM classifier was used to conduct a classification on the non-response and response sequences. The ROC curves and corresponding AUC values were showed in Figure 5. The SVM was implemented by libsvm[, and the hyper- parameters were tuned to be optimal by the grid search. The kernel function applied was Radial basis function, which gave best results comparing to linear, polynomial, and sigmoid kernel functions. We chose four mutation positions combinations to illustrate the prediction power of our findings: (A) 44, 133, 226, 269, 276, 285, 288, 296, 304, 305, 382 (B) 24, 44, 107, 133, 135, 226, 304, 305, 422 (C) 23, 44, 61, 64, 71, 135, 245, 269, 276, 285, 291, 382, 388, 439, 445; (D) 107, 226, 388, 410, 439. From the ROC curves and AUC values, these mutation positions demonstrate considerable discrimination and prediction power with even small samples size (47 response sequences versus 29 non-response sequences).
Figure 5

ROC curve by using SVM with selected mutations as features. (A) Mutation positions: 44, 133, 226, 269, 276, 285, 288, 296, 304, 305, 382; (B) Mutation positions: 24, 44, 107, 133, 135, 226, 304, 305, 422; (C) Mutation positions: 23, 44, 61, 64, 71, 135, 245, 269, 276, 285, 291, 382, 388, 439, 445; (D) Mutation positions: 107, 226, 388, 410, 439.

Single positions significant under Fisher test also reveal differences between response and non-response samples in terms of the frequency of amino acid. At position 48, the frequency of amino acid R is 100% in non-response samples, while only 70.2% in response samples. At position 81, the frequency of amino acid R is about 15% higher in response samples. These results, combined with the more reliable evidence from Table 4 and Table 5, gave us a relatively complete picture of the differences between non-response and response samples.

Discussion

Utilizing the pioneering method proposed by Zhang et al[, which employs Bayesian statistical modeling, we were able to detect and analyze, the complex interactions of mutations of the HCV protease and reverse transcriptase. While this analysis helps present a relatively comprehensive picture of the different pretreatment structures of non-response and response patients, it admittedly omits many other factors that possibly influence HCV virus mutations. Despite all the possibilities that may emerge, this study has not only confirmed the original findings of HCV drug resistance but also demonstrated the long-puzzled selection pattern of HCV drug treatment effects. We are positive that the method and results presented here will make a stimulation of new and more accurate ways to decipher the myths behind drug resistance of HCV and other related diseases.
  19 in total

1.  Prospective characterization of full-length hepatitis C virus NS5A quasispecies during induction and combination antiviral therapy.

Authors:  J Nousbaum; S J Polyak; S C Ray; D G Sullivan; A M Larson; R L Carithers; D R Gretch
Journal:  J Virol       Date:  2000-10       Impact factor: 5.103

2.  Predictive factors in eradicating hepatitis C virus using a relatively small dose of interferon.

Authors:  M Fukuda; K Chayama; A Tsubota; M Kobayashi; M Hashimoto; Y Miyano; H Koike; M Kobayashi; I Koida; Y Arase; S Saitoh; N Murashima; K Ikeda; H Kumada
Journal:  J Gastroenterol Hepatol       Date:  1998-04       Impact factor: 4.029

3.  Mutations in the nonstructural protein 5A gene and response to interferon in patients with chronic hepatitis C virus 1b infection.

Authors:  N Enomoto; I Sakuma; Y Asahina; M Kurosaki; T Murakami; C Yamamoto; Y Ogura; N Izumi; F Marumo; C Sato
Journal:  N Engl J Med       Date:  1996-01-11       Impact factor: 91.245

4.  Interferon alfa-2b alone or in combination with ribavirin as initial treatment for chronic hepatitis C. Hepatitis Interventional Therapy Group.

Authors:  J G McHutchison; S C Gordon; E R Schiff; M L Shiffman; W M Lee; V K Rustgi; Z D Goodman; M H Ling; S Cort; J K Albrecht
Journal:  N Engl J Med       Date:  1998-11-19       Impact factor: 91.245

Review 5.  Clinical relevance of hepatitis C virus quasispecies.

Authors:  N Enomoto; C Sato
Journal:  J Viral Hepat       Date:  1995       Impact factor: 3.728

6.  Evidence for sequence selection within the non-structural 5A gene of hepatitis C virus type 1b during unsuccessful treatment with interferon-alpha.

Authors:  M Gerotto; F Dal Pero; D G Sullivan; L Chemello; L Cavalletto; S J Polyak; P Pontisso; D R Gretch; A Alberti
Journal:  J Viral Hepat       Date:  1999-09       Impact factor: 3.728

7.  Evidence that hepatitis C virus resistance to interferon is mediated through repression of the PKR protein kinase by the nonstructural 5A protein.

Authors:  M J Gale; M J Korth; N M Tang; S L Tan; D A Hopkins; T E Dever; S J Polyak; D R Gretch; M G Katze
Journal:  Virology       Date:  1997-04-14       Impact factor: 3.616

8.  Impact of NS5A sequences of Hepatitis C virus genotype 1a on early viral kinetics during treatment with peginterferon- alpha 2a plus ribavirin.

Authors:  Francesca Dal Pero; Kwok H Tang; Martina Gerotto; Gladis Bortoletto; Emma Paulon; Eva Herrmann; Stefan Zeuzem; Alfredo Alberti; Nikolai V Naoumov
Journal:  J Infect Dis       Date:  2007-08-21       Impact factor: 5.226

9.  Control of PKR protein kinase by hepatitis C virus nonstructural 5A protein: molecular mechanisms of kinase regulation.

Authors:  M Gale; C M Blakely; B Kwieciszewski; S L Tan; M Dossett; N M Tang; M J Korth; S J Polyak; D R Gretch; M G Katze
Journal:  Mol Cell Biol       Date:  1998-09       Impact factor: 4.272

10.  Hepatitis C virus nonstructural protein 5A contains potential transcriptional activator domains.

Authors:  K M Chung; O K Song; S K Jang
Journal:  Mol Cells       Date:  1997-10-31       Impact factor: 5.034

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.