| Literature DB >> 32968103 |
Hong Thai1, James Lara2, Xiaojun Xu2,3, Kathryn Kitrinos4,5, Anuj Gaggar4, Henry Lik Yuen Chan6, Guo-Liang Xia2, Lilia Ganova-Raeva2, Yury Khudyakov2.
Abstract
Tenofovir disoproxil fumarate (TDF) is one of the nucleotide analogs capable of inhibiting the reverse transcriptase (RT) activity of HIV and hepatitis B virus (HBV). There is no known HBV resistance to TDF. However, detectable variation in duration of HBV persistence in patients on TDF therapy suggests the existence of genetic mechanisms of on-drug persistence that reduce TDF efficacy for some HBV strains without affording actual resistance. Here, the whole genome of intra-host HBV variants (N = 1,288) was sequenced from patients with rapid (RR, N = 5) and slow response (SR, N = 5) to TDF. Association of HBV genomic and protein polymorphic sites to RR and SR was assessed using phylogenetic analysis and Bayesian network methods. We show that, in difference to resistance to nucleotide analogs, which is mainly associated with few specific mutations in RT, the HBV on-TDF persistence is defined by genetic variations across the entire HBV genome. Analysis of the inferred 3D-structures indicates no difference in affinity of TDF binding by RT encoded by intra-host HBV variants that rapidly decline or persist in presence of TDF. This finding suggests that effectiveness of TDF recognition and binding does not contribute significantly to on-drug persistence. Differences in patterns of genetic associations to TDF response between HBV genotypes B and C and lack of a single pattern of mutations among intra-host variants sensitive to TDF indicate a complex genetic encoding of the trait. We hypothesize that there are many genetic mechanisms of on-drug persistence, which are differentially available to HBV strains. These pervasive mechanisms are insufficient to prevent viral inhibition completely but may contribute significantly to robustness of actual resistance. On-drug persistence may reduce the overall effectiveness of therapy and should be considered for development of more potent drugs.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32968103 PMCID: PMC7511938 DOI: 10.1038/s41598-020-72467-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Heterogeneity and phylogeny of HBV quasispecies strains of genotype (GT) C, B and E. Shown are the median joining networks (MJN) of the full HBV genomic quasispecies sequences sampled from (A) five SR patients (P1-P3, P8 and P10) and from (B) five RR patients (P4-P7 and P9), and (C) phylogenetic tree. Sequences were sampled at three time points: baseline and week 4 and 40 during TDF therapy. GT C sequences organized into two clades or clusters (C1 and C2). Nodes in MJN and in phylogram tree represent HBV variants. Nodes colored based on time point of sampling (as denoted in color legend).
Polymorphic nt sites associated to RR/SR response in HBV strains of GT C, B and E.
| Genome | Genea | Protein domain (codon)a | Overlappinga |
|---|---|---|---|
| 61 | P 324 | Sp 147 (1st) | S 143 (3rd) |
| 706 | P 539 | RT 193 (1st) | S 358 (3rd) |
| 886 | P 599 | RT 253 (1st) | – |
| 1122 | P 667 | RT 331 (3rd) | – |
| 1221 | P 710 | RNAse H 20 (3rd) | – |
| 1320 | P 743 | RNAse H 53 (3rd) | – |
| 1499 | P 803 | RNAse H 113 (2nd) | X 42 (3rd) |
| 1786 | X 413 | X 138 (2nd) | – |
| 1856 | pre-C 43 | pre-C 15 (1st) | – |
| 1946 | C 133 | C 45 (1st) | – |
| 1976 | C 163 | C 55 (1st) | – |
| 2012 | C 199 | C 67 (1st) | – |
| 2075 | C 262 | C 88 (1st) | – |
| 2095 | C 282 | C 94 (3rd) | – |
| 2441 | P 45 | Terminal protein 45 (3rd) | C 210 (1st) |
| 2573 | P 89 | Terminal protein 89 (3rd) | – |
Listed are the positions of the CFS-derived subset of 16 nt polymorphic sites found across several regions of the HBV genome that were strongly associated (Merit = 0.755) to the response rate characteristics of TDF-treated patients. The CFS algorithm[23] was applied to the dataset of unique full-length HBV quasispecies sequences (N = 954) sampled from ten patients at three time points: baseline and week 4 and 40 during treatment. In total, 1,443 candidate subsets were evaluated by CFS (details in SI).
Position numbering is based on reference sequence: GenBank accession number AY233278.
aGene, protein or protein domain abbreviations: polymerase gene (P), spacer (Sp), reverse transcriptase (RT), ribonuclease H (RNAse H), terminal protein, core gene (C), pre-core (pre-C), X gene (X), S gene (S). First, second or third codon positions are noted in parenthesis. Non-overlapping positions denoted by dash line “–“.
Figure 2Relevant nt sites associated to the RR/SR. BN generated using full HBV genomic quasispecies sequences (N = 954) of GT C, B and E sampled from ten patients at three time points: baseline, and week 4 and 40 during TDF therapy. Round nodes in the graph represent 16 polymorphic nt sites (Table 1) and the square node represents the response (“target”) variable. Coloring of round nodes based on genomic region (see legend Fig. 4). Dependencies (relationships) between the response and nt sites are displayed as blue arcs and inter-dependencies between the sites as black arcs. The average strength of the relationship between a node and the target was small but significant (KL = 0.19, P < 0.05). However, four relationships in the network—arcs between the target and nodes representing genome positions (p): 866, 1946, 2075 and 2441—could not be statistically supported (P > 0.05). Nonetheless, this BN was found useful for prediction of RR/SR association (Tables S1 and Tables S2, in SI).
Figure 4Genome-wide dependencies among polymorphic sites. BN generated using 799 whole-genome quasispecies from 6 HBV/C-infected patients treated with TDF (see “BN Section” in SI). Round nodes in the graph represent polymorphic nt sites and arcs represent significant (SC = 1.0, P < 0.001) dependency relationships. The BN comprises a major 215-varaible component (212 nt sites, and the response, time-point and phylogenetic cluster variables—square nodes in yellow) and 11 minor components representing 30 nt sites. Node coloring based on nine regions (genome positions in parenthesis): overlapping S–P genes (1–837), RT domain (838–1163), RNAse H domain (1164–1375), overlapping RNAse H–X (1376–1625), X gene (1626–1815), overlapping X–C genes (1816–1840), C gene (1841–2308), overlapping C–P genes (2309–2454) and Terminal protein domain (2455–3215). Position numbering based on reference sequence: GenBank accession number AY233278.
Figure 3Physicochemical clustering of HBV sequences. Shown is the SOTA-based[24,25] grouping, by (A) response and by (B) genotype, of physicochemical profiles representing full HBV quasispecies sequences (N = 954) collected from ten TDF-treated patients at three time points. Group 1 (neuron 3, in red) was mostly (97.4%) comprised of HBV GT C (clusters C1 and C2—see Fig. 1C), B and E strains sampled from SR patients. Group 2 (neuron 2, in cyan) was comprised of strains derived solely from RR HBV/C- & B-infected patients. Only thirteen out of the 47 unique variants sampled from one HBV/B-infected RR patient were found to be members of Group 1. Red arrow denotes boundary between the two groups. The 16 nt-based physicochemical profile representation of HBV sequences was generated using a scale of five physicochemical properties of DNA nt’s[26].
Frequency distributions of the RR/SR state given the observation of specific nt/aa states.
| Genome nt positionsb | nt statesa | Response | aa positionsb | aa statesa | Response | |||
|---|---|---|---|---|---|---|---|---|
| SR (53.4%) | RR (46.6%) | P | RNAse H | SR (47.2%) | RR (52.8%) | |||
| 1501 | G (45.9%) | 803c | 113c | R (48.1%) | ||||
| A (54.1%) | H (51.9%) | |||||||
| 1322 | C (53.4%) | 743 | 53 | N (47.2%) | ||||
| A (46.6%) | K (52.4%) | |||||||
| T (0.4%) | ||||||||
| 1231 | A (53.7%) | 713 | 23 | R (51.9%) | ||||
| T (0.1%) | S (0.2%) | |||||||
| G (46.2%) | Q (47.6%) | |||||||
| L (0.2%) | ||||||||
| 1223 | A (31.9%) | 710d | 20d | I (86.0%) | ||||
| G (21.5%) | M (13.7%) | |||||||
| T (30.2%) | V (0.2%) | |||||||
| C (16.4%) | ||||||||
Frequency distribution of the SR/RR state associated to specific nt and aa polymorphisms (i.e., posterior conditional probabilities in BN) are shown in italiced cells.
Frequency distributions, in the data, of nt and aa states and of SR and RR phenotypes (i.e., prior conditional probabilities in BN) are shown in parenthesis.
BN analysis (details of analysis in SI) was performed on the dataset of unique full-length HBV GT C quasispecies nt sequences (N = 799) and polymerase aa sequences (N = 422) sampled from six patients (P1-P6) at three time points: baseline and week 4 and 40 during treatment.
aOne-letter symbols denote the nt states: guanine (G), adenine (A), cytosine (C) and thymine (T); and the aa states: arginine (R), histidine (H), asparagine (N), lysine (K), threonine (T), serine (S), glutamine (Q), leucine (L), isoleucine (I), methionine (M) and valine (V).
bPosition numbering in polymerase protein (P) and ribonuclease H (RNAse H) based on reference GenBank sequence AF458665.1.
cFour variants sampled at week 4 from RR patients (three from P5 and one from P4) had R at this position.
dAll HBV quasispecies sequences from SR P1 and one variant sampled at week 4 from SR P3 had M at this position, and one variant sampled at baseline from RR P6 had V at this position.
Figure 5Mapping of HBV/C RT polymorphic aa sites relevant to SR/RR association. Shown is the predicted 3D structure of the HBV-RT/DNA-RNA/TFV-DP complex representative of the 345aa-long RT protein of HBV GT C strains[34]. The 22 aa sites represented in the RT-BNC (Table S3 in SI) are denoted as sticks (in purple). Potential effectors (N = 4) of the ligand–protein interaction are marked with corresponding RT positions. RT 3D-structure coloring scheme: fingers, in cyan and gold; palm, in green, and thumb, in red. TDF and DNA/RNA ligands are depicted with ball-and-stick (cpk colors) and cartoon (grey color) representations, respectively. Rendering was done using the VMD software[35]. Position numbering based on reference sequence: GenBank accession number AF458665.1.