| Literature DB >> 33976241 |
Vincent Montoya1, Anita Y M Howe2, Weiyan Y Dong1, Winnie Dong1, Chanson J Brumme1,3, Andrea D Olmstead2,3, Kanna Hayashi4,5, P Richard Harrigan3, Jeffrey B Joy6,7,8.
Abstract
Most individuals chronically infected with hepatitis C virus (HCV) are asymptomatic during the initial stages of infection and therefore the precise timing of infection is often unknown. Retrospective estimation of infection duration would improve existing surveillance data and help guide treatment. While intra-host viral diversity quantifications such as Shannon entropy have previously been utilized for estimating duration of infection, these studies characterize the viral population from only a relatively short segment of the HCV genome. In this study intra-host diversities were examined across the HCV genome in order to identify the region most reflective of time and the degree to which these estimates are influenced by high-risk activities including those associated with HCV acquisition. Shannon diversities were calculated for all regions of HCV from 78 longitudinally sampled individuals with known seroconversion timeframes. While the region of the HCV genome most accurately reflecting time resided within the NS3 gene, the gene region with the highest capacity to differentiate acute from chronic infections was identified within the NS5b region. Multivariate models predicting duration of infection from viral diversity significantly improved upon incorporation of variables associated with recent public, unsupervised drug use. These results could assist the development of strategic population treatment guidelines for high-risk individuals infected with HCV and offer insights into variables associated with a likelihood of transmission.Entities:
Year: 2021 PMID: 33976241 PMCID: PMC8113533 DOI: 10.1038/s41598-021-88132-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Behavioural variables measured in this study for 78 participants.
| Clinical variable | Individuals | Samples |
|---|---|---|
| HIV positive | 9 (11.5) | 10 (7.9) |
| Homelessness | 29 (37.2) | 36 (28.6) |
| Daily injection drug use in L6M | 54 (69.2) | 74 (58.7) |
| Injection drug use in L6M | 71 (91) | 106 (84.1) |
| Daily heroin usein L6M | 35 (44.9) | 49 (38.9) |
| Heroin in L6M | 59 (75.6) | 84 (66.7) |
| Daily cocaine use in L6M | 22 (28.2) | 25 (20) |
| Cocaine in L6M | 52 (66.7) | 70 (55.6) |
| Daily meth in L6M | 4 (5.1) | 4 (3.2) |
| Meth in L6M | 14 (17.9) | 16 (12.7) |
| Daily prescription opioid use in L6M | 6 (7.7) | 8 (6.3) |
| Prescription opioid use in L6M | 25 (32.1) | 32 (25.4) |
| Daily non-injection crack use in L6M | 16 (20.5) | 21 (16.7) |
| Non-injection crack use in L6M | 36 (46.2) | 53 (42.1) |
| Heavy alcohol use in L6M | 5 (21.7) | 6 (18.2) |
| Daily alcohol in L6M | 10 (12.8) | 11 (8.7) |
| Syringe sharing in L6M | 25 (32.1) | 31 (24.8) |
| Syringe borrowing in L6M | 20 (25.6) | 24 (19) |
| Syringe lending in L6M | 18 (23.1) | 22 (17.5) |
| Public drug use in L6M | 50 (64.1) | 66 (52.4) |
| Sex work in L6M | 14 (17.9) | 22 (17.5) |
| Any opioid use in L6M | 8 (34.8) | 9 (27.3) |
| Methadone treatment in L6M | 27 (34.6) | 34 (27) |
| Jail in L6M | 26 (33.8) | 32 (25.6) |
| History of mental illness | 27 (44.3) | 37 (42) |
| Mental illness in L6M | 7 (12.1) | 7 (8.2) |
| Daily marijuana use in L6M | 18 (23.1) | 24 (19) |
| Marijuana use in L6M | 48 (61.5) | 78 (61.9) |
| Living in the downtown eastside | 39 (50) | 58 (46) |
| Any resistance mutation* | 26 (33.3) | 42 (33.3) |
| Q80K resistance* | 21 (26.9) | 34 (27) |
Counts for all behavioural variables available for analysis in this study for each individual/sample as well as the percent positive (in parentheses).
*Measured from deep sequencing analyses. L6M represents last 6 months prior to sample collection. Positives for each variable are defined as positive at any point during sampling.
Figure 1Phylogenetic tree of HCV consensus sequences and genotype counts. Maximum likelihood phylogenetic tree of HCV WGS. Reference genomes for each genotype are shown with black points and different coloured tips signify those included in this study. This tree was rooted on a genotype 7 reference sequence. Stars indicate mixed infections which were removed from subsequent analyses, red for sample 33 and blue for sample 983 (two longitudinal samples). Inset bar chart depicts counts of genotypes for each sample (including mixed and genotype switches) separated by time point.
Figure 2Shannon diversity across the HCV genome and over time. Read depth normalized Shannon diversities for each gene grouped by timeframe of infection: less than 6 months (< 6 mo), between 6 months and 1 year (< 1 yr), between 1 and 2 years (< 2 yr), between 1 and 3 years (< 3 yr), between 3 and 4 years (< 4 yr), and greater than 5 years (> 5 yrs).
Akaike information criterions for models predicting time from diversity.
| Gene | Window | lm AIC | MM AIC | lm R2 |
|---|---|---|---|---|
| NS3 | 499:598 | 279.90 | 281.66 | 0.27 |
| NS3 | 500:599 | 279.67 | 281.44 | 0.27 |
| NS3 | 1561:1660 | 285.05 | 286.49 | 0.23 |
| E1 | 263:462 | 284.81 | 284.33 | 0.24 |
| NS3 | 526:825 | 286.25 | 287.87 | 0.23 |
| NS5a | 817:1116 | 291.80 | 292.96 | 0.19 |
| NS5b | 702:901 | 290.16 | 292.16 | 0.20 |
| NS5a | 750:1149 | 292.74 | 293.78 | 0.18 |
| NS3 | 1205:1604 | 291.12 | 292.56 | 0.19 |
| NS3 | 1215:1414 | 291.98 | 293.52 | 0.19 |
| NS5b | 657:756 | 292.84 | 294.46 | 0.18 |
| E2 | 1:81 | 301.41 | 303.34 | 0.12 |
Shown are the linear mixed model AICs (MM AIC), the linear model AIC (lm AIC), and the linear model R2 values (lm R2). The models with the best AICs are shown among each region. There were additional overlapping regions within each 100, 200, 300, 400, and 500 bp window containing similar AIC values, however unique regions are preferentially displayed.
Significant changes in viral diversity due to behavioural variables.
| Gene | Variable | 1st sample | Last sample |
|---|---|---|---|
| Core | Heroin in L6M | ||
| E1 | Heroin in L6M | ||
| NS3 | Heroin in L6M | ||
| NS4a | Heroin in L6M | 0.07 | |
| NS4b | Heroin in L6M | 0.07 | |
| NS5a | Heroin in L6M | 0.06 | |
| NS5b | Heroin in L6M | ||
| Core | Injection drug use in L6M | ||
| NS3 | Injection drug use in L6M | 0.17 | |
| NS5b | Injection drug use in L6M | 0.19 | |
| Core | Cocaine in L6M | 0.17 | |
| E1 | Cocaine in L6M | 0.28 | |
| NS4b | Cocaine in L6M | 0.27 | |
| NS5b | Cocaine in L6M | 0.37 | |
| Core | Daily cocaine in L6M | 0.13 | |
| NS4b | Daily cocaine in L6M | 0.32 | |
| NS5b | Daily cocaine in L6M | 0.5 |
Diversities were compared for each individual's first and final time points for each group using a Mann–Whitney–Wilcoxon test. L6M—last 6 months. Significance was determined with p values less than 0.05 (in bold).
Linear models with and without behavioural variables.
| Gene | Window | Raw AIC | Step AIC | Step AIC R2 | Variables |
|---|---|---|---|---|---|
| NS3 | 499:598 | 279.90 | 259.17 | 0.46 | Public drug use |
| NS3 | 500:599 | 279.67 | 259.22 | 0.45 | Public drug use |
| NS3 | 1561:1660 | 285.05 | 260.94 | 0.45 | Cocaine + public drug use + any res |
| E1 | 263:462 | 284.81 | 265.83 | 0.42 | Homelessness |
| NS3 | 526:825 | 286.25 | 266.09 | 0.42 | Public drug use |
| NS5a | 817:1116 | 291.80 | 272.12 | 0.39 | Homelessness + public drug use |
| NS5b | 702:901 | 290.16 | 272.41 | 0.38 | Homelessness + public drug use |
| NS5a | 750:1149 | 292.74 | 272.57 | 0.39 | Public drug use |
| NS3 | 1205:1604 | 291.12 | 273.05 | 0.39 | Homelessness + public drug use |
| NS3 | 1215:1414 | 291.98 | 274.07 | 0.37 | Homelessness + public drug use |
| NS5b | 657:756 | 292.84 | 274.48 | 0.37 | Homelessness |
| NS5b | 658:757 | 292.99 | 275.00 | 0.37 | Homelessness |
| E2 | 1:81 | 301.41 | 284.15 | 0.31 | Homelessness |
Models were built for each full gene as well as for each specified region. AICs are shown for linear models without clinical data generated from the data set (Raw AIC) as well as when all variables were included (step AIC) along with the R2 values for the step AIC models (stepAIC R2). Any Res. are samples with any resistant positions detected.
Figure 3Measuring the time to infection prediction capacity of diversities from each gene/region. Shown are the root mean square error (RMSE) values used to measure the capacity of diversities from each region to predict duration of infection. The top 17 models determined by their median RMSE values are shown for all regions tested.
Area under the receive operating characteristic curves for the differentiation of acute from chronic infections.
| Gene | Window | Total samples | Single sample acute group |
|---|---|---|---|
| NS5b | 657:756 | 0.85 | 0.85 |
| NS3 | 1215:1414 | 0.84 | 0.85 |
| NS5b | 702:901 | 0.84 | 0.84 |
| NS5a | 750:1149 | 0.84 | 0.83 |
| NS3 | 470:969 | 0.81 | 0.81 |
| NS3 | 499:598 | 0.80 | 0.81 |
| NS3 | 1561:1660 | 0.80 | 0.81 |
| E2 | 1:81 | 0.79 | 0.81 |
| E1 | 263:462 | 0.78 | 0.79 |
Total samples are those AUC values obtained from analyses using the total samples from this dataset, whereas 'single sample acute group' removes longitudinal data and retains the earliest acute and latest chronic samples of each individual.