| Literature DB >> 33142675 |
Louisa A Carlisle1,2, Teja Turk1,2, Karin J Metzner1,2, Herbert A Mbunkah1,2, Cyril Shah2, Jürg Böni2,3, Michael Huber2,3, Dominique L Braun1,2, Jan Fehr1,4, Luisa Salazar-Vizcaya5, Andri Rauch5, Sabine Yerly6, Aude Nguyen6, Matthias Cavassini7, Marcel Stoeckle8, Pietro Vernazza9, Enos Bernasconi10, Huldrych F Günthard1,2, Roger D Kouyos1,2.
Abstract
HIV-1 genetic diversity can be used to infer time since infection (TSI) and infection recency. We adapted this approach for HCV and identified genomic regions with informative diversity. We included 72 HCV/HIV-1 coinfected participants of the Swiss HIV Cohort Study, for whom reliable estimates of infection date and viral sequences were available. Average pairwise diversity (APD) was calculated over each codon position for the entire open reading frame of HCV. Utilizing cross validation, we evaluated the correlation of APD with TSI, and its ability to infer TSI via a linear model. We additionally studied the ability of diversity to classify infections as recent (infected for <1 year) or chronic, using receiver-operator-characteristic area under the curve (ROC-AUC) in 50 patients whose infection could be unambiguously classified as either recent or chronic. Measuring HCV diversity over third or all codon positions gave similar performances, and notable improvement over first or second codon positions. APD calculated over the entire genome enabled classification of infection recency (ROC-AUC = 0.76). Additionally, APD correlated with TSI (R2 = 0.33) and could predict TSI (mean absolute error = 1.67 years). Restricting the region over which APD was calculated to E2-NS2 further improved accuracy (ROC-AUC = 0.85, R2 = 0.54, mean absolute error = 1.38 years). Genetic diversity in HCV correlates with TSI and is a proxy for infection recency and TSI, even several years post-infection.Entities:
Keywords: genetic variation; hepatitis C virus infection; infection recency; sequence analysis; viral genomics
Mesh:
Substances:
Year: 2020 PMID: 33142675 PMCID: PMC7692400 DOI: 10.3390/v12111241
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Patient characteristics.
| Total Number | 72 | |
|---|---|---|
| Gender, | Female | 2 (3) |
| Male | 70 (97) | |
| Age when sample taken (years), median (IQR) | 45 (39, 52) | |
| Ethnicity, | Asian | 2 (3) |
| Black | 2 (3) | |
| Hispanic | 4 (4) | |
| White | 64 (89) | |
| HIV transmission group, | HET | 3 (4) |
| MSM | 67 (93) | |
| IDU | 1 (1) | |
| Unclear/unknown | 1 (1) | |
| Recorded history of intravenous drug use ever, | Yes | 9 (13) |
| No | 63 (88) | |
| HCV Viral subtype, | 1A | 50 (69) |
| 1B | 5 (7) | |
| 2C | 1 (1) | |
| 3A | 2 (3) | |
| 4D | 14 (19) | |
| Time since HCV infection (years), median (IQR) | 0.82 (0.47, 2.5) | |
| Clearly recent or chronic a, | True | 50 (69) |
| False | 22 (31) | |
| Full coverage of gene at all codon positions, |
| 69 (96) |
|
| 69 (96) | |
|
| 70 (97) | |
|
| 70 (97) | |
|
| 70 (97) | |
|
| 71 (99) | |
|
| 71 (99) | |
|
| 70 (97) | |
|
| 69 (96) | |
|
| 65 (90) |
IQR = Interquartile range, HET = heterosexual contacts, MSM = men who have sex with men, IDU = injection drug use. a Sample collection less than 12 months after last negative HCV test or more than 12 months after first positive HCV test.
Figure 1Average pairwise diversity (APD) against time since infection for APD calculated over each of the codon positions in turn, and over all three codon positions. Linear regression models are shown as solid lines.
Figure 2Receiver operator characteristics (ROC) curves comparing the ability of average pairwise diversity (APD) calculated over each and all codon positions to infer whether infections are recent (<1 year post-infection) or chronic. APD was calculated across the whole HCV open reading frame. All 50 patients who could be clearly classified as recent or chronic are included. Recent infection is taken as the positive outcome. AUC = area under the ROC curve.
Figure 3Area under the ROC curve, R2, and mean absolute error across the HCV open reading frame, all codon positions. The HCV open reading frame was split into 11 overlapping regions of approximately 500 amino acid codons, and average pairwise diversity (APD) was calculated over individual regions, using all codon positions. Regions were tested for their ability to categorize infection as recent (<1 year) or chronic (top), their correlation with time since infection (middle), and their ability to infer time since infection (bottom). Black dashed lines show the respective values for APD calculated over the whole open reading frame. A similar analysis was performed with diversity calculated over each gene in turn. The HCV genome is shown along the x-axis, with genes colour-coded for a composite (z-score sum, see Supplementary Equation S1) of all three outcome scores. Darker red indicates a better overall performance. Numbers along the x-axis refer to amino acid positions of the H77 reference genome.
Figure 4Time since infection against estimated time since infection as calculated from average pairwise diversity (APD). APD calculated over the recommended region of amino acid codons 503–1004 (left), and the recommended genes E2, p7, and NS2 (right).