| Literature DB >> 27126912 |
Mona Riemenschneider1, Kieran Y Cashin2, Bettina Budeus3, Saleta Sierra4, Elham Shirvani-Dastgerdi5, Saeed Bayanolhagh6, Rolf Kaiser4, Paul R Gorry2,7, Dominik Heider1,8.
Abstract
Antiretroviral treatment of Human Immunodeficiency Virus type-1 (HIV-1) infections with CCR5-antagonists requires the co-receptor usage prediction of viral strains. Currently available tools are mostly designed based on subtype B strains and thus are in general not applicable to non-B subtypes. However, HIV-1 infections caused by subtype B only account for approximately 11% of infections worldwide. We evaluated the performance of several sequence-based algorithms for co-receptor usage prediction employed on subtype A V3 sequences including circulating recombinant forms (CRFs) and subtype C strains. We further analysed sequence profiles of gp120 regions of subtype A, B and C to explore functional relationships to entry phenotypes. Our analyses clearly demonstrate that state-of-the-art algorithms are not useful for predicting co-receptor tropism of subtype A and its CRFs. Sequence profile analysis of gp120 revealed molecular variability in subtype A viruses. Especially, the V2 loop region could be associated with co-receptor tropism, which might indicate a unique pattern that determines co-receptor tropism in subtype A strains compared to subtype B and C strains. Thus, our study demonstrates that there is a need for the development of novel algorithms facilitating tropism prediction of HIV-1 subtype A to improve effective antiretroviral treatment in patients.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27126912 PMCID: PMC4850382 DOI: 10.1038/srep24883
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Overview of computational tools.
| Tool | Prediction method | Subtype |
|---|---|---|
| T-CUP 2.0 | Uses structural information of the V3 loop by modelling the electrostatic potential and hydrophobicity; combination of results by stacking | subtype B (primarily) and C with 1351 sequences (200X4 and 1151 R5) |
| Geno2pheno[coreceptor] | Support vector machine for binary classification | subtype B with 1100 sequences (769 R5, 210X4, 131 R5X4) from 332 patients |
| PhenoSeq | Evaluation of HIV-1 V3 amino acid length, net amino acid charge, number of N-linked glycosylation sites and the frequency of site-specific amino acid alterations | A, A1, A2, B, C, D, CRF_01_AE, CRF02_AG |
| WebPSSM x4r5 | Scoring matrices, reflecting the difference in abundance of a particular amino acid at a particular site | subtype B |
| WebPSSM sinsi | Scoring matrices, reflecting the difference in abundance of a particular amino acid at a particular site | subtype B |
| WebPSSM sinsiC | Scoring matrices, reflecting the difference in abundance of a particular amino acid at a particular site | subtype C |
| Genotypic rule (Raymond | 11/25 rule in combination with a net charge rule | subtype C |
| Genotypic rule (Esbjörnsson | Rules based on Raymond | subtype A |
Prediction performance of co-receptor usage models.
| Subtype | Method | Sens | Spec | Acc | PPV | NPV | FPR | FDR | F1 | DOR |
|---|---|---|---|---|---|---|---|---|---|---|
| T-CUP | 91.07 | 98.60 | 97.59 | 91.07 | 98.61 | 1.39 | 8.93 | 91.07 | 513 | |
| g2p | 87.50 | 97.77 | 96.39 | 85.96 | 98.04 | 2.23 | 14.04 | 86.73 | 244 | |
| PhenoSeq | 91.38 | 92.48 | 92.33 | 75.71 | 98.56 | 4.74 | 24.29 | 82.81 | 172 | |
| X4R5 | 75.00 | 94.49 | 91.89 | 67.74 | 96.08 | 5.51 | 32.26 | 71.19 | 47 | |
| SINSI | 71.43 | 98.90 | 95.23 | 90.91 | 95.73 | 1.10 | 9.09 | 80.00 | 174 | |
| SINSI.C | 89.29 | 91.46 | 91.17 | 61.73 | 98.22 | 8.54 | 38.27 | 72.99 | 76 | |
| Raymond | 89.29 | 99.45 | 98.09 | 96.15 | 98.37 | 0.55 | 3.85 | 92.59 | 879 | |
| T-CUP | 18.18 | 96.32 | 55.39 | 84.44 | 51.69 | 3.68 | 15.56 | 29.92 | 5 | |
| g2p | 15.79 | 97.89 | 54.89 | 89.19 | 51.38 | 2.11 | 10.81 | 26.83 | 7 | |
| Phenoseq | 17.70 | 94.74 | 54.39 | 78.72 | 51.14 | 5.26 | 21.28 | 28.91 | 4 | |
| X4R5 | 15.31 | 93.94 | 53.56 | 72.73 | 51.24 | 6.06 | 27.27 | 25.30 | 3 | |
| SINSI | 11.54 | 97.98 | 53.69 | 85.71 | 51.32 | 2.02 | 14.29 | 20.34 | 5 | |
| SINSI.C | 37.80 | 58.59 | 47.91 | 49.07 | 47.15 | 41.41 | 50.93 | 42.70 | 1 | |
| Raymond | 11.00 | 98.48 | 53.56 | 88.46 | 51.18 | 1.52 | 11.54 | 19.57 | 6 | |
| Esbjörnsson | 13.40 | 99.49 | 55.28 | 96.55 | 52.12 | 0.51 | 3.45 | 23.53 | 16 |
For subtype C each algorithm achieved a sensitivity of around 90%. Prediction performance for subtype A generally resulted in sensitivities lower than 20%.
Figure 1ROC-Curve of the best performing descriptor.
The ROC-Curve with confidence intervals of the best performing descriptor on subtype A V3 sequences is shown. A random forest model was used to classify sequences as X4 vs. R5. The sequences were encoded with the Zimm-Brag parameter sigma.
Figure 2Association plot of subtype A.
On the x-axis alignment positions of gp120 region are shown, the y-axis represents the associated p-values based on SeqFeatR. Significant changes in amino acid composition between X4 and R5 sequences (p-value < 0.01) are marked with asterisks. The variable regions V1–V5 of gp120 are drawn. The strongest associations were found in the V3 and V2 regions. Additionally, the region of V4 shows statistically significant associations to co-receptor tropism.
Figure 3Significant changes in subtype A sequences.
Consensus sequences of V2, V3 and V4 are shown for subtype A. Significant changes of amino acid compositions that have been detected with SeqFeatR are highlighted in grey and marked with asterisks. The loops are shown in bold. The regions around V3 and V2 show strongest statistical differences in amino acid composition.
Figure 4Association plot of subtype B and subtype C.
On the x-axis alignment positions of gp120 region are shown for subtype B (left) and C (right), the y-axis represents the associated p-values based on SeqFeatR. Significant differences in amino acid composition between X4 and R5 sequences can be observed at V3 region in both subtypes with the strongest signals. In addition, regions around V1, V2 and V5 show significant differences.