| Literature DB >> 27998283 |
Matthias Döring1, Pedro Borrego2,3, Joachim Büch4, Andreia Martins2, Georg Friedrich4, Ricardo Jorge Camacho5, Josef Eberle6, Rolf Kaiser7, Thomas Lengauer4, Nuno Taveira2,8, Nico Pfeifer9.
Abstract
BACKGROUND: CCR5-coreceptor antagonists can be used for treating HIV-2 infected individuals. Before initiating treatment with coreceptor antagonists, viral coreceptor usage should be determined to ensure that the virus can use only the CCR5 coreceptor (R5) and cannot evade the drug by using the CXCR4 coreceptor (X4-capable). However, until now, no online tool for the genotypic identification of HIV-2 coreceptor usage had been available. Furthermore, there is a lack of knowledge on the determinants of HIV-2 coreceptor usage. Therefore, we developed a data-driven web service for the prediction of HIV-2 coreceptor usage from the V3 loop of the HIV-2 glycoprotein and used the tool to identify novel discriminatory features of X4-capable variants.Entities:
Keywords: Chemokine receptor; Coreceptor; Coreceptor antagonists; HIV-2; Human immunodeficiency virus type 2; Prediction; Statistical learning; V1; V2; V3
Mesh:
Substances:
Year: 2016 PMID: 27998283 PMCID: PMC5168878 DOI: 10.1186/s12977-016-0320-7
Source DB: PubMed Journal: Retrovirology ISSN: 1742-4690 Impact factor: 4.602
Classifier AUCs per run of cross validation
| CV Run | RBF (σ = 0.001) | Linear | Polynomial (degree = 2) | Edit Kernel (γ = 0.005, PAM70) |
|---|---|---|---|---|
| 1 | 0.9475 | 0.9459 | 0.941 | 0.8629 |
| 2 | 0.9509 | 0.9506 | 0.9452 | 0.851 |
| 3 | 0.9504 | 0.9579 | 0.9444 | 0.8655 |
| 4 | 0.9449 | 0.947 | 0.9379 | 0.8634 |
| 5 | 0.9472 | 0.9467 | 0.9413 | 0.8744 |
| 6 | 0.9467 | 0.9467 | 0.9457 | 0.8689 |
| 7 | 0.9532 | 0.9535 | 0.9475 | 0.8377 |
| 8 | 0.9522 | 0.9532 | 0.9306 | 0.8623 |
| 9 | 0.9524 | 0.9524 | 0.9478 | 0.9012 |
| 10 | 0.9441 | 0.9431 | 0.9384 | 0.8672 |
| μ | 0.949 | 0.9497 | 0.942 | 0.8654 |
| σ | 0.0033 | 0.0045 | 0.0053 | 0.0162 |
The column names indicate the kernel function corresponding to each SVM and kernel parameters are indicated in brackets. Only the results for the best-performing kernel function (in terms of average AUC across all CV runs) for each set of evaluated parameters are shown. All of the classifiers performed best with a setting of ν = 0.3
Fig. 1X4-probabilities predicted by geno2pheno[coreceptor-hiv2] for V3 amino-acid sequences exhibiting the established discriminatory features indicative of X4-capability listed on the x-axis. The left-hand panel shows the predicted X4-probabilities for sequences labeled as R5, while the right-hand panel shows the predicted X4-probabilities for sequences labeled as X4-capable. The bottom line of a box indicates the 1st quartile (Q1) of predicted X4-probabilities, the bar inside the box indicates the median, and the top line indicates the 3rd quartile (Q3). The whiskers extending from a box indicate predicted X4-probabilities that lie within 1.5× IQR (interquartile range, IQR = Q3 − Q1). Outlier values that are not within the whisker region are shown as dots. Note that some of the sequence characteristics indicated on the x-axis do not have a predicted X4-probability, because no sequences exhibiting the corresponding feature and phenotype were available
Features in the model with the strongest impact on predicted viral coreceptor usage
| Position | R5 feature | X4 feature | R5 weights | X4 weights |
|---|---|---|---|---|
| 18 | L | H, Q, F, M | 0.69 | −0.23, −0.15, −0.12, −0.1 |
| Insertion after position 24 | – | I, V | 0.45 | −0.22, −0.21 |
| 19 | I | R, K, V | 0.19 | −0.25, −0.23, −0.19 |
|
| – | H, Y | 0.36 | −0.18, −0.18 |
| 24 | P |
| 0.17 |
|
| 23 | Q | R | 0.14 | −0.14 |
|
| Q | K | 0.09 | −0.12 |
|
| T | R | 0.11 | −0.07 |
|
|
| N |
| −0.09 |
|
| A | K | 0.09 | −0.07 |
|
| I | L | 0.08 | −0.08 |
| 22 | S |
| 0.08 |
|
|
| A | G | 0.08 | −0.07 |
|
| K | S | 0.07 | −0.07 |
Positions of discriminatory features that were not described previously are shown in bold italics
Fig. 2Visualization of the model coefficients for the V3 loop of the mutant ROD10 isolate (H18L + K29T). Amino acids with positive coefficients are associated with R5-tropic viruses, while negative coefficients are associated with X4-capable variants. The legend on the right indicates the color-coded amino acids and gives the FPR of the prediction. Because the predicted FPR is below the selected cutoff at 5%, the sequence is predicted to be X4-capable, which is indicated by the dark color of the X4-capable label in the bottom left corner. The labels of the x-axis refer to the positions and amino acids of the HIV-2 reference strain . Note that since the input sequence contains two insertions relative to the reference (H and Y after position 22), the 29T mutation is visualized at the x-axis tick with the D27 label
Results from the validation of the web service on nine additional V3 sequences
| Isolate | FPR | Major markers | Minor markers | Visseaux prediction | geno2pheno[coreceptor-hiv2] prediction | Phenotype |
|---|---|---|---|---|---|---|
| ROD10 (Wildtype) | 0.01 | L18X, V3 net charge >6 |
| X4-capable | X4-capable | X4-capable |
| ROD10 (K29T) | 0.01 | L18X |
| X4-capable | X4-capable | X4-capable |
| ROD10 (H18L) | 0.03 | V3 net charge >6 |
| X4-capable | X4-capable | X4-capable |
| ROD10 (H23Δ + Y24Δ) | 0.01 | L18X |
| X4-capable | X4-capable | X4-capable |
| ROD10 (H18L + K29T) | 0.03 | NA |
| R5* | X4-capable | X4-capable |
| ROD10 (H18L + H23Δ + Y24Δ) | 0.11 | V3 net charge >6 |
| X4-capable* | R5 | R5 |
| ROD10 (H18L + H23Δ + Y24Δ + K29T) | 0.15 | NA |
| R5 | R5 | R5 |
| 15PTHSJIG | 0.36 | NA |
| R5 | R5 | R5 |
| 15PTHCEC | 0.01 | L18X, V19K/R, Insertion24, V3 net charge >6 | Q23R, R28K | X4-capable | X4-capable | X4-capable |
Incorrect predictions are marked with an asterisk. ROD10 refers to the HIV2-group A reference strain, which uses both CCR5 and CXCR4 as entry coreceptors. Mutations from the ROD10 wildtype sequence are indicated in brackets, where Δ indicates deletions