| Literature DB >> 22044801 |
Joke Snoeck1, Jacques Fellay, István Bartha, Daniel C Douek, Amalio Telenti.
Abstract
BACKGROUND: The HIV-1 genome is subject to pressures that target the virus resulting in escape and adaptation. On the other hand, there is a requirement for sequence conservation because of functional and structural constraints. Mapping the sites of selective pressure and conservation on the viral genome generates a reference for understanding the limits to viral escape, and can serve as a template for the discovery of sites of genetic conflict with known or unknown host proteins.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22044801 PMCID: PMC3229471 DOI: 10.1186/1742-4690-8-87
Source DB: PubMed Journal: Retrovirology ISSN: 1742-4690 Impact factor: 4.602
Figure 1Map of the HIV-1 genome. For clarity, the genome is represented as linear, with the genes represented as a concatemer (top bar). The following layers of data are shown: Conservation: black = amino acid conservation less than 95%. RNA: dark blue = extensively structured RNA (SHAPE parameter < 0.25), light purple = flexible RNA (SHAPE > 0.5). Protein structure: blue = structured (β-sheet or α-helix), grey = no structural information available for vif, vpr, tat, rev, vpu and nef. Overlapping region: green = sites in overlapping reading frames. Positive selection: dark purple = sites under positive selection. CD8 T cell epitope: pink = CD8 T cell epitope, CD4 T cell epitope: light pink = CD4 T cell epitope, AB epitope : red = antibody epitope, AA and AG enrichment: orange = regions enriched in AA and AG dinucleotide motives.
Result of the univariate statistics for association with (A) conservation, or (B) positive selection.
| (A) Univariate - conservation | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Genome | gag | pro | rt | int | vif | vpr | tat | rev | vpu | gp120 | gp41 | nef | ||||||||||||||
| OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | OR (95% CI) | p | |
| Flexible RNA regions | 0.93 (0.69-1.26) | NS | na | na | 1.27 (0.55-3.45) | NS | 1.38 (0.58-3.82) | NS | 0.92 (0.27-3.62) | NS | na | 5.42E+06(8.53E-123-∞) | NS | 5.65E+06 (8.53E-123-∞) | NS | 2.5 (0.86-7.47) | NS | 0.58 (0.33-0.99) | 0.05 | 1.69 (0.16-3.66) | NS | 0.58 (0.074-2.12) | NS | |||
| Structured RNA regions | 1.29 (1.05-1.6) | 0.02 | 0.54 (0.34-0.87) | 0.01 | 1.05 (0.37-3.23) | NS | 3.19 (1.46-8.43) | 0.008 | na | 0.58 (0.23-1.46) | NS | 1.07 (0.38-3.31) | NS | 1.63E-07(∞-1.04E+122) | NS | 0.98 (0.18-5.51) | NS | 3.47E-07(∞-2.13E+122) | NS | 1.73 (0.93-3.31) | NS | 2.87 (1.75-4.82) | 4E-05 | 1.19 (0.67-2.12) | NS | |
| α-helix structures | 1.52 (1.17-1.98) | 0.002 | 2 (1.17-3.42) | 0.01 | 1.06 (0.14-2.20) | NS | 0.9 (0.55-1.49) | NS | 0.9 (0.42-1.9) | NS | na | na | na | na | na | 1.1 (0.57-2.11) | NS | 3.93 (0.48-2.45) | NS | na | ||||||
| β-sheet structures | 0.74 (0.56-0.97) | 0.03 | na | 0.38 (0.12-1.11) | NS | 1.14 (0.63-2.14) | NS | 1.94 (0.7-6.24) | NS | na | na | na | na | na | 1.05 (0.66-1.66) | NS | na | na | ||||||||
| Overlapping regions | 0.55 (0.45-0.68) | 7.3E-09 | 0.51 (0.3-0.89) | 0.02 | 0.63 (0.18-2.5) | NS | na | 2.78 (0.55-50.9) | NS | 1.55 (0.74-3.48) | NS | 0.84 (0.32-2.33 | NS | 0.25 (0.1-0.58) | 0.002 | na | 1.10 (0.42-2.8) | NS | Na | 0.53 (0.33-0.86) | 0.01 | na | ||||
| OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) | ||||||||||||||
| CD8 T cell epitope | 0.85(0.64-1.12 | NS | 1.21E-07(∞-1.03E+88) | NS | 0.69 (0.2-1.85) | NS | 2.46(0.13-1.48) | NS | 1.44 (0.21-6.11) | NS | 1.23E-07(∞-8.03E+38) | NS | 0.29 (0.015-1.68) | NS | 1.22 (0.36-3.599 | Ns | na | 0.76 (0.36-1.52) | NS | 0.84 (0.35-1.95) | NS | |||||
| CD4 T cell epitope | 1.03 (0.82-1.28) | NS | 0.84(0.42-1.79) | NS | na | 0.81 (0.13-2.86) | NS | 1.06 (0.28-3.37) | NS | na | na | 1.54 (0.99-2.37) | NS | 0.66 (0.28-1.59) | NS | |||||||||||
| AB epitope | 6.14E-07(∞-1.78E+15) | NS | na | 0.62 (0.034-3.09) | NS | na | na | na | 0.44 (0.065-1.76) | NS | na | na | 0.87 (0.44-1.64) | NS | 3.98 (0.8-162) | NS | ||||||||||
The odds ratio (95% confidence interval) and the p-value are shown for the entire genome and for each gene separately. Results in bold are statistically significant. NS: not significant, na: not applicable
Figure 2Multivariate analysis of constraining and diversifying forces shaping the viral genome. For each variable, the odds ratio and 95% confidence interval are shown for a multivariate model predicting conservation (A) or positive selection (B). The vertical line shows the null hypothesis (OR = 1). Two separate analyses were performed, one including all variables (blue) and one excluding protein structure, not available for 45% of the genome (black). OR = odds ratio, CI = confidence interval.