| Literature DB >> 22589711 |
Ben Murrell1, Tulio de Oliveira, Chris Seebregts, Sergei L Kosakovsky Pond, Konrad Scheffler.
Abstract
The evolution of substitutions conferring drug resistance to HIV-1 is both episodic, occurring when patients are on antiretroviral therapy, and strongly directional, with site-specific resistant residues increasing in frequency over time. While methods exist to detect episodic diversifying selection and continuous directional selection, no evolutionary model combining these two properties has been proposed. We present two models of episodic directional selection (MEDS and EDEPS) which allow the a priori specification of lineages expected to have undergone directional selection. The models infer the sites and target residues that were likely subject to directional selection, using either codon or protein sequences. Compared to its null model of episodic diversifying selection, MEDS provides a superior fit to most sites known to be involved in drug resistance, and neither one test for episodic diversifying selection nor another for constant directional selection are able to detect as many true positives as MEDS and EDEPS while maintaining acceptable levels of false positives. This suggests that episodic directional selection is a better description of the process driving the evolution of drug resistance.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22589711 PMCID: PMC3349733 DOI: 10.1371/journal.pcbi.1002507
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Summary of models described in this manuscript.
| Model | Data | Baseline model | Site variation | Lineage variation | Selection test | Citation |
| MEDS | Codon | MG94 | Fixed effects | Episodic | Directional | This paper |
| FEEDS | Codon | MG94 | Fixed effects | Episodic | Diversifying |
|
| DEPS | Protein | HIV-Between | Random effects | Constant | Directional |
|
| EDEPS | Protein | HIV-Between | Random effects | Episodic | Directional | This paper |
[29].
FEEDS has the same structure as a model called IFEL in that paper, but the use here is novel.
[37].
Figure 1The maximum-likelihood phylogeny for the protease dataset.
Foreground branches are marked in red. All terminal foreground branches lead to sequences obtained from patients who had been receiving antiretroviral therapy. See text for details of how we determined which internal branches were assigned to foreground. MEDS and EDEPS allow the presence of a directional component along the foreground branches where antiretroviral therapy exerts selective pressure.
Sites under episodic directional and episodic diversifying selection in reverse transcriptase.
| Site | Target | MEDS p-value |
| FEEDS p-value | EDEPS Bayes Factor | Resistance |
|
| L |
|
| - | - | NRTI |
|
| V | - | - | - | 313 | NRTI |
|
| K |
|
|
| - | |
|
| L | - | - | - | 211 | NRTI |
|
| S |
|
| - | - | |
|
| I |
|
| - |
| NNRTI |
|
|
| - | - | 0.0025 | - | |
|
| N |
|
|
|
| NNRTI |
|
| Y |
|
| - | - | |
|
| F | - | - | - |
| NRTI |
|
| Y |
|
| - | - | NRTI |
|
| M |
|
| - |
| NRTI |
|
| Q |
|
| - | - | |
|
| S | - | - | - | 1772 | |
|
| L |
|
| - | 2245 | |
|
| R | - | - | - | 105 | |
|
| I |
|
| - |
| NNRTI |
|
| V |
|
| - |
| NRTI |
|
| L |
|
|
|
| NNRTI |
|
| Y |
|
| - | - | |
|
| S |
|
| - |
| NNRTI |
|
|
| - | - |
| - | |
|
| F |
|
| - | 2727 | NRTI |
|
| T |
|
| - | - | |
|
| R |
|
| - | 1401 | NRTI accessory |
|
| L |
|
| - |
| NNRTI |
|
|
| - | - | 0.0006 | - | |
|
| A |
|
| - | - |
MEDS versus FEEDS LRT, testing for directional selection.
the lower bound of the approximate confidence interval calculated from profile likelihood.
LRT, testing for diversifying selection.
Empirical Bayes analysis, testing for directional selection on protein data.
‘-’: not significant.
Nucleoside reverse-transcriptase inhibitor.
Non-nucleoside reverse-transcriptase inhibitor.
: detected only by FEEDS which does not identify a target AA.
Sites under episodic directional and episodic diversifying selection in protease.
| Site | Target | MEDS p-value |
| FEEDS p-value | EDEPS Bayes Factor | Resistance |
|
|
| - | - | 0.0005 | - | PI |
|
| T |
|
| - | - | |
|
| V |
|
| - | 145 | PI accessory |
|
| D |
|
| - | - | |
|
|
| - | - | 0.0026 | - | PI |
|
| E |
|
| - | - | PI accessory |
|
| E |
|
| - | - | |
|
| V | - | - | 0.0011 | 257 | PI accessory |
|
| S |
|
| 0.0013 | - | PI accessory |
|
| A | - | - |
|
| PI |
|
| V |
|
| - |
| PI |
|
| M |
|
|
|
| PI |
|
| L |
|
| - | - | PI accessory |
MEDS versus FEEDS LRT, testing for directional selection.
lower confidence interval calculated from the likelihood profile.
LRT, testing for diversifying selection.
Empirical Bayes analysis, testing for directional selection on protein data.
: detected only by FEEDS which does not identify a target AA.
‘-’: not significant.
Protease inhibitor.
Sites under episodic directional and episodic diversifying selection in integrase.
| Site | Target | MEDS p-value |
| FEEDS p-value | EDEPS Bayes Factor | Resistance |
|
| I |
|
| - | - | INI |
|
| A |
|
|
|
| INI accessory |
|
| S |
|
|
|
| INI |
|
| R |
|
|
|
| INI |
|
| H |
|
|
|
| INI |
|
| H |
|
|
|
| INI |
|
| R | - | - | - |
| INI accessory |
|
| Q | - | - | - |
| |
|
|
| - | - | 0.0064 | - | |
|
|
| - | - | 0.0048 | - | INI accessory |
MEDS versus FEEDS LRT, testing for directional selection.
lower confidence interval calculated from the likelihood profile.
LRT, testing for diversifying selection.
Empirical Bayes analysis, testing for directional selection on protein data.
‘-’: not significant.
Integrase inhibitor.
: detected only by FEEDS which does not identify a target AA.
Single target power simulations: power as a function of .
| # FG branches |
| ||||
| 2 | 5 | 10 | 100 | 1000 | |
| 4 | 0 (8) | 0 (16) | 0 (37) | 0.31 (110) | 0.79 (155) |
| 8 | 0 (11) | 0 (18) | 0.04 (62) | 0.51 (129) | 0.73 (170) |
| 16 | 0 (31) | 0.018 (54) | 0.036 (83) | 0.59 (177) | 0.71 (201) |
| 32 | 0.02 (62) | 0.03 (71) | 0.16 (116) | 0.68 (223) | 0.80 (282) |
Numbers in brackets are the number of times at least one substitution towards the target occurred along foreground branches: i.e. the denominator for the proportion of detections.
Single target power simulations: power as a function of number of substitutions to target AA along foreground branches, pooling over .
| # FG branches | # substitutions to target AA | |||||
| 0 | 1 | 2 | 3 | 4 |
| |
| 4 | 0 (1674) | 0 (119) | 0.2 (58) | 0.77 (48) | 0.99 (111) | N/A |
| 8 | 0 (1610) | 0 (146) | 0.23 (53) | 0.69 (26) | 1 (21) | 0.99 (144) |
| 16 | 0 (1454) | 0 (200) | 0.34 (92) | 0.49 (39) | 0.79 (34) | 0.97 (181) |
| 32 | 0 (1246) | 0.03 (234) | 0.4 (107) | 0.41 (70) | 0.70 (46) | 0.97 (297) |
Numbers in brackets are the number of times that many substitutions towards the target occurred along foreground branches: i.e. the denominator for the proportion of detections.
Dual target power simulations: power as a function of number of substitutions to two target AAs.
| Substitutions to both targets |
|
|
|
|
|
|
|
|
| MEDS detects at least one target: | 0.64 | 0.81 | 0.89 | 0.92 | 0.95 | 0.98 | 1 | 1 |
| MEDS detects both targets: | 0.19 | 0.36 | 0.48 | 0.52 | 0.63 | 0.76 | 0.78 | 0.81 |
| Total sites: | 538 | 288 | 214 | 179 | 132 | 99 | 69 | 32 |
Substitutions along foreground branches. Each target has 8 foreground branches along which changes towards it were accelerated.
False positives with site specific equilibrium frequencies as a function of the concentration parameter and the nominal p-value of the test.
|
| 0.005 | 0.05 | 0.5 | 5 |
|
| 0.005 | 0.0025 | 0.0025 | 0.0075 |
|
| 0.02 | 0.0175 | 0.02 | 0.015 |
|
| 0.0325 | 0.0325 | 0.035 | 0.0375 |