| Literature DB >> 20941398 |
Samuel Alizon1, Viktor von Wyl, Tanja Stadler, Roger D Kouyos, Sabine Yerly, Bernard Hirschel, Jürg Böni, Cyril Shah, Thomas Klimkait, Hansjakob Furrer, Andri Rauch, Pietro L Vernazza, Enos Bernasconi, Manuel Battegay, Philippe Bürgisser, Amalio Telenti, Huldrych F Günthard, Sebastian Bonhoeffer.
Abstract
HIV virulence, i.e. the time of progression to AIDS, varies greatly among patients. As for other rapidly evolving pathogens of humans, it is difficult to know if this variance is controlled by the genotype of the host or that of the virus because the transmission chain is usually unknown. We apply the phylogenetic comparative approach (PCA) to estimate the heritability of a trait from one infection to the next, which indicates the control of the virus genotype over this trait. The idea is to use viral RNA sequences obtained from patients infected by HIV-1 subtype B to build a phylogeny, which approximately reflects the transmission chain. Heritability is measured statistically as the propensity for patients close in the phylogeny to exhibit similar infection trait values. The approach reveals that up to half of the variance in set-point viral load, a trait associated with virulence, can be heritable. Our estimate is significant and robust to noise in the phylogeny. We also check for the consistency of our approach by showing that a trait related to drug resistance is almost entirely heritable. Finally, we show the importance of taking into account the transmission chain when estimating correlations between infection traits. The fact that HIV virulence is, at least partially, heritable from one infection to the next has clinical and epidemiological implications. The difference between earlier studies and ours comes from the quality of our dataset and from the power of the PCA, which can be applied to large datasets and accounts for within-host evolution. The PCA opens new perspectives for approaches linking clinical data and evolutionary biology because it can be extended to study other traits or other infectious diseases.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20941398 PMCID: PMC2947993 DOI: 10.1371/journal.ppat.1001123
Source DB: PubMed Journal: PLoS Pathog ISSN: 1553-7366 Impact factor: 6.823
Figure 1Combining phylogenies and trait values in the MSM strict dataset.
A) A phylogeny based on HIV sequences obtained from patients with known set-point viral loads, B) Phylogenetic distance between two tips versus difference in trait value between these two tips (slope and p-value) and C) Distribution of log(spVL) values in the MSM strict dataset. Panel A shows the maximum likelihood phylogenetic tree built with the MSM strict dataset. Squares on the tips of the tree correspond to infected patients. The colour of the squares and the graph on the right indicate the set-point viral load (colours range from blue to red for increasing log(spVL)). The PCA tests the correlation between proximity in the phylogeny and trait values (log(spVL)). The circles on the tree nodes indicate bootstrap values: black is greater than 90%, grey is between 50 and 90% and white is lower than 50%.
Phylogenetic signal in the 4 datasets for the three different traits.
| Dataset |
|
|
|
|
|
|
|
| MSM strict | 134 | 0.59 | n.s. | 0.91 | 0.51 (0.27) | 0 | 1.07 (0.12) |
| all strict | 230 | n.s. | n.s. | n.s. | 0.17 | 0 | 0.88 (0.06) |
| MSM liberal | 404 | 0.09 | n.s. | 0.82 | 0.13 (0.05) | 0 | 1.07 (0.015) |
| all liberal | 661 | n.s. | n.s. | 0.71 | — | — | — |
We use two estimators ( and ) that lead to similar results. Significance code for the p-value of the randomisation test for is ‘’ , ‘’ , ‘’ , and ‘n.s.’ indicates that the signal does not differ from that found on a random tree. The are obtained by taking the median value of over 161 trees (see the Methods). We also show the standard deviation in brackets. ‘—’ indicates that the largest tree could not be computed with the Bayesian method because of the large number of patients. is the sample size of each dataset.
Figure 2Phylogenetic signal estimated for evolutionary processes with known heritability.
20 phylogenies are simulated to model the evolution of an infection trait in a case where heritability is set to a given value. Phylogenetic signal ( in black and in red) is then estimated on each tree using only 128 leaves to account for incomplete sampling. The box plot shows the median values, the three quartiles and the outliers. The dashed line shows . The slope is (p-value and adjusted ) for and of (p-value and adjusted ) for .
Regressions between life-history traits with and without correction for phylogenetic signal.
| Traits and dataset | Test | Slope (SE) | Y-intercept (SE) | ln(likelihood) | AIC |
| log(spVL) vs. dsCD4 ( | OLS | −1.7e-3 | 4.26 | −56 | 119 |
| RegBM | −2.7e-3 | 4.1 | −57 | 121 | |
| Trait variation among risk groups ( | OLS | log(spVL) | |||
| RegBM | log(spVL) |
The first line is a regression between log(spVL) and dsCD4. The second line tests if values for a given trait vary across risk groups. OLS is the ordinary generalised least square without phylogenetic correction (a generalised linear model yielded similar results); RegBM indicates a correction based on the tree assuming Brownian motion. SE stands for ‘Standard Error’. For further details, see the Supplementary Methods. Significance code for the p-value of the test is ‘’ , ‘’ , ‘’ and ‘n.s.’ for non sognificant.