Literature DB >> 30250751

Inferring the age difference in HIV transmission pairs by applying phylogenetic methods on the HIV transmission network of the Swiss HIV Cohort Study.

Katharina Kusejko1,2, Claus Kadelka1,2, Alex Marzel1,2, Manuel Battegay3, Enos Bernasconi4, Alexandra Calmy5, Matthias Cavassini6, Matthias Hoffmann7, Jürg Böni2, Sabine Yerly5, Thomas Klimkait8, Matthieu Perreau6, Andri Rauch9, Huldrych F Günthard1,2, Roger D Kouyos1,2.   

Abstract

Age-mixing patterns are of key importance for understanding the dynamics of human immunodeficiency virus (HIV)-epidemics and target public health interventions. We use the densely sampled Swiss HIV Cohort Study (SHCS) resistance database to study the age difference at infection in HIV transmission pairs using phylogenetic methods. In addition, we investigate whether the mean age difference of pairs in the phylogenetic tree is influenced by sampling as well as by additional distance thresholds for including pairs. HIV-1 pol-sequences of 11,922 SHCS patients and approximately 240,000 Los Alamos background sequences were used to build a phylogenetic tree. Using this tree, 100 per cent down to 1 per cent of the tips were sampled repeatedly to generate pruned trees (N = 500 for each sample proportion), of which pairs of SHCS patients were extracted. The mean of the absolute age differences of the pairs, measured as the absolute difference of the birth years, was analyzed with respect to this sample proportion and a distance criterion for inclusion of the pairs. In addition, the transmission groups men having sex with men (MSM), intravenous drug users (IDU), and heterosexuals (HET) were analyzed separately. Considering the tree with all 11,922 SHCS patients, 2,991 pairs could be extracted, with 954 (31.9 per cent) MSM-pairs, 635 (21.2 per cent) HET-pairs, 414 (13.8 per cent) IDU-pairs, and 352 (11.8 per cent) HET/IDU-pairs. For all transmission groups, the age difference at infection was significantly (P < 0.001) smaller for pairs in the tree compared with randomly assigned pairs, meaning that patients of similar age are more likely to be pairs. The mean age difference in the phylogenetic analysis, using a fixed distance of 0.05, was 9.2, 9.0, 7.3 and 5.6 years for MSM-, HET-, HET/IDU-, and IDU-pairs, respectively. Decreasing the cophenetic distance threshold from 0.05 to 0.01 significantly decreased the mean age difference. Similarly, repeated sampling of 100 per cent down to 1 per cent of the tips revealed an increased age difference at lower sample proportions. HIV-transmission is age-assortative, but the age difference of transmission pairs detected by phylogenetic analyses depends on both sampling proportion and distance criterion. The mean age difference decreases when using more conservative distance thresholds, implying an underestimation of age-assortativity when using liberal distance criteria. Similarly, overestimation of the mean age difference occurs for pairs from sparsely sampled trees, as it is often the case in sub-Saharan Africa.

Entities:  

Keywords:  HIV; age structure; cophenetic distance; phylogenies; sampling

Year:  2018        PMID: 30250751      PMCID: PMC6143731          DOI: 10.1093/ve/vey024

Source DB:  PubMed          Journal:  Virus Evol        ISSN: 2057-1577


1. Introduction

Human immunodeficiency virus (HIV) infection and acquired immunodeficiency syndrome (AIDS) is a major health threat with approximately 1.8 million new HIV infections and 1 million AIDS-related deaths worldwide in 2016 (UNAIDS) (Fact sheet—Latest statistics on the status of the AIDS epidemic, 2017). Targeting public health interventions for the prevention of new infections in subpopulations at risk is therefore crucial to curb the epidemic. In this context, age mixing and its impact on HIV transmission was studied in different settings in the past. For example, in sub-Saharan Africa, the region that carries the highest burden of the HIV epidemic, there is evidence that older men infecting younger women drive the HIV epidemic (Ott et al., 2011; Oliveira et al., 2017; Schaefer et al., 2017). Many public health interventions therefore target young women, e.g., by teaching them cautiousness around so-called ‘sugar daddies’, i.e., older men. In the USA, African Americans carry a disproportionate burden of the HIV epidemic, in particular young men who have sex with men (MSM). Age patterns, in particular differences between black and white MSM, were analyzed by Grey et al. (2015). Black MSM exhibited a slightly more disassortative age mixing compared with white MSM, but this difference was too weak to explain the higher HIV prevalence of black MSM in their model. Doherty, Schoenbach, and Adimora (2009) investigated sexual mixing of heterosexual African Americans and found an overall strong assortativity with respect to illicit drug use and assortative mixing with respect to education and incarceration primarily for males. In Hurt et al. (2010), the age differences of the three most recent sexual partners of young MSM in the USA were used to quantify how the odds of acquiring HIV increase with the age of the sexual partners. In addition, a modeling study by Wilson (2009) showed that for Australian MSM, despite the increasing mean age of HIV-infected MSM, the epidemic is likely to be sustained due to frequent age-disparate mixing. Another Australian study, by Chow et al. (2016), shows that sexual mixing is assortative with respect to age and condom use in MSM and heterosexual relationships. Most of the above-mentioned studies were based on questionnaires about the age of sexual partners of the study participants. Such studies heavily rely on correct reporting by the study participants, but also on the ability of estimating the age of the sexual partners correctly. Phylogenetic analysis of HIV sequences can overcome these potential biases introduced by incorrect reports. The underlying assumption of analyses of phylogenetic trees is that two patients whose HIV sequences are clustered in a tree share a social network or even form an HIV transmission pair. Calculating the mean of the absolute age differences in birth years of patients clustered in the tree gives hence information about the age differences at infection in the HIV transmission network. This method is, however, sensitive to the choice of certain parameters. Several drawbacks of phylogenetic cluster methods, such as the potential bias introduced by the time since infection, were pointed out in a simulation study by Le Vu et al. (2018). Novitsky et al. (2014) studied the impact of sample density on the proportion of HIV sequences in phylogenetic clusters. The performance of different phylogenetic methods in challenging, i.e., poorly sampled, settings was analyzed by Ratmann et al. (2017) based on simulated HIV-1 epidemics with a focus on recent transmission dynamics. Phylogenetic analyses of demographic and social patterns depend on the sample proportion of the whole population of people living with HIV (PLWH), on distance thresholds used for inclusion of clusters and of course the scientific question of interest itself. It is expected that any significant pattern detected in HIV transmission networks, e.g., clustering of patients of similar age or same ethnicity, will be underestimated if only few patients are sampled. With a small sample proportion, the phylogeny might not reflect the HIV transmission network well and the chances of obtaining HIV transmission pairs in the phylogeny are small. Inclusion of a large number of pairs, which form a pair in the phylogenetic tree only because the intermediate links of the transmission chain are not sampled, will therefore underestimate how strong patterns are pronounced in the HIV transmission network (see Fig. 1 for the underlying idea).
Figure 1.

‘Heuristic’ example of the possible impact of the sample proportion: we start with sixteen patients: the tips are labeled with the birth year. The left tree has sixteen tips and six pairs (in blue). For the middle tree, eight tips are randomly sampled from the left tree (the red tips). The middle tree has three pairs (in blue). For the right tree, four tips are randomly sampled from the middle tree (the red tips). The right tree has two pairs (in blue). For each tree, the mean age difference of the pairs is calculated: 1.2 years for the left tree, 2 years for the middle tree and 3 years for the right tree.

‘Heuristic’ example of the possible impact of the sample proportion: we start with sixteen patients: the tips are labeled with the birth year. The left tree has sixteen tips and six pairs (in blue). For the middle tree, eight tips are randomly sampled from the left tree (the red tips). The middle tree has three pairs (in blue). For the right tree, four tips are randomly sampled from the middle tree (the red tips). The right tree has two pairs (in blue). For each tree, the mean age difference of the pairs is calculated: 1.2 years for the left tree, 2 years for the middle tree and 3 years for the right tree. In our study, we use the Swiss HIV Cohort Study (SHCS) resistance database to analyze the age difference in pairs of patients in the HIV transmission network. In particular, we study the age difference at infection for the three most frequent transmission groups of HIV, namely in MSM, heterosexuals (HET), and intravenous drug users (IDU). Moreover, we use this dataset as an example to better understand the impact of sample proportion and distance thresholds on the age difference of pairs, measured by the difference in birth years, in the phylogenetic tree.

2. Methods

2.1 Swiss HIV Cohort Study

The SHCS is a prospective multicenter study including PLWH at the age of sixteen year or older in Switzerland and was launched in 1988. It is estimated that the SHCS covers at least 45 per cent of all PLWH and 69 per cent of all AIDS patients in Switzerland (Schoeni-Affolter et al. 2010). Baseline demographic information, such as birth year, gender, most likely route of HIV infection and ethnicity, is collected at study entry. Clinical and laboratory information, such as CD4 cell counts and HIV viral load, is collected in two to four follow-up visits per year. The genotypic-resistance-test database of the SHCS contains HIV-1 pol-sequences of 11,922 patients, which is 60 per cent of all patients enrolled up to 2016. Considering only patients enrolled between 1996 and 2016, the database contains at least one sequence for 77 per cent of the patients, due to considerable retrospective sequencing based on the bio bank. Combining the sample proportion of the SHCS of at least 45 per cent of the whole Swiss epidemic and the 60 per cent of SHCS patients with at least one sequence in the database, we can deduce that the sequences available in the SHCS cover at least 27 per cent of the whole Swiss HIV epidemic, again with considerably higher coverage for recent years. The SHCS further contains sequences for an estimated 69 per cent of MSM diagnosed between 1996 and 2009 in Switzerland (Drescher et al. 2014), and a recent study by Shilaih et al. (2016) showed a good coverage of hard-to-reach subpopulations suggesting no systematic exclusion of marginalized populations neither from the cohort nor from the sequence database.

2.2 The phylogenetic tree

For the construction of the maximum-likelihood phylogenetic tree, we included HIV-1 pol-sequences stored in the SHCS database. Sequencing was routinely performed for the pol region from the nucleotide positions 2,253–3,870 in the HIV genome. Only sequences with a minimal length of 250 nucleotides in the protease and a minimal length of 500 nucleotides in the reverse transcriptase were included into our analysis. If more than one sequence per patient was available, the earliest sequence was considered. In a first step, the sequences were aligned to the reference genome HXB2 (accession number: K03455.1). In addition, the SHCS sequences were compared with approximately 240,000 sequences from the Los Alamos database by using Basic Local Alignment Search tool (BLAST) (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Los Alamos sequences with at least 90 per cent identity to an SHCS sequence, called hits, were included, but at most the 10 closest hits per SHCS sequence. These criteria led to the inclusion of 11,922 SHCS sequences and 11,390 Los Alamos sequences. The median coverage of the protease was 297 nucleotides and of the reverse transcriptase 1,005 nucleotides. The phylogenetic tree including the SHCS sequences and the Los Alamos hits was built with FastTree (Price, Dehal, and Arkin 2009), by using the generalized time-reversible model of nucleotide evolution and the CAT approximation for rate variation across sites. This approach of building a tree was already verified and used in other SHCS projects (Bachmann et al. 2017; Turk et al. 2017).

2.3 Sampling from the tree

We used the phylogenetic tree containing sequences of the SHCS with included Los Alamos hits and constructed new trees by keeping only a certain percentage of the tips of the original tree and dropping the other tips. We call the resulting trees pruned trees, i.e., the trees after dropping tips as well as the corresponding edges that connected these tips to the rest of the tree. These pruned trees keep the topology of the original tree, with the desired tips removed, as illustrated in Fig. 1. The pruning procedure was performed by using the drop.tip function in the R-package Analyses of Phylogenetics and Evolution (ape) (Paradis, Claude, and Strimmer 2004). In total, we generated repeatedly 1, 2, 3, 4, and 5 per cent of the tips and steps of 5 thereafter, i.e. 10, 15, … , 95 per cent, to generate pruned trees. For each fixed sample percentage, 500 pruned trees were sampled, resulting in 23× 500 = 11,500 pruned trees, plus the original tree. For each tree, all clusters of size 2 (pairs) with both patients being SHCS patients were extracted and saved together with the corresponding cophenetic distance. For each sub-analysis, pairs with a cophenetic distance below the threshold of interest were used. The birth year, sex, and most likely route of HIV transmission (referred to as transmission group) of the patients were mapped on the tips of the tree. In addition, pairs were grouped by transmission, namely pairs with both patients being MSM (MSM-pairs), both patients being IDU (IDU-pairs), both patients being HET (HET-pairs) and pairs with one patient being HET and one patient being IDU (HET/IDU-pairs).

2.4 Cophenetic distance

The cophenetic distance between two tips from a phylogenetic tree is the sum of the branch lengths connecting the two tips (Sokal and Rohlf 1962). Different thresholds ranging from 0.01 to 0.05 of this distance were considered for transmission pairs. Pairs that exceeded the distance threshold were not included in the respective analysis.

2.5 Calculating the mean age difference

We used the absolute age difference measured as the age difference of the birth years of the patients in all cases. For each fixed sample proportion and each fixed distance threshold, the mean of the absolute age differences of the pairs of interest, i.e., either all or transmission group specific, was calculated separately for each of the 500 corresponding pruned trees and then averaged. We used the age difference by birth year, since this measure stays constant over time and is independent of sample date or age at infection of the patients. If, for example, Patient A was born in 1960, but diagnosed and sequenced in 2000 and Patient B born in 1970, diagnosed in 2005 and sequenced in 2008, their age difference will be 1970−1960 = 10 years at any time point.

2.6 Analysis of the age difference at infection

To assess whether HIV transmission is more likely in pairs of similar age, we used random pairs as a comparison. For that, we randomly assigned pairs of patients and computed the resulting mean age difference of all pairs. This process was repeated one hundred times and the mean of the resulting mean age differences was used as the reference value. In addition, we wanted to assess by how much the age difference of pairs was overestimated when varying the distance threshold and the sample density, denoted by ‘assortativity lost’. For that, we used the minimum age difference (min) and the maximum age difference (max) obtained by varying the distance threshold and sample percentage. In particular, we compare ‘min’ and ‘max’ with the mean age difference obtained by random pairing (random). The assortativity lost is then defined as: 1− (random−max)/(random−min).

3. Results

3.1 Study population

Sequences of 11,922 SHCS patients were included in the phylogenetic tree. Of them, 8,554 (71.75 per cent) were males and 3,368 (28.25 per cent) were females. Moreover, 4,738 (39.74 per cent) of the patients were MSM, 4,246 (35.61 per cent) HET and 2,430 (20.38 per cent) IDU. The whole tree with all 11,922 patients contained 2,991 potential SHCS transmission pairs, i.e., clusters of size 2 with both tips belonging to SHCS patients. Of these, 954 (31.9 per cent) were MSM-pairs, 635 (21.23 per cent) HET-pairs, 414 (13.84 per cent) IDU-pairs, and 352 (11.77 per cent) HET/IDU-pairs. In addition, there were 310 (10.4 per cent) pairs with one patient being MSM and one patient being HET, 105 (3.5 per cent) pairs with one patient being MSM and the other patient being IDU and 221 (7.4 per cent) with at least one patient not belonging to one of the three main transmission groups MSM, HET, or IDU. For further analyses, we either looked at all pairs together or concentrated on the four epidemiologically most relevant categories of pairs, namely MSM-, HET-, IDU-, and HET/IDU-pairs.

3.2 Age structure and age difference at infection

The median birth year of all included patients was 1965 with a standard deviation of 11.3 years. MSM had the median birth year 1965 (SD =12.2 years), HET the year 1966 (SD=11.9 years), and IDU the year 1963 (SD=6.4 years) (see Fig. 2A). Without including a distance threshold, the mean age difference between all pairs was 9.1 years with a median of 7 years. IDU-pairs had the smallest age difference with a mean of 5.6 years (median=4 years), followed by HET/IDU-pairs with a mean of 7.2 years (median=5 years), MSM-pairs with a mean of 9.3 years (median=7 years), and HET-pairs with a mean of 9.6 years (median=7 years), visualized in Fig. 2B. The age difference observed in pairs in the tree was indeed significantly smaller compared with the average age difference of two patients on the tree, namely 9.1 years compared with 12.2 years (P < 0.001). Random reassignment of MSM-pairs led to a mean age difference of 13.2 years (13.1 years for HET-pairs, 7.1 years for IDU-pairs and 10.4 years for HET/IDU-pairs). For each considered category, the mean age difference was significantly smaller (P < 0.001) compared with randomly assigned pairs, indicating that HIV transmission occurs between people of similar age, for each transmission group. In Fig. 2C, we showed the number of pairs for each combination of birth years (grouped by 5 years) and in Fig. 2D we normalized the number of pairs for each combination of birth years by the number of patients born in these categories. With Fig. 2D, we could also visualize that the age difference in pairs in the phylogenetic tree is smaller compared with random pairs, i.e., the diagonal is darker than the off-diagonal.
Figure 2.

(A) Distribution of the birth years by route of HIV transmission: MSM, HET, and IDU; (B) distribution of age differences (in years) for the four categories of pairs: MSM-, HET-, IDU-, and HET/IDU-pairs; (C) number of pairs by birth year; (D) number of pairs by birth year, normalized by the number of patients in each birth year.

(A) Distribution of the birth years by route of HIV transmission: MSM, HET, and IDU; (B) distribution of age differences (in years) for the four categories of pairs: MSM-, HET-, IDU-, and HET/IDU-pairs; (C) number of pairs by birth year; (D) number of pairs by birth year, normalized by the number of patients in each birth year.

3.3 Impact of distance criterion

The impact of the distance criterion on the mean age difference of the pairs was analyzed by only including pairs with a distance smaller than a given threshold, which varied from 0.01 to 0.05 (see Fig. 3). We observed a strong effect for HET-pairs, where the mean age difference ranged from 8.3 to 9 years. This presented a significant increase (P < 0.001) in age difference with a regression slope of 0.013 years per 0.01 increase of distance criterion. Similarly, the mean age difference for HET/IDU-pairs ranged from 6.1 to 7.4 years with a slope of 0.019 years per 0.01 increase of distance criterion (P < 0.001). For MSM-pairs, the mean age difference ranged from 8.7 to 9.2 with a significant (P < 0.001) slope of 0.011 years per 0.01 increase of distance criterion. For IDU-pairs, no significant correlation between the mean age difference and the distance criterion was observed (see Fig. 3).
Figure 3.

Analysis of the observed mean age difference of MSM-, IDU-, HET-, and HET/IDU-pairs for various distance thresholds. For each distance threshold, only pairs with a smaller distance are included. The number of pairs for each distance threshold is shown as well (in blue).

Analysis of the observed mean age difference of MSM-, IDU-, HET-, and HET/IDU-pairs for various distance thresholds. For each distance threshold, only pairs with a smaller distance are included. The number of pairs for each distance threshold is shown as well (in blue).

3.4 Impact of sampling

The impact of sampling on the mean age difference of the pairs was analyzed by generating 500 pruned trees for various sample percentages between 1 and 100 per cent, and a liberal distance threshold of 0.05. Strong effects of the sample proportion were observed for HET-pairs with the mean age difference increasing from 9 to 10.3 years (P < 0.001), for MSM-pairs with an increase from 9.2 to 10.1 years (P < 0.001) and for IDU-pairs with an increase from 5.6 to 6.1 years (P < 0.001) (Fig. 4).
Figure 4.

Analysis of the mean age difference of MSM-, IDU-, HET-, and HET/IDU-pairs for different sample proportions, averaging over 500 pruned trees for each sample proportion. The number of pairs for each sample proportion (again, averaged over 500 pruned trees) is shown as well (in blue).

Analysis of the mean age difference of MSM-, IDU-, HET-, and HET/IDU-pairs for different sample proportions, averaging over 500 pruned trees for each sample proportion. The number of pairs for each sample proportion (again, averaged over 500 pruned trees) is shown as well (in blue). For HET/IDU-pairs, the mean age difference increased significantly (P < 0.001) from 7.3 to 7.9 years. The total difference in the mean age difference was rather small, e.g., 9.2 compared with 10.1 years for MSM-pairs. A relative comparison to the randomly expected mean age difference revealed, however, a noteworthy underestimation of the clustering of patient with similar age for too low sample proportions. In Table 1, we compared the mean age difference between two patients in the tree with the mean age difference obtained from pairs in the trees. We showed that the clustering of patient with similar age was underestimated by up to 45 per cent: for the full tree with a distance threshold of 0.01, we found that HET/IDU-pairs are 4.3 years younger compared with randomly assigned HET/IDU-pairs, but for the distance threshold of 0.05 in the 4 per cent pruned tree only 2.4 years younger.
Table 1.

Comparison of the mean age difference (in years) for varying distance threshold and varying sample percentage, for all pairs, as well as stratified by transmission groups, including the expected mean age difference for randomly assigned pairs.

ALLMSMHETIDUHET/IDU
Random12.2113.2413.117.0810.36
100 per cent sampling, distance
≤0.058.79.295.67.3
≤0.048.69.18.85.57.2
≤0.038.498.65.57.2
≤0.028.48.98.55.67.2
≤0.018.38.88.35.96.1
Distance ≤0.05, sampling
100 per cent8.79.295.67.3
80 per cent8.89.29.15.67.5
60 per cent8.89.39.35.67.6
40 per cent8.99.49.65.77.7
20 per cent99.59.85.87.8
10 per cent9.19.710.35.97.9
5 per cent9.21010.45.97.7
4 per cent9.310.110.468
3 per cent8.89.610.35.77.1
2 per cent8.81010.967.8
1 per cent8.811.69.55.67.2
Assortativity lost26 per cent20 per cent31 per cent32 per cent45 per cent
Comparison of the mean age difference (in years) for varying distance threshold and varying sample percentage, for all pairs, as well as stratified by transmission groups, including the expected mean age difference for randomly assigned pairs. Moreover, in Fig. 5, we show the impact of sampling given for varying distance thresholds. We see a trend that sampling has more impact, i.e., a larger difference in age, for more liberal distance thresholds.
Figure 5.

Top: mean age difference of all pairs in years (by color, scale bar on the right) derived from the phylogenetic tree, stratified by distance threshold (ranging from 0.01 to 0.05) and sample percentage (ranging from 5 to 100 per cent). Bottom: for each fixed distance threshold above, we show the difference in ‘mean age difference’ between 100 and 5 per cent sampling.

Top: mean age difference of all pairs in years (by color, scale bar on the right) derived from the phylogenetic tree, stratified by distance threshold (ranging from 0.01 to 0.05) and sample percentage (ranging from 5 to 100 per cent). Bottom: for each fixed distance threshold above, we show the difference in ‘mean age difference’ between 100 and 5 per cent sampling.

3.5 A closer look at HET-pairs

In the above analysis, we treated HET-pairs in the same way as MSM- and IDU-pairs in the sense that we considered two patients in the pair, regardless of their gender, to be interchangeable. While MSM-pairs are by definition pairs with both patients being male, all combinations of gender, i.e., male–male, male–female, and female–female, were considered in the analysis of IDU- and HET-pairs. In contrast to HET-pairs, all combinations of gender are plausible in IDU-pairs due to needle sharing. In the fully sampled tree without including a distance criterion, 414 (65.1 per cent) of the HET-pairs were male–female, but 140 (22 per cent) were female–female and 81 (12.8 per cent) were male–male pairs. In the male–female HET-pairs, the median birth year of male patients was 1963, 5 years earlier than the median birth year of female patients that was 1968. In 277 (66.9 per cent) of the male–female HET-pairs the male patient was older, in 116 (28.0 per cent) pairs the female patient was older and in 21 (5.1 per cent) pairs the two patients had the same birth year. While male–male HET-pairs could still be true transmission pairs due to incorrect report of sexual preference, sexual transmission of HIV between two female persons is very rare. Therefore, a large amount of female–female HET-pairs are most likely not real transmission pairs. The impact of the distance threshold and the sample density on the percentage of female–female HET-pairs is shown in Fig. 6. As expected, the percentage of female–female HET-pairs decreases with higher sample proportion and with stricter distance criterion.
Figure 6.

Percentage of HET-pairs (indicated by color, scale bar on the right) where both patients are female for various distance thresholds (ranging from 0.01 to 0.05) and sample proportions (ranging from 20 to 100 per cent).

Percentage of HET-pairs (indicated by color, scale bar on the right) where both patients are female for various distance thresholds (ranging from 0.01 to 0.05) and sample proportions (ranging from 20 to 100 per cent).

4. Discussion

Phylogenetic analysis of HIV transmission is an efficient tool for obtaining a better understanding of the dynamics of the epidemic, but it needs to be executed with caution. In this study, we aimed to highlight the influence of the sample proportion of PLWH and distance threshold for genetically linked pairs on the mean age difference of pairs defined by the birth years of the patients in the pairs. Similar effects are, however, expected when extracting other traits by the same method, as could, for example, be done for the assortativity by ethnicity, the body-mass index or behavioral aspects such as smoking. All these scientific questions addressed by phylogenetic methods face the same underlying problem: the lower the sample proportion, the lower the probability of obtaining real transmission pairs or at least two patients who share indeed a social network. We use the SHCS resistance database, which is densely sampled with at least 27 per cent of sequences of the whole Swiss epidemic and even better coverage for recent years, to point out problems associated with phylogenetic analyses of demographic traits within HIV transmission networks. Exemplary, we use the mean age difference of pairs to demonstrate changes in the observed age difference at infection obtained by varying sample proportion and distance threshold of including pairs in the analysis. In all transmission groups considered, i.e., MSM-, HET-, IDU-, and HET/IDU-pairs, we find that the age difference in pairs in the tree is slightly, but significantly, smaller compared with randomly assigned pairs. The mean age difference in MSM-pairs is around 9 years, which might be unexpectedly high, but is not implausible as studies by Hurt et al. (2010) and Morris, Zavisca, and Dean (1995) on young MSM in the USA identified young men who have sex with older men as the drivers of the epidemic. In particular, partners of young, primary HIV-infected MSM were on average 6 years older than partners of uninfected MSM of the same age class. One simple explanation of the high age difference in pairs of PLWH is certainly that the whole population of PLWH is ageing. Because of that, the odds of acquiring HIV when having an older partner is higher compared with having a younger partner (Morris, Zavisca, and Dean 1995; Hurt et al. 2010). On the other hand, some HIV transmissions might occur due to prostitution of young MSM with the client being much older, or the young MSM coming from a high prevalence country. We want to emphasize that the age difference calculated in this project reflects the age difference in MSM-pairs where an HIV transmission event happened, but does not necessarily reflect the sexual contact network in general. Moreover, it should be noted that we only consider the age difference of patients in pairs. Although it might be potentially relevant, we do not distinguish different absolute ages leading to the same age difference, e.g., we do not distinguish between pairs with ages 18 and 25 years or 38 and 45 at infection (both correspond to an age difference 7 years). The main hurdle for distinguishing between different absolute ages leading to the same age difference is that the exact date of infection is unknown for most transmission events. The same holds for the large age difference of more than 9 years observed for HET-pairs: not the age difference of heterosexual pairs in Switzerland is reflected, but the absolute age difference in pairs for which an HIV transmission event happened, with the most likely route being heterosexual contacts. Cases of HET-pairs with a large age difference might be pairs of patients with mixed ethnicity, as shown by Marzel et al. (2017). In their study, among thirty-three validated transmission pairs, twelve pairs were of mixed ethnicity with a large median age difference of 17.5 years. In addition, the mean age difference might still be an overestimation: we show that the mean age difference of pairs clearly increases when using a smaller sample proportion and as a result would most likely observe a decrease when performing the analysis with an even higher sample proportion than provided by the SHCS. This raises a problem concerning phylogenetic studies performed on sparsely sampled populations, as, for example, by Oliveira et al. (2017). In this study, the authors wanted to highlight the ‘sugar daddy’-phenomenon for KwaZulu-Natal, South Africa, by understanding the age structure of transmission pairs using phylogenetic analysis. Their study area, a part of the uMgungundlovu district of KwaZulu-Natal, has about 445,000 inhabitants, of which around 40 per cent are infected with HIV. Using a genetic distance threshold of 4.5 per cent, they identified 90 phylogenetic clusters in a tree of 1,589 sequences, all sampled of individuals not virally suppressed at the time the study was undertaken. They found high age differences within these clusters. Our study showcases that their low sample proportion may (at least partially) explain these high age differences. Of course, there are many other studies that do not use phylogenetic methods, which report that high age differences in Sub-Saharan Africa, i.e., young women and older men, drive the HIV epidemics (Schaefer et al. 2017). Also, even with only 1 per cent sample density of the SHCS sequences—the lowest sample percentage we analyzed—and a liberal distance threshold, we still observed a lower age difference in the pairs in the trees compared with the average age difference of patients in the tree. However, the magnitude of the age difference reported by Oliveira et al. (2017) might be overestimated due to the low sampling density and a liberal genetic distance threshold. Kiwuwa-Muyingo et al. (2017) found a high within-household and within-community HIV transmission in five fishing communities with approximately 15,000 inhabitants at Lake Victoria, Uganda, by using 238 sequences. To compensate for this low sample proportion, an extensive sensitivity analysis was performed by using genetic distance thresholds ranging from 0.5 to 5 per cent. Moreover, they used not only HIV-1 pol-sequences but also env-sequences for their analysis. An important question to investigate is therefore, whether a conservative distance criterion for including pairs or clusters in phylogenetic analyses can compensate for sparse sampling. For that, it is important to mention that we usually do not see a random sample of the population of interest, but a biased sample for several reasons: some transmission groups are more likely to be sampled earlier in their HIV infection, as is, for example, the case for MSM in Switzerland. This leads to a problem when choosing a distance threshold, as transmission pairs that are sequenced during the first few months of the HIV infection have a much smaller distance compared with pairs sampled many years after infection. With a too conservative distance threshold, recent transmission pairs are preferentially selected, introducing another bias into the phylogenetic analysis, as, for example, shown by Marzel et al. (2016). Another bias concerning the distance threshold could be introduced by different in-host evolution, as, for example, due to HIV infection by multiple founder viruses, HIV super infection or simply different host- or viral-genetic factors. These factors are hardly studied with regard to their impact on phylogenies, making it difficult to deduce the ideal distance criterion. For low sample proportions, there may not even be an ideal distance criterion; the mean age difference of pairs in the transmission tree is overestimated for all criteria at low sample proportions (Fig. 5). To conclude, a good way of dealing with the problem of a sparsely sampled HIV population is certainly to perform extensive sensitivity analysis on the distance criterion, but also resampling the available sequences and understanding the impact of sample density on the specific scientific question. Moreover, combining phylogenetic analysis of different regions of the genome, as done for env and pol by Kiwuwa-Muyingo et al. (2017) could be a promising method, which needs, however, further investigation. It is important to realize, depending on the scientific question of interest, whether working with true transmission pairs is crucial or transmission pairs reflecting the underlying social network suffice to understand the underlying dynamics. Our results show that pairs defined by a conservative distance threshold are more robust to sparse sampling of sequences from a patient population. This suggests that if the sampling proportion is low and if it is important that phylogenetic pairs reflect true pairs (as might be the case for quantifying the prevalence of transmission in pairs with large age differences), the pitfalls induced by the low sample proportion can be alleviated by choosing a strict distance threshold. In addition, measures to estimate the magnitude of the pairs that are most likely no real transmission pairs could help to determine suitable parameters for the phylogenetic analysis. One example for such a measure is the percentage of pairs with both patients being female heterosexual (see Fig. 6). Of course, female–female HET-pairs could still be true transmission pairs due to wrong classification of transmission group, i.e., they could have shared needles, or reflect rare events of sexual female-to-female HIV transmission (Chan et al. 2014). Nevertheless, in the case of a high number of female–female HET-pairs, further investigation on the chosen parameters should be considered. Finally, combining phylogenetic analysis with other clinical and demographic properties, such as, for example, done by Marzel et al. (2017) by looking at shared visits in the clinic for detecting true transmission pairs, could increase the credibility of phylogenetic studies.

Summary

The representativeness of the SHCS resistance database, which covers at least 27 per cent of the whole Swiss HIV epidemic, allowed us to analyze the impact of the sample proportion and the distance threshold on the age difference of observed pairs in the phylogenetic tree. Both factors proved to influence the mean age difference of HIV transmission pairs, which was measured by the absolute difference in birth years. The age difference decreased almost monotonically both with a stricter, i.e., smaller, distance threshold, and a higher sample proportion. Especially for low sample proportions, deriving the age difference of pairs in the phylogenetic tree, or similar quantitative measures, can be misleading and requires extensive sensitivity analysis (see Fig. 5).

Ethical approval and consent to participate

The SHCS was approved by the ethics committees of the participating institutions (Kantonale Ethikkommission Bern, Ethikkommission des Kantons St. Gallen, Comite Departemental d’Ethique des Specialites Medicales et de Medicine Communataire et de Premier Recours, Kantonale Ethikkommission Zürich, Repubblica et Cantone Ticino–Comitato Ethico Cantonale, Commission Cantonale d’Étique de la Recherche sur l’Être Humain, Ethikkommission beider Basel for the SHCS and Kantonale Ethikkommission Zürich for the ZPHI), and written informed consent was obtained from all participants.

Data availability

All clinical, demographical, and genetic data were obtained as part of the Swiss HIV Cohort Study (SHCS). Due to privacy reasons, the sensitivities associated with HIV infections, and the representativeness of the dataset, a deposition of the sequence data in an open database is not possible (our data would in principle allow the reconstruction of transmission events and could thereby endanger the patients’ privacy. This is especially problematic because HIV-1 sequences have been frequently used in court cases). From a scientific point of view, the consequences of an open and uncontrolled access to such densely sampled sequences could jeopardize the future publication (and thus the investigation) of similarly complete data-sets and thereby be contra-productive even from an ‘open-data’ perspective. However, data can be made available for checking the results on a confidential basis and a sub-sample of 10 per cent sequences from the SHCS has been uploaded to Genbank in the context of a previous publications. Moreover, all data in the SHCS can be used for well-defined projects that are in accordance with the guidelines of the SHCS, if a corresponding project proposal is approved by the SHCS scientific board.
  23 in total

1.  Age-gaps in sexual partnerships: seeing beyond 'sugar daddies'.

Authors:  Miles Q Ott; Till Bärnighausen; Frank Tanser; Mark N Lurie; Marie-Louise Newell
Journal:  AIDS       Date:  2011-03-27       Impact factor: 4.177

2.  Mining for pairs: shared clinic visit dates identify steady HIV-positive partnerships.

Authors:  A Marzel; M Shilaih; T Turk; N K Campbell; W-L Yang; J Böni; S Yerly; T Klimkait; V Aubert; H Furrer; A Calmy; M Battegay; M Cavassini; E Bernasconi; P Schmid; K J Metzner; H F Günthard; R D Kouyos
Journal:  HIV Med       Date:  2017-04-04       Impact factor: 3.180

3.  Social and sexual networks: their role in the spread of HIV/AIDS among young gay men.

Authors:  M Morris; J Zavisca; L Dean
Journal:  AIDS Educ Prev       Date:  1995

4.  Transmission networks and risk of HIV infection in KwaZulu-Natal, South Africa: a community-wide phylogenetic study.

Authors:  Tulio de Oliveira; Ayesha B M Kharsany; Tiago Gräf; Cherie Cawood; David Khanyile; Anneke Grobler; Adrian Puren; Savathree Madurai; Cheryl Baxter; Quarraisha Abdool Karim; Salim S Abdool Karim
Journal:  Lancet HIV       Date:  2016-12-01       Impact factor: 12.767

5.  HIV-1 Transmission During Recent Infection and During Treatment Interruptions as Major Drivers of New Infections in the Swiss HIV Cohort Study.

Authors:  Alex Marzel; Mohaned Shilaih; Wan-Lin Yang; Jürg Böni; Sabine Yerly; Thomas Klimkait; Vincent Aubert; Dominique L Braun; Alexandra Calmy; Hansjakob Furrer; Matthias Cavassini; Manuel Battegay; Pietro L Vernazza; Enos Bernasconi; Huldrych F Günthard; Roger D Kouyos; V Aubert; M Battegay; E Bernasconi; J Böni; H C Bucher; C Burton-Jeangros; A Calmy; M Cavassini; G Dollenmaier; M Egger; L Elzi; J Fehr; J Fellay; H Furrer; C A Fux; M Gorgievski; H F Günthard; D Haerry; B Hasse; H H Hirsch; M Hoffmann; I Hösli; C Kahlert; L Kaiser; O Keiser; T Klimkait; R D Kouyos; H Kovari; B Ledergerber; G Martinetti; B Martinez de Tejada; K Metzner; N Müller; D Nadal; D Nicca; G Pantaleo; A Rauch; S Regenass; M Rickenbach; C Rudin; F Schöni-Affolter; P Schmid; J Schüpbach; R Speck; P Tarr; A Trkola; P L Vernazza; R Weber; S Yerly
Journal:  Clin Infect Dis       Date:  2015-09-19       Impact factor: 9.079

6.  Sex with older partners is associated with primary HIV infection among men who have sex with men in North Carolina.

Authors:  Christopher B Hurt; Derrick D Matthews; Molly S Calabria; Kelly A Green; Adaora A Adimora; Carol E Golin; Lisa B Hightow-Weidman
Journal:  J Acquir Immune Defic Syndr       Date:  2010-06       Impact factor: 3.731

7.  Age-disparate relationships and HIV incidence in adolescent girls and young women: evidence from Zimbabwe.

Authors:  Robin Schaefer; Simon Gregson; Jeffrey W Eaton; Owen Mugurungi; Rebecca Rhead; Albert Takaruza; Rufurwokuda Maswera; Constance Nyamukapa
Journal:  AIDS       Date:  2017-06-19       Impact factor: 4.177

8.  Phylogenetic Tools for Generalized HIV-1 Epidemics: Findings from the PANGEA-HIV Methods Comparison.

Authors:  Oliver Ratmann; Emma B Hodcroft; Michael Pickles; Anne Cori; Matthew Hall; Samantha Lycett; Caroline Colijn; Bethany Dearlove; Xavier Didelot; Simon Frost; A S Md Mukarram Hossain; Jeffrey B Joy; Michelle Kendall; Denise Kühnert; Gabriel E Leventhal; Richard Liang; Giacomo Plazzotta; Art F Y Poon; David A Rasmussen; Tanja Stadler; Erik Volz; Caroline Weis; Andrew J Leigh Brown; Christophe Fraser
Journal:  Mol Biol Evol       Date:  2016-10-07       Impact factor: 16.240

9.  HIV-1 transmission networks in high risk fishing communities on the shores of Lake Victoria in Uganda: A phylogenetic and epidemiological approach.

Authors:  Sylvia Kiwuwa-Muyingo; Jamirah Nazziwa; Deogratius Ssemwanga; Pauliina Ilmonen; Harr Njai; Nicaise Ndembi; Chris Parry; Paul Kato Kitandwe; Asiki Gershim; Juliet Mpendo; Leslie Neilsen; Janet Seeley; Heikki Seppälä; Fred Lyagoba; Anatoli Kamali; Pontiano Kaleebu
Journal:  PLoS One       Date:  2017-10-12       Impact factor: 3.240

10.  Likely female-to-female sexual transmission of HIV--Texas, 2012.

Authors:  Shirley K Chan; Lupita R Thornton; Karen J Chronister; Jeffrey Meyer; Marcia Wolverton; Cynthia K Johnson; Raouf R Arafat; Patricia M Joyce; William M Switzer; Walid Heneine; Anupama Shankar; Timothy Granade; Michele S Owen; Patrick Sprinkle; Vickie Sullivan
Journal:  MMWR Morb Mortal Wkly Rep       Date:  2014-03-14       Impact factor: 17.586

View more
  8 in total

1.  Genetic clustering analysis for HIV infection among MSM in Nigeria: implications for intervention.

Authors:  Yuruo Li; Hongjie Liu; Habib O Ramadhani; Nicaise Ndembi; Trevor A Crowell; Gustavo Kijak; Merlin L Robb; Julie A Ake; Afoke Kokogho; Rebecca G Nowak; Charlotte Gaydos; Stefan D Baral; Erik Volz; Sodsai Tovanabutra; Man Charurat
Journal:  AIDS       Date:  2020-02-01       Impact factor: 4.632

2.  Limited Sustained Local Transmission of HIV-1 CRF01_AE in New South Wales, Australia.

Authors:  Francesca Di Giallonardo; Angie N Pinto; Phillip Keen; Ansari Shaik; Alex Carrera; Hanan Salem; Barbara Telfer; Craig Cooper; Karen Price; Christine Selvey; Joanne Holden; Nadine Bachmann; Frederick J Lee; Dominic E Dwyer; Sebastián Duchêne; Edward C Holmes; Andrew E Grulich; Anthony D Kelleher
Journal:  Viruses       Date:  2019-05-27       Impact factor: 5.048

3.  Pretreatment HIV drug resistance spread within transmission clusters in Mexico City.

Authors:  Margarita Matías-Florentino; Antoine Chaillon; Santiago Ávila-Ríos; Sanjay R Mehta; Héctor E Paz-Juárez; Manuel A Becerril-Rodríguez; Silvia J Del Arenal-Sánchez; Alicia Piñeirúa-Menéndez; Verónica Ruiz; Patricia Iracheta-Hernández; Israel Macías-González; Jehovani Tena-Sánchez; Florentino Badial-Hernández; Andrea González-Rodríguez; Gustavo Reyes-Terán
Journal:  J Antimicrob Chemother       Date:  2020-03-01       Impact factor: 5.790

4.  Incorporation of information diffusion model for enhancing analyses in HIV molecular surveillance.

Authors:  Tsz Ho Kwan; Ngai Sze Wong; Grace Chung Yan Lui; Kenny Chi Wai Chan; Owen Tak Yin Tsang; Wai Shing Leung; Kai Man Ho; Man Po Lee; Wilson Lam; Sze Nga Chan; Denise Pui Chung Chan; Shui Shan Lee
Journal:  Emerg Microbes Infect       Date:  2020-01-30       Impact factor: 7.163

5.  Assessing the uncertainty around age-mixing patterns in HIV transmission inferred from phylogenetic trees.

Authors:  David Niyukuri; Peter Nyasulu; Wim Delva
Journal:  PLoS One       Date:  2021-03-25       Impact factor: 3.240

6.  Using Molecular Transmission Networks to Reveal the Epidemic of Pretreatment HIV-1 Drug Resistance in Guangxi, China.

Authors:  Fei Zhang; Bingyu Liang; Xu Liang; Zhaosen Lin; Yuan Yang; Na Liang; Yao Yang; Huayue Liang; Jiaxiao Jiang; Jiegang Huang; Rongye Huang; Shanmei Zhong; Cai Qin; Junjun Jiang; Li Ye; Hao Liang
Journal:  Front Genet       Date:  2021-09-10       Impact factor: 4.599

7.  Similar But Different: Integrated Phylogenetic Analysis of Austrian and Swiss HIV-1 Sequences Reveal Differences in Transmission Patterns of the Local HIV-1 Epidemics.

Authors:  Katharina Kusejko; Nadine Tschumi; Sandra E Chaudron; Huyen Nguyen; Manuel Battegay; Enos Bernasconi; Jürg Böni; Michael Huber; Alexandra Calmy; Matthias Cavassini; Alexander Egle; Katharina Grabmeier-Pfistershammer; Bernhard Haas; Hans Hirsch; Thomas Klimkait; Angela Öllinger; Matthieu Perreau; Alban Ramette; Baharak Babouee Flury; Mario Sarcletti; Alexandra Scherrer; Patrick Schmid; Sabine Yerly; Robert Zangerle; Huldrych F Günthard; Roger D Kouyos
Journal:  J Acquir Immune Defic Syndr       Date:  2022-08-01       Impact factor: 3.771

8.  A View of Human Immunodeficiency Virus Infections in the North-West Region of Romania.

Authors:  Cristian Jianu; Sorana D Bolboacă; Adriana Violeta Topan; Irina Filipescu; Mihaela Elena Jianu; Corina Itu-Mureşan
Journal:  Medicina (Kaunas)       Date:  2019-11-29       Impact factor: 2.430

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.