| Literature DB >> 21610724 |
Mattia C F Prosperi1, Massimo Ciccozzi, Iuri Fanti, Francesco Saladini, Monica Pecorari, Vanni Borghi, Simona Di Giambenedetto, Bianca Bruzzone, Amedeo Capetti, Angela Vivarelli, Stefano Rusconi, Maria Carla Re, Maria Rita Gismondo, Laura Sighinolfi, Rebecca R Gray, Marco Salemi, Maurizio Zazzi, Andrea De Luca.
Abstract
Understanding the determinants of virus transmission is a fundamental step for effective design of screening and intervention strategies to control viral epidemics. Phylogenetic analysis can be a valid approach for the identification of transmission chains, and very-large data sets can be analysed through parallel computation. Here we propose and validate a new methodology for the partition of large-scale phylogenies and the inference of transmission clusters. This approach, on the basis of a depth-first search algorithm, conjugates the evaluation of node reliability, tree topology and patristic distance analysis. The method has been applied to identify transmission clusters of a phylogeny of 11,541 human immunodeficiency virus-1 subtype B pol gene sequences from a large Italian cohort. Molecular transmission chains were characterized by means of different clinical/demographic factors, such as the interaction between male homosexuals and male heterosexuals. Our method takes an advantage of a flexible notion of transmission cluster and can become a general framework to analyse other epidemics.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21610724 PMCID: PMC6045912 DOI: 10.1038/ncomms1325
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Figure 1Automated partition of a phylogenetic tree.
Graphical example of a depth-first tree search for automated phylogenetic tree partition. The method considers nodes/sub-trees with a reliability ≥ 90% and ≥ 2 distinct patients, recognizing a sub-tree as a cluster when the median sub-tree pairwise patristic distance is below a percentile threshold of the whole-tree patristic distance distribution (let it be the 10th percentile). (a) An example of a phylogenetic tree, where each patient/sequence is identified by a letter (A–Z) and each tree node has an associated value of reliability (which might be bootstrap support). (b) Histogram of the whole-tree patristic distance distribution. The vertical black line corresponds to the 10th percentile distance threshold. The partition method identifies three clusters (yellow, red and green) and discards the grey sub-tree.
Figure 2Phylogeny of Italian HIV-1 subtype B pol isolates.
Maximum-likelihood phylogenetic tree of 11,541 HIV-1 subtype B pol gene sequences from the Italian ARCA cohort. Tree is rooted on subtype J and depicted using three-dimensional hyperbolic geometry. Nodes and leaves are highlighted by yellow points.
Figure 3Tree topology.
Topological analysis of a maximum-likelihood phylogenetic tree composed of 11,541 HIV-1 subtype B pol sequences from the Italian ARCA cohort, rooted on subtype J. Median (interquartile range) branch length (blue) and number of nodes (red) for each tree level are depicted.
Study population.
|
|
| % | ||
|---|---|---|---|---|
| Sequences | 11,541 | 100 | ||
| Patients | 7,350 | 63.69 | ||
| Patients with known seroconversion date | 258 | 3.51 | ||
| Patients with known date of first HIV-1 positive test | 2,720 | 37.01 | ||
|
| ||||
| Central Italy | 3,274 | 44.54 | ||
| Northern Italy | 3,528 | 48.00 | ||
| Southern Italy | 443 | 6.03 | ||
| Unknown | 105 | 1.43 | ||
|
| ||||
| Injecting drug user | 1,418 | 19.29 | ||
| Heterosexual | 1,439 | 19.58 | ||
| Male homosexual | 1,313 | 17.86 | ||
| Other/unknown | 3,180 | 43.27 | ||
|
| ||||
| Female | 1,677 | 22.82 | ||
| Male | 5,111 | 69.54 | ||
| Unknown | 562 | 7.65 | ||
|
| ||||
| Italy | 4,210 | 57.28 | ||
| Other countries | 229 | 3.12 | ||
| Unknown | 2,911 | 39.60 | ||
|
| ||||
| ART-naive | 1,287 (1,116) | 11.15 (15.18) | ||
| ART-experienced | 6,962 (3,603) | 60.32 (49.02) | ||
| Unknown ART status | 3,292 (2,631) | 28.52 (35.80) | ||
| Presence of at least one resistance mutation for a specific drug class (one sequence per patient) | Considering mixtures of resistant mutants and wild types at specific positions | Not considering mixtures of resistant mutants and wild types at specific positions | ||
|
| ||||
| ART-naive | 246 | 22.04% | 164 | 14.69% |
| ART-experienced | 2,911 | 80.79% | 2,657 | 73.74% |
|
| ||||
| ART-naive | 125 | 11.20% | 109 | 9.77% |
| ART-experienced | 2,712 | 75.27% | 2,512 | 69.72% |
|
| ||||
| ART-naive | 145 | 12.99% | 76 | 6.81% |
| ART-experienced | 1,485 | 41.21% | 1,147 | 31.83% |
|
| ||||
| ART-naive | 69 | 6.18% | 55 | 4.93% |
| ART-experienced | 1,312 | 36.41% | 1,135 | 31.50% |
|
|
|
| ||
| Sequence year | 2004 | 2002–2007 | ||
| Number of sequences per patient | 1 | 1–2 | ||
|
| ||||
| ART-naive | 4.64 | 4.04–5.23 | ||
| ART-experienced | 3.98 | 3.32–4.65 | ||
|
| ||||
| ART-naive | 361 | 196–556 | ||
| ART-experienced | 334 | 191–504 | ||
|
| ||||
| ART-naive | 37 | 31–44 | ||
| ART-experienced | 41 | 37–46 | ||
|
| ||||
| ART-naive | 0 | 0–1 | ||
| ART-experienced | 10 | 6–15 | ||
ART, antiretroviral therapy; IQR, interquartile range; NNRTI, non-nucleoside reverse transcriptase inhibitors; NRTI, nucleoside/nucleotide reverse transcriptase inhibitors; PI, protease inhibitors.
Characteristics of subtype B HIV-1 infected patients enrolled in the Italian ARCA cohort.
Demographic factors in transmission clusters.
|
|
|
|
| ||
|---|---|---|---|---|---|
|
| % |
| % | ||
|
| |||||
| Northern Italy | 183 | 31.28 | 96 | 17.30 | <0.0001 |
| Central Italy | 146 | 24.96 | 90 | 16.22 | 0.0009 |
| Southern Italy | 8 | 1.37 | 7 | 1.26 | 0.8985 |
| Northern and Central Italy | 158 | 27.01 | 230 | 41.44 | <0.0001 |
| Northern and Southern Italy | 19 | 3.25 | 13 | 2.34 | 0.4504 |
| Central and Southern Italy | 9 | 1.54 | 12 | 2.16 | 0.5302 |
| Northern, Central and Southern Italy | 62 | 10.60 | 107 | 19.28 | 0.0001 |
|
| |||||
| Male homosexual | 135 | 27.05 | 77 | 15.88 | <0.0001 |
| Heterosexual | 93 | 18.64 | 72 | 14.85 | 0.1794 |
| IDU | 56 | 11.22 | 79 | 16.29 | 0.0415 |
| Male homosexual and heterosexual | 99 | 19.84 | 48 | 9.90 | <0.0001 |
| Male homosexual and IDU | 13 | 2.61 | 46 | 9.48 | <0.0001 |
| Heterosexual and IDU | 50 | 10.02 | 42 | 8.66 | 0.5564 |
| Male homosexual, heterosexual and IDU | 53 | 10.62 | 121 | 24.95 | <0.0001 |
|
| |||||
| Naive | 99 | 19.08 | 48 | 9.78 | 0.0001 |
| Experienced | 227 | 43.74 | 231 | 47.05 | 0.3921 |
| Naive and experienced | 193 | 37.19 | 212 | 43.18 | 0.0883 |
|
| |||||
| Yes | 153 | 26.15 | 102 | 18.65 | 0.0069 |
| No | 156 | 26.67 | 87 | 15.90 | <0.0001 |
| Yes and no | 276 | 47.18 | 358 | 65.45 | <0.0001 |
|
| |||||
| Yes | 67 | 11.45 | 58 | 10.34 | 0.5998 |
| No | 243 | 41.54 | 183 | 32.62 | 0.0051 |
| Yes and no | 275 | 47.01 | 320 | 57.04 | 0.0020 |
|
| |||||
| Yes | 61 | 10.43 | 34 | 6.09 | 0.0181 |
| No | 300 | 51.28 | 225 | 40.32 | 0.0007 |
| Yes and no | 224 | 38.29 | 299 | 53.58 | <0.0001 |
|
| |||||
| Yes | 188 | 32.14 | 161 | 28.50 | 0.2574 |
| No | 119 | 20.34 | 58 | 10.27 | <0.0001 |
| Yes and no | 278 | 47.52 | 346 | 61.24 | <0.0001 |
|
| |||||
| Below 3 | 143 | 34.71 | 76 | 18.86 | <0.0001 |
| Between 3 and 9 | 53 | 12.86 | 48 | 11.91 | 0.7235 |
| Between 9 and 14 | 33 | 8.01 | 49 | 12.16 | 0.0851 |
| Above 14 | 26 | 6.31 | 37 | 9.18 | 0.1964 |
| Mixtures with ≥2 factors | 157 | 38.11 | 193 | 47.89 | 0.0121 |
|
| |||||
| Below 1,000 | 19 | 3.85 | 28 | 6.38 | 0.1300 |
| Between 1,000 and 10,000 | 55 | 11.16 | 61 | 13.90 | 0.2834 |
| Above 10,000 | 212 | 43.00 | 151 | 34.40 | 0.0169 |
| Mixtures with ≥2 factors | 207 | 41.99 | 199 | 45.33 | 0.4017 |
|
| |||||
| Male homosexual | 130 | 26.69 | 76 | 15.90 | 0.0001 |
| Male heterosexual and homosexual | 66 | 13.55 | 19 | 3.97 | <0.0001 |
| Male heterosexual | 45 | 9.24 | 40 | 8.37 | 0.6846 |
| Male IDU | 36 | 7.39 | 48 | 10.04 | 0.2173 |
| Female and male heterosexual | 22 | 4.52 | 10 | 2.09 | 0.0649 |
| Female and male heterosexual, male homosexual | 21 | 4.31 | 8 | 1.67 | 0.0338 |
| Other mixtures | 167 | 34.29 | 277 | 57.95 | <0.0001 |
|
| |||||
| Naive and no-resistance | 78 | 15.00 | 30 | 5.93 | <0.0001 |
| Naive and resistance | 12 | 2.31 | 11 | 2.17 | 0.8985 |
| Naive, treated and no-resistance | 28 | 5.38 | 11 | 2.17 | 0.0169 |
| Naive, treated and resistance | 22 | 4.23 | 24 | 4.74 | 0.7247 |
| Other mixtures | 380 | 73.08 | 430 | 84.98 | <0.0001 |
ART, antiretroviral therapy; ARV, antiretroviral; IDU, injecting drug user; NNRTI, non-nucleoside reverse transcriptase inhibitors; NRTI, nucleoside/nucleotide reverse transcriptase inhibitors; PI, protease inhibitors.
Clusters compositions by different demographic factors, using a clustering threshold of 0.07 nucleotide substitutions per site. Observed proportions have been compared with a data randomization.
Factors associated with transmission clustering.
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
| |
| Calendar year of genotyping (per more recent) | 1.04 | (1.01–1.06) | 0.0011 | 0.98 | (0.96–1) | 0.1163 | 0.98 | (0.95–1.01) | 0.2626 |
|
| |||||||||
| Northern | 0.52 | (0.46–0.6) | <0.0001 | 0.73 | (0.63–0.85) | <0.0001 | 1.09 | (0.85–1.39) | 0.4936 |
| Southern | 0.58 | (0.45–0.75) | <0.0001 | 0.78 | (0.6–1.01) | 0.0578 | 1.07 | (0.68–1.68) | 0.7860 |
| Unknown | 0.86 | (0.56–1.34) | 0.5151 | 1.05 | (0.63–1.76) | 0.8583 | 0.65 | (0.32–1.32) | 0.2341 |
|
| |||||||||
| Heterosexual | 1.64 | (1.29–2.09) | <0.0001 | 1.08 | (0.86–1.36) | 0.4993 | 0.27 | (0.17–0.44) | <0.0001 |
| Male homosexual | 1.58 | (1.24–2.01) | 0.0002 | 1.05 | (0.82–1.33) | 0.7112 | 0.16 | (0.1–0.25) | <0.0001 |
| Other/unknown | 1.52 | (1.19–1.94) | 0.0007 | 1.05 | (0.84–1.32) | 0.6658 | 0.33 | (0.21–0.52) | <0.0001 |
|
| |||||||||
| Male | 1.88 | (1.56–2.26) | <0.0001 | 1.00 | (0.83–1.19) | 0.9564 | 0.54 | (0.4–0.73) | <0.0001 |
| Unknown | 2.27 | (1.73–2.97) | <0.0001 | 1.08 | (0.82–1.42) | 0.5801 | 0.52 | (0.32–0.83) | 0.0058 |
|
| |||||||||
| Other than Italy | 0.64 | (0.45–0.91) | 0.0142 | 0.53 | (0.37–0.75) | 0.0004 | 0.28 | (0.18–0.43) | <0.0001 |
| Unknown | 1.03 | (0.86–1.23) | 0.7390 | 1.13 | (0.93–1.36) | 0.2096 | 1.12 | (0.82–1.53) | 0.4867 |
|
| |||||||||
| ≤36 | 1.49 | (1.23–1.81) | <0.0001 | 1.20 | (0.98–1.47) | 0.0790 | 1.49 | (1.1–2.04) | 0.0111 |
| >36 and ≤41 | 1.06 | (0.88–1.26) | 0.5617 | 1.00 | (0.83–1.21) | 0.9656 | 1.70 | (1.26–2.29) | 0.0006 |
| >41 and ≤46 | 0.99 | (0.84–1.18) | 0.9366 | 1.00 | (0.84–1.19) | 0.9965 | 1.44 | (1.1–1.9) | 0.0082 |
| Unknown | 1.02 | (0.82–1.26) | 0.8910 | 1.10 | (0.87–1.38) | 0.4303 | 1.50 | (1.01–2.21) | 0.0424 |
|
| |||||||||
| ART-experienced | 0.68 | (0.56–0.84) | 0.0003 | 1.08 | (0.85–1.37) | 0.5361 | 1.18 | (0.82–1.7) | 0.3708 |
| Unknown | 0.72 | (0.59–0.89) | 0.0020 | 1.04 | (0.82–1.31) | 0.7447 | 0.93 | (0.65–1.32) | 0.6677 |
|
| |||||||||
| ≤3 years | 2.56 | (1.87–3.51) | <0.0001 | 1.78 | (1.3–2.45) | 0.0004 | 1.33 | (0.75–2.36) | 0.3261 |
| >3 and ≤9 years | 1.70 | (1.27–2.26) | 0.0003 | 1.14 | (0.87–1.5) | 0.3524 | 0.87 | (0.52–1.45) | 0.6022 |
| >9 and ≤14 years | 1.30 | (1–1.7) | 0.0537 | 1.18 | (0.93–1.51) | 0.1804 | 0.99 | (0.62–1.57) | 0.9519 |
| Unknown | 1.38 | (1.04–1.81) | 0.0232 | 0.90 | (0.7–1.17) | 0.4394 | 0.77 | (0.47–1.24) | 0.2809 |
| NRTI | 0.68 | (0.6–0.78) | <0.0001 | 0.72 | (0.62–0.82) | <0.0001 | 0.99 | (0.8–1.24) | 0.9378 |
| NNRTI | 0.93 | (0.83–1.04) | 0.2271 | 0.99 | (0.88–1.12) | 0.8875 | 0.95 | (0.78–1.16) | 0.6196 |
| PI | 0.93 | (0.82–1.07) | 0.3092 | 0.80 | (0.7–0.92) | 0.0012 | 0.77 | (0.61–0.97) | 0.0274 |
| HIV-1 RNA per one Log10 copies per ml higher | 1.08 | (1.01–1.15) | 0.0344 | 1.01 | (0.94–1.09) | 0.7574 | 1.04 | (0.92–1.17) | 0.5722 |
| CD4+ count per 50 cells/mm3 higher | 1.03 | (1.01–1.04) | 0.0002 | 1.02 | (1.01–1.04) | 0.0057 | 1.03 | (1–1.06) | 0.0616 |
ART, antiretroviral therapy; CI, confidence interval; IDU, injecting drug user; NNRTI, non-nucleoside reverse transcriptase inhibitors; NRTI, nucleoside/tide reverse transcriptase inhibitors; ns s−1, nucleotide substitutions per site; OR, odds-ratio; PI, protease inhibitors.
*Mixtures of wild-type and resistant mutants are classified as resistant
Adjusted OR of transmission clustering evidence (clustered versus un-clustered isolates) from fitting a multivariable logistic generalized-estimating-equations model, by considering different percentile thresholds.