| Literature DB >> 29765018 |
Hongshuo Song1,2, Elena E Giorgi3, Vitaly V Ganusov4, Fangping Cai1, Gayathri Athreya5, Hyejin Yoon3, Oana Carja6, Bhavna Hora1, Peter Hraber3, Ethan Romero-Severson3, Chunlai Jiang1,7, Xiaojun Li1, Shuyi Wang8, Hui Li8, Jesus F Salazar-Gonzalez9,10, Maria G Salazar9, Nilu Goonetilleke11, Brandon F Keele12, David C Montefiori1, Myron S Cohen11, George M Shaw8,13, Beatrice H Hahn8,13, Andrew J McMichael14, Barton F Haynes1, Bette Korber3, Tanmoy Bhattacharya3,15, Feng Gao16,17.
Abstract
Recombination in HIV-1 is well documented, but its importance in the low-diversity setting of within-host diversification is less understood. Here we develop a novel computational tool (RAPR (Recombination Analysis PRogram)) to enable a detailed view of in vivo viral recombination during early infection, and we apply it to near-full-length HIV-1 genome sequences from longitudinal samples. Recombinant genomes rapidly replace transmitted/founder (T/F) lineages, with a median half-time of 27 days, increasing the genetic complexity of the viral population. We identify recombination hot and cold spots that differ from those observed in inter-subtype recombinants. Furthermore, RAPR analysis of longitudinal samples from an individual with well-characterized neutralizing antibody responses shows that recombination helps carry forward resistance-conferring mutations in the diversifying quasispecies. These findings provide insight into molecular mechanisms by which viral recombination contributes to HIV-1 persistence and immunopathogenesis and have implications for studies of HIV transmission and evolution in vivo.Entities:
Mesh:
Year: 2018 PMID: 29765018 PMCID: PMC5954121 DOI: 10.1038/s41467-018-04217-5
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Highlighter plot and phylogenetic tree of 3′ half genome sequences at screening time for CH0010. Six distinct T/F viruses (labeled a–f) were identified. Each line represents a 3′ half genome sequence, and mutations from the major T/F (first sequence at the top) are color coded according to nucleotide
Fiebig stage, transmission route, days since infection, and number of T/F viruses of heterogeneously infected individuals
| Subject | Fiebig stage | Transmission route | Days from infection (95% CI) | No. of T/F viruses | |
|---|---|---|---|---|---|
| CH0010 | I/II | Heterosexual | 9 (7, 11) | 3 | 6 |
| CH0078 | I/II | Heterosexual | 11 (7, 14) | 2 | 2 |
| CH0200 | I/II | Heterosexual | 15 (13, 17) | 3 | 5 |
| CH0047 | III | Heterosexual | 25 (21, 29) | 2 | 2 |
| CH0228 | III | Heterosexual | 24 (21, 28) | 2 | 3 |
| CH0275 | I/II | Heterosexual | 12 (8, 16) | 1 | 2 |
| CH0654 | I/II | MSM | 15 (13, 17) | 9 | 8 |
| CH1244 | IV | Heterosexual | 12 (11, 13) | NA | 2 |
| CH1754 | III | Heterosexual | 14 (12, 15) | 6 | 7 |
Days from infection were calculated using previously published methods[33]. These methods generally strongly correlate with Fiebig stage, though occasional deviations still happen, especially when under strong selection. In our present dataset, CH1244 is one such exception, with a Fiebig stage IV yet diversity that would be expected—under a model of random accumulation of mutation—to be roughly 2 weeks into the infection
MSM men who have sex with men
Fig. 2Identification of recombinants by RAPR. a Phylogenetic tree of 3′ half genome sequences from longitudinal samples from CH0010. Shaded regions show clonal expansion events and are color coded to indicate their closest T/F lineage. In each cluster, solid circle represents the originating (de novo) recombinant for that cluster and open circles represent its recombinant descendants. b Recombination breakpoints. Each line represents a sequence, and colored lines represent recombinants. Colored intervals in each recombinant sequence indicate the parental T/F lineage. Black boxes indicate the interval where the breakpoints are most likely to have occurred, and when the breakpoints are statistically significant, the box is shaded in gray. Shaded regions show clonal expansion events and are color coded to indicate their closest T/F lineage. Dashed lines indicate recombination descendants derived from a particular de novo recombinant as shown as a solid line in the same shaded group. The black stars on the right indicate the sequence lineage that RAPR identified as descendants of a single recombination event, while the program RECCO failed to recognize it as such
Fig. 3Proportion of recombinants and their descendants in the viral population over time. The 3′ half genome (a) and 5′ half genome (b) sequences were analyzed separately. At each time point, colored dots indicate the percentages of T/F sequences and their lineages, red dots indicate all recombinants, red squares de novo recombinants, red circles descendants of recombinants, and open diamonds superinfection variants
Fig. 4Clustering graph of the 3′ half genome sequences from CH0010. Each circle represents a sequence, and circles of the same color represent a T/F lineage. Gray circles represent de novo recombinant sequences and open circles represent descendants of recombinant sequences. Clusters are calculated using a greedy algorithm and they originate either from a T/F lineage or through recombination across lineages. Recombination clusters are shaded and color coded to indicate their closest T/F lineage. At most one sequence per recombination cluster is called a de novo recombinant: the rest are assumed to be descendants of this recombinant sequences. Light blue lines point to recombinant parents and green lines to the recombinant child
Estimated rates of coinfection, the percentage of coinfected cells, and the half replacement time by recombinants during the infection (3′ half genome)
| Subject | Coinfection rate (/day) | % Coinfected cells | |
|---|---|---|---|
| CH0010 | 0.067 (0.051, 0.093) | 3.3 | 23.3 (19.3, 27.6) |
| CH0078 | 0.088 (0.081, 0.095) | 4.2 | 47.3 (44.6, 50.5) |
| CH0200 | 0.056 (0.033, 0.094) | 2.7 | 29.5 (21.9, 41.8) |
| CH0047 | 0.014 (0.008, 0.022) | 0.7 | 39.2 (31.2, 55.1) |
| CH0228 | 0.103 (0.068, 0.153) | 4.9 | 27.0 (24.3, 31.0) |
| CH1754 | 0.149 (0.123, 0.177) | 6.9 | 19.8 (18.9, 21.0) |
| CH0654 | 0.035 (0.027, 0.044) | 1.7 | 27.4 (24.8, 30.6) |
| CH1244 | 0.175 (0.121, 0.27) | 8.0 | 19.5 (16.8, 22.8) |
| CH0275 | 0.024 (0.014, 0.039) | 1.2 | 114.0 (73.9, 172.0) |
| Median | 0.067 | 3.3 | 27.0 |
The rate of coinfection is given as γβI, and the percentage of coinfected cells during the infection is Fc = γβI/(γβI + γβT) with γβT=2/day. T1/2 indicates the predicted time when the frequency of recombinants reaches 50% of the viral population. Coinfection rate was estimated by fitting 3′ and 5′ data simultaneously using maximum likelihood (see Methods for more detail). In CH0654, all sequences were recombinants at day 84 before the initiation of ART at day 112. Therefore, ART in CH0654 did not affect the analysis
Fig. 5Comparison of new breakpoint and mutation rates. Each graph shows the accumulation rate of new breakpoints per sequence per nucleotide per day (blue lines), new mutations per sequence per nucleotide per day (red lines), and log-scale viral loads (green lines) in both half genomes over time for all nine participants. Upper VL detection limit is 750,000 copies/ml. The gray vertical line in CH0200 3′ half highlights the time point at which a superinfecting T/F was observed. The sequences from the last time point (day 432) in CH0654 were excluded from the analysis due to the initiation of ART at day 112
Fig. 6Identification of recombination hotpots in HV-1 genome. Recombination hotspots are shown in dark orange and cold spots in blue across the 3′ half genome. Position numbering is relative to HXB2. Each line represents a position, and the thickness of the colored regions represents consecutive positions. The sequences from the last time point (day 432) in CH0654 were excluded from the analysis due to the initiation of ART at day 112
Fig. 7Genetic p-distances within lineages, inter-lineages, and within recombinants. Pairwise genetic distances (p-distance) were calculated among recombinant viruses, variants evolved from the same T/F virus (intra-T/F), as well as variants from different T/F viruses (inter-T/F) for 3′ half (a) and 5′ half (b) genomes. For intra-T/F diversity, T/F lineages with less than three sequences were excluded. Pairwise genetic distances are shown in black dots, and their means in blue lines. All data were calculated by combining the viral sequences from all time points up to the time that all of the viruses were recombined. The middle line indicates the median of the diversity; the box shows 25 to 75 percentile of the data; and the whisker shows 10 to 90 percentile of the data. The number of sequences for each pairwise comparison group is indicated at the top of the plot
Fig. 8Sequence and neutralization analysis of viruses from homogeneously infected CH0505. a Syn/nonsyn highlighter plot shows the synonymous (green) and nonsynonymous (red) mutations compared to the T/F sequences. b Autologous neutralization analysis. Multiple Env pseudoviruses were generated from seven time points from CH0505 over 2 years of infection. Neutralization susceptibility of 123 pseudoviruses were determined with 13 CH103-lineage antibodies (right panel) and neutralization susceptibility of 33 pseudoviruses were determined were determined with 16 CH235-lineage antibodies (right panel). Neutralization magnitude is color coded by warm colors, with red being the strongest. Blue color indicates that no neutralization activities were detected with neutralizing antibodies at the 50 µg/ml concentration. c Recombinant env genes from days 538 and 692 after infection. Recombinants and their breakpoints, as determined by RAPR, are shown. Recombination breakpoints are shown as black boxes, and they are filled when the breakpoints are significant