| Literature DB >> 32395255 |
Heather E Grant1, Emma B Hodcroft2,3, Deogratius Ssemwanga4,5, John M Kitayimbwa6, Gonzalo Yebra7, Luis Roger Esquivel Gomez8, Dan Frampton9, Astrid Gall10, Paul Kellam10, Tulio de Oliveira11, Nicholas Bbosa4, Rebecca N Nsubuga4, Freddie Kibengo4, Tsz Ho Kwan12, Samantha Lycett7, Rowland Kao7, David L Robertson13, Oliver Ratmann14, Christophe Fraser15, Deenan Pillay10,11, Pontiano Kaleebu4,5, Andrew J Leigh Brown1.
Abstract
Recombination is an important feature of HIV evolution, occurring both within and between the major branches of diversity (subtypes). The Ugandan epidemic is primarily composed of two subtypes, A1 and D, that have been co-circulating for 50 years, frequently recombining in dually infected patients. Here, we investigate the frequency of recombinants in this population and the location of breakpoints along the genome. As part of the PANGEA-HIV consortium, 1,472 consensus genome sequences over 5 kb have been obtained from 1,857 samples collected by the MRC/UVRI & LSHTM Research unit in Uganda, 465 (31.6 per cent) of which were near full-length sequences (>8 kb). Using the subtyping tool SCUEAL, we find that of the near full-length dataset, 233 (50.1 per cent) genomes contained only one subtype, 30.8 per cent A1 (n = 143), 17.6 per cent D (n = 82), and 1.7 per cent C (n = 8), while 49.9 per cent (n = 232) contained more than one subtype (including A1/D (n = 164), A1/C (n = 13), C/D (n = 9); A1/C/D (n = 13), and 33 complex types). K-means clustering of the recombinant A1/D genomes revealed a section of envelope (C2gp120-TMgp41) is often inherited intact, whilst a generalized linear model was used to demonstrate significantly fewer breakpoints in the gag-pol and envelope C2-TM regions compared with accessory gene regions. Despite similar recombination patterns in many recombinants, no clearly supported circulating recombinant form (CRF) was found, there was limited evidence of the transmission of breakpoints, and the vast majority (153/164; 93 per cent) of the A1/D recombinants appear to be unique recombinant forms. Thus, recombination is pervasive with clear biases in breakpoint location, but CRFs are not a significant feature, characteristic of a complex, and diverse epidemic.Entities:
Keywords: HIV; breakpoints; genome; phylogenetics; recombination; subtypes
Year: 2020 PMID: 32395255 PMCID: PMC7204518 DOI: 10.1093/ve/veaa004
Source DB: PubMed Journal: Virus Evol ISSN: 2057-1577
Figure 1.Subtype distribution in the 5,000 bp and above genomes, n = 1,472, and the near full length 8,000 bp and above dataset, n = 465.
Figure 2.Maximum-likelihood reconstruction of the A1/D recombinants using IQ-TREE and their SCUEAL subtype (right). One triplet (Rec-105 to Rec-107), and a few cherries can be seen (e.g. Rec-153 and Rec-154). Some examples of convergent recombination patterns include Rec-116 and Rec-147, Rec-8 and Rec-160, Rec-29 and Rec-158.
Figure 3.Pairs of genomes linked by a distance of less than 2 per cent genetic distance (TN93) in two or more 300 bp windows along the genome. The matching windows are shown with open clear boxes, and the SCUEAL subtyping result for the genome pairs are in colour (blue for subtype A1 and orange for subtype D).
Figure 4.Recombination pattern of the A1/D recombinant genomes (n = 164). Genome position is on the x-axis and each horizontal bar is an individual genome recombination pattern. Segments of orange colour represent subtype D, while blue colouration represents subtype A1.
Figure 5.(a) Distribution of inter-subtype recombination breakpoints divided into 300 bp bins in A1/D recombinants (n = 164) and all other inter-subtype recombinant genomes (n = 68). Genome position numbering corresponds to the alignment as described in Section 2. (b) Distribution of breakpoints in the envelope region. Breakpoints have been binned into 100 bp regions and the finer sub-structure of gp120 and gp41 is shown.
Beta estimates for the GLM on the log-odds scale.
| Estimate | SE | z | P | |
|---|---|---|---|---|
| Intercept (gene region = accessory) | −1.61635 | 0.05886 | −27.462 | <0.001 |
| Gene region = gag–pol | −0.92597 | 0.09147 | −10.123 | <0.001 |
| Gene region = env C2-TM | −1.40804 | 0.16688 | −8.437 | <0.001 |