| Literature DB >> 35891568 |
Paballo Nkone1, Shayne Loubser2, Thomas C Quinn3,4, Andrew D Redd3,4, Oliver Laeyendecker3,4, Caroline T Tiemessen2, Simnikiwe H Mayaphi1,5.
Abstract
HIV-1 incidence is an important parameter for assessing the impact of HIV-1 interventions. The aim of this study was to evaluate HIV-1 polymerase (pol) gene sequence diversity for the prediction of recent HIV-1 infections. Complete pol Sanger sequences obtained from 45 participants confirmed to have recent or chronic HIV-1 infection were used. Shannon entropy was calculated for amino acid (aa) sequences for the entire pol and for sliding windows consisting of 50 aa each. Entropy scores for the complete HIV-1 pol were significantly higher in chronic compared to recent HIV-1 infections (p < 0.0001) and the same pattern was observed for some sliding windows (p-values ranging from 0.011 to <0.001), leading to the identification of some aa mutations that could discriminate between recent and chronic infection. Different aa mutation groups were assessed for predicting recent infection and their performance ranged from 64.3% to 100% but had a high false recency rate (FRR), which was decreased to 19.4% when another amino acid mutation (M456) was included in the analysis. The pol-based molecular method identified in this study would not be ideal for use on its own due to high FRR; however, this method could be considered for complementing existing serological assays to further reduce FRR.Entities:
Keywords: HIV-1 polymerase; Shannon entropy; chronic HIV-1 infection; false recency rate; recent HIV-1 infection; sanger sequencing; sequence diversity
Mesh:
Substances:
Year: 2022 PMID: 35891568 PMCID: PMC9324365 DOI: 10.3390/v14071587
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
Demographics and summary of the amino acid mutations found to have a different distribution between recent and chronic infection sequences.
| IT | ER | EL | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Pt ID | HIV Stage | Sex | Age (years) | HIV VL (copies/mL) | CD4 Count (cells/uL) | I234 | I241 | M456 | T499 | E531 | E548 | E628 | R629 | E670 | E684 | I690 | L704 | L733 |
| 261 | R | M | 33 |
| 386 | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2504 | R | F | 24 |
| 287 | X | - | X | - | - | - | - | - | - | - | X | - | X |
| 5041 | R | M | 23 |
| n/a | - | - | - | - | - | - | - | - | - | - | - | - | |
| 6512 | R | F | 23 |
| 215 | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 6727 | R | F | 28 |
| 706 | X | - | - | - | - | - | - | - | - | - | - | - | - |
| 6638 | R | F | 28 |
| 457 | X | - | - | - | - | - | - | - | - | - | - | - | - |
| 6582 | R | F | 24 |
| 818 | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 9498 | R | F | 34 |
| 964 | - | - | - | X | - | - | - | - | - | X | - | - | - |
| 9049 | R | F | 20 |
| n/a | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 8575 | R | F | 27 |
| 668 | - | - | X | X | - | - | - | - | - | X | - | - | - |
| 8047 | R | M | 31 |
| n/a | X | - | - | - | - | - | - | - | - | - | X | - | - |
| 7084 | R | F | 28 |
| 411 | - | - | X | - | - | - | - | - | X | - | - | - | - |
| 6743 | R | F | 26 |
| 638 | - | - | - | - | - | - | - | - | - | - | - | - | X |
| 6737 | R | F | 24 |
| n/a | X | - | - | - | - | - | - | - | - | - | - | - | X |
| 639 | C | F | 30 |
| 392 | X | - | - | - | X | - | - | - | - | - | - | X | - |
| 843 | C | F | 21 |
| 348 | - | X | X | - | - | - | - | X | - | - | - | - | - |
| 1121 | C | F | 27 |
| 127 | - | - | - | - | - | X | - | - | X | - | - | - | - |
| 1213 | C | F | 36 |
| n/a | - | - | - | - | - | X | - | - | - | - | - | - | - |
| 1475 | C | F | 32 |
| 255 | - | - | X | X | - | X | X | - | - | - | X | - | - |
| 2340 | C | F | 22 |
| n/a | - | - | - | - | - | - | - | X | - | X | X | - | - |
| 2696 | C | F | 18 |
| n/a | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 3253 | C | F | 28 |
| n/a | - | - | - | - | X | - | - | X | X | X | X | - | - |
| 3387 | C | F | 37 |
| n/a | - | - | X | - | - | X | - | - | - | - | - | - | - |
| 3474 | C | F | 21 |
| 675 | - | - | X | - | - | - | - | - | X | - | - | - | - |
| 9986 | C | F | 28 |
| 160 | - | X | X | - | - | - | - | - | - | - | X | - | - |
| 3606 | C | F | 24 |
| 529 | - | - | - | - | - | - | - | - | - | - | - | X | - |
| 3869 | C | F | 32 |
| n/a | - | - | X | - | - | - | - | - | - | - | X | - | - |
| 3880 | C | F | 30 |
| 575 | - | - | - | - | - | - | - | - | X | - | - | X | - |
| 3910 | C | F | 33 |
| n/a | - | - | - | - | X | - | X | - | - | - | - | X | - |
| 3912 | C | F | 27 |
| 287 | - | - | - | - | - | - | - | X | X | - | - | - | - |
| 3920 | C | F | 20 |
| 306 | - | - | X | - | - | - | - | X | - | - | - | - | - |
| 4351 | C | F | 35 |
| n/a | - | X | X | - | X | - | - | - | - | - | - | - | - |
| 4198 | C | F | 19 |
| n/a | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 5054 | C | F | 26 |
| 72 | - | - | X | - | - | - | - | - | - | - | - | - | - |
| 6380 | C | F | 25 |
| 385 | - | X | - | - | - | - | - | - | - | - | - | X | - |
| 6565 | C | F | 28 |
| 371 | - | - | - | - | - | - | - | - | - | X | - | - | - |
| 6671 | C | F | 35 |
| n/a | - | - | - | X | - | X | - | - | - | - | - | - | - |
| 6649 | C | F | 32 |
| 164 | - | - | X | X | - | - | - | - | - | - | X | - | - |
| 6640 | C | F | 37 |
| 407 | - | - | - | - | - | - | X | - | - | - | X | X | - |
| 6596 | C | F | 31 |
| 576 | X | - | - | - | X | - | - | X | - | - | X | - | - |
| 9915 | C | F | 30 |
| 382 | - | - | - | X | - | - | - | - | - | X | - | - | - |
| 9895 | C | F | 31 |
| 607 | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 9854 | C | F | 20 |
| n/a | - | - | X | - | - | - | - | - | X | - | - | X | - |
| 7959 | C | F | 40 |
| 343 | - | - | X | X | - | - | X | - | X | X | - | - | - |
| 6990 | C | F | 26 |
| 269 | - | - | - | X | - | - | - | - | - | - | - | - | - |
Different groups of amino acid mutations (ER, EL and IT) were named based on the first and last amino acid of the group. The amino acid mutations (I234 and L733) had higher frequency in recent infection but were not used for further analysis as there were only few of these mutations identified during recent HIV stage. There was no statistical significance (p > 0.05) for amino acid mutations with high frequency during chronic HIV-1 infection. Pt ID = participant identity; R = recent HIV; C = chronic HIV; F = female; M = male; n/a = not available.
Figure 1Shannon entropy score analysis of the complete HIV polymerase gene amino acid sequences obtained during recent and chronic HIV-1 disease stages. (A) Higher diversity (entropy) was observed during chronic HIV-1 infection with site-by-site entropy analysis. (B) A scatterplot indicating a significant difference in the median entropy scores between the two stages of infection. Graphs were plotted using GraphPad Prism. HIV = Human immunodeficiency virus.
Figure 2Scatterplots of sliding windows (containing 50 amino acids each) used to screen for informative areas that can differentiate recent from chronic infection. The following windows had significantly higher median entropy scores between recent and chronic HIV disease stages: 51–100, 151–200, 201–250, 301–350, 351–400, 401–450, 451–500, 501–550, 601–650, 651–700 and 701–750. Scatterplots were created using GraphPad Prism. HIV = Human immunodeficiency virus.
HIV-1 pol CTL epitopes and location of amino acid mutations with different distribution between recent and chronic infection stages.
| Amino Acid Site | CTL Epitope Mapped to | HIV-1 | Subtype Identified for |
|---|---|---|---|
| I241 | NETPG | RT (137–145) | B |
| M456 | R | RT (356–366) | B, C |
| T499 | PIQKETWE | RT (392–401) | B |
| E531 | RT (432–441) | C | |
| E548 | R | RT (448–457) | -- |
| E628 | IKK | RT (526–534) | B |
| E629 | IKKE | RT (526–534) | B |
| E670 | QE | IN (9–19) | B, C |
| E684 | RAMAS | IN (20–28) | -- |
| I690 | FNLPP | IN (26–36) | A |
| L704 | VASCDKCQ | IN (37–45) | C |
CTL = cytotoxic T-lymphocyte; RT = reverse transcriptase; IN = integrase; -- = HIV-1 subtype for this epitope has not yet been defined. Underlined = mutant amino acids within these epitopes.
Comparison of the frequency of highly informative amino acid mutations between study sequences and reference sequences.
| IT | ER | EL | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Study Sequences | n | I241 | M456 | T499 | E531 | E548 | E628 | R629 | E670 | E684 | I690 | L704 |
| Recent | 14 | 0 | 3 | 2 | 0 | 0 | 0 | 0 | 1 | 2 | 2 | 0 |
| Chronic | 30 | 4 | 12 | 6 | 5 | 5 | 4 | 6 | 7 | 5 | 8 | 7 |
|
| ||||||||||||
| SA [ | 21 | 2 | 5 | 1 | 1 | 0 | 0 | 4 | 5 | 3 | 5 | 3 |
| SA [ | 29 | 2 | 6 | 16 | 1 | 8 | 0 | 2 | 2 | 3 | 8 | 1 |
| Malawi [ | 25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 15 | 9 | 15 |
| India [ | 21 | 1 | 3 | 2 | 6 | 1 | 2 | 1 | 3 | 21 | 8 | 2 |
|
| ||||||||||||
| SA [ | 75 | 3 | 23 | 21 | 7 | 8 | 5 | 5 | 7 | 12 | 15 | 12 |
| Malawi [ | 38 | 0 | 6 | 0 | 0 | 9 | 0 | 5 | 9 | 7 | 0 | 10 |
| Botswana [ | 23 | 7 | 12 | 10 | 11 | 6 | 3 | 14 | 6 | 15 | 19 | 0 |
Different groups of amino acid mutations (ER, EL and IT) were named based on the first and last amino acid of the group. Highly informative amino acid mutations are mutations that could discriminate between recent and chronic HIV-1 infections.
Figure 3Amino acid mutations (E531, E548, E628 and R629) identified in study sequences to be highly conserved during recent HIV-1 stage compared to chronic HIV stage. (A) Comparison of these amino acid mutations with subtype C reference sequences obtained during recent HIV-1 infection [21,22,23,24]. (B) Comparison of these amino acid mutations with subtype C reference sequences obtained during chronic HIV-1 infection [23,25,26]. Image created in WebLogo 3.7.4. HIV = human immunodeficiency virus.
Performance of the different combinations of amino acid mutations for detecting recent HIV-1 infection.
| Stage | n | ER | ERM | EL | ELM | IT |
|---|---|---|---|---|---|---|
| <6 months | 14 | 14 (100%) | 11 (78.6%) | 9 (64.3%) | 9 (64.3%) | 10 (71.4%) |
| >6 months | 31 | 15 (48.4%) | 9 (29.0%) | 11 (35.5%) | 6 (19.4%) | 15 (48.4%) |
Different groups of amino acid mutations (ER, EL and IT) were named based on the first and last amino acid of the group. See Table 1 for the distribution of mutations in recent compared to chronic infection. The M456 mutation was used in combination with other amino acid mutations outside its group (see Table 1), this led to new groups designated ERM and ELM. <6 months = recent infection; >6 months = chronic infection.