| Literature DB >> 27152230 |
Madara Hetti Arachchilage1, Helen Piontkivska2.
Abstract
The replication of human immunodeficiency virus-1 (HIV-1) requires reverse transcription of the viral RNA genome and integration of newly synthesized pro-viral DNA into the host genome. This is mediated by the viral proteins reverse transcriptase (RT) and integrase (IN). The formation and stabilization of the pre-integration complex (PIC), which is an essential step for reverse transcription, nuclear import, chromatin targeting, and subsequent integration, involves direct and indirect modes of interaction between RT and IN proteins. While epitope-based treatments targeting IN-viral DNA and IN-RT complexes appear to be a promising combination for an anti-HIV treatment, the mechanisms of IN-RT interactions within the PIC are not well understood due to the transient nature of the protein complex and the intrinsic flexibility of its components. Here, we identify potentially interacting regions between the IN and RT proteins within the PIC through the coevolutionary analysis of amino acid sequences of the two proteins. Our results show that specific regions in the two proteins have strong coevolutionary signatures, suggesting that these regions either experience direct and prolonged interactions between them that require high affinity and/or specificity or that the regions are involved in interactions mediated by dynamic conformational changes and, hence, may involve both direct and indirect interactions. Other regions were found to exhibit weak, but positive correlations, implying interactions that are likely transient and/or have low affinity. We identified a series of specific regions of potential interactions between the IN and RT proteins (e.g., specific peptide regions within the C-terminal domain of IN were identified as potentially interacting with the Connection domain of RT). Coevolutionary analysis can serve as an important step in predicting potential interactions, thus informing experimental studies. These studies can be integrated with structural data to gain a better understanding of the mechanisms of HIV protein interactions.Entities:
Keywords: HIV-1 integrase; HIV-1 reverse transcriptase; molecular coevolution; pre-integration complex; protein–protein interaction
Year: 2016 PMID: 27152230 PMCID: PMC4854294 DOI: 10.1093/ve/vew002
Source DB: PubMed Journal: Virus Evol ISSN: 2057-1577
Epitope clusters and non-epitope segments in HIV-1 RT and HIV-1 IN used in the study (residue coordinates are given per HXB2 amino acid coordinates).
| HIV-1 IN | HIV-1 RT | ||||||
|---|---|---|---|---|---|---|---|
| Epitope cluster/non-epitope segment | Start position | End position | Fragment size (in aa) | Epitope cluster/non-epitope segment | Start position | End position | Fragment size (in aa) |
| IN-EP1 | 1 | 8 | 8 | RT-EP1 | 1 | 13 | 13 |
| IN-NE1 | 9 | 15 | 7 | RT-EP2 | 18 | 26 | 9 |
| IN-EP2 | 16 | 43 | 28 | RT-NE1 | 27 | 32 | 6 |
| IN-NE2 | 44 | 65 | 22 | RT-EP3 | 33 | 53 | 21 |
| IN-EP3 | 66 | 93 | 28 | RT-NE2 | 54 | 72 | 19 |
| IN-EP4 | 96 | 121 | 26 | RT-EP4 | 73 | 82 | 10 |
| IN-EP5 | 123 | 132 | 10 | RT-NE3 | 83 | 92 | 10 |
| IN-EP6 | 135 | 143 | 9 | RT-EP5 | 93 | 115 | 23 |
| IN-NE3 | 144 | 164 | 21 | RT-EP6 | 118 | 135 | 18 |
| IN-EP7 | 165 | 234 | 70 | RT-EP7 | 137 | 187 | 51 |
| IN-NE4 | 235 | 241 | 7 | RT-NE4 | 188 | 194 | 7 |
| IN-EP8 | 242 | 271 | 30 | RT-EP8 | 195 | 210 | 16 |
| IN-NE5 | 272 | 288 | 17 | RT-NE5 | 211 | 243 | 33 |
| RT-EP9 | 244 | 318 | 75 | ||||
| RT-NE6 | 319 | 332 | 14 | ||||
| RT-EP10 | 333 | 350 | 18 | ||||
| RT-EP11 | 354 | 366 | 13 | ||||
| RT-NE7 | 367 | 374 | 8 | ||||
| RT-EP12 | 375 | 401 | 27 | ||||
| RT-NE8 | 402 | 410 | 9 | ||||
| RT-EP13 | 411 | 457 | 47 | ||||
| RT-NE9 | 458 | 494 | 37 | ||||
| RT-EP14 | 495 | 505 | 11 | ||||
| RT-NE10 | 506 | 519 | 14 | ||||
| RT-EP15 | 520 | 544 | 25 | ||||
| RT-NE11 | 545 | 552 | 8 | ||||
| RT-EP16 | 553 | 560 | 8 | ||||
aA cluster of best-defined CTL, T-Helper, or Ab epitopes that are overlapping is defined as one epitope cluster
Figure 1.The distribution of the top coevolving region pairs in major domains of HIV-1 IN and RT proteins. Dotted lines represent potential interactions among thirty-seven region pairs that have the combined correlation coefficient of at least 0.5 or higher.
Top coevolving IN-RT epitope/non-epitope segment pairs, with corresponding structural domains of the two proteins and references to experimental studies of interactions.
| RT epitope clusters/non-epitope segment | Relevant RT domain | IN Epitope clusters/non-epitope segments | Relevant IN domain | Combined correlation coefficient | Lower CI at 95% confidence level | Upper CI at 95% confidence level | Sample size | Identified by analysis of phylogenetically independent sister pairs | If top half of sliding windows in RT | If top half of sliding windows in IN | Experimental evidence (if available) |
|---|---|---|---|---|---|---|---|---|---|---|---|
| RT-EP1 | Finger-Palm | IN-NE5 | CTD | 0.622 | 0.224 | 0.843 | 997 | ||||
| RT-EP1 | Finger-Palm | IN-EP2 | NTD | 0.499 | 0.293 | 0.660 | 999 | + | + | ||
| RT-EP11 | Connection | IN-NE5 | CTD | 0.505 | 0.153 | 0.743 | 950 | + | + | ( | |
| RT-EP12 | Connection | IN-NE5 | CTD | 0.585 | 0.066 | 0.855 | 965 | + | ( | ||
| RT-EP12 | Connection | IN-EP2 | NTD | 0.529 | 0.145 | 0.774 | 941 | Yes | + | + | |
| RT-EP13 | Connection and RNaseH | IN-EP2 | NTD | 0.531 | 0.264 | 0.722 | 956 | Yes | + | + | ( |
| RT-EP15 | RNaseH | IN-EP2 | NTD | 0.532 | 0.373 | 0.661 | 1,000 | Yes | + | ||
| RT-EP15 | RNaseH | IN-NE5 | CTD | 0.522 | 0.236 | 0.725 | 999 | + | + | ||
| RT-EP15 | RNaseH | IN-EP4 | CCD | 0.503 | 0.322 | 0.648 | 1,000 | Yes | + | + | |
| RT-EP15 | RNaseH | IN-EP7 | CCD and CTD | 0.495 | 0.295 | 0.653 | 1,000 | Yes | + | ( | |
| RT-EP15 | RNaseH | IN-NE2 | NTD and CCD | 0.495 | 0.328 | 0.632 | 1,000 | + | + | ( | |
| RT-EP16 | RNaseH | IN-NE5 | CTD | 0.601 | 0.131 | 0.851 | 996 | + | + | ||
| RT-EP16 | RNaseH | IN-EP2 | NTD | 0.544 | 0.309 | 0.716 | 1,000 | + | + | ||
| RT-EP2 | Finger-Palm | IN-NE5 | CTD | 0.507 | 0.059 | 0.785 | 801 | + | + | ||
| RT-EP7 | Finger-Palm | IN-EP7 | CCD and CTD | 0.498 | 0.209 | 0.707 | 994 | Yes | + | + | ( |
| RT-EP8 | Finger-Palm | IN-NE5 | CTD | 0.658 | 0.135 | 0.894 | 997 | + | ( | ||
| RT-EP8 | Finger-Palm | IN-EP2 | NTD | 0.555 | 0.273 | 0.749 | 994 | + | |||
| RT-EP9 | Thumb | IN-EP2 | NTD | 0.598 | 0.466 | 0.705 | 1,000 | Yes | + | ( | |
| RT-EP9 | Thumb | IN-EP7 | CCD and CTD | 0.576 | 0.377 | 0.724 | 1,000 | Yes | + | + | |
| RT-EP9 | Thumb | IN-EP4 | CCD | 0.552 | 0.400 | 0.675 | 1,000 | Yes | + | + | |
| RT-EP9 | Thumb | IN-EP8 | CTD | 0.542 | 0.335 | 0.700 | 1,000 | Yes | + | ||
| RT-EP9 | Thumb | IN-NE5 | CTD | 0.530 | 0.257 | 0.725 | 998 | + | + | ||
| RT-EP9 | Thumb | IN-EP3 | CCD | 0.529 | 0.394 | 0.642 | 1,000 | Yes | + | + | |
| RT-EP9 | Thumb | IN-NE2 | NTD and CCD | 0.526 | 0.389 | 0.640 | 1,000 | Yes | + | ||
| RT-NE10 | RNaseH | IN-NE5 | CTD | 0.708 | 0.039 | 0.939 | 991 | + | + | ||
| RT-NE10 | RNaseH | IN-EP6 | CCD | 0.527 | 0.065 | 0.803 | 760 | + | + | ( | |
| RT-NE4 | Finger-Palm | IN-NE5 | CTD | 0.518 | 0.026 | 0.808 | 992 | + | + | ( | |
| RT-NE6 | Thumb and Connection | IN-NE5 | CTD | 0.736 | 0.283 | 0.921 | 998 | + | ( | ||
| RT-NE6 | Thumb and Connection | IN-EP6 | CCD | 0.540 | 0.132 | 0.792 | 781 | + | + | ||
| RT-NE6 | Thumb and Connection | IN-EP2 | NTD | 0.520 | 0.242 | 0.719 | 999 | + | + | ||
| RT-NE9 | RNaseH | IN-NE5 | CTD | 0.604 | 0.281 | 0.804 | 998 | + | + | ||
| RT-NE9 | RNaseH | IN-EP2 | NTD | 0.585 | 0.428 | 0.708 | 1,000 | Yes | + | + | |
| RT-NE9 | RNaseH | IN-EP4 | CCD | 0.565 | 0.358 | 0.718 | 1,000 | Yes | + | + | |
| RT-NE9 | RNaseH | IN-EP7 | CCD and CTD | 0.545 | 0.338 | 0.702 | 1,000 | Yes | + | ||
| RT-NE9 | RNaseH | IN-EP3 | CCD | 0.541 | 0.378 | 0.671 | 1,000 | Yes | + | + | |
| RT-NE9 | RNaseH | IN-EP8 | CTD | 0.508 | 0.255 | 0.696 | 998 | Yes | + | ||
| RT-NE9 | RNaseH | IN-NE2 | NTD and CCD | 0.497 | 0.318 | 0.642 | 1,000 | Yes | + | + |
For each pair of regions, combined correlation coefficient (with confidence intervals) is given. Only top 15 per cent of pairs that have the combined correlation coefficient at or above 0.5 threshold are listed.
aSample size refers to the number of samples (out of 1,000) that had a significant Pearson correlation value in at least two-third of the total samples (i.e., 750 samples out of 1,000).
bIN/RT regions that are identified by analysis of phylogenetic independent sequence pairs.
c’+’ designates whether the specific region has been identified as part of the top half of coevolving sliding windows by the number of coevolving regions (e.g., sliding windows in RT that coevolve with 5–8 IN regions; see Section 2 for further details).