| Literature DB >> 25147212 |
Mamuka Kvaratskhelia1, Amit Sharma2, Ross C Larue2, Erik Serrao3, Alan Engelman4.
Abstract
Retroviral replication proceeds through an obligate integrated DNA provirus, making retroviral vectors attractive vehicles for human gene-therapy. Though most of the host cell genome is available for integration, the process of integration site selection is not random. Retroviruses differ in their choice of chromatin-associated features and also prefer particular nucleotide sequences at the point of insertion. Lentiviruses including HIV-1 preferentially integrate within the bodies of active genes, whereas the prototypical gammaretrovirus Moloney murine leukemia virus (MoMLV) favors strong enhancers and active gene promoter regions. Integration is catalyzed by the viral integrase protein, and recent research has demonstrated that HIV-1 and MoMLV targeting preferences are in large part guided by integrase-interacting host factors (LEDGF/p75 for HIV-1 and BET proteins for MoMLV) that tether viral intasomes to chromatin. In each case, the selectivity of epigenetic marks on histones recognized by the protein tether helps to determine the integration distribution. In contrast, nucleotide preferences at integration sites seem to be governed by the ability for the integrase protein to locally bend the DNA duplex for pairwise insertion of the viral DNA ends. We discuss approaches to alter integration site selection that could potentially improve the safety of retroviral vectors in the clinic.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25147212 PMCID: PMC4176367 DOI: 10.1093/nar/gku769
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Features and domain organization of LEDGF/p75, HRP2 and BET proteins. (A) The N-terminal region of LEDGF/p75, which contains a PWWP domain, charged regions (CR) 1–3, nuclear localization signal (NLS), and AT-hooks, interacts with chromatin. Similar to LEDGF/p75, HRP2 contains an N-terminal PWWP domain and AT-hooks. HRP2 has an additional domain, termed the homology region III (HR3) that is conserved in multiple HRP2 homologs as well as in LEDGF/p75. The C-terminal regions of both proteins exhibit the IBD that directly interacts with lentiviral INs. (B) The BET proteins consist of BRD2, 3, 4 and T (not pictured). Whereas BRD3 is expressed as a single isoform, BRD2 is expressed as four isoforms (isoform 1 is pictured) and BRD4 as three isoforms (isoforms A and C are pictured; as compared to isoform C, isoform B harbors a unique 75 amino acid C-terminal tail that interacts with condensing II complexes; 183). Known domains and their respective start and end amino acids numbers are indicated. Two N-terminal bromodomains (BD I and II) and motifs A and B collectively contribute to high affinity chromatin binding. In the C-terminal region of the BET proteins, the conserved ET domain interacts with multiple proteins including MoMLV IN. Other domains include the SEED domain, which is present in all BET proteins, the BID, which is present in all BRD4 isoforms, and the CTM, which is unique to BRD4 isoform A.
Figure 2.HIV-1 and MoMLV IN similarities and differences. (A) Features and domain organization of HIV-1 and MoMLV INs. Retroviral INs consist of three conserved domains, the N-terminal domain (NTD, yellow), the catalytic core domain (CCD, gray) and the C-terminal domain (CTD, blue). Shown in red is the conserved amino acids of the catalytic triad (DDE) that coordinates Mg2+ and is responsible for 3′ processing and strand transfer activities. Also shown in blue letters is the Zn binding motif (HH-CC type) that helps to mediate IN multimerization. (B) Sequence alignment of mid region sections of retroviral IN CCDs from HIV-1 strain NL4–3 (GenBank accession code M19921.2), HIV-2 strain ROD (M15390), feline immunodeficiency virus (FIV, M25381.1), equine infectious leukemia virus (EIAV, M16575.1) and MoMLV (NC_001501.1). Invariant residues across retroviral INs are shown in red (glutamic acid of the DDE catalytic triad) and blue (lysine that mediates binding to viral DNA; 103,184). Residues highlighted in black are identical across this alignment, whereas those highlighted in gray are conserved in minimally three of the sequences based on the following chemical groupings: G, A, S, T, P; M, V, L, I; F, Y, W; D, E, N, Q; K, R, H; C (185). IN residues that interact with the LEDGF/p75 IBD are highlighted by the nature of the contact: s for side chain and b for backbone (64). The rectangle highlights residues that compose the α4/5 connector region that lies between CCD α helices 4 and 5 and mediates several key contacts with LEDGF/p75 (108). (C) Ribbon diagram of the crystal structure of a dimer of the HIV-1 IN CCD (cyan and green) bound to the LEDGF/p75 IBD (gray). The carboxylate side chain of LEDGF/p75 residue Asp366 hydrogen bonds with the backbone amides of IN residues Glu170 and His171. The adjacent LEDGF/p75 residue Ile365 (not shown) predominantly interacts with the cyan IN molecule through hydrophobic contacts. (D) Sequence alignment of the C-terminal tail regions of gammaretroviral INs from the following full-length molecular clones: MoMLV, MLV from the AKV mouse strain (J01998.1), feline leukemia virus (FeLV, NC_001940.1), gibbon ape leukemia virus (GaLV, NC_001885.2) and reticuloendotheliosis virus (REV) strain GD1210 (KF709431.1). Among these viruses, MoMLV and FeLV have been shown to favor TSSs during integration (7,186). These IN proteins have also been shown to bind BET proteins in vitro (19–21). Below the alignment is the consensus WxΦxxpxxPLbΦbΦxR sequence, where p stands for small polar (S or T) residue, b stands for basic (R, K, or H), Φ stands for small hydrophobic (M, V, I, or L) and x refers to a position that is not conserved across the alignment. (E) Ribbon diagram of the NMR structure of BRD4 ET domain, with residues in red implicated in interacting with the MoMLV IN CTD as determined by chemical shift perturbations. These interactions were predominantly observed in helices 2 and 3 and the short loop connecting them (indicated by an arrow).
Figure 3.Heatmaps depicting relationships between retroviral integration frequencies and histone post-translational modifications. For both panels (A and B), the integration site data sets are shown in columns with the histone post-translational modifications in the rows labeled to the left. The relationship between the integration site frequencies relative to matched random controls for each of the annotated histone post-translational modification was quantified by the receiver operator characteristic (ROC) curve area method. The color key depicts enrichment or depletion of the annotated feature near integration sites. P-values are for individual integration site datasets compared to match random controls, ***P < 0.001; **P < 0.01; *P < 0.05. (A) Integration frequencies of different retroviruses including MoMLV, HIV-1 and ASLV. (B) Integration frequencies of MoMLV with respect to histone post-translational modifications following treatment with either DMSO or the JQ-1 (500 nM) inhibitor. Figure adapted from (87).
Figure 4.Model depicting the bimodal interactions of LEDGF/p75 and BET proteins with corresponding HIV-1 and MoMLV intasomes and mononucleosomes containing select histone marks. (A) LEDGF/p75 (depicted in blue) is able to bind selectively and with high affinity to mononucleosomes through the cooperative binding of the PWWP domain with the H3K36me3 histone tail and the three charge regions (CR1–3) with the DNA (shown in red) wrapped around the histones (shown in gray). The C-terminal IBD of LEDGF/p75 is able to directly engage the HIV-1 intasome (depicted with a tetramer of HIV-1 IN in orange and viral DNA, in a dark red single line). (B) A BRD protein (depicted in green) is able to bind selectively and with high affinity to mononucleosomes through the cooperative binding of the dual bromodomains with acetylated H3 and H4 histone tails (H4 acetylation depicted here) and motifs A and B with DNA (shown in red) wrapped around the histones (shown in gray). The C-terminal region of the BET protein is able to engage the MoMLV intasome (depicted with a tetramer of MoMLV IN in purple and viral DNA in a dark red single line) through its extra terminal (ET) domain, which binds to the C-terminal tail of MoMLV IN. The SEED domain does not directly contribute to these interactions but may play an accessory role in complex stability (87).