| Literature DB >> 28763006 |
Dhanusha Yesudhas1, Maria Batool2, Muhammad Ayaz Anwar3, Suresh Panneerselvam4, Sangdun Choi5.
Abstract
Proteins in the form of transcription factors (TFs) bind to specific DNA sites that regulate cell growth, differentiation, and cell development. The interactions between proteins and DNA are important toward maintaining and expressing genetic information. Without knowing TFs structures and DNA-binding properties, it is difficult to completely understand the mechanisms by which genetic information is transferred between DNA and proteins. The increasing availability of structural data on protein-DNA complexes and recognition mechanisms provides deeper insights into the nature of protein-DNA interactions and therefore, allows their manipulation. TFs utilize different mechanisms to recognize their cognate DNA (direct and indirect readouts). In this review, we focus on these recognition mechanisms as well as on the analysis of the DNA-binding domains of stem cell TFs, discussing the relative role of various amino acids toward facilitating such interactions. Unveiling such mechanisms will improve our understanding of the molecular pathways through which TFs are involved in repressing and activating gene expression.Entities:
Keywords: TF domain family; base and shape readouts; protein-DNA interaction; protein-DNA recognition
Year: 2017 PMID: 28763006 PMCID: PMC5575656 DOI: 10.3390/genes8080192
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Protein-DNA recognition mechanisms. The main three protein-DNA recognition mechanisms are shown. When the transcription factor (pink ring) moves from one site to another by means of sliding along the DNA and is transferred from one base pair to another without dissociating from the DNA, this mechanism is called sliding (top). Hopping occurs when the transcription factor moves on the DNA by dissociating from one site and re-associating with another site (center). Intersegmental transfer describes the mechanism by which the transcription factor gets transferred through DNA bending or the formation of a DNA loop, resulting in the protein being bound transiently to both sides and subsequently moving from on site to the other (bottom).
Figure 2Representative figures of the transcription factor binding domains. The figure shows the crystal structures of different types of TF domains (3l1p, 4m9e, 5d5v, 1lbg, 1gt0, and 1nkp). The structures were obtained from the Protein Data Bank (PDB) and redrawn using chimera. The respective domains and important regions have been labeled. HTH stands for helix-turn-helix domain. bHLH stands for basic helix-loop-helix motif. HD and HMG stand for homeodomain and high-mobility group box domain, respectively.
Different domain family proteins and their domain architectures.
| No | Superfamily Proteins | Domain Motifs | Architecture of DNA-Binding Domains | Representative PROTEIN |
|---|---|---|---|---|
| 1 | Winged HTH proteins | Helix-turn-helix | mainly α | hRFX1 |
| 2 | GCM domain | β-sheet | mixed α/β | WRKY transcription factor |
| 3 | Zinc-coordinating proteins | Zinc finger | mixed α/β | SIP1, FOG, Msn2p, A20, Klf4 |
| 4 | ββα Zinc-finger family | Zinc finger | mixed α/β | Egr-1 |
| 5 | Loop-sheet-helix family | Helix-turn-helix | mainly α | p53 |
| 6 | Leucine zipper family | Helix-loop-helix | mainly α | Jun, Fos |
| 7 | POU domain | Helix-turn-helix | mainly α | Oct1, Oct2, Oct4 |
| 8 | Copper-fist | Zinc finger | mixed α/β | Mac1 |
| 9 | Histone-fold | NA | mainly α | TBP, TAF proteins, HuCHRAC |
| 10 | ETS domain | Helix-turn-helix | mainly α | pointed-P2 |
| 11 | Bet v1-like | NA | mixed α/β | VASt |
| 12 | P-loop domain | NA | multidomain, mixed α/β | ARTS |
| 13 | TEA domain | NA | NA | Simian virus 40 (SV40), enhancer factor TEF-1 |
| 14 | LytTR domain | NA | NA | AlgR/AgrA/LytR family of transcription factors |
| 15 | Steroid receptor | Zinc finger | mixed α/β | NA |
| 16 | p53-like transcription factors, E-set domains, and Runt domain proteins | Immunoglobulin-like β-sandwich motif | mainly β | NF-κB and Rel |
| 17 | TATA-box binding protein-like | TBP (TATA-binding protein) β-sheet | mainly β | HMGB1, HMGB2 |
| 18 | DNA/RNA polymerases | NA | multidomain, mixed α/β | RNA polymerase I, II, III, IV and V |
| 19 | Ribbon-helix-helix | Ribbon-helix-helix | mixed α/β | CopG, NikR, ParG |
| 20 | HMG-box | Helix-turn-helix | mainly α | TCF-1, SRY |
| 21 | IHF-like DNA-binding proteins | NA | mixed α/β | HBsu |
| 22 | RNase A-like | NA | mixed α/β | Train A |
| 23 | TrpR-like | Helix-turn-helix | mainly α | TrpR like proteins |
| 24 | T4 endonuclease V | Helix-turn-helix | mainly α | RuvC protein |
| 25 | ARID-like | Helix-turn-helix | mainly α | SWI-SNF complex protein p270 |
The superfamily proteins were taken from the structural classification of proteins (SCOP) database and the information was retrieved and updated from Rohs’s work [13]. NA: Not available.
Figure 3Domain architectures of stem cell transcription factors. A representation of the arrangement of functional domains in stem cell transcription factors Oct4, Sox2, Nanog, c-Myc, and Klf4 are shown. Each domain is marked with the length of its corresponding amino acid sequence. TAD stands for transactivation domain, HMG, HD, WR, HLH, and LZ stand for high-mobility group, homeodomain, tryptophan repeats, helix-loop-helix, and leucine zippers, respectively. NLS stands for nuclear localization sequence. MBI and MBII stand for Myc Boxes I and II, respectively. ND, CD1, CD2, and POU stand for N-terminal domain, C-terminal domain 1, C-terminal domain 2 and POU is derived from the names of three mammalian transcription factors, the pituitary-specific Pit-1, the octamer-binding proteins Oct-1 and Oct-2, and the neural Unc-86 from Caenorhabditis elegans.
Figure 4Representative figure of the cooperative binding of stem cell factors. The figure illustrates the cooperative binding of Sox2 and Oct4, as well as Sox2 and Nanog, on their enhancers/promoters of target genes. The Oct4/Sox2 crystal structure is obtained from PDB (1gt0), whereas Sox2/Nanog structure was modeled using chimera.