| Literature DB >> 16086857 |
Shane C Dillon1, Xing Zhang, Raymond C Trievel, Xiaodong Cheng.
Abstract
The SET-domain protein methyltransferase superfamily includes all but one of the proteins known to methylate histones on lysine. Histone methylation is important in the regulation of chromatin and gene expression.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16086857 PMCID: PMC1273623 DOI: 10.1186/gb-2005-6-8-227
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Sites and functions of histone lysine methylation
| Histone lysine | Function(s) | Histone lysine methyltransferases* |
| H1 K26 | Transcriptional silencing | |
| H3 K4 | Transcriptional activation | |
| Transcriptional activation and elongation | ||
| Transcriptional activation | ||
| Transcriptional activation (in conjunction with ASH1-mediated methylation of H3 K9 and H4 K20) | ||
| H3 K9 | Heterochromatic and euchromatic silencing; DNA methylation | |
| Euchromatic silencing; DNA methylation | ||
| Euchromatic silencing | ||
| Heterochromatic silencing; DNA methylation | ||
| Heterochromatic silencing; DNA methylation | ||
| Transcriptional activation (in conjunction with ASH1-mediated methylation of H3 K4 and H4 K20) | ||
| H3 K27 | Euchromatic silencing | |
| Euchromatic silencing | ||
| H3 K36 | Transcriptional elongation and silencing | |
| Transcriptional regulation | ||
| H3 K79 | Demarcation of euchromatin | |
| H4 K20 | Cell cycle-dependent silencing, mitosis and cytokinesis [52,53] | |
| Heterochromatic silencing | ||
| Transcriptional regulation | ||
| Transcriptional activation (in conjunction with ASH1-mediated methylation of H3 K4 and H3 K9) | ||
| Recruitment of checkpoint protein Crb2 to sites of DNA damage |
Histone lysine methylation sites, functions, and associated histone lysine methyltransferases, which are listed according to the lysine that they methylate. *Species abbreviations: At, Arabidopsis thaliana; Dm, Drosophila melanogaster; Hs, Homo sapiens; Mm, Mus musculus; Nc, Neurospora crassa; Sc, Saccharomyces cerevisiae; Sp, Schizosaccharomyces pombe. Adapted from [54-56]; additional references listed in the table are those not cited in these reviews.
Figure 1A protein sequence alignment of the SET domains of several representative histone lysine methyltransferases (HKMT) grouped according to their histone-lysine specificity. All sequences are human with the exceptions of Saccharomyces cerevisiae SET1 and SET2, Schizosaccharomyces pombe CLR4, and Neurospora DIM-5. See Table 2 for the family designations of each human protein shown. The alignment between SET7/9 and DIM-5 is based on their structures [16]. The white text on a black background denotes invariant residues; black text on a gray background indicates conserved residues. The involvement of invariant residues in binding to AdoMet and the target lysine, catalysis, the structural pseudoknot (see Figure 3), an intra-molecular interacting salt bridge, and a F/Y switch controlling whether the product is a mono-, di- or tri-methylated histone [57] are indicated.
Properties of some human SET-domain proteins
| Chromosomal location | Gene size (kb) | Number of coding exons | Protein size (amino acids) | Domains common to the family in addition to the SET domain | Domains unique to particular members | GenBank accession number | |
| Pre-SET (9 Cys, 3 Zn), post-SET (CXCX4C) | |||||||
| SUV39H1 | Xp11.23 | 12.4 | 6 | 412 | 4 Cys, chromo | ||
| SUV39H2 | 10p13 | 24 | 6 | 477 ( | 4 Cys, chromo | ||
| G9a | 6p21.33 | 17.3 | 28 | 1,210 | E/KR-rich, NRSF-binding, ankyrin repeats | ||
| GLP1 (EuHMT1) | 9q34.3 | 120 | 25 | 1,267 | Same as G9a | ||
| ESET (SETDB1) | 1q21.2 | 37 | 21 | 1,291 | Tudor, MBD | ||
| CLLL8 (SETDB2) | 13q14.2 | 40 | 14 | 719 | MBD | ||
| Post-SET (CXCX4C) | |||||||
| MLL1 (HRX, ALL1) | 11q23.3 | 86 | 36 | 3,969 | AT hook, Bromo PHD, CXXC | ||
| HRX2 (MLL4) | 19q13.12 | 20 | 37 | 2,715 | Same as above | ||
| ALR (MLL2) | 12q13.12 | 34 | 54 | 5,262 | PHD, ring finger | ||
| MLL3 | 7q36.1 | 299 | 58 | 4,911 | PHD, ring finger | ||
| SET1 (ASH2) | 16p11.2 | 26 | 18 | 1,707 | RRM, poly-S/E/P | ||
| SET1L | 12q24.31 | 14 | 11 | 1,092 ( | RRM, poly-S/E/P | ||
| Pre-SET (7-9 Cys); post-SET (CXCX4C) | |||||||
| WHSC1 (NSD2) | 4q16.3 | 79 | 21 | 1,365 | PWWP, PHD, HMG, ring finger | ||
| WHSCL1 (NSD3) | 8p12 | 73 | 23 | 1,437 | PWWP, PHD, ring finger | ||
| NSD1 | 5q35.3 | 160 | 23 | 2,696 | PWWP, PHD, ring finger | ||
| HIF1 (HYPB) | 3p21.31 | 106 | 19 | 2,061 | WW | ||
| ASH1 | 1q22 | 184 | 27 | 2,969 | AT hook, bromo, BAH, PHD | ||
| RIZ (PRDM2) | 1p36.21 | 86 | 9 | 1,719 | C2H2 zinc finger | ||
| BLIMP1 (PRDM1) | 6q21 | 19 | 6 | 789 | C2H2 zinc finger | ||
| Post-SET (CXCX2C) | |||||||
| SMYD3 | 1q44 | 758 | 12 | 428 | Zf-MYND | ||
| SMYD1 | 2p11.2 | 43 | 9 | 490 | Zf-MYND | ||
| Pre-SET (~15 Cys) | |||||||
| EZH1 | 17q21.2 | 26 | 19 | 747 | 2 SANT | ||
| EZH2 | 7q36.1 | 40 | 19 | 746 | 2 SANT | ||
| Post-SET (CXCX2C) | |||||||
| SUV4-20H1 | 11q13.2 | 57 | 9 | 876 | |||
| SUV4-20H2 | 19q13.42 | 8 | 8 | 462 | |||
| SET7/9 | 4q31.1 | 45 | 8 | 366 | MORN | ||
| SET8 (PR-SET7) | 12q24.31 | 26 | 8 | 393 |
The seven families of SET-domain proteins are classified according to the sequences surrounding their SET domain. *Complete human SUV39H2 and SET1L cDNAs are not available in current databases, but partial cDNA and genomic sequences corresponding to the mouse sequences (Mm) are present. For the pre-SET and post-SET domains, the number and (if known) the arrangement of cysteines in the domain is given. Domain abbreviations and definitions: ankyrin repeats, tandemly repeated modules of about 33 amino acids; AT hook, DNA binding motif with a preference for A/T-rich regions; BAH, Bromo adjacent homology domain; bromo, bromodomain, which can interact specifically with acetylated lysines; chromo, chromatin organization modifier domain; CXXC, domain with two cysteines separated by two amino acids; E/KR-rich, glutamine- or lysine/arginine-rich domains; HMG, high mobility group domain; MBD, methyl-binding domain; MORN, membrane occupation and recognition nexus repeat; NRSF-binding, binds neuron-restrictive silencing factor/repressor element 1 silencing transcription factor; PHD, folds into an interleaved type of Zn-finger chelating two Zn ions; poly-S/E/P, runs of serine, glutamate or proline; PWWP, domain including a conserved Pro-Trp-Trp-Pro motif; RRM, RNA recognition motif; SANT, DNA-binding domain that specifically recognizes the sequence YAAC(G/T)G; Tudor, domain of unknown function present in several RNA-binding proteins; WW, contains two highly conserved tryptophans and binds proline-rich peptide motifs; Zf-MYND, 'myeloid, Nervy, DEAF-1' domain consisting of a cluster of cysteine and histidine residues.
Figure 2Schematic representations of the gene and primary protein structures of two pairs of related SET-domain histone methyltransferases in the SUV39 family. (a) Human SUV39H1 (gene, mRNA and protein); (b) human SUV39H2 (gene and mRNA for comparison with SUV39H1); (c) human G9a (gene, mRNA and protein); (d) human GLP1 (EuHMT1; mRNA for comparison with G9a; the gene structure is not shown because of the large size of the intron). Black boxes in the genes and white boxes in the mRNAs denote exons; numbers above each gene are exon numbers; numbers within exons indicates their size in nucleotides; thin lines in the genes indicate the introns and untranslated regions of the first and the last exons (these are shown to scale with the length of the exons except where lines are broken). (a,c) Protein structures are shown on the same scale as the coding region of the corresponding mRNA, so that the corresponding exons for each protein domain can be directly aligned. Domains are indicated above protein structures, and the number of conserved cysteines (Cys) in each domain is also shown. Abbreviations: Chromo, chromodomain; E-rich, glutamine-rich domain; KR-rich, domain rich in lysine and arginine; NRSF-binding, a domain involved in binding neuron-restrictive silencing factor/repressor element 1 silencing transcription factor. (c,d) The intron-exon junctions indicated with asterisks are those that differ between G9a and GLP1.
Figure 3Representative examples of SET-domain-containing structures. (a) Neurospora crassa DIM-5 (Protein DataBank (PDB) code 1PEG.pdb); (b) human SET7/9 (1O9S.pdb). The pre-SET, SET, and post-SET domains in DIM-5 and the N-SET, SET, and C-SET domains in SET7/9 are indicated. The pseudoknot formed by two conserved SET motifs and the bound histone H3 peptide are also illustrated. The reaction byproduct AdoHcy is in stick representation and the zinc ions are shown as balls. N, amino terminus; C, carboxyl terminus.
Figure 4Structures of the active sites of SET-domain protein methyltransferases. Hydrogen bonds are rendered as dashed lines and residue numbers are indicated in the single-letter amino-acid code. (a-c) The carbon atoms of substrates and products are illustrated in purple to distinguish them from protein residues (gray). (a) The cofactor-binding site of SET7/9 with bound AdoMet (PDB code 1N6A.pdb). (b) The lysine-binding pocket of SET7/9 in complex with methylated K4 (MeK4) of histone H3 from the crystal structure of the ternary complex SET7/9:AdoHcy:histone H3 MeK4 peptide (1O9S.pdb). (c) The lysine-binding channel of DIM-5 bound to K9 of histone H3 from the structure of the ternary complex DIM-5:AdoHcy:histone H3 peptide (1PEG.pdb). (d,e) Protein-substrate-binding clefts of SET-domain protein methyltransferases. The binding sites are rendered as transparent molecular surfaces. For clarity, the carbon atoms of histone H3 are depicted in cyan and the enzymes in green. (d) The protein-substrate-binding site of SET7/9 in complex with a histone H3 peptide from the structure of the ternary complex (1O9S.pdb). The white area is the methyltransfer pore. (e) Substrate-binding cleft of DIM-5 bound to a histone H3 peptide from the ternary complex structure (1PEG.pdb).