| Literature DB >> 31680160 |
Manjeet Kumar1, Marc Gouw1, Sushama Michael1, Hugo Sámano-Sánchez1,2, Rita Pancsa3, Juliana Glavina4, Athina Diakogianni1, Jesús Alvarado Valverde1, Dayana Bukirova1,5, Jelena Čalyševa1,2, Nicolas Palopoli6, Norman E Davey7, Lucía B Chemes4, Toby J Gibson1.
Abstract
The eukaryotic linear motif (ELM) resource is a repository of manually curated experimentally validated short linear motifs (SLiMs). Since the initial release almost 20 years ago, ELM has become an indispensable resource for the molecular biology community for investigating functional regions in many proteins. In this update, we have added 21 novel motif classes, made major revisions to 12 motif classes and added >400 new instances mostly focused on DNA damage, the cytoskeleton, SH2-binding phosphotyrosine motifs and motif mimicry by pathogenic bacterial effector proteins. The current release of the ELM database contains 289 motif classes and 3523 individual protein motif instances manually curated from 3467 scientific publications. ELM is available at: http://elm.eu.org.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31680160 PMCID: PMC7145657 DOI: 10.1093/nar/gkz1030
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) Progression of the motif classes and instances integrated in the ELM resource. (B) Pie-chart showing count and proportion of new instance addition from each motif class type in the current ELM release. (C) Barplot showing the motif classes grouped according to the coverage of their instances by PDB structures, only one structure per instance has been considered for showing the coverage. In total, 164 ELM classes are covered by at least one structure. (D) Top 20 motif classes in terms of the number of representative PDB structures are shown. The plots were generated using plotly chart studio (https://chartstudio.plot.ly).
Overview of the data stored in the ELM database
| Functional sites | ELM classes | ELM instances | GO terms | PDB structures | ELM instances with affinity values | PubMed Links | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| Total | 176 | 289 | 3523 | 791 | 516 | 265 | 3467 | |||
| By category | LIG | 163 | Human | 2090 | Biological process | 430 | ||||
| MOD | 37 | Mouse | 341 | |||||||
| DOC | 31 | Rat | 150 | Cellular component | 163 | |||||
| DEG | 25 | Yeast | 110 | |||||||
| TRG | 22 | Fly | 98 | Molecular function | 198 | |||||
| CLV | 11 | Others | 734 | |||||||
Novel and revised ELM classes since the last ELM publication
| Novel ELM classes | ||
|---|---|---|
| ELM class identifier | Number of instances | ELM class (short) description |
| LIG_SH2_CRK | 34 | CRK family SH2 domain binding motif |
| LIG_PDZ_Wminus1_1 | 27 | The C-terminal Trp-1 PDZ-binding motif is represented by a pattern like W(ACGILV)$. |
| LIG_SH2_STAP1 | 22 | STAP1 Src Homology 2 (SH2) domain Class 2 binding motif |
| LIG_SH2_NCK_1 | 17 | NCK Src Homology 2 (SH2) domain binding motif |
| LIG_PROFILIN_1 | 16 | The polyproline profilin-binding motif is found in regulators of actin cytoskeleton. |
| LIG_PCNA_yPIPBox_3 | 12 | The PCNA binding motifs include the PIP Box, PIP degron and the APIM motif, and are found in proteins involved in DNA replication, repair, methylation and cell cycle control. This is the variant for the yeast PIPbox. |
| LIG_REV1ctd_RIR_1 | 10 | Several DNA repair proteins interact with the C-terminal domain of the Rev1 translesion synthesis scaffold through the Rev1-Interacting Region RIR motif that is centered around two neighboring Phe residues. |
| LIG_IBAR_NPY_1 | 7 | A short NPY motif present in the bacterial effector protein Tir binds the I-BAR domain and is involved in actin polymerization. |
| LIG_MLH1_MIPbox_1 | 6 | Proteins involved in DNA repair and replication employ conserved MIP-box motifs to bind the C-terminal domain of mismatch repair protein MLH1. |
| LIG_FXI_DFP_1 | 5 | The DFP motif enables binding to the 2nd apple domain of coagulation factor XI (FXI) and plasma kallikrein heavy chain. |
| LIG_deltaCOP1_diTrp_1 | 5 | Tryptophan-based motifs enable targeting of the tethering and (dis)assembly factors to the C-terminal mu homology domain (MHD) of the coatomer subunit delta, delta-COP. |
| LIG_CaM_NSCaTE_8 | 3 | Short motif recognized by CaM that is only present in the Cav1.2 and Cav1.3 L-type calcium channels. |
| LIG_ARL_BART_1 | 2 | The ligand motif present in N-terminus region of ARL2 and ARL3 proteins ensures GTD-dependent binding to BART and BARTL1. |
| LIG_PCNA_APIM_2 | 2 | The PCNA-binding APIM motif is found in proteins involved in DNA repair and cell cycle control. |
| MOD_PRMT_GGRGG_1 | 24 | A GGRGG motif recognized by the arginine methyltransferase for arginine methylation. |
| MOD_DYRK1A_RPxSP_1 | 22 | Serine/Threonine residue phosphorylated by Arginine and Proline directed DYRK1A kinase. |
| DOC_PP4_FxxP_1 | 15 | The FxxP-like docking motif recognized by the EVH1 domains of the PPP4R3 regulatory subunits of the PP4 holoenzyme. |
| DOC_PP4_MxPP_1 | 2 | The MxPP-like docking motif recognized by the EVH1 domains of the PPP4R3 regulatory subunits of the PP4 holoenzyme. |
| DOC_MAPK_GRA24_9 | 2 | A kinase docking motif that mediates interaction toward the ERK1/2 and p38 subfamilies of MAP kinases. |
| TRG_Pf-PMV_PEXEL_1 | 24 | Plasmodium Export Element, PEXEL, is a trafficking signal for protein cleavage by PMV protease and export from Plasmodium parasites to infected host cells. |
| TRG_ER_FFAT_2 | 7 | A variant of the classic MSP-domain binding FFAT (diphenylalanine [FF] in an Acidic Tract) motif. |
| ELM Classes with major revisions | ||
| LIG_CaM_IQ_9 | 75 | Helical peptide motif responsible for Ca2+-independent binding of the CaM. |
| LIG_SH2_GRB2like | 35 | GRB2-like Src Homology 2 (SH2) domain binding motif. |
| LIG_LIR_Gen_1 | 21 | Canonical LIR motif that binds to Atg8 protein family members to mediate processes involved in autophagy. |
| LIG_PCNA_PIPBox_1 | 19 | The PCNA-binding PIP Box motif is found in proteins involved in DNA repair and cell cycle control. |
| LIG_Vh1_VBS_1 | 15 | An amphipathic α-helix recognized by the head domain of vinculin that is required for vinculin activation and actin filament attachment. |
| LIG_IRF3_LxIS_1 | 7 | A binding site for IRF-3 protein present in various innate adaptor proteins and the viral protein NSP1 to trigger the innate immune responsive pathways. |
| MOD_CK2_1 | 34 | Casein kinase 2 (CK2) phosphorylation site. |
| MOD_CK1_1 | 27 | CK1 phosphorylation site. |
| MOD_CDK_SPxK_1 | 26 | Canonical version of the CDK phosphorylation site that shows specificity toward a lysine/arginine residue at the [ST]+3 position. |
| MOD_CAAXbox | 17 | Generic CAAX box prenylation motif. |
| DOC_CyclinA_RxL_1 | 28 | This motif is mainly based on cyclin A binding peptides and may not apply to all cyclins. |
| TRG_ER_FFAT_1 | 29 | MSP-domain binding FFAT (diphenylalanine [FF] in an Acidic Tract) motif. |
Figure 2.Structural information on representative DNA damage and repair motif instances and classes added in the current ELM update. (A) Structure of PCNA trimer in complex with PIP box of ZRANB3 [PDB ID: 5MLO] (77). (B) Closeup of the structure of PCNA PIP-binding pocket in complex with the PIP box of p21 [PDB ID: 1AXC] (34). (C) Close-up of the structure of PCNA PIP-binding pocket in complex with the APIM of ZRANB3 [PDB ID: 5YD8] (33). The blue residue in panels (B) and (C) shows the rearrangement of a leucine 126 in the PIP-binding pocket to accommodate the APIM peptide. (D) Close-up of the structure of the Rev1 C-terminal domain with the RIR motif of DNA polymerase kappa [PDB ID: 4FJO] (78). (E) Close-up of the structure of the C-terminal domain of the yeast MUTL alpha (MLH1/PMS1) bound to MIP box motif of Exo1 [PBD ID: 4FMO] (79). (F) Peptides from the structures of panels (A–E) aligned around their core hydrophobic residues. Underlined residues define the motif consensus residues in the peptide. Structural figures were prepared using the UCSF Chimera software (80).
Figure 3.Setting up the ELM server correctly to query bacterial effectors for SLiM candidates using, as an example, the IDP-rich TarP effector from Chlamydophila caviae for which the natural host is guinea pig. TarP is extracellular for the bacterium but the correct cell compartment to use is cytosol for the host cell. The correct species is the host Cavia porcellus. In the output, the three recently added VBS motifs (41) are shown as red ovals. All other motif matches are hypothetical.
Figure 4.Motif-mediated interactions of the Actin Cytoskeleton network. The KEGG resource network for Regulation of Actin Cytoskeleton (KEGG:hsa04810) is color-coded by ELM motif classes. Proteins of the pathway have a light mint green color by default. Motif-containing proteins are re-colored as follows: DOC class (docking sites) - moderate blue; LIG class (ligand binding motifs) - vivid orange; MOD class (modification sites) - soft pink; DEG class (degradation sites) - yellow; CLV class (cleavage sites) - very soft blue; TRG class (targeting sites) - pure orange; proteins with motifs belonging to multiple classes are marked with the respective colors as described in the bottom right of the figure. ELM has instances for pathogen hijack of actin polymerization at VCL, IRSp53, NWASP and Actin itself. The pathogen proteins affecting these hotspots are shown in the rounded boxes colored with light orange background.