| Literature DB >> 22110040 |
Holger Dinkel1, Sushama Michael, Robert J Weatheritt, Norman E Davey, Kim Van Roey, Brigitte Altenberg, Grischa Toedt, Bora Uyar, Markus Seiler, Aidan Budd, Lisa Jödicke, Marcel A Dammert, Christian Schroeter, Maria Hammer, Tobias Schmidt, Peter Jehl, Caroline McGuigan, Magdalena Dymecka, Claudia Chica, Katja Luck, Allegra Via, Andrew Chatr-Aryamontri, Niall Haslam, Gleb Grebnev, Richard J Edwards, Michel O Steinmetz, Heike Meiselbach, Francesca Diella, Toby J Gibson.
Abstract
Linear motifs are short, evolutionarily plastic components of regulatory proteins and provide low-affinity interaction interfaces. These compact modules play central roles in mediating every aspect of the regulatory functionality of the cell. They are particularly prominent in mediating cell signaling, controlling protein turnover and directing protein localization. Given their importance, our understanding of motifs is surprisingly limited, largely as a result of the difficulty of discovery, both experimentally and computationally. The Eukaryotic Linear Motif (ELM) resource at http://elm.eu.org provides the biological community with a comprehensive database of known experimentally validated motifs, and an exploratory tool to discover putative linear motifs in user-submitted protein sequences. The current update of the ELM database comprises 1800 annotated motif instances representing 170 distinct functional classes, including approximately 500 novel instances and 24 novel classes. Several older motif class entries have been also revisited, improving annotation and adding novel instances. Furthermore, addition of full-text search capabilities, an enhanced interface and simplified batch download has improved the overall accessibility of the ELM data. The motif discovery portion of the ELM resource has added conservation, and structural attributes have been incorporated to aid users to discriminate biologically relevant motifs from stochastically occurring non-functional instances.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22110040 PMCID: PMC3245074 DOI: 10.1093/nar/gkr1064
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Summary of data stored in the ELM database
| Number of functional site entries | ELM motif classes | ELM motif instances | Links to PDB structures | GO terms | Pubmed links | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Totals | 115 | 170 | 1840 | 195 | 340 | 1561 | ||||
| By category | LIG | 111 | Human | 1004 | ||||||
| MOD | 30 | Mouse | 160 | Biological process | 173 | From ELM motif | 787 | |||
| TRG | 21 | Rat | 102 | |||||||
| CLV | 8 | Fly | 67 | Cell compartment | 74 | From instance | 1071 | |||
| Yeast | 90 | |||||||||
| Other | 417 | Molecular function | 93 | |||||||
aAs of October 2011.
Figure 1.ELM start page. The user can submit a query sequence to the motif detection pipeline either as UniProt accession number or in FASTA format. Filtering criteria such as taxonomic range or cellular compartment should be activated to limit the resulting list of SLiM instances.
Figure 2.ELM motif detection pipeline output page. The top legend explains the different colors/symbols used. The graphical output of ELM concentrates the output of multiple sequence classification algorithms; phosphorylation sites from Phospho.ELM, protein domains detected by SMART/Pfam, disorder predictions by GlobPlot and IUPred and secondary structure (18). The lower part contains the annotated and putative ELM instances for the given protein sequence (Epsin1, UniProt accession Q9Y6I3). The background is colored according to the structural information available. Each box represents one ELM instance, the color of which indicates the likelihood that this instance is functional: grey instances are buried within structured regions, while shades of blue represent instances outside of structured regions and hint on sequence conservation, with pale blue representing weak sequence conservation and dark blue indicating strong sequence conservation. Red ellipses or boxes mark instances that are annotated in the query sequence or a homologous sequence, respectively.
List of novel ELM classes
| Identifier | Description |
|---|---|
| LIG_Actin_WH2_1 | Motifs, present in proteins in several repeats, which mediate binding to the hydrophobic cleft created by subdomains 1 and 3 of G-actin |
| LIG_Actin_WH2_2 | |
| LIG_Actin_RPEL_3 | |
| LIG_AGCK_PIF_1 | The AGCK docking motif mediates intramolecular interactions to the PDK1 Interacting Fragment (PIF) pocket, serving as a |
| LIG_AGCK_PIF_2 | |
| LIG_AGCK_PIF_3 | |
| LIG_BIR_II_1 | IAP-binding motifs are found in pro-apoptotic proteins and function in the abrogation of caspase inhibition by inhibitor of apoptosis proteins in apoptotic cells |
| LIG_BIR_III_1 | |
| LIG_BIR_III_2 | |
| LIG_BIR_III_3 | |
| LIG_BIR_III_4 | |
| LIG_eIF4E_1 | Motif binding to the dorsal surface of eIF4E |
| LIG_eIF4E_2 | |
| LIG_EVH1_3 | A proline-rich motif binding to EVH1/WH1 domains of WASP and N-WASP proteins |
| LIG_HCF-1_HBM_1 | The DHxY Host Cell Factor-1 binding motif interacts with the N-terminal kelch propeller domain of the cell cycle regulator HCF-1 |
| LIG_Integrin_isoDGR_1 | Present in proteins of extracellular matrix which upon deamidation forms biologically active isoDGR motif which binds to various members of integrin family |
| LIG_LYPXL_L_2 | The LYPxL motif binds the V-domain of Alix, a protein involved in endosomal sorting |
| LIG_LYPXL_S_1 | |
| LIG_PAM2_1 | Peptide ligand motif that directly interacts with the MLLE/PABC domain found in poly(A) binding proteins and HYD E3 ubiquitin ligases |
| LIG_PIKK_1 | Motif located in the C terminus of Nbs1 and its homologous interacting with PIKK family members |
| LIG_Rb_pABgroove_1 | The LxxLFD motif binds in a deep groove between pocket A and pocket B of the Retinoblastoma protein |
| LIG_SCF_FBW7_1 | The TPxxS phospho-dependent degron binds the FBW7 F box proteins of the SCF (Skp1-Cullin-Fbox) complex |
| LIG_SCF_FBW7_2 | |
| LIG_SPAK-OSR1_1 | SPAK/OSR1 kinase binding motif acts as a docking site which aids the interaction with their binding partners including the upstream activators and the phosphorylated substrates |
aAs of October 2011.
Figure 3.ELM detail page showing information about the ELM class TRG_AP2beta_CARGO_1.
Main cellular compartments used in ELM annotation
| Count | GO Id | GO term |
|---|---|---|
| 98 | GO:0005829 | Cytosol |
| 69 | GO:0005634 | Nucleus |
| 17 | GO:0005576 | Extracellular |
| 12 | GO:0005794 | Golgi apparatus |
| 10 | GO:0005886 | Plasma membrane |
| 9 | GO:0009898 | Internal side of plasma membrane |
| 9 | GO:0005783 | Endoplasmic reticulum |
| 6 | GO:0005739 | Mitochondrion |
| 5 | GO:0005643 | Nuclear pore |
| 5 | GO:0045334 | Clathrin-coated endocytic vesicle |
Figure 4.ELM instances browse page. A full-text search (here, search term used was ‘AP2’, filtering for ‘true positive’ instances in taxon ‘Homo sapiens’, yielding 58 instances) assists in finding annotated instances. A search can be restricted to a particular taxonomy or instance logic (top) or ELM class type (buttons on the left). The list can also be exported to TSV or FASTA format for further processing.
Figure 5.Schema of the ELM resource and data life cycle. Annotated ELM classes, and instances thereof, can be searched by database query. Via sequence search by the motif detection pipeline, annotated ELM classes yield putative instances in query sequences. By adding experimental evidence and references, these putative instances become candidate instances for annotation, and, with further curation, ultimately become fully annotated instances.