Literature DB >> 17135199

DOMINO: a database of domain-peptide interactions.

Arnaud Ceol¹, Andrew Chatr-aryamontri, Elena Santonico, Roberto Sacco, Luisa Castagnoli, Gianni Cesareni.

Abstract

Many protein interactions are mediated by small protein modules binding to short linear peptides. DOMINO (http://mint.bio.uniroma2.it/domino/) is an open-access database comprising more than 3900 annotated experiments describing interactions mediated by protein-interaction domains. DOMINO can be searched with a versatile search tool and the interaction networks can be visualized with a convenient graphic display applet that explicitly identifies the domains/sites involved in the interactions.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
Ligands
Peptides

Year: 2006 PMID： 17135199 PMCID： PMC1751533 DOI： 10.1093/nar/gkl961

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Cell function is governed by an intricate web of physical and functional links between proteins. Information about the details of this interaction network is dispersed in the scientific literature in a format that is not easily accessible for large scale analysis. Over the past few years, a number of protein-interaction databases have made an effort to retrieve interaction information from published experiments (1–6). The stored information is freely available and can be downloaded and conveniently represented as graphs where interacting proteins are nodes connected by edges. This mode of representation, however, does not allow the extraction of important information such as the number of partners that any given protein is capable of binding to simultaneously. This question is particularly relevant for proteins (hubs) that have a large number of putative partners and where it is not clear, from a simple protein-interaction graph representation, whether all the partners compete for the same binding site on the hub protein or rather bind in a noncompetitive manner to different domains/sites (7). This limitation can be overcome by taking into account the modular nature of proteins and by mapping each interaction to the binding domains/sites on the partner proteins (8). A few databases have focused on domain–domain interactions. Although they differ somewhat in scope, InterDom and DIMA aim at integration of multiple data sources and prediction techniques to assemble a domain interaction graph linking domains that are likely to interact (9,10). iPfam is a resource that describes domain–domain interactions that are observed in protein complexes whose 3D structure is known (11). None of these resources, however, aim at collecting all experimental observations of interactions mediated by protein-interaction domains. A fairly large fraction of the links in a protein-interaction network is supported by families of small conserved modular domains binding to relatively short peptides in an extended conformation (12). Although the peptide ligands of most domains within a family (for instance SH3, SH2, PDZ etc …) share specific sequence/structure characteristics, each member of the family displays some degree of specificity (8). For instance SH3 domains bind to peptides that are rich in proline, mostly containing the motif PxxP, but while the SH3 domain of the yeast protein RVS167 has affinity for peptides containing an Arg at position P–3 (RxxPxxP), the SH3 domain of SHO1 prefers a Lys at the same position (13). Over the past 15 years, the preferred targets of several members of these domain families have been studied and reported in the scientific literature thus allowing one to infer the physiological network mediated by these relatively low-affinity interactions. In this report, we present DOMINO: A relational database designed to store protein interactions mediated by protein recognition modules (8). PDZBase has a similar scope, although limited to the PDZ domain (14). All the PDZ mediated interactions stored in DOMINO have been freshly curated to meet the Proteomics Standards Initiative Molecular Interactions (PSI-MI) standards (15).

DATABASE STRUCTURE

The data model of DOMINO is based on Intact (1), an open source database, and runs on the Postgresql relational database system (). The Intact data model has been extended to provide convenient and faster access to information about interacting domains. Moreover, new tables have been added for storing annotation retrieved from Pfam. These are used to display the information about interacting modules in the context of the structure of the protein partners. The API of Intact was used as a library for the development of DOMINO applications and web tools. The web interface was developed using the Struts framework (). The applications and the web interface were developed with Java 5. To limit compatibility problems, the Viewer applet has been compiled for Java 1.

STORED DATA

DOMINO aims at annotating all the available information about domain-peptide and domain–domain interactions. The core of DOMINO, of July 24, 2006 consists of more than 3900 interactions extracted from peer-reviewed articles and annotated by expert biologists. A total of 717 manuscripts have been processed, thus covering a large fraction of the published information about domain–peptide interactions. The curation effort has focused on the following domains: SH3, SH2, 14-3-3, PDZ, PTB, WW, EVH, VHS, FHA, EH, FF, BRCT, Bromo, Chromo and GYF. However, interactions mediated by as many as 150 different domain families are stored in DOMINO. The pie chart in Figure 1A reports the fraction of interactions mediated by each of the major domain families.

Figure 1

DOMINO statistics. (A) The pie chart represents the number of interactions mediated by each domain family in the DOMINO database. Only the five domains with the largest number of annotated interactions are shown in detail, while the remaining domains are grouped under ‘others’. (B) Number of annotated protein interactions supported by each method. (C) Number of protein interactions between proteins in different proteomes. More than 75% of the annotated entries describe interactions between mammalian domains and their target peptides, while most of the remaining entries (22%) involve yeast proteins (see Figure 1C for detailed statistics). The interactions deposited in DOMINO are annotated according to the PSI-MI 2.5 (15) standard and can be easily analyzed in the context of the global protein-interaction network as downloaded from major interaction databases like MINT (3), BIND (16), INTACT (1), DIP (5) and Mpact (6). The curation process follows the PSI-MI 2.5 standard but with special emphasis on the mapping of the interaction to specific protein domains of both participating proteins. This is achieved by paying special attention to the shortest protein fragment that was experimentally verified as sufficient for the interaction. Whenever the authors report only the name of the domain mediating the interaction (i.e. SH3, SH2 …), without stating the coordinates of the experimental binding range, the curator may choose to enter the coordinates of the Pfam domain match in the protein sequence. Finally whenever the information is available, any mutation or post-translational modification affecting the interaction affinity is noted in the database.

WEB INTERFACE

DOMINO is accessible through a web interface at . The search page offers the possibility of searching either for any given protein of interest or for all the proteins in the DOMINO database containing a specific domain. The protein search can be carried out by entering identifiers of the main protein databases (Uniprot, SGD, FlyBase and WormBase). However, gene names or synonyms can also be used. A list of all domains included in DOMINO is also provided to facilitate the search. For domain restricted searches, only proteins containing the query domain, and for which the domain has been shown to mediate an interaction stored in DOMINO, will be displayed. If desirable, all types of queries can be restricted to a given organism. The result of the search is an HTML page containing all the proteins matching the query terms and the list of the corresponding InterPro domains (Figure 2A). By clicking the check boxes corresponding to the specific protein of interest or to a specific protein domain, one can direct the search either to the partners of the selected proteins or limit it to the partners binding to the selected domain(s). For instance, in the case of the growth factor receptor-bound protein 2 (GRB2) containing two SH3 and one SH2 domains, it is possible to restrict the search to ligands of the second SH3 domain, or to exclude them. Searches can also be limited to interactions discovered by a specific experimental method. A choice of six main method categories is given (multiple selection is possible), but any of these categories also includes all ‘children’ techniques, as defined in the PSI controlled vocabulary hierarchy. Among other applications, this filtering tool can be used to exclude results of large scale experiments, if so desired.

Figure 2

DOMINO WEB interface: (A) a typical result of a protein search. In this case, the search term was GRB2 and the search was restricted to Homo sapiens. (B) A partial view of the results of an interaction search for ligands of the GRB2 SH3 and SH2 domains. By clicking the check boxes on the right it is possible to remove irrelevant interactions from the list. (C) A selected number of interactions in the output in B were displayed using the Viewer applet. Once the appropriate choice is made, after clicking the ‘search interaction’ button, an HTML page is shown displaying all pairs of relevant interacting proteins and a summary of the interaction details. A full description of the entry, including experimental procedures or biological features such as required post-translation modification or defective mutations, is displayed after pressing the ‘evidence’ button. The HTML page can be edited by removing interactions that are deemed irrelevant to the specific query. The edited interaction list can be exported either as a tab delimited file or as a PSI-MI document (PSI-MI version 1 or 2.5). Finally, interactions can be displayed in a graph representation through the Viewer applet (Figure 2C). In the DOMINO Viewer applet, proteins are represented as rectangles. The protein domain structure is illustrated with a colored background (one color for each domain family). Interactions are represented as edges in the graph. Whereas most protein-interaction display tools only link entire proteins, in DOMINO the viewer utilizes the information stored in the database to link the partner domains involved in the interaction. The extent of the binding site is made clear by drawing a line under the protein fragment involved in the interaction. This representation permits an immediate visualization of the proteins that compete for binding to the same partner (Figure 2C). Whenever the interaction range in one of the two partners has not been determined experimentally, edges are drawn in grey.

DATA ACCESS

Data stored in DOMINO are released under the Creative Commons Attribution License (). According to this license, it is possible to copy, distribute, display and make commercial use of all data if appropriate credit is given. Data can be downloaded at , either as a tab delimited file that can be imported directly into spreadsheet applications, or in PSI-MI 1 and PSI-MI 2.5 XML documents. Users can either download a file containing the full dataset or files containing only the interactions mediated by specialized domains (SH3, SH2, PDZ, 14-3-3 and WW). As stated above, any result of an interaction search can be conveniently downloaded in two file formats.

FUTURE DIRECTIONS

The long-term goal of DOMINO is for it to develop into a stable repository of interactions mediated by protein domains thus offering a unique tool for interpreting protein-interaction networks. We are committed to make the database more comprehensive by entering new data as they become available. Finally, we plan to use the sequence fragments that have been shown to bind specific domains to automatically identify the consensus ligand peptide for any domain for which sufficient experimental information is available.

15 in total

1. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules.

Authors: Amy Hin Yan Tong; Becky Drees; Giuliano Nardelli; Gary D Bader; Barbara Brannetti; Luisa Castagnoli; Marie Evangelista; Silvia Ferracuti; Bryce Nelson; Serena Paoluzi; Michele Quondam; Adriana Zucconi; Christopher W V Hogue; Stanley Fields; Charles Boone; Gianni Cesareni
Journal: Science Date: 2001-12-13 Impact factor: 47.728

Review 2. Assembly of cell regulatory systems through protein interaction domains.

Authors: Tony Pawson; Piers Nash
Journal: Science Date: 2003-04-18 Impact factor: 47.728

3. IntAct: an open source molecular interaction database.

Authors: Henning Hermjakob; Luisa Montecchi-Palazzi; Chris Lewington; Sugath Mudali; Samuel Kerrien; Sandra Orchard; Martin Vingron; Bernd Roechert; Peter Roepstorff; Alfonso Valencia; Hanah Margalit; John Armstrong; Amos Bairoch; Gianni Cesareni; David Sherman; Rolf Apweiler
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

4. The Database of Interacting Proteins: 2004 update.

Authors: Lukasz Salwinski; Christopher S Miller; Adam J Smith; Frank K Pettit; James U Bowie; David Eisenberg
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

5. InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes.

Authors: See-Kiong Ng; Zhuo Zhang; Soon-Heng Tan; Kui Lin
Journal: Nucleic Acids Res Date: 2003-01-01 Impact factor: 16.971

6. The DIMA web resource--exploring the protein domain network.

Authors: Philipp Pagel; Matthias Oesterheld; Volker Stümpflen; Dmitrij Frishman
Journal: Bioinformatics Date: 2006-02-15 Impact factor: 6.937

7. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions.

Authors: Ioannis Xenarios; Lukasz Salwínski; Xiaoqun Joyce Duan; Patrick Higney; Sul-Min Kim; David Eisenberg
Journal: Nucleic Acids Res Date: 2002-01-01 Impact factor: 16.971

Review 8. MINT: a Molecular INTeraction database.

Authors: Andreas Zanzoni; Luisa Montecchi-Palazzi; Michele Quondam; Gabriele Ausiello; Manuela Helmer-Citterich; Gianni Cesareni
Journal: FEBS Lett Date: 2002-02-20 Impact factor: 4.124

9. The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

Authors: Henning Hermjakob; Luisa Montecchi-Palazzi; Gary Bader; Jérôme Wojcik; Lukasz Salwinski; Arnaud Ceol; Susan Moore; Sandra Orchard; Ugis Sarkans; Christian von Mering; Bernd Roechert; Sylvain Poux; Eva Jung; Henning Mersch; Paul Kersey; Michael Lappe; Yixue Li; Rong Zeng; Debashis Rana; Macha Nikolski; Holger Husi; Christine Brun; K Shanker; Seth G N Grant; Chris Sander; Peer Bork; Weimin Zhu; Akhilesh Pandey; Alvis Brazma; Bernard Jacq; Marc Vidal; David Sherman; Pierre Legrain; Gianni Cesareni; Ioannis Xenarios; David Eisenberg; Boris Steipe; Chris Hogue; Rolf Apweiler
Journal: Nat Biotechnol Date: 2004-02 Impact factor: 54.908

10. The Biomolecular Interaction Network Database and related tools 2005 update.

Authors: C Alfarano; C E Andrade; K Anthony; N Bahroos; M Bajec; K Bantoft; D Betel; B Bobechko; K Boutilier; E Burgess; K Buzadzija; R Cavero; C D'Abreo; I Donaldson; D Dorairajoo; M J Dumontier; M R Dumontier; V Earles; R Farrall; H Feldman; E Garderman; Y Gong; R Gonzaga; V Grytsan; E Gryz; V Gu; E Haldorsen; A Halupa; R Haw; A Hrvojic; L Hurrell; R Isserlin; F Jack; F Juma; A Khan; T Kon; S Konopinsky; V Le; E Lee; S Ling; M Magidin; J Moniakis; J Montojo; S Moore; B Muskat; I Ng; J P Paraiso; B Parker; G Pintilie; R Pirone; J J Salama; S Sgro; T Shan; Y Shu; J Siew; D Skinner; K Snyder; R Stasiuk; D Strumpf; B Tuekam; S Tao; Z Wang; M White; R Willis; C Wolting; S Wong; A Wrong; C Xin; R Yao; B Yates; S Zhang; K Zheng; T Pawson; B F F Ouellette; C W V Hogue
Journal: Nucleic Acids Res Date: 2005-01-01 Impact factor: 16.971

36 in total

1. The development and application of a quantitative peptide microarray based approach to protein interaction domain specificity space.

Authors: Brett W Engelmann; Yohan Kim; Miaoyan Wang; Bjoern Peters; Ronald S Rock; Piers D Nash
Journal: Mol Cell Proteomics Date: 2014-08-18 Impact factor: 5.911

2. Large-scale interaction profiling of PDZ domains through proteomic peptide-phage display using human and viral phage peptidomes.

Authors: Ylva Ivarsson; Roland Arnold; Megan McLaughlin; Satra Nim; Rakesh Joshi; Debashish Ray; Bernard Liu; Joan Teyra; Tony Pawson; Jason Moffat; Shawn Shun-Cheng Li; Sachdev S Sidhu; Philip M Kim
Journal: Proc Natl Acad Sci U S A Date: 2014-02-03 Impact factor: 11.205

3. PeptiSite: a structural database of peptide binding sites in 4D.

Authors: Chayan Acharya; Irina Kufareva; Andrey V Ilatovskiy; Ruben Abagyan
Journal: Biochem Biophys Res Commun Date: 2014-01-06 Impact factor: 3.575

4. PDZ domains and their binding partners: structure, specificity, and modification.

Authors: Ho-Jin Lee; Jie J Zheng
Journal: Cell Commun Signal Date: 2010-05-28 Impact factor: 5.712

5. Struct2Net: a web service to predict protein-protein interactions using a structure-based approach.

Authors: Rohit Singh; Daniel Park; Jinbo Xu; Raghavendra Hosur; Bonnie Berger
Journal: Nucleic Acids Res Date: 2010-05-31 Impact factor: 16.971

6. PeptideMine--a webserver for the design of peptides for protein-peptide binding studies derived from protein-protein interactomes.

Authors: Khader Shameer; Lalima L Madan; Shivamurthy Veeranna; Balasubramanian Gopal; Ramanathan Sowdhamini
Journal: BMC Bioinformatics Date: 2010-09-22 Impact factor: 3.169

7. A proposed syntax for Minimotif Semantics, version 1.

Authors: Jay Vyas; Ronald J Nowling; Mark W Maciejewski; Sanguthevar Rajasekaran; Michael R Gryk; Martin R Schiller
Journal: BMC Genomics Date: 2009-08-05 Impact factor: 3.969

8. A structure filter for the Eukaryotic Linear Motif Resource.

Authors: Allegra Via; Cathryn M Gould; Christine Gemünd; Toby J Gibson; Manuela Helmer-Citterich
Journal: BMC Bioinformatics Date: 2009-10-24 Impact factor: 3.169

9. Bayesian modeling of the yeast SH3 domain interactome predicts spatiotemporal dynamics of endocytosis proteins.

Authors: Raffi Tonikian; Xiaofeng Xin; Christopher P Toret; David Gfeller; Christiane Landgraf; Simona Panni; Serena Paoluzi; Luisa Castagnoli; Bridget Currell; Somasekar Seshagiri; Haiyuan Yu; Barbara Winsor; Marc Vidal; Mark B Gerstein; Gary D Bader; Rudolf Volkmer; Gianni Cesareni; David G Drubin; Philip M Kim; Sachdev S Sidhu; Charles Boone
Journal: PLoS Biol Date: 2009-10-20 Impact factor: 8.029

10. MINT, the molecular interaction database: 2009 update.

Authors: Arnaud Ceol; Andrew Chatr Aryamontri; Luana Licata; Daniele Peluso; Leonardo Briganti; Livia Perfetto; Luisa Castagnoli; Gianni Cesareni
Journal: Nucleic Acids Res Date: 2009-11-06 Impact factor: 16.971