Literature DB >> 27857549

HINT: a database of annotated protein-protein interactions and their homologs.

Ashwini Patil1, Haruki Nakamura1.   

Abstract

Despite the abundance of protein-protein interaction databases currently available online, a source that identifies and lists similar interactions in different species is lacking. The Homologous Interactions (HINT) database is such a collection of protein-protein interactions and their homologs in one or more species. The interactions and their homologs are annotated with Eukaryotic Cluster of Orthologous Groups (KOG) IDs, InterPro domains, Gene Ontology (GO) terminology and Protein Data Bank (PDB) structures. HINT is available as an interactive Web server at http://helix.protein.osaka-u.ac.jp/hint/.

Entities:  

Keywords:  homologous interactions; interaction families

Year:  2005        PMID: 27857549      PMCID: PMC5036632          DOI: 10.2142/biophysics.1.21

Source DB:  PubMed          Journal:  Biophysics (Nagoya-shi)        ISSN: 1349-2942


Protein-protein interactions in various organisms are increasingly becoming the focus of study in the identification of cellular functions of proteins. For any given interaction, it is of significant interest to find similar interactions in different species. Such a comparative study helps in the transfer of annotations between interactions from better annotated species to poorly annotated ones. It also aids the identification of likely true interactions from error-prone high-throughput datasets since, intuitively, an interaction found in more than one species is likely to be universal. It has been recently estimated that the total number of interaction types is limited to about 10,0001. Grouping similar interactions on the basis of sequence homology would help in their classification in different distinct interaction types or families. There are a number of protein-protein interaction databases available online that give information about experimentally determined interactions. Some of these are the Database of Interacting Proteins (DIP)2, IntAct3 and Biomolecular Interaction Network Database (BIND)4. Although these databases provide considerable information about the interaction of interest, they do not provide any information about interactions similar or homologous to it. With these goals in mind, we present here HINT, a database of homologous interactions with various annotations for the interacting proteins. HINT is available online at http://helix.protein.osaka-u.ac.jp/hint/.

Methods

Two interactions are considered homologous if the interacting proteins of one interaction are homologous to the interacting proteins for the other interaction (Fig. 1). Homologous interactions include, but are not limited to, orthologous interactions (similar interactions found in different species) and paralogous interactions (similar interactions in the same species).
Figure 1

Homologous interactions — Proteins P1, P2, P3 are the sequence homologs of protein P. Similarly, proteins Q1, Q2 are the sequence homologs of protein Q. Interactions P1–Q2 and P2–Q1 are homologous to the interaction P–Q.

We use protein-protein interaction data for different model organisms from DIP (July 2004 version) and IntAct (September 2004 version). For each interaction, the sequence homologs of the interacting proteins are determined using PSIBLAST5 with 5 iterations and an E-value cutoff of 10−8. If an interaction is found that involves any of the homologs of the interacting proteins, then it is deemed homologous to the interaction under consideration. Figure 1 illustrates this concept. We thus generate groups of homologous interactions that have been determined by small-scale or high-throughput experiments. We determine if any two interactions are orthologous or paralogous by assigning the interacting proteins to clusters from the Eukaryotic Cluster of Orthologous Groups (KOG) database6. The interacting proteins for each interaction are also annotated with domain definitions from InterPro7, Gene Ontology (GO) terms8 and Protein Data Bank (PDB) structures, where available. The interactions from DIP and IntAct were parsed from XML files in Proteomics Standards Initiative — Molecular Interaction (PSI-MI) XML format9. This allows for easy extension of the database by incorporating protein-protein interaction data from various other databases using this format. HINT is implemented as a relational database hosted on a PostgreSQL server and can be accessed over the Inter-net through an HTML web interface.

Results

11103 of the 45840 interactions (24%) have one or more homologs in HINT. Table 1 shows the distribution of the homologs across species. The web interface can be used to search interactions using various identifiers such as SwissProt Accession numbers, PIR IDs, GenBank Accession numbers, RefSeq IDs or descriptions of the interacting proteins. An interaction of interest can be selected from the results of the search to obtain detailed information about it. Figure 2 shows a snapshot of the Interactions web page. The homologs of the interaction selected are shown in graphical form as well as tabular form and sorted according to the score of the protein hits, with the best hits shown first. The graphical form helps the user to visualize the regions and domains that are common among the proteins of the selected interaction and those of the homologous interactions. The tabular form gives details about the E-values and the percent identity given by PSIBlast. Further details about the usage of the web interface are provided in the form of an online Help document.
Table 1

Species distribution of homologous protein-protein interactions in HINT

OrganismTwo letter codeInteractionsHomologous Interactions
D. melanogasterDm205813521
S. cerevisiaeSc141783879
C. elegansCe45531165
H. sapiensHs39332126
H. pyloriHp1409128
E. coliEc554150
M. musculusMm483292
A. thalianaAt7665
R. norvegicusRn6743
S. pombeSp64

Total4584011103
Figure 2

Web Interface of the Database of Homologous Interactions with interaction details and GO, InterPro and KOG annotations of the interacting proteins. Also shown is a graph of the homologs with the query interaction in yellow, InterPro domains of the interacting proteins in blue and the homologs of the proteins colored differently with varying hit scores, similar to BLAST results10. The species in which the homolog is found is given by a 2 letter code as given in Table 1. If the interaction is an ortholog or a paralog of the query interaction, it is indicated by an ‘ O’ or a ‘ P’ along side each homolog. Clicking on the ‘ +’ gives further details of the homolog such as E-value, percent identity, hit score and KOG, where available. Tool tips are provided in various places.

Discussion

HINT is a database of homologous protein-protein interactions that can be used by researchers to determine the detailed information about similar interactions in different species. It provides a graphical view of the homologous interactions and the various annotations of the interacting proteins and can be accessed over the Internet at http://helix.protein.osaka-u.ac.jp/hint/. For a given interaction, HINT is able to provide a list of similar interactions found in the same or in different species. This is of considerable use in comparative genomic analyses. In future, we plan to use HINT in the identification of true positives in high-throughput interaction data sets and in the formation of interaction families.
  9 in total

1.  The Gene Ontology (GO) database and informatics resource.

Authors:  M A Harris; J Clark; A Ireland; J Lomax; M Ashburner; R Foulger; K Eilbeck; S Lewis; B Marshall; C Mungall; J Richter; G M Rubin; J A Blake; C Bult; M Dolan; H Drabkin; J T Eppig; D P Hill; L Ni; M Ringwald; R Balakrishnan; J M Cherry; K R Christie; M C Costanzo; S S Dwight; S Engel; D G Fisk; J E Hirschman; E L Hong; R S Nash; A Sethuraman; C L Theesfeld; D Botstein; K Dolinski; B Feierbach; T Berardini; S Mundodi; S Y Rhee; R Apweiler; D Barrell; E Camon; E Dimmer; V Lee; R Chisholm; P Gaudet; W Kibbe; R Kishore; E M Schwarz; P Sternberg; M Gwinn; L Hannick; J Wortman; M Berriman; V Wood; N de la Cruz; P Tonellato; P Jaiswal; T Seigfried; R White
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  IntAct: an open source molecular interaction database.

Authors:  Henning Hermjakob; Luisa Montecchi-Palazzi; Chris Lewington; Sugath Mudali; Samuel Kerrien; Sandra Orchard; Martin Vingron; Bernd Roechert; Peter Roepstorff; Alfonso Valencia; Hanah Margalit; John Armstrong; Amos Bairoch; Gianni Cesareni; David Sherman; Rolf Apweiler
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  The Database of Interacting Proteins: 2004 update.

Authors:  Lukasz Salwinski; Christopher S Miller; Adam J Smith; Frank K Pettit; James U Bowie; David Eisenberg
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

4.  The InterPro Database, 2003 brings increased coverage and new features.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Daniel Barrell; Alex Bateman; David Binns; Margaret Biswas; Paul Bradley; Peer Bork; Phillip Bucher; Richard R Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Laurent Falquet; Wolfgang Fleischmann; Sam Griffiths-Jones; Daniel Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; Rodrigo Lopez; Ivica Letunic; David Lonsdale; Ville Silventoinen; Sandra E Orchard; Marco Pagni; David Peyruc; Chris P Ponting; Jeremy D Selengut; Florence Servant; Christian J A Sigrist; Robert Vaughan; Evgueni M Zdobnov
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

5.  BIND: the Biomolecular Interaction Network Database.

Authors:  Gary D Bader; Doron Betel; Christopher W V Hogue
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

6.  Ten thousand interactions for the molecular biologist.

Authors:  Patrick Aloy; Robert B Russell
Journal:  Nat Biotechnol       Date:  2004-10       Impact factor: 54.908

Review 7.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

8.  The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

Authors:  Henning Hermjakob; Luisa Montecchi-Palazzi; Gary Bader; Jérôme Wojcik; Lukasz Salwinski; Arnaud Ceol; Susan Moore; Sandra Orchard; Ugis Sarkans; Christian von Mering; Bernd Roechert; Sylvain Poux; Eva Jung; Henning Mersch; Paul Kersey; Michael Lappe; Yixue Li; Rong Zeng; Debashis Rana; Macha Nikolski; Holger Husi; Christine Brun; K Shanker; Seth G N Grant; Chris Sander; Peer Bork; Weimin Zhu; Akhilesh Pandey; Alvis Brazma; Bernard Jacq; Marc Vidal; David Sherman; Pierre Legrain; Gianni Cesareni; Ioannis Xenarios; David Eisenberg; Boris Steipe; Chris Hogue; Rolf Apweiler
Journal:  Nat Biotechnol       Date:  2004-02       Impact factor: 54.908

9.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

  9 in total
  7 in total

1.  Domain distribution and intrinsic disorder in hubs in the human protein-protein interaction network.

Authors:  Ashwini Patil; Kengo Kinoshita; Haruki Nakamura
Journal:  Protein Sci       Date:  2010-08       Impact factor: 6.725

2.  Using informative features in machine learning based method for COVID-19 drug repurposing.

Authors:  Rosa Aghdam; Mahnaz Habibi; Golnaz Taheri
Journal:  J Cheminform       Date:  2021-09-20       Impact factor: 5.514

3.  HitPredict: a database of quality assessed protein-protein interactions in nine species.

Authors:  Ashwini Patil; Kenta Nakai; Haruki Nakamura
Journal:  Nucleic Acids Res       Date:  2010-10-14       Impact factor: 16.971

4.  Comprehensive analysis of pathways in Coronavirus 2019 (COVID-19) using an unsupervised machine learning method.

Authors:  Golnaz Taheri; Mahnaz Habibi
Journal:  Appl Soft Comput       Date:  2022-08-17       Impact factor: 8.263

5.  HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

Authors:  Yosvany López; Kenta Nakai; Ashwini Patil
Journal:  Database (Oxford)       Date:  2015-12-26       Impact factor: 3.451

6.  Topological network based drug repurposing for coronavirus 2019.

Authors:  Mahnaz Habibi; Golnaz Taheri
Journal:  PLoS One       Date:  2021-07-29       Impact factor: 3.240

7.  Evaluation of some aspects in supervised cell type identification for single-cell RNA-seq: classifier, feature selection, and reference construction.

Authors:  Wenjing Ma; Kenong Su; Hao Wu
Journal:  Genome Biol       Date:  2021-09-09       Impact factor: 13.583

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.