| Literature DB >> 26708988 |
Yosvany López1, Kenta Nakai2, Ashwini Patil3.
Abstract
HitPredict is a consolidated resource of experimentally identified, physical protein-protein interactions with confidence scores to indicate their reliability. The study of genes and their inter-relationships using methods such as network and pathway analysis requires high quality protein-protein interaction information. Extracting reliable interactions from most of the existing databases is challenging because they either contain only a subset of the available interactions, or a mixture of physical, genetic and predicted interactions. Automated integration of interactions is further complicated by varying levels of accuracy of database content and lack of adherence to standard formats. To address these issues, the latest version of HitPredict provides a manually curated dataset of 398 696 physical associations between 70 808 proteins from 105 species. Manual confirmation was used to resolve all issues encountered during data integration. For improved reliability assessment, this version combines a new score derived from the experimental information of the interactions with the original score based on the features of the interacting proteins. The combined interaction score performs better than either of the individual scores in HitPredict as well as the reliability score of another similar database. HitPredict provides a web interface to search proteins and visualize their interactions, and the data can be downloaded for offline analysis. Data usability has been enhanced by mapping protein identifiers across multiple reference databases. Thus, the latest version of HitPredict provides a significantly larger, more reliable and usable dataset of protein-protein interactions from several species for the study of gene groups. Database URL: http://hintdb.hgc.jp/htp.Entities:
Mesh:
Year: 2015 PMID: 26708988 PMCID: PMC4691340 DOI: 10.1093/database/bav117
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Improvements in HitPredict version 4 over version 3
| Property | HitPredict version 3 | HitPredict version 4 |
|---|---|---|
| Data sources | 3 | 5 |
| (IntAct, BioGRID, HPRD) | (IntAct, BioGRID, HPRD, DIP, MINT) | |
| Data coverage | 9 species | 105 species |
| 50 200 proteins | 70 808 proteins | |
| 245 409 interactions | 398 696 interactions | |
| Scoring schema | Annotation-based | Annotation-based |
| Method-based | ||
| Combined | ||
| Score coverage | Interactions from high-throughput experiments | All interactions |
| Manual curation | No | Yes |
| Data visualization | Static network layout | Flexible network layout |
| Reference mapping | None | UniProt IDs mapped to Entrez and Ensembl IDs |
| Data download | Entire dataset only | Entire dataset or for a particular protein |
Figure 1HitPredict database content in all updates from 2005 to 2015.
Figure 2HitPredict interaction data assembly and curation (orange boxes indicate manual curation). PPIs: protein-protein interactions.
Figure 3HitPredict experimental information integration and curation. This flowchart shows the process used to combine experimental information from all the source databases for all interactions (orange boxes indicate manual curation). PPIs: protein-protein interactions.
Figure 4Distribution of physical protein–protein interactions in HitPredict by species.
Figure 5Number of publications supporting the protein–protein interactions in HitPredict.
Figure 6Evaluation and comparison of the HitPredict annotation, method and combined interaction scores with the MINT score in mentha.