| Literature DB >> 25652942 |
J M Villaveces1, R C Jiménez1, P Porras1, N Del-Toro1, M Duesbury1, M Dumousseau1, S Orchard1, H Choi1, P Ping2, N C Zong2, M Askenazi1, B H Habermann1, Henning Hermjakob3.
Abstract
The evidence that two molecules interact in a living cell is often inferred from multiple different experiments. Experimental data is captured in multiple repositories, but there is no simple way to assess the evidence of an interaction occurring in a cellular environment. Merging and scoring of data are commonly required operations after querying for the details of specific molecular interactions, to remove redundancy and assess the strength of accompanying experimental evidence. We have developed both a merging algorithm and a scoring system for molecular interactions based on the proteomics standard initiative-molecular interaction standards. In this manuscript, we introduce these two algorithms and provide community access to the tool suite, describe examples of how these tools are useful to selectively present molecular interaction data and demonstrate a case where the algorithms were successfully used to identify a systematic error in an existing dataset.Entities:
Mesh:
Year: 2015 PMID: 25652942 PMCID: PMC4316181 DOI: 10.1093/database/bau131
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Schematic of the merging of interactions between molecules M1–M3, described in publication P1–3 by interaction detection methods D1–3 and with interaction types T1 and T2.
Merging and scoring evidences of the interaction between AKTIP_HUMAN and HOOK2_HUMAN
| PSICQUIC service | Interaction evidences | Publications | Interaction types | Detection methods | MIscore |
|---|---|---|---|---|---|
| STRING | 3 | 1* | – |
Experimental interaction detection Inferred by curator Predictive text mining | 0.20 |
| VirHostNet | 1 | 1 | Physical association | Two hybrid | 0.37 |
| Spike | 1 | 1 | Direct interaction | Coimmunoprecipitation | 0.44 |
| IntAct | 2 | 2 | Physical association |
Two hybrid pooling approach Two hybrid fragment pooling approach | 0.35 |
| APID | 1 | 1 | Association | Two hybrid pooling approach | 0.31 |
| Menthe | 7 | 3 |
Physical association Direct interaction |
Affinity chromatography technology Two hybrid Two hybrid pooling approach Two hybrid fragment pooling approach | 0.76 |
|
Spike IntAct VirHostNet | 4 | 2 |
Direct interaction Physical association |
Two hybrid Coimmunoprecipitation Two hybrid pooling approach Two hybrid fragment pooling approach | 0.68 |
|
APID mentha Spike IntAct VirHostNet | 12 | 3 |
Direct interaction Physical association Association |
Two hybrid Coimmunoprecipitation Two hybrid pooling approach Two hybrid fragment pooling approach Affinity chromatography technology | 0.81 |
|
Spike IntAct VirHostNet APID mentha STRING | 15 | 3 |
Direct interaction Physical association Association – |
Two hybrid Experimental interaction detection Inferred by curator Predictive text mining Coimmunoprecipitation Affinity chromatography technology Two hybrid pooling approach Two hybrid fragment pooling approach | 0.81 |
MIQL query “identifier:(Q9H8T0) AND identifier:(Q96ED9)”. *Predicted data from STRING does not have any publications assigned, so publication number here is attributed only for experimentally derived data, which is imported from other databases.
Figure 2.The MIscore normalized score calculates a composite score for an interaction based on the number of publications reporting the interaction, the reported interaction detection methods and interaction types.
Figure 3.MIscore and Mentha true-positive rates vs. the false-positive rates for different score cutoffs.
Performance measures used to evaluate MIscore and Mentha scores
| Score | Accuracy | Precision | Recall | MCC | Cutoff |
|---|---|---|---|---|---|
| 0.755 | 0.701 | 0.978 | 0.541 | 0.485 | |
| 0.673 | 0.660 | 0.854 | 0.474 | 0.343 |
Figure 4.MImerge results for DIP, IntAct and MINT. Only 1.54% of the interactions are shared between the three databases, 10.86% are shared between two databases and 87.6% are not shared at all.
Merging and scoring evidences of interaction databases in PSICQUC
| PSICQUIC service | Category | Interactions in interval | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| >0–0.1 | >0.1–0.2 | >0.2–0.3 | >0.3–0.4 | >0.4–0.5 | >0.5–0.6 | >0.6–0.7 | >0.7–0.8 | >0.8–0.9 | >0.9–1 | Total | Redundancy % | ||
| APID ( | I | 0 | 0 | 26 821 | 259 405 | 21 110 | 5480 | 4948 | 2406 | 1781 | 628 | 322 579 | 22,48 |
| iRefIndex ( | I | 277 | 15 703 | 67 538 | 163 034 | 39 933 | 16 547 | 7566 | 3761 | 2659 | 1817 | 318 835 | 51,58 |
| Mentha ( | I | 0 | 0 | 49 645 | 274 628 | 76 109 | 22 851 | 11 860 | 5074 | 2751 | 1847 | 444 765 | 37,31 |
| BIND ( | IC | 0 | 3590 | 85 153 | 20 947 | 4891 | 1481 | 520 | 196 | 87 | 24 | 116 889 | 39,41 |
| BindingDB ( | IC | 0 | 0 | 957 | 67 369 | 267 | 4440 | 1052 | 477 | 429 | 502 | 75 493 | 26,10 |
| BioGrid ( | IC | 0 | 0 | 271 005 | 173 966 | 21 372 | 19 477 | 8107 | 5194 | 4805 | 2679 | 506 605 | 31,61 |
| ChEMBL ( | IC | 0 | 0 | 30 020 | 437 441 | 1957 | 22 259 | 4005 | 1538 | 1205 | 1136 | 499 561 | 20,52 |
| HPIDb ( | IC | 0 | 0 | 15 | 712 | 109 | 52 | 15 | 6 | 1 | 0 | 910 | 37,20 |
| InnateDB ( | IC | 0 | 1 | 803 | 12 234 | 980 | 926 | 350 | 170 | 126 | 62 | 15 652 | 36,54 |
| Spike ( | IC | 0 | 1 | 18 923 | 15 399 | 610 | 145 | 0 | 0 | 0 | 0 | 35 078 | 3,23 |
| TopFind ( | IC | 1437 | 3334 | 178 | 5 | 2 | 0 | 0 | 0 | 0 | 0 | 4956 | 48,06 |
| VirHostNet ( | IC | 0 | 0 | 707 | 8629 | 953 | 355 | 129 | 56 | 29 | 14 | 10 872 | 21,26 |
| Reactome ( | IC | 0 | 0 | 0 | 141 996 | 0 | 0 | 0 | 0 | 0 | 0 | 141 996 | 0,00 |
| bhf-ucl ( | IM | 0 | 0 | 16 | 278 | 60 | 31 | 7 | 0 | 0 | 0 | 392 | 45,02 |
| DIP ( | IM | 0 | 0 | 42 864 | 34 785 | 6817 | 1496 | 426 | 224 | 67 | 2 | 86 681 | 19,46 |
| I2D-IMEx ( | IM | 0 | 0 | 61 | 308 | 105 | 52 | 3 | 0 | 0 | 0 | 529 | 52,43 |
| InnateDB-IMEx ( | IM | 0 | 0 | 14 | 274 | 56 | 24 | 3 | 0 | 0 | 0 | 371 | 45,44 |
| IntAct ( | IM | 0 | 8 | 4946 | 220 853 | 11 302 | 4713 | 1957 | 492 | 147 | 29 | 244 447 | 20,67 |
| MatrixDB ( | IM | 0 | 0 | 326 | 129 | 64 | 22 | 4 | 2 | 0 | 0 | 547 | 35,27 |
| MBInfo ( | IM | 0 | 0 | 33 | 272 | 58 | 31 | 9 | 0 | 0 | 0 | 403 | 36,83 |
| MINT ( | IM | 0 | 0 | 3620 | 52 217 | 3739 | 3010 | 936 | 349 | 92 | 19 | 63 982 | 46,87 |
| MolCon ( | IM | 0 | 0 | 12 | 230 | 52 | 6 | 1 | 0 | 0 | 0 | 301 | 39,19 |
| MPIDB ( | IM | 0 | 0 | 93 | 723 | 160 | 70 | 19 | 4 | 0 | 0 | 1069 | 39,23 |
| UniProt ( | IM | 0 | 0 | 247 | 4316 | 1064 | 552 | 180 | 32 | 14 | 0 | 6405 | 45,46 |
| BAR ( | P | 29 | 6765 | 74 013 | 23 442 | 23 | 0 | 0 | 0 | 0 | 0 | 104 272 | 0,53 |
| Interoporc ( | P | 0 | 0 | 208 558 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 208 558 | 0,00 |
| Reactome-FIs ( | P | 0 | 0 | 209 988 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 209 988 | 0,00 |
| STRING ( | P | 0 | 16 335 859 | 4 110 825 | 373 086 | 31 888 | 2353 | 64 | 1 | 0 | 0 | 20 854 076 | 19,93 |
Databases have been grouped in four categories based on the type of evidences provided: imported (I), internally curated (IC), IMEX curated (IM), and predicted (P). The scoring has been performed with values reflecting IntAct ethos.
Figure 5.MIscore distribution proportion for the molecular interaction databases in Table 3. Databases have been grouped in four categories based on the type of evidences provided: imported (I), internally curated (IC), IMEX curated (IM) and predicted(p).
Figure 6.Distribution of IntAct MIscores for the pairwise interactions reported in Ref. 23. A clear and statistically significant difference in score distribution is evident between the 54% of the interactions which were correctly reported and the 46% which were effectively randomized. A Mood test for comparison of non-normally distributed samples was used to compare both groups.