| Literature DB >> 16351748 |
Maria Persico1, Arnaud Ceol, Caius Gavrila, Robert Hoffmann, Arnaldo Florio, Gianni Cesareni.
Abstract
BACKGROUND: The application of high throughput approaches to the identification of protein interactions has offered for the first time a glimpse of the global interactome of some model organisms. Until now, however, such genome-wide approaches have not been applied to the human proteome.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16351748 PMCID: PMC1866386 DOI: 10.1186/1471-2105-6-S4-S21
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1HomoMINT as a web tool. HomoMINT can be searched and analyzed by taking advantage of the tools developed for MINT. A) A search can be carried out in the protein table by entering in the form one of the following: a protein name, a Uniprot or a PDB identifier, a keyword, an InterPro domain or a gene ontology term (top part of the form). Alternatively the search can be carried out on the interaction table (centre). Finally (lower part) a BLAST search can be carried out by entering a protein sequence. B) Search output listing on the right the partners of the query protein and on the left the experimental evidence supporting the interactions. C) The Mint Viewer is an applet that permits the graphic display of interaction networks. Edges marked by small blue circles indicate that the corresponding interactions were inferred from experiments carried out in model organisms, while yellow circles mark interactions supported by direct experimental results. Interactions that are inferred from model organisms but are also supported by direct experiments are marked by yellow circles with a blue contour. A series of check boxes make it possible to visualize interactions inferred by any combination of model organism interactomes.
Intersection of human interactomes in public databases
| MINT | DIP | BIND | Intact | React. | HPRD | MIPS | ||
| Nr. of edges | ||||||||
| MINT | 3679 | x | 315 | 340 | 1350 | 101 | 429 | 54 |
| DIP | 990 | x | 158 | 22 | 67 | 209 | 26 | |
| BIND | 4671 | x | 356 | 229 | 733 | 50 | ||
| Intact | 2860 | x | 103 | 208 | 16 | |||
| Reactome | 15068 | x | 269 | 16 | ||||
| HPRD | 6891 | x | 84 | |||||
| MIPS | 777 | x |
Inferred ad experimental networks compared in this work
| Dataset | Number of interactions | Description or reference |
| OPHID | 23359 | [15] |
| Sanger | 37007 | [13] |
| Sanger H.C. | 5647 | [13] |
| HomoMINT | 9749 | This work |
| HomoMINT_filtered | 5203 | HomoMINT filtered for domain architecture conservation. |
| HMINT_2 int | 290 | inferred from interactions confirmed by at least two experiments. |
| HMINT_2 org | 126 | inferred from interactions supported by experiments in at least two model organisms |
| HM_LT | 543 | Inferred from interactions discovered by low throughput experiments. |
| HEN | 28531 | Compilation of interactions between human proteins |
| iHOP | 278452 | [28] |
Overlap between inferred and experimental human networks
| OPHID | Sanger | HEN | % overlap | LLR | ||
| 23359 | 37007 | 28531 | ||||
| HomoMINT | 9749 | 3501 | 2794 | 694 | 7.1 | 4.2 |
| OPHID | 23359 | 7067 | 1632 | 7.0 | 4.1 | |
| Sanger | 37007 | 1504 | 4.1 | 3.6 | ||
| Sanger H.C. | 5647 | 841 | 14.9 | 5.0 | ||
| HM_filtered | 5203 | 1818 | 1391 | 453 | 8.7 | 4.4 |
| HMINT_2int | 810 | 290 | 227 | 218 | 26.9 | 5.7 |
| HMINT_2org | 126 | 70 | 75 | 60 | 47.6 | 6.6 |
| HM_LT | 543 | 69 | 63 | 131 | 24.1 | 5.6 |
For this comparison we mapped all the proteins to Uniprot ids. In this process proteins (and their interactions) that could not be confidently mapped were eliminated from the networks.
Overlap of the inferred and experimental human networks with iHOP
| H_MINT | H_MINT ctrl | Sanger | Sanger ctrl. | OPHID | OPHID ctrl. | HEN | HEN ctrl. | ||
| Nr of Edges | 7658 | 7658 | 26590 | 26590 | 12887 | 12887 | 23332 | 23332 | |
| iHOP* (sentence) | 278,452 | 522 (6.8) | 57 (0.7) | 857 (3.2) | 233 (0.8) | 941 (7.3) | 88 (0.7) | 5293 (22.6) | 615 (2.7) |
| iHOP (pattern) | 47,807 | 229 (3) | 9 (0.1) | 254 (1) | 53 (0.2) | 468 (3.6) | 14 (0.1) | 2675 (11.5) | 176 (0.7) |
*The iHOP (sentence) network includes interactions between proteins whose names are found in the same sentence in an abstract. iHOP (pattern) is a subnetwork linking proteins found in a pattern of type gene_name_A/verb/ gene_name_B. The networks that are compared with iHOP are described in the main text. The corresponding 'ctrl' networks are scrambled networks containing the same nodes and the same number of edges. For this comparison we mapped all the proteins to Locus Link ids. In this process proteins (and their interactions) that could not be confidently mapped were eliminated from the networks.
Figure 2Degree of common annotation in interacting protein pairs in experimental and inferred networks. A) Schematic representation of the algorithm used to evaluate the relatedness of gene ontology annotation. The Gene Ontology graph induced by protein 'i ' is in green, while the one induced by protein 'j' is in blue. Dij is the number of edges that the two induced graphs have in common. B) For any given network we have derived a 'scrambled network' containing the same protein nodes linked by the same number of edges with their connections rearranged at random. For each interacting protein pair, in which both proteins have a GO annotation, we have then calculated Dij. Finally we have plotted, as a function of Dij, the difference between the percentage of nodes having a specific Dij in the inferred and in the scrambled network.
Figure 3Degree distribition of the HomoMINT network compared with different biological networks. Frequency of nodes with k links for A) the model organism experimental networks in the MINT database B) the assembled Human experimental network (HEN), the Human inferred (HomoMINT) data set, the Mammalian data set in MINT and C), for a random network of similar size and for a scale-free network assembled according to Barabasi [31].
Graph analysis
| Data set | Nodes (N) | Edges* (L) | Clust. coeff. 1 | MPL2 | <k> 3 | d_LCC4 |
| HomoMINT | 4067 | 9132 | 0.04 | 4.9 | 4.73 | 12 |
| HEN | 4933 | 22124 | 0.16 | 4.5 | 9.4 | 15 |
| C. elegans | 2834 | 4406 | 0.02 | 4.8 | 3.2 | 13 |
| D. melanogaster | 7005 | 20282 | 0.01 | 4.4 | 5.8 | 11 |
| S. cerevisiae | 4584 | 12055 | 0.07 | 4.4 | 5.3 | 12 |
| Random2000 # | 1989 | 5047 | 0.002 | 4.8 | 5.0 | 11 |
| Random5000 # | 4893 | 9935 | 0.001 | 6.2 | 4.0 | 13 |
*Number of edges may be different from those reported in Table 2 because in this analysis we have neglected interactions leading to homodimerization.
#Random2000 and Random5000 are random networks with approximately 2000 and 5000 nodes.
1Average of the clustering coefficient of the nodes in the network.
2MPL is the average of the minimal path length between two nodes of the graph.
3
4d_LCC is the diameter of the largest connected component of the graph