| Literature DB >> 18387199 |
Paolo Marcatili1, Giovanni Bussotti, Anna Tramontano.
Abstract
BACKGROUND: Protein-protein interactions are at the basis of most cellular processes and crucial for many bio-technological applications. During the last few years the development of high-throughput technologies has produced several large-scale protein-protein interaction data sets for various organisms. It is important to develop tools for dissecting their content and analyse the information they embed by data-integration and computational methods.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18387199 PMCID: PMC2323660 DOI: 10.1186/1471-2105-9-S2-S11
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Snapshot of the input page of the MoVIN server. The user can upload a Protein-protein interaction map in any of the accepted formats (tab or comma separated) with or without merging it with existing datasets. The minimum and maximum cluster size as well as the threshold E-values for MEME and MAST can be selected as well.
Dataset Summary. The number of interactions in each dataset is between 942 (Uetz) and 51,086 (BioGRID), the number of clusters containing more than 4 proteins is between 100 (Uetz) and 3,963. The average number of proteins in each cluster ranges from 6.15 (Uetz) to 24.96 (BioGRID).
| Dataset | # of interactions | # of clusters | average cluster size |
| BIND | 8847 | 1271 | 9.60 |
| BioGRID | 51086 | 3963 | 24.96 |
| Gavin02 | 3500 | 522 | 9.78 |
| Gavin06 | 19973 | 1738 | 20.94 |
| Krogan | 6699 | 1042 | 10.44 |
| Uetz | 942 | 100 | 6.15 |
Figure 2Motif P Value Distributions for yeast datasets. We report the number of interactions with Motif P value lower than the threshold (reported on the × axis in logarithmic scale) for the experimental datasets (blue) and for its randomized version (red). The interactions for which no motif was found are reported as bars in the origin. The Motif P value distribution for the experimental datasets contains a larger fraction of interactions with respect to the random datasets and is shifted towards lower P values. (A) BIND, (B) BioGRID, (C) Gavin06 and (D) Krogan datasets.
Figure 3Alignment of the motif present in 5 out of the 7 proteins binding to YKL074C. The first protein (YBR172C) is annotated as “Cytoskeleton organization and biogenesis”, the second (YPL105C) does not have any GO annotation. The last three (YML046W, YLR117C and YLR357W) are annotated with terms related to the spliceosome activity.
Statistical Significance of the Motif P value Distributions. The table reports the Z-score of the mean of the experimental distribution with respect to the random distribution of the means. The latter was obtained by computing the means of 100,000 distributions of the same size of the experimental one obtained by randomly extracting interactions from the original and the randomized distributions.
| BIND | Motif | Process | Component | Domain |
| 33,17 | 127,70 | 26,46 | 82,85 | 32,31 |
Correlation coefficient between Motif, Process, Component and Domain P values
| BIND | Motif | Process | Component |
| Motif | 1.00 | 0.30 | 0.26 |
| Process | 0.30 | 1.00 | 0.80 |
| Component | 0.26 | 0.80 | 1.00 |
| Domain | 0.57 | 0.40 | 0.37 |