| Literature DB >> 26048415 |
Zichen Wang1,2,3, Neil R Clark4,5,6, Avi Ma'ayan7,8,9.
Abstract
BACKGROUND: Thousands of biological and biomedical investigators study of the functional role of single genes and their protein products in normal physiology and in disease. The findings from these studies are reported in research articles that stimulate new research. It is now established that a complex regulatory networks's is controlling human cellular fate, and this community of researchers are continually unraveling this network topology. Attempts to integrate results from such accumulated knowledge resulted in literature-based protein-protein interaction networks (PPINs) and pathway databases. These databases are widely used by the community to analyze new data collected from emerging genome-wide studies with the assumption that the data within these literature-based databases is the ground truth and contain no biases. While suspicion for research focus biases is growing, a concrete proof for it is still missing. It is difficult to prove because the real PPINs are mostly unknown.Entities:
Mesh:
Year: 2015 PMID: 26048415 PMCID: PMC4456804 DOI: 10.1186/s12918-015-0173-z
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Mammalian PPINs resources
| PPI databases | PMID | Publication coverage | PPIs | Latest publication time |
|---|---|---|---|---|
| BIND | 12519993 | 10069 | 15895 | 2010 Aug. |
| BioCarta | NA | 1 | 189 | 1994 Jun |
| BioGrid | 16381927 | 22277 | 131438 | 2013 Nov. |
| DIP | 10592249 | 491 | 873 | 2004 Feb. |
| Ewing et al. | 17353931 | 1 | 3585 | 2007 Jan. |
| HPRD | 14681466 | 18515 | 35433 | 2010 Aug. |
| InnateDB | 18766178 | 3028 | 6052 | 2011 Jun. |
| IntAct | 14681455 | 3300 | 54248 | 2013 Jun. |
| KEA | 19176546 | 6790 | 16193 | 2010 Jun. |
| KEGG | 18077471 | 1 | 7207 | 2000 Jan. |
| MINT | 17135203 | 1265 | 11750 | 2009 Oct. |
| MIPS | 14681354 | 170 | 323 | 2004 Jan. |
| PDZBase | 15513994 | 141 | 234 | 2003 Jul. |
| PPID | 21516116 | 1980 | 2904 | 2003 May |
| SNAVI | 16099987 | 1059 | 1156 | 2006 Jan. |
| Stelzl et al. | 16169070 | 1 | 1560 | 2005 Sep. |
| Rual et al. | 16189514 | 1 | 4225 | 2005 Oct. |
| Total | NA | 37015 | 185068 | 2013 Nov. |
Properties of the artificial network models
| Networks | Nodes | Edges | Clustering coefficient | Power-law exponent | Connected components |
|---|---|---|---|---|---|
| BA graph | 25000 | 649324 | 0.011 | 1.9 | 1 |
| BA cluster graph | 25000 | 649304 | 0.182 | 2 | 1 |
| Duplication-Divergence | 25000 | 655271 | 0 | 1.7 | 1 |
| Erdős-Rényi | 25000 | 650069 | 0.002 | NA | 1 |
| Complete graph | 1000 | 499500 | 1 | NA | 1 |
Fig. 1Discovery of the mammalian and yeast LC-PPINs over time. Accumulation of discovered proteins (dotted line) and their interactions (solid line) and the discovery rate of interactions and proteins in the mammalian (a, b) and yeast (c, d) literature based PPINs. The accumulation of discovered proteins (red dots) and their interactions (blue dots) are plotted with respect to the ranking index of time for mammalian (e) and yeast (f) PPINs
Fig. 2The dynamics of individual proteins in the discovery of mammalian and yeast LC - and combined PPINs. a-b The distribution of growth exponents of the degrees of individual proteins; super-linear growth corresponds to an acceleration in the rate of discovery of PPIs involving the protein in question. c-d The normalized entropy plotted against the mean degree of the actual PPI discovery for the real network and also for reshuffled versions. e-h The distribution of time intervals between PPI discoveries involving each protein for the real PPI discovery process and also randomly reshuffled data in LC-PPINs (e-f) and combined PPINs made from both high-content and low-content studies (g-h)
Fig. 3Three model realizations with the scale-free (BA) clustered underlying artificial PPIN. a Distribution of degree growth exponents; (b) distribution of time intervals between PPI discoveries involving each protein; (c) the normalized entropy of PPI discoveries for each protein averaged over each degree
Fig. 4Relationship between community structure and PPI discovery rates in PPINs. a Connected components; (b) Communities; (c) modularity, which a quantity that measures the strength of community partition compared to random [30]. d Clusters with significant over-representation of proteins with accelerating or decelerating PPI discovery rates. e, f Subnetworks connecting proteins from two representative cold clusters where proteins are connected through their known interactions with other members of the cluster