| Literature DB >> 20100321 |
Svetlana Bulashevska1, Alla Bulashevska, Roland Eils.
Abstract
BACKGROUND: We present a statistical method of analysis of biological networks based on the exponential random graph model, namely p2-model, as opposed to previous descriptive approaches. The model is capable to capture generic and structural properties of a network as emergent from local interdependencies and uses a limited number of parameters. Here, we consider one global parameter capturing the density of edges in the network, and local parameters representing each node's contribution to the formation of edges in the network. The modelling suggests a novel definition of important nodes in the network, namely social, as revealed based on the local sociality parameters of the model. Moreover, the sociality parameters help to reveal organizational principles of the network. An inherent advantage of our approach is the possibility of hypotheses testing: a priori knowledge about biological properties of the nodes can be incorporated into the statistical model to investigate its influence on the structure of the network.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20100321 PMCID: PMC2831004 DOI: 10.1186/1471-2105-11-46
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Social proteins in the human protein interaction network.
| Protein | Symbol | Degree | alpha mean | sd | 2.5%quantile | 97.5%quantile | Disorder |
|---|---|---|---|---|---|---|---|
| NP_473376.1 | UNC119 | 56 | 3.48 | 0.13 | 3.14 | 3.70 | 1 |
| CAD39125.1 | RIF1 | 53 | 3.41 | 0.13 | 3.06 | 3.63 | 1 |
| AAH12509.1 | EEF1A1 | 49 | 3.30 | 0.14 | 3.03 | 3.57 | 0 |
| AAH13918.1 | EEF1G | 41 | 3.06 | 0.14 | 2.77 | 3.34 | 0 |
| NP_000537.2 | TP53 | 36 | 2.89 | 0.15 | 2.51 | 3.13 | 1 |
| AAK55500.1 | CRMP1 | 36 | 2.89 | 0.15 | 2.59 | 3.18 | 0 |
| BAA92615.1 | KIAA1377 | 35 | 2.85 | 0.15 | 2.47 | 3.10 | 1 |
| NP_002816.1 | PTN | 34 | 2.81 | 0.15 | 2.42 | 3.06 | 1 |
| NP_036564.1 | SETDB1 | 32 | 2.73 | 0.16 | 2.34 | 2.99 | 1 |
| XP_351098.1 | CHD3 | 30 | 2.65 | 0.16 | 2.25 | 2.91 | 1 |
| NP_005068.2 | TLE1 | 29 | 2.61 | 0.16 | 2.21 | 2.87 | 1 |
| NP_057144.1 | CGI125 | 28 | 2.56 | 0.17 | 2.23 | 2.88 | 0 |
| NP_009107.1 | C14orf1 | 26 | 2.47 | 0.17 | 2.13 | 2.80 | 0 |
| NP_874368.1 | HTATIP | 25 | 2.42 | 0.17 | 2.08 | 2.74 | 0 |
| NP_002961.1 | SAT | 23 | 2.31 | 0.18 | 1.96 | 2.65 | 0 |
| NP_003371.1 | VIM | 22 | 2.26 | 0.18 | 1.82 | 2.56 | 1 |
| NP_004146.1 | SERPINB9 | 20 | 2.14 | 0.19 | 1.76 | 2.49 | 0 |
| AAH33561.1 | DKFZP564O0523 | 19 | 2.08 | 0.19 | 1.69 | 2.44 | 0 |
| NP_006824.2 | COPS6 | 18 | 2.01 | 0.20 | 1.62 | 2.39 | 0 |
| O00231 | PSMD11 | 18 | 2.01 | 0.19 | 1.62 | 2.38 | 0 |
| NP_009153.2 | ZHX1 | 17 | 1.95 | 0.20 | 1.48 | 2.28 | 1 |
| NP_002564.1 | PAFAH1B3 | 17 | 1.94 | 0.20 | 1.55 | 2.32 | 0 |
| NP_008896.1 | ZNF24 | 16 | 1.87 | 0.20 | 1.39 | 2.22 | 1 |
| NP_001060.1 | TUBB | 15 | 1.79 | 0.21 | 1.37 | 2.19 | 0 |
| NP_000994.1 | RPLP1 | 14 | 1.72 | 0.21 | 1.21 | 2.08 | 1 |
| NP_006640.2 | SDCCAG16 | 13 | 1.63 | 0.22 | 1.11 | 2.00 | 1 |
| NP_777576.1 | UBR1 | 13 | 1.63 | 0.22 | 1.18 | 2.04 | 0 |
| P04183 | TK1 | 13 | 1.63 | 0.22 | 1.17 | 2.05 | 0 |
| NP_002937.1 | RPA2 | 13 | 1.62 | 0.22 | 1.18 | 2.05 | 0 |
| NP_006312.1 | ARIH2 | 12 | 1.53 | 0.23 | 1.07 | 1.97 | 0 |
| NP_002281.1 | LAMA4 | 12 | 1.53 | 0.23 | 1.06 | 1.96 | 0 |
| NP_060267.2 | BTBD2 | 11 | 1.44 | 0.24 | 0.96 | 1.88 | 0 |
| NP_003940.2 | HAP1 | 11 | 1.43 | 0.24 | 0.88 | 1.83 | 1 |
| Q9Y2X7 | GIT1 | 11 | 1.43 | 0.23 | 0.96 | 1.88 | 0 |
| NP_002037.2 | GAPD | 11 | 1.43 | 0.24 | 0.95 | 1.88 | 0 |
| NP_004630.2 | BAT3 | 10 | 1.33 | 0.25 | 0.75 | 1.74 | 1 |
| Q13332 | PTPRS | 10 | 1.32 | 0.24 | 0.83 | 1.78 | 0 |
| NP_005878.1 | DLEU1 | 10 | 1.32 | 0.25 | 0.81 | 1.79 | 0 |
| BAB14293.1 | ASC1p100 | 10 | 1.32 | 0.24 | 0.82 | 1.79 | 0 |
| NP_001316.1 | CSTF2 | 9 | 1.21 | 0.26 | 0.62 | 1.65 | 1 |
| NP_002148.1 | HSPE1 | 9 | 1.21 | 0.25 | 0.68 | 1.69 | 0 |
| NP_061730.1 | PCDHA4 | 9 | 1.20 | 0.26 | 0.68 | 1.69 | 0 |
| CAD97612.1 | IMMT | 8 | 1.08 | 0.27 | 0.45 | 1.54 | 1 |
| NP_057103.1 | LUC7L2 | 8 | 1.08 | 0.27 | 0.45 | 1.53 | 1 |
| CAB72445.1 | BRD7 | 8 | 1.08 | 0.27 | 0.45 | 1.53 | 1 |
| NP_061960.1 | ARFRP2 | 8 | 1.08 | 0.27 | 0.53 | 1.58 | 0 |
| NP_002203.1 | ITGB4BP | 8 | 1.07 | 0.27 | 0.52 | 1.57 | 0 |
| NP_005997.2 | ZNF145 | 8 | 1.07 | 0.27 | 0.52 | 1.58 | 0 |
| NP_008998.1 | MYST2 | 7 | 0.93 | 0.29 | 0.28 | 1.42 | 1 |
| NP_060719.3 | CDK5RAP2 | 7 | 0.93 | 0.29 | 0.28 | 1.41 | 1 |
| AAB96331.1 | APLP1 | 7 | 0.93 | 0.29 | 0.27 | 1.41 | 1 |
| NP_001253.1 | CDKN2C | 7 | 0.93 | 0.29 | 0.34 | 1.46 | 0 |
| NP_036569.1 | SNAPAP | 7 | 0.93 | 0.28 | 0.34 | 1.46 | 0 |
| NP_000362.1 | TTR | 7 | 0.93 | 0.29 | 0.34 | 1.46 | 0 |
| Q96RU7 | C20orf97 | 7 | 0.93 | 0.28 | 0.34 | 1.46 | 0 |
| AAH33094.1 | IKBKAP | 7 | 0.93 | 0.29 | 0.34 | 1.46 | 0 |
| NP_004441.1 | ERH | 7 | 0.93 | 0.29 | 0.33 | 1.46 | 0 |
| NP_005774.2 | APACD | 7 | 0.93 | 0.28 | 0.35 | 1.45 | 0 |
| NP_001782.1 | CDC42 | 7 | 0.93 | 0.29 | 0.33 | 1.46 | 0 |
| NP_002613.2 | PFDN1 | 6 | 0.76 | 0.31 | 0.07 | 1.28 | 1 |
| NP_004302.1 | ARL3 | 6 | 0.76 | 0.30 | 0.14 | 1.33 | 0 |
| NP_002680.2 | POLA2 | 6 | 0.76 | 0.30 | 0.13 | 1.32 | 0 |
| NP_001460.1 | G22P1 | 6 | 0.76 | 0.30 | 0.13 | 1.33 | 0 |
| NP_444252.1 | PFN2 | 6 | 0.76 | 0.30 | 0.13 | 1.33 | 0 |
| NP_006077.1 | TUBB4 | 6 | 0.76 | 0.31 | 0.13 | 1.33 | 0 |
| NP_005251.1 | GDF9 | 6 | 0.76 | 0.30 | 0.13 | 1.32 | 0 |
| NP_002936.1 | RPA1 | 6 | 0.76 | 0.30 | 0.13 | 1.32 | 0 |
| NP_071921.1 | FTS | 6 | 0.76 | 0.30 | 0.13 | 1.32 | 0 |
| AAH08720.1 | CRELD1 | 6 | 0.76 | 0.30 | 0.13 | 1.31 | 0 |
Table 1 contains the list of 69 social proteins revealed by the statistical analysis. NCBI identifiers (RefSeqIds) and symbols (as used by [36]) of the proteins are displayed. For each protein, the degree of the respective node in the network is showed. Columns 4-7 present the estimated mean, standard deviation, 2.5% and 97.5% quantiles of the sociality parameter α. The last column contains the disorder group of each protein. (The proteins are sorted in the decreasing order of the mean of the sociality parameter.)
Figure 1Heatmap and cluster diagram demonstrating the clustering of proteins based on their sociality parameters. Figure 1 presents the heatmap and the cluster diagram depicting groups of structurally similar proteins in the protein interaction network obtained by clustering the sociality parameters of the proteins.
Clusters of structurally similar proteins.
| Cluster | Size | Degrees | GO Terms enriched (Molecular Function, "MF") |
|---|---|---|---|
| Cluster 1 | 16 | 22-56 | GO:0000739 DNA strand annealing activity |
| GO:0003682 chromatin binding | |||
| GO:0003746 translation elongation factor activity | |||
| GO:0004145 diamine N-acetyltransferase activity | |||
| GO:0004157 dihydropyrimidinase | |||
| GO:0004864 protein phosphatase inhibitor activity | |||
| GO:0016455 RNA polymerase II transcription mediator activity | |||
| GO:0018024 histone-lysine N-methyltransferase activity | |||
| GO:0050681 androgen receptor binding | |||
| Cluster 2 | 32 | 8-20 | GO:0008270 zinc ion binding |
| Cluster 3 | 86 | 4-7 | GO:0003899 DNA-directed RNA polymerase activity |
| GO:0005125 cytokine activity | |||
| GO:0019900 kinase binding | |||
| GO:0042562 hormone binding | |||
| Cluster 4 | 131 | 2-3 | GO:0003777 microtubule motor activity |
| GO:0005088 Ras guanyl-nucleotide exchange factor activity | |||
| Cluster 5 | 136 | 1 | GO:0003700 transcription factor activity |
| GO:0003723 RNA binding | |||
| GO:0003779 actin binding | |||
| GO:0005198 structural molecule activity | |||
Table 2 presents the groups of proteins obtained by clustering the respective sociality parameters. The number of proteins in each cluster, their degrees and molecular functions found by GO-significance analysis are displayed.
Figure 2Network diagram. Figure 2 presents the diagram summarizing the protein interaction network as five groups of structurally similar proteins with interaction flows between them. The size and grey color intensity of the nodes reflect the degrees of proteins contained in each group. The width and labeling of the ties reflect the number of interactions between proteins belonging to two groups (demonstrated as percentage of the total network interactions). The arrows depict the self-interaction of each group.