| Literature DB >> 18301745 |
Takeshi Hase1, Yoshihito Niimura, Tsuguchika Kaminuma, Hiroshi Tanaka.
Abstract
Protein-protein interaction networks (PINs) are scale-free networks with a small-world property. In a small-world network, the average cluster coefficient (<C>) is much higher than in a random network, but the average shortest path length (<L>) is similar between the two networks. To understand the evolutionary mechanisms shaping the structure of PINs, simulation studies using various network growth models have been performed. It has been reported that the heterodimerization (HD) model, in which a new link is added between duplicated nodes with a uniform probability, could reproduce scale-freeness and a high <C>. In this paper, however, we show that the HD model is unsatisfactory, because (i) to reproduce the high <C> in the yeast PIN, a much larger number (n(HI)) of HD links (links between duplicated nodes) are required than the estimated number of n(HI) in the yeast PIN and (ii) the spatial distribution of triangles in the yeast PIN is highly skewed but the HD model cannot reproduce the skewed distribution. To resolve these discrepancies, we here propose a new model named the non-uniform heterodimerization (NHD) model. In this model, an HD link is preferentially attached between duplicated nodes when they share many common neighbors. Simulation studies demonstrated that the NHD model can successfully reproduce the high <C>, the low n(HI), and the skewed distribution of triangles in the yeast PIN. These results suggest that the survival rate of HD links is not uniform in the evolution of PINs, and that an HD link between high-degree nodes tends to be evolutionarily conservative. The non-uniform survival rate of HD links can be explained by assuming a low mutation rate for a high-degree node, and thus this model appears to be biologically plausible.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18301745 PMCID: PMC2253498 DOI: 10.1371/journal.pone.0001667
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Simulation.
(A) HD model. Node A is duplicated to generate node A′. Each of the links to node A′ is removed with a uniform probability α (left). Note that this method is based on completely asymmetric divergence [44], in which only one (A′) of the duplicated nodes is the target of removal of links. An HD link between node A and node A′ is attached with a uniform probability β (middle). (B) Evolutionary distance. When a node is duplicated, the evolutionary distance between each of the duplicated nodes and each of the other nodes in a network is assumed to increase by one due to mutations occurring in the duplicated nodes during the divergence process. Suppose that the evolutionary distance between node A and node B is d (left). After the duplication of node A to generate node A′ and the divergence of them, the evolutionary distance between nodes A and B, and that between nodes A′ and B become d+1 whether a link between nodes A and B and that between A′ and B are present or not (middle). (A dashed line indicates absence of a link.) The evolutionary distance between nodes A and A′ is defined to be 1 regardless of the presence of a link between them. After that, if node A′ is duplicated to create node A”, the evolutionary distance between nodes A and B continues to be d+1, while the evolutionary distances between nodes A and A′, A and A″, B and A′, and B and A″ become 2, 2, d+2, and d+2, respectively (right). (C) NHD model. In this model, the probability that a link is added between A and A′ is proportional to the number (n N) of common neighbors shared by these nodes.
Statistics of the networks by the HD and NHD models and the yeast PIN
| Model |
|
|
|
|
|
| < |
|
|
| HD model | 1 | 0.725 | 0.061 | 1,312 (11) | 140 (12) | 0.107 (0.009) | 3.73 (0.09) | 0.066 (0.006) | 6.45 (0.14) |
| 2 | − | − | 3,031 (27) | 269 (19) | 0.089 (0.006) | − | − | − | |
| 3 | − | − | 5,309 (43) | 395 (25) | 0.074 (0.005) | − | − | − | |
| 4 | − | − | 8,337 (65) | 514 (31) | 0.062 (0.004) | − | − | − | |
| 5 | − | − | 12,363 (92) | 628 (42) | 0.051 (0.003) | − | − | − | |
| NHD model | 1 | 0.745 | 0.028 | 1,308 (11) | 52 (6) | 0.040 (0.005) | 3.74 (0.07) | 0.066 (0.006) | 6.23 (0.12) |
| 2 | − | − | 3,030 (22) | 105 (11) | 0.035 (0.004) | − | − | − | |
| 3 | − | − | 5,315 (42) | 157 (17) | 0.029 (0.003) | − | − | − | |
| 4 | − | − | 8,351 (61) | 208 (21) | 0.025 (0.003) | − | − | − | |
| 5 | − | − | 12,373 (86) | 259 (28) | 0.021 (0.002) | − | − | − | |
| Yeast PIN | 6,544 | 175 | 0.027 | 3.74 | 0.066 | 4.85 | |||
| Random | 3.74 | 0.00096 | 6.27 |
The number in parentheses represents the standard deviation calculated from 100 networks generated by simulations. −, the same as above.
Parameters used in the simulations. See Materials and Methods.
The number of homologous pairs. Two nodes are defined to be homologous when the evolutionary distance between the two nodes is d T or less.
The number of interactions between homologous proteins.
The average degree.
The average cluster coefficient.
The average shortest path length.
The yeast PIN without self-interactions.
A random network that has the same
Figure 2Properties in the networks by the HD and NHD models.
Black squares, red diamonds, and green crosses show the values for the yeast PIN, the network generated by the NHD model, and the network by the HD model, respectively. The results for the HD and NHD models were obtained by taking the average among 100 networks generated by simulations. (A) Degree distribution P(k). The dashed line represents (k
0+k)−e− with γ = 2.7, k
0
= 3.4, and k
c = 50. (B) Distribution of the average cluster coefficient
Figure 3HD links in the yeast PIN and in the networks by simulations.
Black squares, red diamonds, and green crosses show the values for the yeast PIN, the network generated by the NHD model, and the network by the HD model, respectively. (A) Distribution of P
HD(n
N), the probability that an HD link exists between two homologous proteins when they share n
N common neighbors (for d
T = 3). The slopes of the dashed lines are 0.028 (red) and 0 (green). The result for d
T = 4 is nearly identical to this result (data not shown). (B) Distribution of