| Literature DB >> 23028653 |
Donglei Du1, Connie F Lee, Xiu-Qing Li.
Abstract
Most protein PageRank studies do not use signal flow direction information in protein interactions because this information was not readily available in large protein databases until recently. Therefore, four questions have yet to be answered: A) What is the general difference between signal emitting and receiving in a protein interactome? B) Which proteins are among the top ranked in directional ranking? C) Are high ranked proteins more evolutionarily conserved than low ranked ones? D) Do proteins with similar ranking tend to have similar subcellular locations? In this study, we address these questions using the forward, reverse, and non-directional PageRank approaches to rank an information-directional network of human proteins and study their evolutionary conservation. The forward ranking gives credit to information receivers, reverse ranking to information emitters, and non-directional ranking mainly to the number of interactions. The protein lists generated by the forward and non-directional rankings are highly correlated, but those by the reverse and non-directional rankings are not. The results suggest that the signal emitting/receiving system is characterized by key-emittings and relatively even receivings in the human protein interactome. Signaling pathway proteins are frequent in top ranked ones. Eight proteins are both informational top emitters and top receivers. Top ranked proteins, except a few species-related novel-function ones, are evolutionarily well conserved. Protein-subunit ranking position reflects subunit function. These results demonstrate the usefulness of different PageRank approaches in characterizing protein networks and provide insights to protein interaction in the cell.Entities:
Mesh:
Year: 2012 PMID: 23028653 PMCID: PMC3446998 DOI: 10.1371/journal.pone.0044872
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Ranking position distribution in forward method and reverse method when proteins are sorted by the non-directional ranking positions from top to bottom. A.
Forward vs. non-directional. B. Reverse vs. non-directional. The node size is in terms of ranking percentage of the total ranking probability of all the proteins.
Figure 2The relative sizes (reflected by % of ranking probability) of the 50 largest nodes (proteins) from each type of ranking of 2,249 human proteins in the interaction network.
Fwd: Forward ranking, in which the larger a node is, usually the more sources it receives information from. Rev: Reverse ranking, in which the larger a node is, usually the more proteins it can regulate. Non-D: Non-directional ranking, in which the larger a node is, usually the more interactions (connections in the network) it has with others. Note that the first few large nodes of regulators (Rev) are much larger than those of the top information receivers (Fwd), but the remaining regular nodes are relatively smaller than receivers.
Categories of top 50 ranked proteins.
| Ranking method | No. ranked proteins | Proteins in known pathways | Proteins without pathway assignment | Signalling proteins (no.) | Signalling/knownpathway proteins (%) | MAPK pathway proteins (no) | MAPK/all proteins (%) |
| Forward | Top 50 | 43 | 7 | 37 | 86.0 | 15 | 34.9 |
| Reverse | Top 50 | 46 | 4 | 39 | 85.0 | 14 | 30.4 |
| Non-directional | Top 50 | 45 | 5 | 37 | 80.4 | 11 | 24.4 |
| Forward | Bottom 49 | 17 | 32 | 7 | 14.29 | 1 | 5.88 |
| Reverse | Bottom 50 | 21 | 29 | 11 | 22.00 | 2 | 9.52 |
| Non-directional | Bottom 49 | 17 | 32 | 7 | 14.29 | 1 | 5.88 |
: The ChiTEST analysis comparing forward top ranks and reverse top ranks with non-directional top ranks showed no significant difference (P<0.05) in either the number of signalling proteins or the number of MAPK proteins.
one of the bottom 50 proteins did not found BLASTp target in C. elegans.
because one of the bottom 50 proteins could not be detected in KEGG database.
Comparison of the ranking methods in terms of BLASTp results and the average length of the 50 top ranked proteins.
| Rankingmethod | No. of toprankedproteins | BLAST bitsaverage | Protein length average(No. of amino acids) |
| Forward | 50 | 322.58 A | 686 A |
| Reverse | 50 | 329.58 A | 699 A |
| Non-directional | 50 | 280.54 A | 506 B |
values labelled with the same letter in the same column are not significantly different at the P<0.05 level based on ANOVA and Duncan’s multiple-range test.
Ranking position, degree of evolutionary conservation in terms of BLASTp hit bits in a search against Caenorhabditis elegans proteins, and protein length.
| Ranking method | Ranking degree | No. of top rankedproteins | BLAST bits average | Average protein length(No. of amino acids) |
| Forward | Top | 50 | 322.58 A | 686 A |
| Mid | 49 | 200.16 B | 712 A | |
| Low | 49 | 197.90 B | 734 A | |
| Reverse | Top | 50 | 329.58 A | 699 A |
| Mid | 50 | 200.04 B | 628 A | |
| Low | 49 | 156.80 B | 604 A | |
| Non-directional | Top | 50 | 280.54 A | 506 B |
| Mid | 50 | 292.70 A | 872 A | |
| Low | 50 | 203.06 A | 734 AB |
values labelled with the same letter in the same column, within each top, middle or bottom panel, are not significantly different at the P<0.05 level based on ANOVA and Duncan’s multiple-range test.
Comparison of protein location grouping by the PIDS approach and the protein ranking approach using feedback pathway proteins*.
| Sorting criterion | No. proteins | Nucleus proteins (n) |
| Highest PIDS | 50 | 25 |
| Lowest PIDS | 50 | 20 |
| Top rank (forward) | 50 | 0 |
| Low rank (forward) | 50 | 31 |
| Top rank (reverse) | 50 | 40 |
| Low rank (reverse) | 50 | 0 |
| Top rank (non-directional) | 50 | 3 |
| Low rank (non-directional) | 50 | 31 |
*The protein database analyzed is the database of 379 feedback pathway proteins in the previous publication [13]. The PIDS values and subcellular location information were counted according to the same publication [13]. The ranking data are from the present study.
Subcellular locations of top- and bottom- ranked proteins from the 2,249 protein database that have both feedback and non-feedback pathway proteinsa.
| Subcellular location | ||||
| Ranked position | Nucleus (%) | Cytoplasm (%) | Membrane (%) | In both nucleus and another (%) |
| Forwardly ranked top 50 | 38 | 58 | 46 | 36 |
| Forwardly ranked bottom 50 | 18 | 28 | 58 | 12 |
| Reversely ranked top 50 | 48 | 78 | 50 | 42 |
| Reversely ranked bottom 50 | 52 | 56 | 32 | 28 |
| ChiTEST | 0.0026 | 0.1068NS | 0.0309 | 0.0404 |
The 2,249 protein database [13] analyzed contains both feedback and non-feedback pathway proteins.
Protein location based on http://www.uniprot.org/. Each protein can be in more than one location. Each ranked position group (top 50 or bottom 50) contains 50 proteins that their subcellular locations can be identified or suggested by the UniProt database. In case 1 or 2 proteins lack the subcellular location information, the proteins at the 51st and 52ed positions were used as replacements.
The ChiTEST was between forward ranking and reverse ranking. The top/bottom ratio of protein numbers of the reverse ranking was tested using the top/bottom ratio of forwardly ranked proteins as the reference ratio.
Significant.
Highly significant. NS: Non-significant.