| Literature DB >> 22448851 |
Johannes Tuikkala1, Heidi Vähämaa, Pekka Salmela, Olli S Nevalainen, Tero Aittokallio.
Abstract
BACKGROUND: Graph drawing is an integral part of many systems biology studies, enabling visual exploration and mining of large-scale biological networks. While a number of layout algorithms are available in popular network analysis platforms, such as Cytoscape, it remains poorly understood how well their solutions reflect the underlying biological processes that give rise to the network connectivity structure. Moreover, visualizations obtained using conventional layout algorithms, such as those based on the force-directed drawing approach, may become uninformative when applied to larger networks with dense or clustered connectivity structure.Entities:
Year: 2012 PMID: 22448851 PMCID: PMC3342218 DOI: 10.1186/1756-0381-5-2
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Figure 1Graph coarsening for multilevel organization. The coarsening process is visualized in a sub-network of Ito-Core (8 nodes, 10 edges). The arrows and highlighting indicate those nodes that are combined into a new metanode (MN). At the first level of the coarsening (1), nodes N2 and N3; nodes N4 and N8; and nodes N6 and N7 are merged into metanodes MN1, MN3, and MN2 (2). At the next level (3), node N1 and metanode MN1 are merged into a new metanode MN3; and node N5 and metanode MN2 are merged into metanode MN5. Finally, after merging metanodes MN5 and MN3 into metanode MN6 (4), there are only two nodes left and the coarsening process is finished (5).
Interaction networks used in the study
| Network name | Ref | Type | Screen | Nodes | Edges | D | MND | MNDSD | MCC |
|---|---|---|---|---|---|---|---|---|---|
| Ito-Core | [ | PI | PPI | 426 | 568 | 0.006 | 2.667 | 3.919 | 0.093 |
| VonMering | [ | PI | PPI | 573 | 2097 | 0.013 | 7.319 | 9.017 | 0.450 |
| Schwikowski | [ | PI | PPI | 1297 | 1862 | 0.002 | 2.871 | 3.109 | 0.125 |
| Y2H-CCSB | [ | PI | PPI | 964 | 1598 | 0.003 | 3.315 | 5.456 | 0.095 |
| Y2H-Union | [ | PI | PPI | 1647 | 2682 | 0.002 | 3.257 | 5.334 | 0.086 |
| AP/MS-Combined | [ | PI | CCA | 1004 | 8319 | 0.017 | 16.57 | 18.63 | 0.648 |
| LC-Multiple | [ | MT | LCI | 1213 | 2621 | 0.004 | 4.322 | 4.533 | 0.337 |
| Secretory-Map | [ | GI | E-MAP | 409 | 4175 | 0.050 | 20.42 | 23.82 | 0.251 |
| Chromosome-Map | [ | GI | E-MAP | 735 | 17,185 | 0.064 | 46.76 | 43.61 | 0.233 |
| Costanzo | [ | GI | SGA | 4319 | 74,984 | 0.007 | 29.96 | 41.86 | 0.062 |
| Costanzo-Stringent | [ | GI | SGA | 3811 | 35,924 | 0.004 | 16.07 | 22.92 | 0.046 |
Network types: PI, physical interactions; GI, genetic interactions; MT, mixed type. Screening methods: PPI, protein-protein interaction screen; CCA, protein co-complex association mapping; LCI, literate-curated interactions; E-MAP, epistatic miniarray profiling; SGA, synthetic genetic array mapping. Topological parameters: D, density; MND, mean node degree; MNDSD, standard deviation of MND; MCC, mean clustering coefficient of the network. Costanzo-Stringent sub-network was constructed using the interaction score cut-offs ε < -0.17 or ε > 0.21. In each network, we extracted the largest connected component to be used in the evaluations.
Figure 2Biological evaluation of the layout algorithms. The black and grey bars show the average semantic similarity over the networks of physical interactions (PI) and genetic interactions (GI), respectively. Error bars show the standard error of the mean (SEM). Only one of the GI networks could be laid out using the SEL algorithm in less than one hour. The performance of the other algorithms was compared against the multilevel layout with the clustering option on (MLL-C): *** p < 0.0005; ** p < 0.005; * p < 0.05, two-sided paired t-test.
Figure 3Example layouts for the CCSB-Y2H network. (A) yFiles Organic layout (ORL). (B) Cytoscape's Spring-embedded layout (SEL). (C) Cytoscape's Force-directed layout (FDL). (D) Multilevel layout with the clustering option (MLL-C). The number of nodes and edges in the CCSB-Y2H network are 964 and 1598, respectively, with an average clustering coefficient of 0.095. The node sizes and edge widths were standardized in Cytoscape to make the layout displays comparable in accuracy (the same zooming resolution was used in the export).
Figure 4Computation times of the layout algorithms. The running times (in seconds) are plotted as a function of the network size (number of nodes). The number of the nodes correlated significantly with the running times of the MLL, MLL-C (with or without the M-tree option) and FDL algorithms (p < 10-5). The running time of the proprietary ORL implementation converged after 1000 nodes. The systematically slowest SEL algorithm was omitted from the illustration, since it could draw only 7 out of the 11 test networks in less than one hour. The arrows point the two largest SGA genetic interaction networks (Costanzo and Costanzo-Stringent).
Figure 5Example layouts for the HPRD network. (A) yFiles Organic layout (ORL). (B) Cytoscape's Spring-embedded layout (SEL). (C) Cytoscape's Force-directed layout (FDL). (D) Multilevel layout with the clustering option (MLL-C). The HPRD network consists of 5699 protein nodes and 19,779 literature-curated protein-protein interactions. A high-resolution version of the layout solutions of the four layout algorithms is provided in Additional File 8.