| Literature DB >> 20679333 |
Arnold Kuzniar1, Somdutta Dhir, Harm Nijveen, Sándor Pongor, Jack A M Leunissen.
Abstract
UNLABELLED: Multi-netclust is a simple tool that allows users to extract connected clusters of data represented by different networks given in the form of matrices. The tool uses user-defined threshold values to combine the matrices, and uses a straightforward, memory-efficient graph algorithm to find clusters that are connected in all or in either of the networks. The tool is written in C/C++ and is available either as a form-based or as a command-line-based program running on Linux platforms. The algorithm is fast, processing a network of > 10(6) nodes and 10(8) edges takes only a few minutes on an ordinary computer. AVAILABILITY: http://www.bioinformatics.nl/netclust/.Entities:
Mesh:
Year: 2010 PMID: 20679333 PMCID: PMC2944197 DOI: 10.1093/bioinformatics/btq435
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.The principle of Multi-netclust is illustrated on a two-parameter network. Thick and thin edges correspond to distinct similarity data (A). Dotted lines denote edges that are below the respective threshold, and hence can be omitted from the networks. Two different aggregation rules are implemented: the weighted arithmetic averaging (‘sum rule’) gives clusters that are connected within either of the two networks (B); the weighted geometric averaging (‘product rule’) gives clusters that are connected within both networks (C). M denotes the value assigned to the edges, w is the weighting factor (‘alpha’) of the two matrices (hence n = 2) and Mmix refers to the aggregated matrix.
Protein classification results obtained for the individual and combined similarity networks
| Dataset | Correct | Incorrect | Singletons |
|---|---|---|---|
| SW × DALI1 (251) | 910 | 0 | 447 |
| BLAST (0.1) × DALI2 (0.4) | 888 | 0 | 469 |
| BLAST (0.4) + DALI2 (0.4) | 803 | 469 | 85 |
| SW (251) | 316 | 0 | 1041 |
| DALI1 (251) | 56 | 1266 | 35 |
| DALI2 (0.4) | 790 | 475 | 92 |
| BLAST (0.4) | 36 | 0 | 1321 |
| BLAST (0.1) | 66 | 1101 | 190 |
Numbers in parentheses denote the threshold used. Symbols ‘×’ and ‘+’ refer to the product and sum aggregation rules, respectively. The results were obtained for ‘alpha’ weighting factor 0.5.
DALI1, matrix of raw scores; DALI2, matrix of diagonally normalized scores; correct, proteins connected only to members of the same SCOP superfamily; incorrect, proteins connected to members of other SCOP superfamilies.