| Literature DB >> 25474367 |
Xianjun Shen, Yanli Zhao, Yanan Li, Tingting He, Jincai Yang, Xiaohua Hu.
Abstract
BACKGROUND: In recent years, many protein complex mining algorithms, such as classical clique percolation (CPM) method and markov clustering (MCL) algorithm, have developed for protein-protein interaction network. However, most of the available algorithms primarily concentrate on mining dense protein subgraphs as protein complexes, failing to take into account the inherent organizational structure within protein complexes. Thus, there is a critical need to study the possibility of mining protein complexes using the topological information hidden in edges. Moreover, the recent massive experimental analyses reveal that protein complexes have their own intrinsic organization.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25474367 PMCID: PMC4255745 DOI: 10.1186/1471-2105-15-S12-S7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Convert the undirected edge to directed and weighted edges. (a) is an interactive graph (undirected and unweighted graph) including node s and node t .(b) shows the undirected and unweighted edge between node s and node t . (c) shows the directed and weighted edges between node s and node t after conversion treatment.
Figure 2The average size of complexes predicted under different extended level parameter . The impact that the extended level parameter α . α has on the Krogan and Collins dataset.
Various performance indicators of different algorithm on Krogan and Collins datasets
| Dataset | Method | clusters | matched | Sn | PPV | Acc |
|---|---|---|---|---|---|---|
| Krogan | CFinder | 121 | 34 | 0.611 | 0.162 | 0.315 |
| MCL | 483 | 68 | 0.411 | 0.408 | 0.409 | |
| 114 | 66 | 0.661 | 0.367 | 0.492 | ||
| 183 | 88 | 0.587 | 0.409 | 0.536 | ||
The figures of CFinder and MCL derive from literature [17]
The five protein complexes with minimal p-value by MKE mining algorithm
| Dataset | ID | P-value | Identification | Gene Ontology term |
|---|---|---|---|---|
| Krogan | 1 | 4.78e-47 | 26 out of 26 genes, 100.0% | RNA splicing, via transesterification reactions with bulged adenosine as nucleophile |
| 2 | 9.26e-41 | 15 out of 15 genes, 100.0% | chromatin disassembly | |
| 3 | 2.47e-30 | 19 out of 21 genes, 90.5% | mitochondrial translation | |
| 4 | 7.92e-26 | 19 out of 23 genes, 82.6% | modification-dependent protein catabolic process | |
| 5 | 3.80e-19 | 17 out of 17 genes, 100.0% | transcription from RNA polymerase II promoter | |
| 1 | 1.98e-91 | 67 out of 93 genes, 72.0% | cytoplasmic translation | |
| 2 | 1.07e-28 | 25 out of 25 genes, 100.0% | transcription from RNA polymerase II promoter | |
| 3 | 1.42e-41 | 24 out of 24 genes, 100.0% | mitochondrial translation | |
| 4 | 6.79e-28 | 14 out of 15 genes, 93.3% | mRNA 3'-end processing | |
| 5 | 5.17e-44 | 16 out of 16 genes, 100.0% | chromatin disassembly | |
GO semantic similarity and Co-localization enrichment analysis by algorithm MKE
| Dataset | Method | GO semantic | Co-localization | Arithmetic |
|---|---|---|---|---|
| Krogan | 0.482 | 0.448 | 0.465 | |
| 0.429 | 0.682 | 0.556 | ||
| 0.725 | 0.616 | 0.671 | ||
| 0.783 | 0.900 | 0.842 | ||
| 0.984 | 0.768 | 0.876 | ||
The figures of Reference dataset come from literature [21]