| Literature DB >> 26641660 |
Eileen Marie Hanna1, Nazar Zaki1, Amr Amin2,3.
Abstract
Developing suitable methods for the detection of protein complexes in protein interaction networks continues to be an intriguing area of research. The importance of this objective originates from the fact that protein complexes are key players in most cellular processes. The more complexes we identify, the better we can understand normal as well as abnormal molecular events. Up till now, various computational methods were designed for this purpose. However, despite their notable performance, questions arise regarding potential ways to improve them, in addition to ameliorative guidelines to introduce novel approaches. A close interpretation leads to the assent that the way in which protein interaction networks are initially viewed should be adjusted. These networks are dynamic in reality and it is necessary to consider this fact to enhance the detection of protein complexes. In this paper, we present "DyCluster", a framework to model the dynamic aspect of protein interaction networks by incorporating gene expression data, through biclustering techniques, prior to applying complex-detection algorithms. The experimental results show that DyCluster leads to higher numbers of correctly-detected complexes with better evaluation scores. The high accuracy achieved by DyCluster in detecting protein complexes is a valid argument in favor of the proposed method. DyCluster is also able to detect biologically meaningful protein groups. The code and datasets used in the study are downloadable from https://github.com/emhanna/DyCluster.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26641660 PMCID: PMC4671556 DOI: 10.1371/journal.pone.0144163
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Snapshots of a hypothetical PPI network, showing its dynamics through different temporal, spatial and/or contextual settings.
Nodes and edges of the same color belong to the same protein complex.
Fig 2An outline of the DyCluster framework developed for the detection of protein complexes in dynamic PPI networks modeled as gene expression biclusters.
Parameter settings of the applied biclustering algorithms.
| Parameter Settings | |
|---|---|
| CC | upper limit of MSR: |
| threshold for multiple node deletion: | |
| number of output biclusters = 10 | |
| OPSM | number of passed models for each iteration: |
| ISA | threshold of genes: |
| threshold of chips: | |
| number of starting points = 100 | |
|
| distance measure: Pearson’s correlation |
| number of clusters = 10 | |
| number of iterations = 100 | |
| number of replications = 1 | |
The formula of the quality scored used to evaluate our approach.
| Evaluation Scores | Equations |
|---|---|
| Overlap score: between two protein complexes |
|
| Clustering-wise sensitivity |
|
| Clustering-wise positive predictive value |
|
| Accuracy |
|
Experimental results of matching the detected sets of protein complexes by various detection methods against the CYC2008 reference catalogue.
| Method | No. of matched complexes | No. of detected complexes | Acc |
| MMR | PPV |
|---|---|---|---|---|---|---|
| ProRank | 41 | 230 | 0.4715 | 0.3072 | 0.1032 | 0.7237 |
| ProRank+ | 46 | 274 | 0.4788 | 0.3371 | 0.1161 | 0.6801 |
| ClusterONE | 76 | 365 | 0.6008 | 0.511 | 0.2349 | 0.7064 |
| CMC | 114 | 4292 | 0.6587 | 0.6517 | 0.347 | 0.6658 |
| MCODE | 62 | 168 | 0.55 | 0.4271 | 0.149 | 0.7082 |
| CFinder | 116 | 6381 | 0.6143 | 0.5641 | 0.3776 | 0.669 |
Experimental results of matching the detected sets of protein complexes by our proposed framework against the CYC2008 reference catalogue in comparison to ProRank, ProRank+, ClusterONE, CMC, MCODE and CFinder.
| Method | Biclustering Algorithm | No. of matched cmplxs | No. of detected cmplxs |
|
| MMR | PPV |
|---|---|---|---|---|---|---|---|
| ProRank | OPSM | 78 | 335 | 0.5911 | 0.4627 | 0.2103 | 0.755 |
| CC | 63 | 252 | 0.5658 | 0.4296 | 0.1804 | 0.7451 | |
| ISA | 71 | 320 | 0.564 | 0.4332 | 0.195 | 0.7342 | |
|
| 71 | 331 | 0.556 | 0.4222 | 0.1896 | 0.7322 | |
| ProRank+ | OPSM | 81 | 397 | 0.5982 | 0.5116 | 0.225 | 0.6995 |
| CC | 65 | 305 | 0.5668 | 0.4724 | 0.1947 | 0.6802 | |
| ISA | 78 | 392 | 0.5677 | 0.4719 | 0.2231 | 0.683 | |
|
| 78 | 424 | 0.5687 | 0.4782 | 0.2196 | 0.6764 | |
| ClusterONE | OPSM | 89 | 929 | 0.6426 | 0.5758 | 0.2469 | 0.7172 |
| CC | 78 | 578 | 0.6267 | 0.5465 | 0.2036 | 0.7186 | |
| ISA | 87 | 890 | 0.6015 | 0.5506 | 0.2499 | 0.6571 | |
|
| 83 | 862 | 0.6153 | 0.533 | 0.2334 | 0.7102 | |
| CMC | OPSM | 100 | 1207 | 0.6159 | 0.5566 | 0.2903 | 0.6816 |
| CC | 95 | 1145 | 0.5983 | 0.5264 | 0.2844 | 0.6801 | |
| ISA | 100 | 1843 | 0.6041 | 0.5518 | 0.3071 | 0.6614 | |
|
| 94 | 1126 | 0.6088 | 0.5542 | 0.2913 | 0.6689 | |
| MCODE | OPSM | 71 | 475 | 0.5695 | 0.4602 | 0.1835 | 0.7049 |
| CC | 60 | 285 | 0.545 | 0.4058 | 0.1581 | 0.7321 | |
| ISA | 63 | 315 | 0.5529 | 0.4232 | 0.171 | 0.7222 | |
|
| 74 | 448 | 0.5658 | 0.4583 | 0.1947 | 0.6986 | |
| CFinder | OPSM | 94 | 2079 | 0.6187 | 0.525 | 0.2925 | 0.7291 |
| CC | 98 | 1236 | 0.5977 | 0.559 | 0.3005 | 0.6391 | |
| ISA | 99 | 2119 | 0.5738 | 0.5393 | 0.3021 | 0.6104 | |
|
| 99 | 1352 | 0.5988 | 0.5455 | 0.3098 | 0.6574 |
Fig 3Statistical significance of scores differences between pairs of protein-complex detection methods without and with gene expression data based on the proposed framework.
The displayed p-values are the ones less than or equal to 0.1 reflecting improvements in the scores, i.e. the matching qualities of the detected protein complexes.
The biological components detected by our framework, listed by types, along with their matching percentages.
| Detected Component | Matching Percentage | |
|---|---|---|
| InterPro-Domains | Chemokine receptor family | 100 |
| G protein-coupled receptor, rhodopsin-like | 100 | |
| GPCR, rhodopsin-like, 7TM | 100 | |
| BLC2 family | 83.3 | |
| BLC2-like | 83.3 | |
| Death effector domain | 66.7 | |
| Interleukin-6 receptor alpha, binding | 50 | |
| Death domain | 100 | |
| Apoptosis regulator, Bcl-2, BH2 motif, conserved site | 75 | |
| Chemokine interleukin-8-like domain | 60 | |
| KEGG Pathway | Chemokine signaling pathway | 40 |
| Cytokine-cytokine receptor interaction | 32.8 | |
| NOD-like receptor signaling pathway | 31.3 | |
| Apoptosis | 34.4 | |
| Autoimmune thyroid disease | 71.4 | |
| Huntington’s disease | 66.7 | |
| Systemic lupus erythematosus | 40 | |
| Asthma | 50 | |
| Intestinal immune network for IgA production | 25 | |
| Cell adhesion molecules | 50 | |
| Pathways in cancer | 70 | |
| Molecular Function | Peptide receptor activity | 58.3 |
| Receptor activity | 52.2 | |
| Growth factor activity | 60 | |
| C-C chemokine binding | 66.7 | |
| Tumor necrosis factor receptor superfamily binding | 40 | |
| Death effector domain binding | 66.7 | |
| Growth factor binding | 50 | |
| Nucleic acid binding transcription factor activity | 75 | |
| Chemokine activity | 77.8 | |
| Pfam Domains | 7 transmembrane receptor, rhodopsin family | 100 |
| Apoptosis regulator proteins, Bcl-2 family | 83.3 | |
| Death effector domain | 66.7 | |
| Interleukin-6 receptor alpha chain, binding | 50 | |
| Small cytokines (intecrine/chemokine), interleukin-8 like | 53.3 | |
| Death domain | 100 | |
| Reactome Pathway | Activation of DNA fragmentation factor | 66.7 |
| Interleukin-1 family precursors are cleaved by caspase-1 | 100 | |
| Downstream TCR signaling | 100 | |
| FasL/CD95L signaling | 100 | |
| Exocytosis of platelet alpha granule contents | 100 | |
| IRAK4 is activated by autophosphorylation | 75 | |
| Beta defensins | 66.7 | |
| TRAIL signaling | 66.7 | |
| Interleukin-1 processing | 75 | |
| FASL:FAS Receptor Trimer, FADD complex | 100 |
Fig 4The number of matched (in green) and detected (in blue) complexes per detection method.