| Literature DB >> 23368995 |
Antonino Fiannaca1, Massimo La Rosa, Alfonso Urso, Riccardo Rizzo, Salvatore Gaglio.
Abstract
BACKGROUND: We introduce a Knowledge-based Decision Support System (KDSS) in order to face the Protein Complex Extraction issue. Using a Knowledge Base (KB) coding the expertise about the proposed scenario, our KDSS is able to suggest both strategies and tools, according to the features of input dataset. Our system provides a navigable workflow for the current experiment and furthermore it offers support in the configuration and running of every processing component of that workflow. This last feature makes our system a crossover between classical DSS and Workflow Management Systems.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23368995 PMCID: PMC3548703 DOI: 10.1186/1471-2105-14-S1-S5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1System architecture. The system is built upon a rule-based expert system. KB contains the expertise about the application domain in the form of facts and rules. The Reasoner, that is an inference engine, according to the user's requests, input data and available knowledge, decides what are the strategies to follow and the tools to run, and suggest them to the User. The Executor actually runs all the executable processing tools and updates the KB with results of processing, that can be used to make new inferences.
Figure 2Data-Problem-Solver ontology for knowledge-based expert systems. An overview of the Data-Problem-Solver paradigm used for building a complete and exhaustive Knowledge Base.
Input dataset.
| # | Protein_A | Protein_B | PPI_ID |
|---|---|---|---|
| act1 | abp1 | DIP-10439E | |
| app1 | abp1 | DIP-9959E | |
| cla4 | abp1 | DIP-3499E | |
| sla1 | abp1 | DIP-2452E | |
| srv2 | abp1 | DIP-1139E | |
| yor284w | abp1 | DIP-3500E | |
| act1 | act1 | DIP-1145E | |
| bni1 | act1 | DIP-1155E | |
| cof1 | act1 | DIP-1157E | |
| las17 | act1 | DIP-1158E | |
| pfy1 | act1 | DIP-1143E | |
| sla2 | act1 | DIP-1175E | |
| act1 | aip1 | DIP-1140E | |
| srv2 | aip1 | DIP-3502E | |
| hsl7 | app1 | DIP-3683E | |
| rvs167 | app1 | DIP-3907E | |
| sla2 | app1 | DIP-3966E | |
| ysc84 | app1 | DIP-11282E | |
| cdc42 | bni1 | DIP-1154E | |
| cap2 | cap1 | DIP-3546E | |
| gic2 | cap1 | DIP-3547E | |
| cla4 | cdc42 | DIP-2580E | |
| gic2 | cdc42 | DIP-2583E | |
| gic2 | cla4 | DIP-3639E | |
| aip1 | cof1 | DIP-1346E | |
| app1 | cof1 | DIP-14613E | |
| las17 | cof1 | DIP-1161E | |
| app1 | crn1 | DIP-3604E | |
| cof1 | crn1 | DIP-11816E | |
| crn1 | crn1 | DIP-4127E | |
| hsl7 | hsl7 | DIP-9812E | |
| swe1 | hsl7 | DIP-7787E | |
| cap2 | las17 | DIP-1160E | |
| las17 | las17 | DIP-11092E | |
| rvs167 | las17 | DIP-3699E | |
| sla1 | las17 | DIP-1162E | |
| sla2 | las17 | DIP-15438E | |
| ysc84 | las17 | DIP-11095E | |
| bni1 | pfy1 | DIP-1164E | |
| bnr1 | pfy1 | DIP-1166E | |
| srv2 | pfy1 | DIP-3762E | |
| app1 | rvs161 | DIP-4047E | |
| las17 | rvs161 | DIP-4048E | |
| ybr108w | rvs161 | DIP-1781E | |
| abp1 | rvs167 | DIP-1138E | |
| acf2 | rvs167 | DIP-3900E | |
| act1 | rvs167 | DIP-1146E | |
| rvs161 | rvs167 | DIP-1780E | |
| rvs167 | rvs167 | DIP-3901E | |
| sla2 | rvs167 | DIP-10013E | |
| ybr108w | rvs167 | DIP-3902E | |
| ygr268c | rvs167 | DIP-3903E | |
| yjr083c | rvs167 | DIP-3904E | |
| ypr171w | rvs167 | DIP-10016E | |
| ysc84 | rvs167 | DIP-10017E | |
| app1 | sla1 | DIP-10020E | |
| rvs167 | sla1 | DIP-10011E | |
| srv2 | sla1 | DIP-10018E | |
| ygr268c | sla1 | DIP-10019E | |
| yor284w | sla1 | DIP-11232E | |
| ypr171w | sla1 | DIP-3964E | |
| abp1 | sla2 | DIP-2453E | |
| cla4 | sla2 | DIP-3965E | |
| sla2 | sla2 | DIP-3144E | |
| act1 | srv2 | DIP-1144E | |
| cof1 | srv2 | DIP-11822E | |
| rvs167 | srv2 | DIP-3029E | |
| srv2 | srv2 | DIP-1177E | |
| trm5 | srv2 | DIP-4014E | |
| crn1 | svl3 | DIP-3603E | |
| app1 | swe1 | DIP-4050E | |
| ygr268c | ygr268c | DIP-2272E | |
| ysc84 | ygr268c | DIP-2243E | |
| las17 | yhr133c | DIP-3700E | |
| yjr083c | yjr083c | DIP-4186E | |
| ysc84 | yjr083c | DIP-11280E | |
| rvs167 | ynl086w | DIP-3906E | |
| rvs167 | yor284w | DIP-10015E | |
| sla2 | yor284w | DIP-3967E | |
| yor284w | yor284w | DIP-6160E | |
| ysc84 | yor284w | DIP-11283E | |
| las17 | ypl246c | DIP-3702E | |
| sla1 | ypl246c | DIP-11231E | |
| cap1 | ypr171w | DIP-9981E | |
| ysc84 | ypr171w | DIP-11285E | |
| abp1 | ysc84 | DIP-11370E | |
| acf2 | ysc84 | DIP-11277E | |
| sla1 | ysc84 | DIP-2242E | |
| sla2 | ysc84 | DIP-3968E | |
| ypl246c | ysc84 | DIP-11284E |
There are 90 PPIs among 34 Proteins for the species Saccharomyces cerevisiae. Each row contains two PPIs. For each PPI is shown the first protein uniprotKB ID, the second protein uniprotKB ID and the interaction PID ID between the previous pair of proteins. The complete set of PPIs for this species is available in Scere20081014.txt file, provided by PID online database [2].
Figure 3Decision making modules for Protein Complex Extraction scenario. The tree structure among modules is projected into a treemap representation. Each parent module is responsible for the activation of children modules. In the treemap, this relation is depicted through a set of nested boxes.
Figure 4Workflow of the preprocessing phase. This figure depict a state of the system during the preprocessing phase, in facts so far two decision making modules are used. The child module, "Complex Preprocessing", reports at "Abstraction Layer 1" the execution of two strategies ("Add FN PPIs" and "Delete FP PPIs") and at lower abstraction layer the executed algorithms (yellow boxes).
Figure 5Workflow of the whole experiment. The system shows all strategies (blue boxes) and algorithms (yellow boxes) have been used during this scenario. They are arranged in three workflows, one for each abstraction layer. The workflow at "Abstraction Layer 0" reports the complex extraction process at object level.
Figure 6Clustering visualization with Cytoscape tool. Cytoscape shows the clustered network arranged in a hierarchical layout. Each complex is depicted in a different colour.
System output.
| System Output | |
|---|---|
| 1 | app1, swe1, hsl7 |
| 2 | act1, srv2, bnr1, bni1, cof1, trm5, aip1 |
| 3 | sla2, abp1, yor284w, rvs167, ysc84, sla1, ynl086w, ypl246c, rvs161, acf2, ybr108w, yjr083c, ygr268c, ypr171w, yhr133c |
| 4 | cap2, gic2 |
| 5 | crn1, svl3 |
The implemented workflow is composed by two algorithms in cascade: Detective Cliques (network preprocessing) and MCL (network clustering). The system running with standard parameters gives five protein complexes as result.
Comparison among the proposed approach and some of the other approaches.
| Methods | Protein complexes | Protein fraction | p-Value |
|---|---|---|---|
| Proposed System | app1, swe1, hsl7 | 2/3 | 2.17e-03 |
| PINCoC | swe1, hsl7 | 2/2 | 6.90e-04 |
| UVCluster | app1, swe1, hsl7 | 2/3 | 2.17e-03 |
| Proposed System | act1, srv2, bnr1, bni1, cof1, trm5, aip1 | 4/7 | 4.92e-10 |
| PINCoC | bnr1, bni1, pfy1, act1, srv2, aip1, trm5 | 5/7 | 1.52e-07 |
| UVCluster | act1, srv2, aip1, trm5, cof1 | 4/5 | 7.30e-04 |
| Proposed System | sla2, abp1, yor284w, rvs167, ysc84, sla1, ynl086w, ypl246c, rvs161, acf2, ybr108w, yjr083c, ygr268c, ypr171w, yhr133c | 7/15 | 4.65e-08 |
| PINCoC | sla2, abp1, yor284w, rvs167, ysc84, app1, rvs161, ynl086w, yjr083c, acf2 | 6/10 | 6.72e-08 |
| UVCluster | sla2, abp1, yor284w, rvs167, ysc84, sla1, ygr268c | 4/7 | 5.93e-05 |
| Proposed System | crn1, svl3 | 0/2 | > 0.01 |
| PINCoC | crn1, svl3, las17, yhr133c, cof1 | 3/5 | 9.07e-06 |
| UVCluster | crn1, svl3 | 0/2 | > 0.01 |
| Proposed System | ---- | --- | --- |
| PINCoC | cdc42, cla4, gic2 | 3/3 | 1.76e-06 |
| UVCluster | cdc42, cla4, gic2 | 3/3 | 1.76e-06 |
The table reports a comparison among the proposed approach and two different approaches, called respectively PINCoC and UVCluster, that have been previously tested with this database. Proposed system outperforms result of the other tools with respect to the complexes Actin Filament Depolymerization and Actin Cytoskeleton Organization.
Comparison among some different clustering strategies for protein complex problem.
| Techniques | Protein Complexes | Protein Fraction | P-Value |
|---|---|---|---|
| G2/M T | |||
| MCL | app1, swe1, hsl7 | 2/3 | 2.17e-03 |
| RNSC | app1, swe1, hsl7 | 2/3 | 2.17e-03 |
| MCODE | -- | 2/3 | 2.17e-03 |
| A | |||
| MCL | act1, srv2, bnr1, bni1, cof1, trm5, aip1 | 4/7 | 4.92e-10 |
| RNSC | act1, srv2, aip1, cof1 | 4/4 | 2.94e-05 |
| MCODE | -- | -- | 5.25e-03 |
| A | |||
| MCL | sla2, abp1, yor284w, rvs167, ysc84, sla1, ynl086w, ypl246c, rvs161, acf2, ybr108w, yjr083c, ygr268c, ypr171w, yhr133c | 7/15 | 4.65e-08 |
| RNSC | sla2, yor284w, rvs167, ysc84, sla1, ygr268c, abp1 | 4/7 | 5.93e-05 |
| MCODE | abp1, app1, rvs167, act1, yor284w, ysc84 | 4/6 | 1.93e-05 |
| A | |||
| MCL | crn1, svl3 | 0/2 | > 0.01 |
| RNSC | crn1, svl3 | 0/2 | > 0.01 |
| MCODE | -- | 0/2 | > 0.01 |
| R | |||
| MCL | -- | -- | -- |
| RNSC | cla4, bni1, cdc42, gic2 | 3/4 | 1.04e-05 |
| MCODE | cla4, cdc42, gic2 | 3/3 | 1.76e-06 |
Table reports a comparison among three clustering techniques contained into the knowledge base of the system: MCL, RNSC and MCODE. The suggested tool allows the system to reach the smallest p-values for all the complexes, but the Rho Protein Signal Trasduction cluster.
Comparison among three different preprocessing techniques when MCODE tool is selected.
| Preprocessing Methods | PPI-Network Modification | Protein Complexes | Protein Fraction | P-Value |
|---|---|---|---|---|
| Detect Defective Cliques | Add 1 PPI | abp1, app1, rvs167, act1, ysc84 | 4/6 | 1.93e-05 |
| No Preprocessing | --- | abp1, app1, rvs167, act1, yor284w | 3/5 | 8.10e-04 |
| Betweenness Centrality | Remove 3 PPI (2 core PPI) | abp1, rvs167, ysc84, yor284w | 2/4 | 7.1e-05 |
| Detect Defective Cliques | Add 1 PPI | sla2, las17 | 2/2 | 5.30e-04 |
| No Preprocessing | --- | sla2, las17 | 2/2 | 5.30e-04 |
| Betweenness Centrality | Remove 3 PPI (2 core PPI) | --- | --- | --- |
| Detect Defective Cliques | Add 1 PPI | cla4, gic2, cdc42 | 3/3 | 1.76e-06 |
| No Preprocessing | --- | cla4, gic2, cdc42 | 3/3 | 1.76e-06 |
| Betweenness Centrality | Remove 3 PPI (2 core PPI) | cla4, gic2, cdc42 | 3/3 | 1.76e-06 |
Table reports a comparison among some of the preprocessing tools contained into the knowledge base of the system.The suggested tool allows the system to reach the smallest p-values for all the complexes.