| Literature DB >> 26020784 |
Vassilis Stavrakas1, Ioannis N Melas2, Theodore Sakellaropoulos1, Leonidas G Alexopoulos1.
Abstract
Modeling of signal transduction pathways is instrumental for understanding cells' function. People have been tackling modeling of signaling pathways in order to accurately represent the signaling events inside cells' biochemical microenvironment in a way meaningful for scientists in a biological field. In this article, we propose a method to interrogate such pathways in order to produce cell-specific signaling models. We integrate available prior knowledge of protein connectivity, in a form of a Prior Knowledge Network (PKN) with phosphoproteomic data to construct predictive models of the protein connectivity of the interrogated cell type. Several computational methodologies focusing on pathways' logic modeling using optimization formulations or machine learning algorithms have been published on this front over the past few years. Here, we introduce a light and fast approach that uses a breadth-first traversal of the graph to identify the shortest pathways and score proteins in the PKN, fitting the dependencies extracted from the experimental design. The pathways are then combined through a heuristic formulation to produce a final topology handling inconsistencies between the PKN and the experimental scenarios. Our results show that the algorithm we developed is efficient and accurate for the construction of medium and large scale signaling networks. We demonstrate the applicability of the proposed approach by interrogating a manually curated interaction graph model of EGF/TNFA stimulation against made up experimental data. To avoid the possibility of erroneous predictions, we performed a cross-validation analysis. Finally, we validate that the introduced approach generates predictive topologies, comparable to the ILP formulation. Overall, an efficient approach based on graph theory is presented herein to interrogate protein-protein interaction networks and to provide meaningful biological insights.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26020784 PMCID: PMC4447287 DOI: 10.1371/journal.pone.0128411
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Data matrix of 2 considered experimental scenarios.
| RAF-1 | ERK | AP1 | GSK-3 | P38 | NFKB | IKK | MAP3K1 | MAP3K7 | PI3K | |
|---|---|---|---|---|---|---|---|---|---|---|
| EGF | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| TNFA | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
Herein, we present fictitious signaling events for 10 signals downstream of EGF and TNFA stimulation. Each row corresponds to one experimental scenario and each column contains the measured state changes of the readout species. If a node is regulated in the respective scenario, then x = 1, otherwise x = 0. For instance, we imply that TNFA stimulus causes an activation of AP1 signal (x 23 = 1) and a zero response of IKK measured species (x 27 = 0).
Setting the dependencies.
| k | Dependencies |
|---|---|
| 1 | EGF → RAF-1 |
| 2 | EGF → ERK |
| 3 | EGF → AP1 |
| . | . |
| . | . |
| . | . |
| 14 | TNFA → NFKB |
We present the 14 dependencies extracted from the experimental matrix presented in Table 1. The symbol → signifies the desired intermediate pathway between the two molecules.
Fig 1A simple example network used for illustration purposes—Workflow.
(a) The full network adopted from [36], after applying the Direct Paths step. These Direct Paths are depicted in blue edges, while in dashed we present edges and nodes not yet included in our solution. (b) The compressed model, as obtained after applying the Alternative Paths step and dealing with conflicts detected in the network. In this compressed version of the network we notice the appearance of the connection between TNFR and PI3K. The purpose of this new edge is not to link TNFA to P38 phosphorylation, but to satisfy the TNFA → GSK-3 dependency. The fact that TNFA links to P38 phosphorylation through this connection (i.e. TNFR → PI3K) is coincidental in this case and depends on the paths derived via the “Direct Paths” procedure (the blue edges have already been included in the final topology from the previous step). The algorithm, in order to satisfy the TNFA → P38 dependency chooses the shortest path TNFA → TNFR → TRAF2 → MAP3K7 → MKK4 → P38, including two conflicts (i.e. MAP3K7 and IKK nodes have been measured as inactive under TNFA stimulation). However, this error vanishes due to satisfaction of the two dependencies (TNFA → P38 and TNFA → NFKB). Consequently, the scoring method assesses this case as a draw case (2 Satisfied Dependencies – 2 Conflicts Detected = 0). In this work we suggest that the draw cases should be included in the compressed topology, as they add connectivity-topology information. The algorithmic steps and the experimental design is colour annotated. In blue we present the Direct Paths produced in the previous step, while in red we present the Alternative ones. The nodes in crimson contours represent the detected inconsistencies (conflicts) between network topology and experimental measurements. Finally, in dashed we present edges and components excluded from the final solution.
Reachable signals matrix according to the compressed model’s topology.
| RAF-1 | ERK | AP1 | GSK-3 | P38 | NFKB | IKK | MAP3K1 | MAP3K7 | PI3K | |
|---|---|---|---|---|---|---|---|---|---|---|
| EGF | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 1 |
| TNFA | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Herein we present the reachable signals matrix for our toy model, according to the compressed model’s topology. We set a matrix of same size as the experimental matrix presented in Table 1. In this new matrix, each row corresponds to a stimulus used to perturb the cells and each column to a measured signal. If the c measured node is reachable under stimulation of r perturbed node, based on the compressed model’s topology, we set the corresponding matrix element = 1, otherwise we set = 0.
Medium scale network experimental data.
| hspb1 | akt1 | p70s6k | shp2 | jnk2 | ikba | gsk3b | p38mapk | nfkb | mp2k6 | tor | mek1 | erk1 | rsk1 | creb1 | rs6 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| il6 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| tnfa | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| il1b | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 |
| tgfa | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 1 |
| ins | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
The experimental scenarios, presented here in discretized format, consist of 5 stimuli and 16 measured key phosphoproteins and is described properly in [4]. The proposed formulation requires a qualitative view of signal transduction, supporting only two discrete states indicating the variation of the activation state of signaling nodes (”1” for activation and ”0” for unchanged state.
Fig 2Medium scale network-Compressed model.
The model structure can be compressed substantially from 90 nodes and 139 edges to 41 nodes and 44 edges. The compressed model reflects the essential dependencies in the original network structure that can be addressed by the given set of measured nodes. Our solution resulted in a fitting error of 29, which has thus reduced much in comparison to 59 in original model. Several edges are absent due to conflict with the data. One example is the absence of RSK1 → RS6, in order to isolate the RS6 activity from the IL1B stimuli. In a similar manner, several edges are preserved as MEK1 → ERK1 and MEK1 → RSK1 to permit the activity of ERK1 and RSK1 under the TGFA treatment. Additionally, MAP3K → IKK enables the activation of NFKB signal under both IL6 and TGFA stimulation and the activation of IKBA measured node from the IL6 stimulus. In red color, we present the removed edges in the compressed model after a parameter change in our ranking method. This new model structure consists of 38 nodes and 41 edges. The new compressed model reflects essentially the experimental dependencies in the original network structure and provides a final fitting error of 19, much reduced in comparison to 59 in original model and 29 in the previous solution.