| Literature DB >> 24507381 |
Pablo Meyer, Thomas Cokelaer, Deepak Chandran, Kyung Hyuk Kim, Po-Ru Loh, George Tucker, Mark Lipson, Bonnie Berger, Clemens Kreutz, Andreas Raue, Bernhard Steiert, Jens Timmer, Erhan Bilal, Herbert M Sauro, Gustavo Stolovitzky, Julio Saez-Rodriguez.
Abstract
BACKGROUND: Accurate estimation of parameters of biochemical models is required to characterize the dynamics of molecular processes. This problem is intimately linked to identifying the most informative experiments for accomplishing such tasks. While significant progress has been made, effective experimental strategies for parameter identification and for distinguishing among alternative network topologies remain unclear. We approached these questions in an unbiased manner using a unique community-based approach in the context of the DREAM initiative (Dialogue for Reverse Engineering Assessment of Methods). We created an in silico test framework under which participants could probe a network with hidden parameters by requesting a range of experimental assays; results of these experiments were simulated according to a model of network dynamics only partially revealed to participants.Entities:
Mesh:
Year: 2014 PMID: 24507381 PMCID: PMC3927870 DOI: 10.1186/1752-0509-8-13
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1Model and gene regulatory network of the parameter estimation challenge. A. Example of a case of regulation of the transcription of coding sequence g4 by proteins p1 and p4, respectively activator and repressor, through the activator (as4, green box) and repressor (rs2, red box) sites. The rate of production of g4 is given by the transcription dependent on the promoter pro4. The rate of production of p4 is given by the translation dependent on the ribosomal binding site rbs4.B. Gene network from model 1 of the Parameter Prediction challenge consisting of 9 genes whose 45 parameters and the prediction of response to perturbations were requested from challenge participants.
Figure 2Scores and correlation between parameter and protein prediction distances for model 1. A. Graph representing the dynamics of the mRNAs from the 9 genes for model 1 network. Dots are the data with noise, lines represent the data without noise and shades the associated noise model B. Overall scores from the participants calculated from the p-values as indicated by the formula. P-values were obtained from the two different metrics used for challenge scoring described in Additional file 3: Figure S1. C. The participant distances defined for scoring the submitted predictions for the parameters and the protein perturbation predictions are plotted respectively in the y-axis Dparam and x-axis Dprot. Each team is represented by its rank number in the final scoring except for the best performer Orangeballs. The R2 coefficient for a linear fit in log-scale is 0.23; the red line is a visual reference for a perfect fit. D. For each of the 45 parameters in the model, the vector of parameter values submitted by the 12 participants is correlated (R2) to the unique vector of Dprot values, the protein perturbation prediction distance values. The graph shows the parameters ordered by increasing correlation value, with from left to right, pro5_strength, v10_Kd, pro3_strength, v9_Kd, v4_h, v8_Kd, v8_h, v1_Kd, v11_h, v1_h, pro7_strength, v4_Kd, v12_Kd, pro8_strength, rbs9_strength, v10_h, pro2_strength, v9_h, pro1_strength, v12_h, v5_h, pro4_strength, v3_h, v7_h, rbs7_strength, v3_Kd, rbs2_strength, pro9_strength, v6_h, rbs1_strength, v7_Kd, pro6_strength, v6_Kd, v11_Kd, v2_Kd, v5_Kd, v13_h, p_degradation_rate, v2_h, rbs3_strength, rbs6_strength, rbs5_strength, rbs8_strength, rbs4_strength, v13_Kd.
Model parameters summary
| Promoter strength | 9 | |
| rbs strength | 9 | |
| Protien synthesis | 16 | |
| Basals | 2 | |
| Degradation rate | 1 | 11 |
| kd | 13 | 16 |
| Hill coefficient | 13 | 16 |
| Total | 45 | 61 |
Parameters involved in the parameter estimation challenge and the network topology challenge. The nature of each parameter is indicated on the first column, and the number of parameters in Model 1 for the parameter estimation challenge and Model 2 for the network topology challenge are listed in the second and the third column, respectively.
Scores and features of parameter inference challenge
| Orangeballs | 0.0229 | 3.25E-03 | 0.002438361 | 1.21E - 25 | 27.4 | no | yes | Game Tree | Sequential local search |
| 2 | 0.8404 | 1.00E + 00 | 0.016023721 | 3.39E-18 | 17.5 | no | no | Manual based on parameter uncertainty | Global method |
| 3 | 0.1592 | 6.00E-01 | 0.035404398 | 4.45E-15 | 14.6 | yes | no | Manual | LH |
| 4 | 0.0899 | 1.88E-01 | 0.047495432 | 6.28E-14 | 13.9 | no | yes | Manual | LM + Particle Swarm |
| 5 | 0.1683 | 6.45E-01 | 0.09791128 | 4.01E-11 | 10.6 | yes | no | Train + Sim | UKF |
| 6 | 0.0453 | 1.37E-02 | 0.198785197 | 1.93E-08 | 9.6 | no | no | A=Criterion | Local (LM) |
| 7 | 0.1702 | 6.45E-01 | 0.362463945 | 2.90E-06 | 5.7 | no | yes | Sensitivity analysis | Hybrid (Local + Global) |
| 8 | 0.8128 | 1.00E + 00 | 0.356429217 | 2.53E-06 | 5.6 | yes | no | Estimation of improved uncertainty | Global (MH) |
| 9 | 0.3766 | 9.99E-01 | 0.817972877 | 1.34E-03 | 2.9 | yes | yes | MI | ABC-SMC |
| 10 | 0.0699 | 9.83E-02 | 19.32326868 | 1.00E + 00 | 1.0 | no | yes | Minimize variance based on FI | Multistart local search |
| 11 | 0.1883 | 7.29E-01 | 3.222767988 | 6.90E-01 | 0.3 | no | no | Train + Sim | LH + DE |
| 12 | 5.0278 | 1.00E + 00 | 14.77443631 | 1.00E + 00 | 0.0 | no | no | Manual | Local method |
Table for Model 1 of the parameter inference challenge contains anonymized teams (except for best performer) ordered by Score rank. Next to each team is listed its parameter distance and associated p-value, protein distance and associated p-value and the score. The last four columns indicate the features of the fitting strategies used by the participants. Abbreviations used for the features: ABC-SMC, Approximate Bayesian Computation with Sequential Monte Carlo; DE, Differential Evolution; FI, Fisher Information; LH, Latin Hypercube; LM, Levenberg-Marquardt; MH, Metropolis Hastings; MI, Maximize Mutual Information between parameters and output of experiments; Train + Sim, iterative steps of training on data and simulation to find most informative experiments; Rank rank experiments in top 10% of the A-Criterion (trace of the covariance matrix) according to price; UKF, Unscented Kalman Filtering.
Scores and features of network topology challenge
| crux | 12 | 1.49E-02 | 1.83 | Manual |
| 2 | 9 | 5.60E-02 | 1.25 | Manual |
| 3 | 8 | 1.07E-01 | 0.97 | Manual first + algorithm |
| 4 | 8 | 1.07E-01 | 0.97 | Manual('logic reasoning') |
| 5 | 8 | 1.07E-01 | 0.97 | Manual |
| 6 | 7 | 2.10E-01 | 0.68 | Algorithm(Grenits) |
| 7 | 6 | 3.83E-01 | 0.42 | Manual |
| 8 | 5 | 6.01E-01 | 0.22 | Manual |
| 9 | 4 | 8.01E-01 | 0.10 | Did not participate |
| 10 | 4 | 8.01E-01 | 0.10 | Did not participate |
| 11 | 3 | 9.86E-01 | 0.01 | Manual |
| 12 | 2 | 1.00E + 00 | 0 | Algorithm GP-DREAM |
Table for Model 2 of the Network topology Challenge contains anonymized teams (except for best performer) ordered by Score rank. Next to each team is listed their network score s, associated p-value and the final score Score. The last column indicates the features of the link addition strategies used by the participants.
Figure 3Scores of aggregated participant results. A. Protein concentrations of participants’ predictions (in blue) and the solution (green) are plotted against time for proteins p3, p5 and p8 under the perturbed conditions considered for scoring. B. Participant submissions are aggregated by averaging each protein concentration for individual time points, starting from the 2 best performing teams until all 12 teams are included. Each aggregated result is plotted in blue and the solution is plotted in green. C. Log scale distance to the solution of parameter predictions is plotted for participant teams ordered by rank (blue line) and geometric means of parameter predictions from teams ordered by number of aggregated teams following parameter distance rank (green line) or inverse rank order (red line). D. Log-scale distance to the solution of proteins p3, p5 and p8 under perturbed conditions is plotted for participant teams ordered by rank (blue line) and aggregated teams. Aggregations were computed for the predictions of the teams, ordered by number of aggregated teams ranging from 1 to 12, following prediction distance rank (green line) or inverse order (red line).
Figure 4Dynamics and scores of the network topology challenge. A. Time courses of the proteins from the 11 proteins in the model 2 network. Dots are the data with noise, lines represent the data without noise and shades the associated noise model. B. Ordered scores from the participants as well as the score of the consensus solution defined as the 3 most submitted links. Scores were calculated from the p-values as indicated in Methods, Additional file 2: Figure S2 and Additional file 1. C. The 3 links r9, r10, r12 composing the solution to the Network Inference challenge are shown in their numeric (top left) and diagram (bottom left) notations. The list of submitted participant links is shown (right) in its numeric notation as well as the number of times such links were submitted. The links colored in blue indicate the consensus network composed of the 3 most submitted links whose score is indicated in (B). D. Diagrams of consensus network of links (blue) and solution (black). Dashed arrow indicates an indirect regulation.
Figure 5Analysis of experimental credit usage in challenges A. Histogram indicating the number of times credits were spent on an experiment for the parameter estimation challenge. The nature of the experiments is indicated on the horizontal axis. B. Histogram indicating the number of times credits were spent on an experiment for the network topology challenge. C. Diagram indicating the sequence of experiments performed in the parameter estimation challenge. Each box represents a different experiment and the arrows indicate the sequence followed. Dark arrows represent the most used paths with numbers indicating usage, and grey arrows indicate a single usage. The path of the winning team is shown with red arrows and the order of the experiments is indicated via roman numerals. D. Diagram indicating the sequence of experiments performed in the network topology challenge. Each box represents a different experiment and the arrows indicate the sequence followed as in (C).