Mahdi Imani1, Roozbeh Dehghannasiri2, Ulisses M Braga-Neto1,3, Edward R Dougherty1,3. 1. Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX, USA. 2. School of Medicine, Stanford University, Stanford, CA, USA. 3. Center for Bioinformatics and Genomic Systems Engineering, Texas A&M Engineering Experiment Station (TEES), College Station, TX, USA.
Abstract
Scientists are attempting to use models of ever-increasing complexity, especially in medicine, where gene-based diseases such as cancer require better modeling of cell regulation. Complex models suffer from uncertainty and experiments are needed to reduce this uncertainty. Because experiments can be costly and time-consuming, it is desirable to determine experiments providing the most useful information. If a sequence of experiments is to be performed, experimental design is needed to determine the order. A classical approach is to maximally reduce the overall uncertainty in the model, meaning maximal entropy reduction. A recently proposed method takes into account both model uncertainty and the translational objective, for instance, optimal structural intervention in gene regulatory networks, where the aim is to alter the regulatory logic to maximally reduce the long-run likelihood of being in a cancerous state. The mean objective cost of uncertainty (MOCU) quantifies uncertainty based on the degree to which model uncertainty affects the objective. Experimental design involves choosing the experiment that yields the greatest reduction in MOCU. This article introduces finite-horizon dynamic programming for MOCU-based sequential experimental design and compares it with the greedy approach, which selects one experiment at a time without consideration of the full horizon of experiments. A salient aspect of the article is that it demonstrates the advantage of MOCU-based design over the widely used entropy-based design for both greedy and dynamic programming strategies and investigates the effect of model conditions on the comparative performances.
Scientists are attempting to use models of ever-increasing complexity, especially in medicine, where gene-based diseases such as cancer require better modeling of cell regulation. Complex models suffer from uncertainty and experiments are needed to reduce this uncertainty. Because experiments can be costly and time-consuming, it is desirable to determine experiments providing the most useful information. If a sequence of experiments is to be performed, experimental design is needed to determine the order. A classical approach is to maximally reduce the overall uncertainty in the model, meaning maximal entropy reduction. A recently proposed method takes into account both model uncertainty and the translational objective, for instance, optimal structural intervention in gene regulatory networks, where the aim is to alter the regulatory logic to maximally reduce the long-run likelihood of being in a cancerous state. The mean objective cost of uncertainty (MOCU) quantifies uncertainty based on the degree to which model uncertainty affects the objective. Experimental design involves choosing the experiment that yields the greatest reduction in MOCU. This article introduces finite-horizon dynamic programming for MOCU-based sequential experimental design and compares it with the greedy approach, which selects one experiment at a time without consideration of the full horizon of experiments. A salient aspect of the article is that it demonstrates the advantage of MOCU-based design over the widely used entropy-based design for both greedy and dynamic programming strategies and investigates the effect of model conditions on the comparative performances.
Entities:
Keywords:
dynamic programming; entropy; experimental design; gene regulatory network; greedy search; mean objective cost of uncertainty; structural intervention
A basic problem in genomic signal processing is to derive intervention strategies for
gene regulatory networks (GRNs) to avoid undesirable states, in particular,
cancerous phenotypes. The problem goes back to the early days of genomics when 2
paradigms were introduced to force dynamical GRNs away from carcinogenic states,
dynamical intervention[1-3] and structural
intervention.[4] Substantial work has been done since then (see the work by Dougherty et al[5] for reviews). The goal of dynamical intervention is to find a proper finite
or infinite-horizon control strategy for altering the regulatory output of one or
more genes at each time point. The goal of structural intervention, which is the
focus of this article, is to find a one-time change in regulatory function for
beneficially changing the steady-state distribution (SSD) of a GRN. A solution for
optimal structure intervention via representation of logical alterations of the
regulatory functions is discussed in Xiao et al[6] in the context of probabilistic Boolean networks (PBNs).[7] A solution that applies Markov chain perturbation theory to Markovian
regulatory networks to find a structural intervention that optimally reduces the
steady-state mass of undesirable states is presented in the work by Qian et al.[8]The basic theory of structural intervention provides optimal intervention under the
assumption that the regulatory model is known; however, in practice, this is
generally not the case. For instance, in a Boolean network (BN), or more generally a
PBN, it is commonly the case that certain regulatory relations are unknown or at
least not known with certainty. In this case, one needs to reformulate the
optimization problem to take into account the uncertainty. While we are focusing
here on GRNs, this is a problem recognized as far back as the 1960s in control
theory.[9-11] More recently, it has been
treated in signal processing, first in a minimax framework[12-14] and then in a Bayesian
framework,[15,16] and it has also studied in pattern recognition.[17,18] In the case of
GRNs, the problem has been addressed in the work by Yoon et al[19] using the mean objective cost of uncertainty (MOCU), which quantifies the
uncertainty based on its effect on the objective, in this instance, the degree to
which the uncertainty reduces the phenotypical effect of the intervention.In all cases, one would like to reduce the uncertainty in the model system to achieve
better optimal performance. Owing to cost and time, it is prudent to prioritize
potential experiments based on the information they provide and then conduct the
most informative. This process is called experimental design.
Various methods employ entropy,[20-22] the MOCU,[23-25] and the knowledge gradient.[26] This article provides a comparison of entropy-based and MOCU-based
methods.Because uncertainty can be quantified via entropy, a historical approach to
experimental design has been to choose an experiment that maximally reduces
entropy.[20,21] Assuming that the true model lies in an uncertainty class of
models governed by a probability distribution, from a class of potential
experiments, the aim is to choose the one that provides model information resulting
in the greatest reduction of entropy relative to the distribution. In the case of a
GRN, uncertainty might relate to lack of knowledge concerning regulatory relations
and the uncertainty class would consist of a collection of GRNs with differing
regulatory relations among some of the genes. Potential experiments would
characterize unknown regulatory relations, thereby reducing the uncertainty class
and lowering entropy.Entropy does not take into account the objective for building a model. An experiment
might reduce entropy but have little or no effect on knowledge necessary for
accomplishing the desired objective. In the case of GRNs, structural intervention
involves altering gene regulation so as to reduce the steady-state probability of
undesirable states, such as cell proliferative (carcinogenic) states. Given a model,
one finds an optimal structural intervention.[8] When there is model uncertainty, we desire to reduce uncertainty relevant to
determining an optimal structural intervention. The MOCU,[19] which provides a quantification of uncertainty based on the degree to which
model uncertainty affects the translational objective, is used for experimental
design: choose the experiment that yields the greatest reduction in MOCU.[23]Typically, one might perform a sequence of experiments to progressively reduce the
uncertainty. What should be the order of the experiments to get the best reduction
in uncertainty? Using MOCU, this problem has been addressed in the context of GRNs
in a greedy sequential fashion[19]: at each step, the optimal experiment is chosen from among the ones not yet
performed. In this article, we demonstrate the strength of MOCU-based experimental
design by considering optimal sequential experimental design in the context of
structural intervention in BN with perturbation (BNp). In this framework, there are
unknown regulatory relations and the aim is to sequentially choose
optimal experiments to reduce uncertainty. Notice that “sequential” refers to the
step-wise experiments that are taken over the uncertain regulations, which then lead
to a better but still one-time structural intervention. We compare design via MOCU
and entropy for both greedy and dynamic programming–based sequential design, which
has not been previously used with MOCU.
GRNs and Interventions
BNs: a brief overview
Several BN models have been developed in recent years for studying the dynamics
of GRNs,[1,27,28] for
instance, the cell cycle in the Drosophilafruit fly,[29] in the Saccharomyces cerevisiaeyeast,[30] and the mammalian cell cycle.[31]A binary BN on genes is represented by a set of gene expression,
and a set of Boolean functions, , that gives functional relationships between the genes over
time. The state of each gene is represented by 0 (OFF) or 1 (ON), where
and correspond to the activation and inactivation of gene
, respectively. The states of genes at time step
is denoted by a vector . The value of the gene at time step is affected by the value of the predictor genes at time step via , for . In a BNp, the state value of each gene at each time point is
assumed to be flipped with a small probability . This produces a dynamical model , where , ⊕ is component-wise modulo-2 addition, and with , for . Letting be the set of all possible Boolean states, the transition
probability matrix (TPM) of the Markov chain defined by the state model is given byfor , where refers to the element in the row and column of the TPM , and is the norm. For a nonzero perturbation process , the corresponding Markov chain of a BNp possesses a SSD
π describing the long-run behavior of system. The SSD can
be computed based on the TPM of the Markov chain as , where denotes the transpose of and the element denotes the steady-state probability of being at state
.
Structural intervention of GRNs
We now briefly review the solution that applies Markov chain perturbation theory
to the TPM to find a structural intervention that optimally reduces the
steady-state mass of undesirable states.[8] Given a known BNp, under the rank 1 function perturbation, the TPM
will be altered to ,[8] where and are arbitrary vectors and ( is the all unity column vector). A special case of a rank 1
perturbation, called a single-gene perturbation process, is considered in this
article. According to this process, the output state of a single-input state
changes and the output states of other states stay unchanged. Let
be a class of potential interventions to the network. Let
be the Boolean function after intervention. A single-gene
perturbation for the input state changes the output value of the Boolean function:
, and , for and i ≠ j. We shall refer to
this as a intervention. The TPM after perturbation, is the same as , except for and . The SSD of the system after perturbation can be computed as follows[8]where is the steady-state probability of the state of the perturbed system following a intervention, is the fundamental matrix of a BNp, being the identity matrix, and are elements of .If is the set of undesirable Boolean states, then is the steady-state probability mass of undesirable states
after applying a intervention. The optimal single-gene perturbation structural
intervention minimizes
:
Experimental Design
The complex regulatory machinery of the cell and the lack of sufficient data for
accurate inference create significant uncertainty in GRN models. Consider a GRN
possessing uncertain parameters . In our application, corresponds to a regulatory relation of an uncertain type that can
take on 2 different values: “” for activating regulation and for suppressive regulation. These unknown parameters result in
different BN models for the system that differs in one or more of
these uncertain regulations. Let be the uncertainty class of these network models, where
, for . The prior distribution over BN models can be encoded into a
single column vector:where is a vector containing the true values of the parameters.For a given initial distribution and , the prior probability that the regulation is activating iswhere if and otherwise. The initial belief state is
.As in the work by Dehghannasiri et al,[23] we assume that there exist experiments , where determines the regulation . More general experimental formulations are possible; for
instance, there is a probability that can return the wrong value.[25]Let be the belief state before conducting the experiment. Given that experiment at time step is performed, if the outcome of the experiment shows that
, then the element of the belief vector at time step will get the value and the other elements will get their previous values:
, for . However, if , then the element of the belief vector will be and the rest will be unaltered from time to .Thus, each of the elements of the belief vector can take 3 possible values during
the experimental design process and the belief vector is of the form , where . We view transition in the belief space as a Markov decision
process with states. The controlled transition matrix in the
belief space under experiment is a matrix of size . The element associated with the probability of transition from
state to state under experiment can be written as follows:
Greedy-MOCU
Optimal experimental design using the MOCU, first proposed in the work by
Dehghannasiri et al,[23] is briefly described in this section. Let be the cost of applying the intervention to the network . Using equation (3), for any
, the optimal single-gene perturbation structural intervention
for a BNp defined by a given uncertainty vector is , where for any .The MOCU relative to an uncertainty class represented by the belief vector
and a class of interventions is defined as follows:where is an intrinsically Bayesian robust (IBR)
intervention,and is the vector of posterior probabilities of network models for
a belief vector , which can be computed based on the independency of the
regulations as follows:for . The IBR intervention depends on the belief state , whereas the optimal intervention is designed for a specific network model . The MOCU is the expected cost increase that results from
applying a robust intervention over all networks in instead of the optimal intervention for the unknown true
network.The goal of sequential greedy MOCU-based experimental design[23]), referred to herein as Greedy-MOCU, is to select an experiment at each
time point that results in the maximal reduction in MOCU in the next time step.
If is the current belief state, then the Greedy-MOCU decision is
given bywhere the second equality follows by expressing the expectation in terms of , for , and then dropping the terms unrelated to minimization. After
making the decision and observing outcomes, one needs to update the belief state
and repeat another experimental design process if necessary.
Dynamic programming MOCU
Greedy-MOCU uses the expected value of MOCU in the next time step for decision
making at the current time. If the number of experiments, , is known a priori, then all future experiments (the remaining
horizon) can be taken into account during the decision-making process. In this
section, we introduce optimal finite-horizon experimental design based on
dynamic programming (DP),[32] which we call DP-MOCU.Let be a policy at time step that maps a belief vector into an experiment in . We define a bounded immediate cost function
at time step corresponding to transition from the belief vector
into the belief vector under policy µ as follows:for , where . The terminal cost function is defined as
, for any .Letting be the space of all possible policies, using the definitions
of the immediate and terminal cost functions, an optimal policy, , is given by solving the minimization problem:where the expectation is taken over stochasticities in belief transition.Dynamic programming provides a solution for the minimization in equation
(9). The method starts by setting the terminal cost function as
for . Then, in a recursively backward fashion, the optimal cost
function can be computed as follows:with an optimal policy, µkMOCU (b), given byfor and , where is defined in equation (4). Unlike
Greedy-MOCU, the DP-MOCU policy decides which uncertain regulation should be
determined at each step to maximally reduce the uncertainty relative to the
objective after conducting all experiments.
Greedy entropy
The idea of entropy-based experimental design[20,21] is to reduce the amount of
the entropy, which quantifies the uncertainty of the system. While MOCU-based
techniques take action to reduce the uncertainty with respect to an objective,
entropy-based techniques do not take into account the objective during decision
making. Performance comparisons are made in section “Results.”The entropy for belief vector is , where is the posterior probability of network models under the
belief state defined in equation (7). The maximum value
of the entropy is , which corresponds to a uniform prior over the network models,
and the minimum value is which corresponds to certainty.The Greedy-Entropy approach sequentially chooses an experiment to minimize the
expected entropy at the next time step:where the second equality is obtained by removing constant terms.
Dynamic programming entropy
The Greedy-Entropy approach takes into account only the entropy in the next step
for selecting the experiment to be performed at the current step. If the number
of experiments is known a priori, then the DP technique is
used for finding an optimal entropy-based solution. In the work by Huan and Marzouk,[22] an approximate DP solution based on the entropy scheme for cases with a
continuous belief space is provided. Here, we employ the optimal DP solution
because the belief space is finite. Again letting be a policy at time step , we define a bounded immediate cost function at time step
corresponding to transition from the belief vector
to the belief vector under policy µk byfor . Define the terminal cost function by , for any . Using the immediate and terminal cost functions
and instead of and , for , in the DP process, the optimal finite-horizon policy,
, for , based on the entropy, is obtained. It is called the
DP-Entropy policy.
Results
Simulation setup
According to the majority vote rule for generating BN models of GRNs, the
Boolean predictor is given byfor , where can take 3 values: if there is an activating regulation () from gene to gene , if there is suppressive regulation () from gene to gene , and if gene is not an input to gene .[33]We employ the symmetric Dirichlet distribution for generating the initial
distribution over various network models:where is the gamma function and is the parameter of the symmetric Dirichlet distribution. The
expected value of the initial distribution for any value of is a vector of size with all elements . specifies the variability of the initial distributions; the
smaller is, the more the initial distributions deviate from the
uniform distribution.
Performance evaluation based on synthetic BNps
To evaluate performance, simulations based on synthetic BNps have been performed.
A total of 100 random BNps of size with a single set of unknown regulations for each network have been considered. The
perturbation probability is set to . Different values of have been tried and similar results, as presented in the
sequel, have been observed. The states with upregulated first gene are assumed
to be undesirable (). Three different values are considered for the Dirichlet
parameter: . From each , 500 initial distributions are generated.Five experimental design strategies are considered: (1) Greedy-MOCU, (2) DP-MOCU,
(3) Greedy-Entropy, (4) DP-Entropy, and (5) Random. A successful experimental
design strategy has the ability to effectively reduce the cost of intervention.
Thus, the robust intervention based on the resulting belief state of each
strategy is applied to the true (unknown) network and the cost of intervention
(total steady-state mass in undesirable states) is used as a metric. For a given
belief state computed before taking the experiment, the cost is . represents the amount of remaining uncertainty in the system
for a given belief state . Thus, we define the intervention gain of conducting the
chosen experiment over a random experiment by , and the entropy gain as , where is the belief state after performing the experiment determined via experimental design (Greedy-MOCU,
DP-MOCU, Greedy-Entropy, or DP-Entropy), and is the belief vector before performing the experiment during the random experimental design process.Figure 1 shows the
average gain of intervention with respect to the horizon length strategies for different numbers of unknown regulations
and Dirichlet parameters . The figure shows curves for and . The curves end at gain value when , so that all regulations have been identified. In practice,
the number of unknown regulations is usually more than the number of experiments
which can be performed. In these cases, large intervention gains have been
attained by the MOCU-based strategies in comparison with entropy-based
techniques, thus demonstrating the effectiveness of MOCU-based strategies in
reducing network uncertainty relevant to the objective.
Figure 1.
The average intervention gain with respect to the total number of
experiments, , for randomly generated synthetic 6-gene Boolean
networks with 2, 3, 5, 7 uncertain regulations . (A) φ = 10, (B)
φ = 10, and (C) φ = 10.
The average intervention gain with respect to the total number of
experiments, , for randomly generated synthetic 6-gene Boolean
networks with 2, 3, 5, 7 uncertain regulations . (A) φ = 10, (B)
φ = 10, and (C) φ = 10.The maximum amount of gain in MOCU-based strategies is achieved for
(maximum uncertainty). In contrast, the intervention gains in
entropy-based strategies are very close to when the initial distribution is closer to uniform
and increase as this distribution deviates from uniform
. Indeed, as gets larger (initial distributions get closer to uniform), the
Entropy scheme does not discriminate between potential experiments and performs
like a random selection approach. Thus, entropy-based strategies slightly
perform better than the random strategy for nonuniform prior distributions, with
their performance being far worse than MOCU-based techniques. In addition, the
peak in the intervention gain is shifted slightly into the left side as
decreases. This is due to the fact that in the presence of a
nonuniform initial distribution, the MOCU-based strategies are capable of
selecting the first most effective experiments in early steps to reduce the
intervention cost.In Figure 1, DP-MOCU
outperforms Greedy-MOCU in all cases with respect to the cost of intervention
because future experiments are taken into account for decision making in
DP-MOCU, as opposed to Greedy-MOCU, which only considers one-step look-ahead.
When the total number of experiments is , Greedy-MOCU and DP-MOCU are equivalent and the same gain can
be seen for both strategies. The highest gain difference is achieved when the
horizon length is less than , the number of uncertain parameters.To better appreciate the performance of entropy-based techniques, the gain in
entropy is reported in Figure
2. Once again, the figure shows curves for and . Comparison between Figures 1 and 2 shows that the maximum entropy
reduction by the entropy-based strategies does not necessarily result in the
highest reduction in the cost of intervention, which is the main objective of
performing experimental design. Interestingly, although DP-Entropy is more
successful in reducing the entropy value in comparison with Greedy-Entropy (as
expected), it does not outperform Greedy-Entropy relative to average gain in
intervention.
Figure 2.
The average entropy gain with respect to the total number of experiments,
, for randomly generated synthetic 6-gene Boolean
networks with 2, 3, 5, 7 uncertain regulations . (A) φ = 10, (B)
φ = 10, and (C) φ = 10.
The average entropy gain with respect to the total number of experiments,
, for randomly generated synthetic 6-gene Boolean
networks with 2, 3, 5, 7 uncertain regulations . (A) φ = 10, (B)
φ = 10, and (C) φ = 10.Next, we consider the effect of the initial distribution on the performance of
various experimental design strategies. The horizon length and the number of
unknown regulations are set to be and , respectively. The initial distribution is a vector of size
. The entropy of this initial distribution specifies the amount
of initial uncertainty in the system. The closer this value is to its maximum
value , the closer the initial distribution is to the uniform
distribution. In the figure, we observe that as the entropy of the initial
distribution increases, the performance of both Greedy-MOCU and DP-MOCU
increases as well. This growth is higher for DP-MOCU compared with Greedy-MOCU,
which shows the superiority of DP-MOCU in reducing the intervention cost in the
presence of high uncertainty in the system. However, note the reduction trend in
the amount of intervention gain for the entropy-based techniques as the entropy
of the initial distribution increases. This is due to the fact that the
entropy-based strategies are unable to discriminate between potential
experiments in the presence of the uniform initial distribution and perform like
random selection (Figure
3).
Figure 3.
The average intervention gain with respect to the entropy of the initial
distribution, , for randomly generated synthetic 6-gene Boolean
networks with unknown regulations and total number of experiments
equal to .
The average intervention gain with respect to the entropy of the initial
distribution, , for randomly generated synthetic 6-gene Boolean
networks with unknown regulations and total number of experiments
equal to .The average cost of robust intervention with respect to the number of conducted
experiments for different experimental design strategies is shown in Figure 4. DP-MOCU has the
lowest average cost of robust intervention at the end of the horizon (after
taking all experiments); however, Greedy-MOCU has the lowest cost before
reaching the end of the horizon. This observation can be understood by looking
at the finite-horizon DP policy. DP-MOCU finds a sequence of experiments from
time to to minimize the expected sum of the differences of MOCUs
throughout this interval. The expected value of MOCU after conducting the last
experiment plays the key role in the decision making by the DP policy. Thus, the
capability of DP-MOCU in planning for reducing MOCU at the end of the horizon,
as opposed to Greedy-MOCU which takes only the next step into account for
decision making, results in the lowest average cost of robust intervention by
DP-MOCU at the end of horizon. Both DP-MOCU and Greedy-MOCU are equivalent for
horizon length and behave differently for other cases. When the number of
experiments is not known a priori, Greedy-MOCU may be preferred to DP-MOCU
because, as presented in Figure
4, the intervention gain in DP-MOCU might be lower than Greedy-MOCU
before conducting the total number of experiments.
Figure 4.
The average cost of robust intervention with respect to the number of
conducted experiments obtained by various experimental design strategies
for randomly generated synthetic BNps with unknown regulations and . (A) φ = 10, (B)
φ = 10, and (C) φ = 10.
The average cost of robust intervention with respect to the number of
conducted experiments obtained by various experimental design strategies
for randomly generated synthetic BNps with unknown regulations and . (A) φ = 10, (B)
φ = 10, and (C) φ = 10.
Performance evaluation based on the mammalian cell cycle network
The mammalian cell cycle involves a sequence of events resulting in the
duplication and division of the cell. It occurs in response to growth factors
and under normal conditions; it is a tightly controlled process. A regulatory
model for the mammalian cell cycle, proposed in Fauré et al,[31] is shown in Figure 5. This model contains 10
genes: CycD, Rb, p27, E2F, CycE, CycA, Cdc20, Cdh1, UbcH10, and CycB. The blunt
and normal arrows represent suppressive and activating regulations, respectively. Mammalian cell division is
coordinated with the overall growth of the organism via extracellular signals
that control the activation of CycD in the cell. Cell division happens due to
the positive stimuli activating cyclin D (CycD). When CycD is upregulated, it
inactivates the tumor suppressor Rb protein via phosphorylation. Rb can also be
expressed if gene p27 and either CycE or CycA is active. The activation of Rb in
the absence of stimuli causes cell proliferative (cancerous) phenotypes. States
with downregulated CycD, Rb, and p27 are undesirable, representing cancerous phenotypes. The goal
is to reduce the steady-state probability mass of the set of undesirable states,
, via structural intervention.A gene regulatory network model of the mammalian cell cycle. Normal
arrows represent activating regulations and blunt arrows represent
suppressive regulations.We consider various cases with to unknown regulations . We randomly select different sets of regulations from the network, for which we assume their
regulatory information is not known and apply various experimental design
strategies to predict the experiment to be performed. A total of 500 initial
distributions have been generated from the Dirichlet distribution with parameter
.The average intervention gains for various experimental design strategies are
presented in Figure 6,
which shows that curves for . DP-MOCU and Greedy-MOCU have the highest average intervention
gain in comparison with the entropy-based strategies. DP-MOCU is clearly
superior to Greedy-MOCU for cases with larger numbers of unknown regulations and
when the number of experiments is smaller than the number of unknown regulations
. Both Greedy-Entropy and DP-Entropy perform poorly in all
cases.
Figure 6.
The average intervention gains of various experimental strategies versus
the random strategy for the mammalian cell cycle network for 2 to 6
unknown regulations and .
The average intervention gains of various experimental strategies versus
the random strategy for the mammalian cell cycle network for 2 to 6
unknown regulations and .
Computational complexity analysis
Consider a network with genes, in which the states to are undesirable. Structural intervention requires
searches over state pairs. This gives complexity for the optimal intervention process for a single network.
Given uncertain parameters, which poses different network models, the complexity of Greedy-MOCU is of
order . However, DP-MOCU has an extra step for the DP process. The
complexity of the DP process is of order , where is the horizon length. Thus, the complexity of DP-MOCU is
. In contrast to the MOCU-based strategies, the complexities of
the entropy-based techniques are independent of the intervention process.
Greedy-Entropy and DP-Entropy have complexities and , respectively.Table 1 shows
approximate processing times for networks of different size with various numbers
of regulations. Simulations have been run on a machine with 16 GB of RAM and
Intel Core i7 CPU, 3.6 GHz. The running time of the MOCU-based strategies grows
exponentially as the number of genes increases. It also increases with increases
in the number of unknown regulations. It can be seen that the running time of
DP-MOCU is slightly higher than that of Greedy-MOCU owing to the extra DP
recursion in DP-MOCU.
Table 1.
Comparing the approximate processing times (in seconds) of various
experimental design methods for networks of size with uncertain regulations, and .
n=10
n=11
n=12
Greedy-MOCU
M=4
250
2651
32 933
M=5
493
5210
65 208
M=6
967
10 264
127 397
DP-MOCU
M=4
272
2696
32 989
M=5
490
5294
65 323
M=6
1002
10 314
127 413
DP-Entropy
M=4
5
11
21
M=5
15
29
63
M=6
44
86
173
Greedy-Entropy
M=4
6
13
25
M=5
18
36
69
M=6
50
99
184
Abbreviations: DP, dynamic programming; MOCU, mean objective cost of
uncertainty.
Comparing the approximate processing times (in seconds) of various
experimental design methods for networks of size with uncertain regulations, and .Abbreviations: DP, dynamic programming; MOCU, mean objective cost of
uncertainty.Clearly, computational complexity is an issue. The issue has been addressed in
the context of structural intervention in the work by Dehghannasiri et al,[23] where computation reduction for MOCU-based design is achieved via network
reduction schemes. These result in suboptimal experimental design, but they are
still superior to random design.
Conclusions
By taking into account the operational objective, MOCU-based experimental design
significantly outperforms entropy-based design. Our aim in this article has been
2-fold: to demonstrate this advantage and to propose and examine the effect of using
finite-horizon DP for sequential design. The simulations show that if one has a
fixed number of experiments in mind, then DP provides improved results because it
takes into account experiments over the full horizon, but for the same reason, it
can be disadvantageous if one is interested in stopping experimentation once MOCU
reduction falls below a given threshold, meaning that further experimentation is not
worth the cost. Although our focus in this article has been intervention in GRNs, it
should be recognized that Greedy-MOCU has been used for optimal experimental design
in other environments such as the development of optimally performing materials[34] and optimal signal filtering.[35] Dynamic programming can be applied for these problems in the same way it has
been done here.
Authors: Daniel N Mohsenizadeh; Roozbeh Dehghannasiri; Edward R Dougherty Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2016-08-25 Impact factor: 3.710