Literature DB >> 29702665

Network approach for decision making under risk-How do we choose among probabilistic options with the same expected value?

Wei Pan1,2,3,4, Yi-Shin Chen5,6.   

Abstract

Conventional decision theory suggests that under risk, people choose option(s) by maximizing the expected utility. However, theories deal ambiguously with different options that have the same expected utility. A network approach is proposed by introducing 'goal' and 'time' factors to reduce the ambiguity in strategies for calculating the time-dependent probability of reaching a goal. As such, a mathematical foundation that explains the irrational behavior of choosing an option with a lower expected utility is revealed, which could imply that humans possess rationality in foresight.

Entities:  

Mesh:

Year:  2018        PMID: 29702665      PMCID: PMC5922534          DOI: 10.1371/journal.pone.0196060

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Decision making under risk, i.e., all possible outcomes and the associated probabilities of the options are well known, has attracted the interests of researchers in a number of disciplines, including economics, psychology, neuroscience, and business, and has also inspired discourse on human’s rationality [1-10]. The expected utility theory established by von Neumann and Morgenstern [11] states that in a multiple choices problem, a rational decision is assumed to maximize the utility function (EU), the product of utility (u) and the associated probability (p). Many cases have shown that this criterion provides a good reference in decision making and can explain the behaviors of decision makers [3, 12–14]. However, the utility theory does not provides explanations for several behaviors in decision making. Several studies have also shown that a certain amount of people do not always take the option(s) with the maximum EU [15-21]. For instance, in experiments conducted by Kahneman and Tversky, given the choice between options with (u, p) of A(4000, 0.2) and B(3000, 0.25), 65% of the subjects took the former. However, between C(4000, 0.8) and D(3000, 1), 80% of the subjects took the latter. Here, the option (u, p) means that the subjects who take this option have a p chance of obtaining u and a (1 − p) chance of obtaining nothing. In other words, at least 45% of the overlapped subjects were inconsistent in responding to these two questions [22-27]. Kahneman and Tversky explained the psychological aspect of choosing D between C and D; in gain framing, as proposed in their prospect theory, these people were referred to as risk averse [17, 25, 28–30]. The framing effect introduced in prospect theory pointed out that people might count on different reference points in frames of gains and losses [31-33]. Moreover, people are often risk averse in decisions involving gains and are often risk seeking in decisions involving losses. These inconsistent behaviors owing to the shift of reference points in different frames are considered as irrational behaviors. Consistency is a principal aspect of the rational behaviors [34-39]. In evolution in the natural world, an act that is broadly disseminated through genes or memes should benefit, or at least not be harmful to, the actors or the offspring. In other words, an apparently detrimental action that is often observed should be beneficial in an obscure way [40]. The behavior of human beings, who are members of the ecological sphere, should also be confined by such a natural rule. Therefore, ‘inconsistent’ behaviors require a further study to find the hidden benefits and mechanisms. Further, the applicability of the expected utility theory in decision making relies on the viability of maximizing the EU, which counts on the law of large numbers. That is, it requires a large number of people making a decision or a person making a decision a lot of times to obtain outcomes close to the expected value. However, for an individual who has aspirations other than the average, the results from many people would not be a good reference to make such a decision. The real outcome would deviate a lot from that predicted by the expected value [41-45]. Thus, new criteria other than maximizing the EU are needed. Timing is significant as to satisfy one’s needs in time is an important issue in decision making [44-52]. With respect to time, people may have different needs in short-term or long-term. An individual may have different financial needs at different stage of life. Thus, one may need to set specific goals in the financial plans according to the ages [53]. Additionally, achieving a goal as soon as possible usually, but not always, fits the decision maker’s needs. For example, in an election campaign, the best strategy for the candidate is to reach the highest rate of support exactly on voting day in order to transform the support rate into votes−peaking too early or too late does not as effectively benefit the candidate. Thus, the reference points may change in different periods [54, 55]. In this study, we provide an alternative aspect by considering goals and the times in making decisions. It is assumed that the behaviors in decision making is meant to maximize the probability of achieving a goal during or by a designated time. As human beings have a planning nature, taking the goal and time, a concrete aim for the future, into account makes an individual’s activity meaningful [56]. This paper is organized as follows. First, a model that considers both goals and times is proposed; in this model, decisions are considered to be a process moving from the initial stage through several intermediate stages to ultimately reach the goal. These stages and the connecting paths are respectively illustrated as nodes and links in a network [57-59]. The decisions are a series of choosing a link that connects the current node to the next node until arriving to the node of goal. This process is equivalent to a walking on the constructed network, which is then further transformed into a matrix [60]. We then introduce strategies that set different weighting on choosing the available options. The outcomes of this series of decisions can be calculated by iteration from the continual multiplication of the matrices. Several strategies are raised to demonstrate how the proposed model works. Finally, we discuss the applications of the model in measuring the goal in mind and the mathematical foundation to the economic behaviors.

Methods

In our model, the decision process is divided into three parts, (1) Game set, (2) Strategy, and (3) Iteration. The first part, Game set depends on the situation in which the decision maker is involved including the number of options and the associated (u, p). Not losing generality, we set the utility as the monetary value. According to Game set, one can construct a network, which resembles the diagram in Markov decision process [61-63]. The second part, Strategy, is new that a decision maker is freely to distribute the weights on options as to narrowly or broadly bracket choices [64]. One can construct the nodes and links of the decision network according to these two parts. We can further transform the network into a matrix and proceed the third part, Iteration to find the time-dependent probability of achieving the goal, P(t). The decision maker can adjust Strategy to maximize P(t) during the designated t. In this model, the cost is the time spent on the game.

Game set: Construction of the network

Approaching mode and Growing mode are raised for constructing the network as shown in Figs 1 and 2, respectively.
Fig 1

Construction of the network in a decision game of G of S0 in Approaching mode.

(I) Payoff table: the decision maker who chooses option j has a chance of p to gain u or a chance of (1 − p) to gain nothing. (II) Game set: the game set according to the payoff table in (I) is further illustrated. At S, the decision maker has k options, e.g., option 1 leads to S with a probability of p1 or stay at S0 with a probability of (1 − p), option 2 leads to S with a probability of p2 or stay at S0 with a probability of (1 − p2), …, and option k leads to S0 with a probability of p or stay at S0 with a probability of (1 − p). (III) Weights: the decision maker at S can distribute the weights of w1, w2, …, and w on the options of (u1, p1), (u2, p2), …, and (u, p), respectively. Thus, the probabilities of moving from S to S, S, …, and S0 are w1 p1, w2 p2, …, and w p, respectively. The one also has a probability of (w1(1 − p1) + w2(1 − p2) + … + w(1 − p)) of staying at S.

Fig 2

Construction of the network in a decision game of G of S0 in Growing mode.

(I) Payoff table: the decision maker who chooses option j has a chance of p to gain u or a chance of (1 − p) to gain nothing. (II) Game set: the game set according to the payoff table in (I) is further illustrated. At S0, the decision maker has k options, e.g., option 1 leads to S1 with a probability of p1 or stay at S0 with a probability of (1 − p1), option 2 leads to S2 with a probability of p2 or stay at S0 with a probability of (1 − p2), …, and option k leads to S with a probability of p or stay at S0 with a probability of (1 − p). (III) Weights: the decision maker at S0 can distribute the weights of 0w1, 0w1, …, and 0w on the options of (u1, p1), (u2, p2), …, and (u, p), respectively. Thus, the probabilities of moving from S0 to S0, S1, S2, …, and S are 0w1 p1, 0w2 p2, …, and 0w p, respectively. The one also has a probability of (0w1(1 − p1) + 0w2(1 − p2) + … + 0w(1 − p)) of staying at S0.

Construction of the network in a decision game of G of S0 in Approaching mode.

(I) Payoff table: the decision maker who chooses option j has a chance of p to gain u or a chance of (1 − p) to gain nothing. (II) Game set: the game set according to the payoff table in (I) is further illustrated. At S, the decision maker has k options, e.g., option 1 leads to S with a probability of p1 or stay at S0 with a probability of (1 − p), option 2 leads to S with a probability of p2 or stay at S0 with a probability of (1 − p2), …, and option k leads to S0 with a probability of p or stay at S0 with a probability of (1 − p). (III) Weights: the decision maker at S can distribute the weights of w1, w2, …, and w on the options of (u1, p1), (u2, p2), …, and (u, p), respectively. Thus, the probabilities of moving from S to S, S, …, and S0 are w1 p1, w2 p2, …, and w p, respectively. The one also has a probability of (w1(1 − p1) + w2(1 − p2) + … + w(1 − p)) of staying at S.

Construction of the network in a decision game of G of S0 in Growing mode.

(I) Payoff table: the decision maker who chooses option j has a chance of p to gain u or a chance of (1 − p) to gain nothing. (II) Game set: the game set according to the payoff table in (I) is further illustrated. At S0, the decision maker has k options, e.g., option 1 leads to S1 with a probability of p1 or stay at S0 with a probability of (1 − p1), option 2 leads to S2 with a probability of p2 or stay at S0 with a probability of (1 − p2), …, and option k leads to S with a probability of p or stay at S0 with a probability of (1 − p). (III) Weights: the decision maker at S0 can distribute the weights of 0w1, 0w1, …, and 0w on the options of (u1, p1), (u2, p2), …, and (u, p), respectively. Thus, the probabilities of moving from S0 to S0, S1, S2, …, and S are 0w1 p1, 0w2 p2, …, and 0w p, respectively. The one also has a probability of (0w1(1 − p1) + 0w2(1 − p2) + … + 0w(1 − p)) of staying at S0. The payoff table of the decision game for the Approaching mode is listed in Fig 1(I). The decision maker who choose option j has a p chance of obtaining u that move j stages and a (1 − p) chance of staying at the original stage. In the Approaching mode as shown in Fig 1(II), the stage occupied by the decision maker is denoted as S, indicating that it requires k stages to reach the goal, S0. Thus, a series of decisions is considered a walk from S to S0 in the network. The available options correspond to the links connecting to the following nodes according to the payoff table. For a decision maker at S, option i indicates an outward link pointing to S and a returning link associated with probabilities of p and (1 − p), respectively. The value and the utility of u is set to be equal to move i stages in our research. In the Growing mode as shown in Fig 2(II), the initial stage is denoted as S0. The options taken by the decision maker correspond to the links connecting to the following nodes according to the payoff table as shown in Fig 2(I). At S0, option i indicates an outward link pointing to S and a returning link associated with probabilities of p and (1 − p), respectively. Again, the value and the utility of u is set to be equal to move i stages.

Strategy matrix, W

The decision maker can distribute weights on the possible links (options). In the Approaching mode as shown in Fig 1(III), the decision maker can distribute a weight, w1 (< 1), on option i when she is at S. Thus, she has a w p chance of moving to S and a w(1 − p) chance of staying at S. Generally, at S, the decision maker can set the weights of w1, w2, … w, and w on the options 1, 2, …, k − 1, and k, respectively. Thus, the chances of moving from S to S, S, …S1, and S0 are w1 p1, w2 p2, … w p, and w p, respectively. The chance of staying at S is (w1(1 − p1) + w2(1 − p2) + …w(1 − p) + w(1 − p)), i.e. w(1 − p). In the Growing mod as shown in Fig 2(III), the decision maker can distribute a weight, 0w (< 1), on option i when she is at S0. Thus, she has a 0w p chance of moving from S0 to S and a 0w(1 − p) chance of staying at S0. Generally, at S0, the decision maker can set the weights of 0w1, 0w2, … 0w, and 0w on the options 1, 2, …, k−1, and k, respectively. Such that, the chances of moving from S0 to S1, S2, …S, and S are 0w1 p1, 0w2 p2, … 0w p, and 0w p, respectively. The chance of staying at S0 is (0w1(1 − p1) + 0w2(1 − p2)+ … 0w(1 − p) + 0w(1 − p)), i.e. 0w(1 − p). Since the outward links are conservative, the total of the weights is set as 1, i.e., w = 1 for any j in both modes. A strategy in a decision corresponds to how the decision maker distribute the weights, w, on the available links at S. For simplicity, we take Approaching mode in the following sections. A decision maker at S distributes the weight on the link, (u, p) and is denoted as w, which indicates moving m stages to S with a probability of p or staying at S with a probability of (1 − p). A strategy matrix W is set when all elements, b are set, which are equivalent to w.

Matrices

The network of a decision game can be transformed into a game matrix Q [60]. The Q is a (k + 1) × (k + 1) matrix with the elements, q are the probability connecting S to S. Additionally, the distribution of the weights on each option can be represented in a strategy matrix W. In our game setting, Q is determined by the decision game and W depends on the decision maker’s strategy. They are mutually independent. The Q and the W can be further composed into a matrix A. The correlation of these matrices are shown in Eq 1, in which “∘” denotes the operation of the Hadamard product, a = q ⋅ b for i > j. The diagonal elements of A are set in D, which contains only diagonal elements, i.e., d = a and d = 0 for i ≠ j. Walking on the network can be further transformed into an iteration of a transition matrix, A. The element a indicates the transition probability from S to S [60]. Thus, A is a (k + 1) × (k + 1) matrix with elements indexing from a00 to a. Those a are i w( p( for i > j. The diagonal elements, a corresponding to the self-linking links, are w(1 − p). The left part of A and the associated elements a are shown in Eqs 2 and 3, respectively. The sum of each row of A is 1 as shown in Eq 4, indicating that A is not only a lower triangular but also a stochastic matrix.

Iteration: State vector and iteration relation

We then introduce a state matrix, R(t), [r0, r1, r2, …, r] with dimension of 1 × (k + 1). The elements r(t) indicate the probability of a decision maker staying at S in t. Therefore, a decision maker begins at S, which corresponds to R(0) of [0, 0, 0, …, 1]. The first element, r0(t), is called P(t), the probability that a decision maker arrives at S0 within t. The next state, r0(t + 1), is generated from the product of r0(t) and a as shown in Eq 5. Thus, the state matrix R(t) fits the iteration relation as shown in Eq 6. The decision maker eventually arrives at S0, which corresponds to R(∞), [1, 0, 0, …, 0]. We then examine the solvability to find the best strategy to satisfy the decision maker. In a decision game beginning at S, the number of unknown variables in W is k(k + 1)/2. There are (k + 1) constraints including the sum of each row of W and the desired . There are still k(k + 1)/2 − (k + 1) undetermined variables. The problem can be analytically solved when k is less than 2. For k > 2, one can construct a trail W and then obtain the r0(t) resulting from the iteration relation in Eq 6. The decision maker may try several Ws in order to determine strategies that satisfy the needs.

Strategy comparison

At S, the decision maker has k number of options from (u1, p1), (u2, p2), (u3, p3), …, to (u(, p(), and (u, p) linking to S(, S(, S(, …, to S1, and S0, respectively. For resolving the ambiguity raised in Abstract, the expected utility for each option is set as 1, i.e., (u, p) is set as linking from S to S. This setting is also for comparison of different strategies choices among options without privilege. In this paper, we raise strategy examples: fix, min k, max k, risky, random, and safe. The fix strategy indicates that a decision maker always chooses a certain option until the end of the game as shown in Fig 3. Assuming that the decision maker takes the option of (m, p), only g/m stages are needed to arrive at S0.
Fig 3

The network of the fix strategy choosing (m, p).

For simplicity, the G is set as 3m to demonstrate the network. The decision maker only passes the stages with indexes of (g − nm), where m = g/u and n = 0, 1, 2, or 3, i.e., 3 stages from S to S0. That is, there are only four stages with this strategy: S, S, S, and S0 (or S3, S2, S, and S0, respectively).

The network of the fix strategy choosing (m, p).

For simplicity, the G is set as 3m to demonstrate the network. The decision maker only passes the stages with indexes of (g − nm), where m = g/u and n = 0, 1, 2, or 3, i.e., 3 stages from S to S0. That is, there are only four stages with this strategy: S, S, S, and S0 (or S3, S2, S, and S0, respectively). A decision maker taking the min k strategy always chooses the safest option, i.e., (1, 1). In other words, the decision maker moves one stage each time. In a game with a goal (G) of g, it takes g stages to reach G. This provides us a baseline for comparison with other strategies. With the max k strategy, the decision maker always chooses the riskiest option, i.e., the one with the highest utility, (g, 1/g), as shown in Fig 4. The decision maker either stays at the very beginning, S, or moves to S0 with a probability of (1 − 1/g) or 1/g, respectively. Both min k and max k are special cases of the fix strategy. With the risky strategy, the decision maker places more weight on the high utility options. In other words, at any stage during the game, S, she chooses the option of (k, 1/k) in a probabilistic manner, which is in proportion to the utility, k. With the safe strategy, the decision maker places more weight on the options with high probability. That is, she chooses the option of (k, 1/k) at S in a probabilistic manner, which is in proportion to the probability, 1/k. With the random strategy, the decision maker chooses the option in a random manner in all stages. That is, she weights all option equally. The Ws for the strategies, risky, random, and safe are listed in Table 1. Note that the real time can be estimated by multiplying t with an approximated period τ to make a decision, i.e., tτ.
Fig 4

The network of the max k strategy in a decision game of G of g.

The decision maker always chooses the option with the highest utility, i.e., (g, 1/g), a special case of the fix strategy. Thus, she either stays at S or moves to S0. The intermediate stages, S to S1, are not reachable.

Table 1

Strategies in a decision game at S.

Pr: (k, 1/k) is the probability to choose (k, 1/k).

StrategyPr: (k, 1/k)iwkaij, i > j
riskyk2k/i(i + 1)ai, j = 2/i(i + 1)
random1/i1/i1/i(ij)
safe∝ 1/k 1/(km=1i1m) 1/((i-j)2m=1i1m)

The network of the max k strategy in a decision game of G of g.

The decision maker always chooses the option with the highest utility, i.e., (g, 1/g), a special case of the fix strategy. Thus, she either stays at S or moves to S0. The intermediate stages, S to S1, are not reachable.

Strategies in a decision game at S.

Pr: (k, 1/k) is the probability to choose (k, 1/k). The results are calculated by the iteration relation, as shown in Eq 6, to see how r0(t) evolves with the associated W. It can also be performed by computer simulation. Because of the law of large numbers, both methods should produce consistent outcomes. Therefore, we only show the results from the calculations.

Results

The results are presented both in the time-dependent cumulative distribution function (CDF, P(t)) of the probability of achieving the goals and the associated time-dependent probability mass function (PMF, P(t)).

Fixed option strategy

We first show the outcomes of the fix strategy. In a decision game with G of g, the node number is reduced from (g + 1) to when the decision maker takes (m, p) only, as shown in Fig 3. The dimension of the associated matrix A is also reduced from (g + 1) × (g + 1) to . Thus, there are only two outward links for each stage: the one linking to the next stage and the other linking to the original stage. The associated A becomes a band matrix in which the non-zero elements locate only in the main diagonal and the first diagonal below, as shown in Eq 7. The networks of the fix strategies with u of 4 and 6 in a decision game of G of 24 are shown in Fig 5A and 5B, respectively. For the decision maker that chooses C(4, 0.5), the network becomes a six-stage link from S24 to S0 through S4, where n ∈ integers. For the decision maker that chooses B(6, 0.3), the network becomes a four-stage link from S24 to S0 through S6, where n ∈ integers. From the aspect of networks, the decision involves comparing a six-stage route with higher probability of moving to a four-stage route with lower probability of moving.
Fig 5

The network of the a decision game of G of 24 with options of (or B(6, 0.3)) and C(4, 0.5).

The network can be decoupled when a decision maker takes the fix strategy. For a decision maker that chooses A (or B), the network becomes a four-stage network linking from S24 to S0 through S18, S12, and S6. For a decision maker that chooses C, the network becomes a six-stage network linking from S24 to S0 through S20, S16, S12, S8, and S4.

The network of the a decision game of G of 24 with options of (or B(6, 0.3)) and C(4, 0.5).

The network can be decoupled when a decision maker takes the fix strategy. For a decision maker that chooses A (or B), the network becomes a four-stage network linking from S24 to S0 through S18, S12, and S6. For a decision maker that chooses C, the network becomes a six-stage network linking from S24 to S0 through S20, S16, S12, S8, and S4. The game shown in Fig 5 is taken as an example for comparisons of (1) options with the same expected utility, and C(4, 0.5), and, (2) options with different expected utilities, B(6, 0.3) and C(4, 0.5). In Fig 6B, the CDF curves of options of A and C interlace at t of 12. This indicates that on average, both options take 12 stages to achieve the goal because the expected utility for both is 2. In the region of t < t, P is larger than P, which indicates that A is more preferable than C in this region. The situation is reversed in the region of t > t, where P is larger than P, indicating that option C is more preferable than A. The result demonstrates that options with the same expected utilities lead to different probabilities of achieving the goal. The riskier option A leads to a higher probability of achieving the goal in the earlier period, whereas the safer option C leads to a higher probability of achieving the goal in the later period. This is also found in the PMF curves that the peak positions are in the order of A < C in Fig 6A.
Fig 6

Comparison of the fix strategies with different expected utilities in a decision game of G of 24.

The options are , B(6, 0.3), and C(4, 0.5), denoted by the black square, red circle, and blue diamond, respectively. The PMF and CDF curves are shown in (A) and (B), respectively. Between A and B, where ΔuΔp > 0, it is found that choosing A is dominant because P > P in the whole t range shown in (B). Between A and C, which have the same expected utility, the two curves interlace at t near 12. For t < t, A is preferred, whereas for t > t, B is preferred. Between B and C, which have different expected utilities, the two curves interlace at t near 10. For t < t, B is preferred, whereas for t > t, C is preferred. Therefore, between the options in which ΔuΔp < 0, the preferred option becomes t dependent.

Comparison of the fix strategies with different expected utilities in a decision game of G of 24.

The options are , B(6, 0.3), and C(4, 0.5), denoted by the black square, red circle, and blue diamond, respectively. The PMF and CDF curves are shown in (A) and (B), respectively. Between A and B, where ΔuΔp > 0, it is found that choosing A is dominant because P > P in the whole t range shown in (B). Between A and C, which have the same expected utility, the two curves interlace at t near 12. For t < t, A is preferred, whereas for t > t, B is preferred. Between B and C, which have different expected utilities, the two curves interlace at t near 10. For t < t, B is preferred, whereas for t > t, C is preferred. Therefore, between the options in which ΔuΔp < 0, the preferred option becomes t dependent. By introducing the goal, the ambiguity of choosing the options with the same expected utilities is eliminated. In other words, a decision maker willing to achieve an urgent goal can take the option with a larger u, i.e., a risky option; a decision maker willing to achieve a goal in the future can take the option with larger p, i.e., a safe option. We compare the results of choosing the options with different expected utilities, B(6, 0.3) and C(4, 0.5) in Fig 6. The two CDF curves of B and C interlace at t near 10 in (B). For t < t, that P is larger than P indicates B is preferred in the earlier region. For t > t, that P is larger than P indicates C is preferred in the later region. Again, taking option C, which has a higher expected utility, is preferable for a decision maker willing to take a longer time to achieve her goal. However, for a decision maker wanting to achieve her goal in a shorter period, taking B is preferable, even though the corresponding expected utility is low. This might provide a mathematical explanation for the behavior of choosing the option with a low expected utility, which can be interpreted as the decision maker wishing to satisfy her desire quickly. Comparing the CDF curves of A and B in (B), it is found that P is larger than P in the whole t range investigated. The ambiguity in decision making comes from when ΔuΔp < 0 ((u − u)(p − p) < 0). This also results an interlace in the CDF curves. In other words, there exists a dominant option when p ⩾ p or u ⩾ u and ΔuΔp > 0. The criterion of the preferable option using the fix strategy can be formulated as Eq 8.

Baseline for comparison: The min k strategy

A decision maker who takes the min k strategy always chooses the option of (1, 1) despite the stage at which she steps in. It takes g stages to move from S to S0 since this option consistently leads the decision maker to advance one stage each time. The P as a function of t for this strategy is a delta function with non-zero value at g and the associated P is a step function with a value of 1 for t ⩾ g, as shown in the black lines in Fig 7A, 7B and 7C. This indicates that the decision maker will not reach the goal until t = g. This can serve as a baseline for comparison with the other strategies.
Fig 7

The CDF (Pcom, upper panel) and PMF (P, lower panel) curves for the strategies of max k (grey hexagon), risky (black square), random (red circle), safe (blue diamond), and min k (black line) to reach the goals of (A) 10, (B) 50, and (C) 100.

The P curves interlace near t of 12, 60, and 120 for (A), (B), and (C), respectively. In the regime of t < t, the P are in the order of max k, risky, random, safe, and min k, indicating that the max k strategy leads to the highest probability of achieving a goal as soon as possible. However, in the regime of t > t, the P are in reverse order, indicating that the min k strategy leads to the highest probability of achieving the goal at a later time. The averages of all these strategies in (A), (B), and (C) are 10, 50, and 100, respectively.

The CDF (Pcom, upper panel) and PMF (P, lower panel) curves for the strategies of max k (grey hexagon), risky (black square), random (red circle), safe (blue diamond), and min k (black line) to reach the goals of (A) 10, (B) 50, and (C) 100.

The P curves interlace near t of 12, 60, and 120 for (A), (B), and (C), respectively. In the regime of t < t, the P are in the order of max k, risky, random, safe, and min k, indicating that the max k strategy leads to the highest probability of achieving a goal as soon as possible. However, in the regime of t > t, the P are in reverse order, indicating that the min k strategy leads to the highest probability of achieving the goal at a later time. The averages of all these strategies in (A), (B), and (C) are 10, 50, and 100, respectively.

The riskiest decision maker: The max k strategy

The max k strategy can be considered a special case of the fix strategy where the riskiest option is always taken. The network for the max k strategy in a game with G of g is illustrated in Fig 4. The associated matrix is shown in Eq 9. The state R is a 1 × 2 row vector with an initial state R(0) of [0, 1], as shown in Eq 10. The P can be deduced to a function of t, as shown in Eq 11. The outcome of this strategy is shown in the following section.

Comparison of the strategies

The CDF and the PMF as functions of t for the exemplified strategies in decision games with G of 10, 50, and 100 are shown in Fig 7A, 7B and 7C, respectively. The outcomes exhibit scalability in that the positions of the peaks of the curves for the strategies investigated are in proportion to the G. For example, the peaks of the curves for the safe strategy are at 10, 50, and 100 for the G of 10, 50, and 100, respectively. Therefore, we will only describe the results from Fig 7A in details as the other two results for G of 50 and 100 exhibit the same qualitative properties. The CDF curves are shown in the upper panels. The curves for max k, risky, random, and safe interlace at t near 12, 60, and 120 for G of 10, 50, and 100, respectively, which also show scalability. In the range of t < t, the P are in the order of max k > risky > random > safe > min k. This result indicates that the riskier strategies lead to a higher probability of achieving the goal by t. However, in the range of t > t, the order is reversed. This indicates that the safer strategies have a higher probability of achieving the goal in the long run. The PMF curves for the exemplified strategies are very different although the expected utility for the options are all the same. The peak position indicates that the majority of decision makers who take that strategy achieve their goal during that period. Among these strategies, the peak positions are in the order of max k < risky < random < safe ≃ min k as long as the t increases. This provides suggestions for decision makers who wish to achieve a goal during a designated period. With a G of 10, for example, for decision makers who wish to achieve their goals during 8 < t < 12, the safe strategy is recommended, whereas for those wish to achieve their goal during 4 < t < 10, the risky strategy is recommended. In our game set, the expected utilities for all options are the same, as are the resulting average times to reach the goal. This is also plausible according to the law of large numbers. Thus, in the CDF curves shown in Fig 7, the strategy that leads decision makers to a higher probability of achieving a goal in an early period also leads them to a lower probability of achieving the goal in a late period. Therefore, a decision maker can set strategies, i.e., construct the W and apply our method to find satisfactory strategies according to the resulting P(t) and P(t).

Discussions

Expected utility revisited

No matter which strategy a decision maker takes, the average t to reach the goal is the same, which is also guaranteed by the law of large numbers. However, the resulting distribution of P(t) is highly dependent on the various strategies. By taking the goal and time into account, our results show that the expected utility is not the sole criterion, even though it is often used in evaluating strategies in decision making under risk. For a decision maker who wants to achieve a goal as soon as possible, riskier options are preferable than safer options, even though the former may have a lower expected utility. Therefore, the criteria for a rational decision need to be reconsidered. Rationality often implies that a decision maker should be consistent. In other words, a decision maker who takes the option with the maximum expected utility in this question, should also take the one with the maximum expected utility in another question. According to our results, this inconsistency might imply that people do not maximizing the expected utility, but the P(t). That is, in some circumstances, a decision maker may wish to accomplish a goal earlier, and in other circumstances, she may wish to accomplish a goal later. Furthermore, goals might vary under different circumstances. Inconsistency in answering the decision question might not exhibit the irrationality, but rather reveal a different goal-time consideration. Therefore, the meaning of rational choice needs to be redefined.

Measurement of goal in mind

In our model, a decision maker should consider both the goal and time when choosing a strategy, as well as the distribution of the weights on options. As in the results shown in Fig 6, a decision maker who chooses a riskier option with a low expected utility might reveal urgency in achieving her goal. This might provide an opportunity to measure the decision maker’s wishes and also shed light on revealing the automatic and unconscious calculation in mind. It is suggested that an individual’s series of decisions be recorded in order to estimate their goal and urgency in achieving it. The goal in mind might be measured when the time question is answered, e.g., by the individual’s plan. The urgency might be obtained when the goal in mind is answered directly by the individual or estimated according to the individual’s need. The aforementioned experiments conducted by Kahneman and Tversky [22, 31, 65, 66] are taken as an example to illustrate how to estimate the goal or urgency. Fig 8 shows the CDF curves for a decision between A(4000, 0.2) and B(3000, 0.25) and between C(4000, 0.8) and D(3000, 1) with G of 24000 (upper panel) and 12000 (lower panel). Accordingly, 65% of the subjects chose A between A and B, and 80% chose D between C and D. Our model provides a possible explanation for the majority of subjects who took A and D. We then try to find a range of t that satisfies both P > P and P > P. If they wished to achieve the goal within 5 < t < 20 (10 < t < 42), a goal in mind could be estimated as 12000 (24000). On the other hand, if they stated that their goal were 12000 (24000), the t in which they wished to reach the goal could be estimated as 5 < t < 20 (10 < t < 42).
Fig 8

The CDF curves for decisions between A(4000, 0.2) and B(3000, 0.25), and between C(4000, 0.8) and D(3000, 1) with G of 24000 (upper panel) and 12000 (lower panel).

A possible explanation resulted from our model − the majority of subjects that took A and D might imply that they wish to achieve a goal of 12000 within 5 < t < 20 or to achieve a goal of 24000 within 10 < t < 42. If these subjects wish to achieve the goal by t = 30, it could be estimated that the goal in mind is about 24000.

The CDF curves for decisions between A(4000, 0.2) and B(3000, 0.25), and between C(4000, 0.8) and D(3000, 1) with G of 24000 (upper panel) and 12000 (lower panel).

A possible explanation resulted from our model − the majority of subjects that took A and D might imply that they wish to achieve a goal of 12000 within 5 < t < 20 or to achieve a goal of 24000 within 10 < t < 42. If these subjects wish to achieve the goal by t = 30, it could be estimated that the goal in mind is about 24000.

Prospect theory revisited

By using our model, we try to reinterpret two aspects in prospect theory [22, 24–26, 31]: (1) the framing effect and (2) non-linear preference. The framing effect, as mentioned previously, describes that in frame of gains or losses, people tend towards risk aversion or risk seeking, respectively. In a decision game with the payoff table as shown in Table 2. A decision maker that is s stages away from S0 has two options: a riskier option, R(u, p), and a safer option, S(u, p), in which u p = u p and p < p. In the frame of gains as shown in Fig 9A, a decision maker is s stages away from S0. Choosing either R or S may lead her to S0. However, choosing S leads her to S0 with a higher chance than that choosing R. Besides, choosing R results in a probability of (1 − p) to stay at S, which is higher than (1 − p) that choosing S. For a decision maker at the stage near the goal, it is preferable to choose a safer option. Therefore, one would exhibit risk aversion because there is no need to choose a riskier option with a small probability and a large reward that is in excess of the goal. In an extreme case, it is preferable to choose an option of a sure gain when the gain can just lead her to the goal with certainty.
Table 2

Reward and probability.

Optionup
Rurpr
0(1 − pr)
Susps
0(1 − ps)
Fig 9

Examples of the mathematical interpretation of the economic behaviors in (A) frame of gains and (B) frame of losses.

There are two options, a riskier R(u, p) and a safer S(u, p), in which u p = u p and p < p. (A) In the frame of gains, both u and p are positive. When the decision maker is at S, which is s stages away from the goal, S0, both options may lead her to S0. Nevertheless, choosing S allows her a higher probability to arrive S0 and a lower probability to stay at S than choosing R. Thus, S is preferable. This demonstrates a risk aversion behavior for a decision maker when the positive goal is near in the frame of gains. (B) In the frame of losses, both u and u are negative. When the decision maker is at S, which is s stages away from S0 that she tries to avoid. Both options may lead her to S0. However, choosing R allows her to stay at S0 with a higher probability than choosing S ((1 − p) > (1 − p)). Although choosing R may lead her to S, a worse stage than S0, it does not really matter since both S and S0 indicate her arrival to the negative goal. Additionally, choosing R may lead her farther away from S0. This demonstrates a risk seeking behavior for a decision maker when the goal is near in the frame of losses.

Examples of the mathematical interpretation of the economic behaviors in (A) frame of gains and (B) frame of losses.

There are two options, a riskier R(u, p) and a safer S(u, p), in which u p = u p and p < p. (A) In the frame of gains, both u and p are positive. When the decision maker is at S, which is s stages away from the goal, S0, both options may lead her to S0. Nevertheless, choosing S allows her a higher probability to arrive S0 and a lower probability to stay at S than choosing R. Thus, S is preferable. This demonstrates a risk aversion behavior for a decision maker when the positive goal is near in the frame of gains. (B) In the frame of losses, both u and u are negative. When the decision maker is at S, which is s stages away from S0 that she tries to avoid. Both options may lead her to S0. However, choosing R allows her to stay at S0 with a higher probability than choosing S ((1 − p) > (1 − p)). Although choosing R may lead her to S, a worse stage than S0, it does not really matter since both S and S0 indicate her arrival to the negative goal. Additionally, choosing R may lead her farther away from S0. This demonstrates a risk seeking behavior for a decision maker when the goal is near in the frame of losses. We further consider the situation in frame of losses by assuming both u and u are negative. In Fig 9B, S0 is set as negative, indicates a lower bound of losses, e.g., the bankruptcy, which a decision maker tries to avoid. At S, choosing R results in a chance of (1 − p) to stay at S, which is higher than (1 − p) that choosing S. Therefore, the decision maker would show a preference in choosing a riskier option. That is, a decision maker may behave as a risk seeker in the frame of losses, especially when she is near the negative goal. Besides, a sure loss (p = 1) is not welcome since a rational decision maker tries to avoid the arrival at S0. This demonstrates risk aversion in frame of gains and risk seeking in frame of losses. Such an asymmetry between the behaviors in the frame of gains and the frame of losses is embedded in our model. In other words, the framing effect in prospect theory can be mathematically interpreted. The nonlinear preference, the concavity and convexity in the value-utility curve as shown in the prospect theory is referred to psychological activities [25]. In our model, the expected utility is in proportion to the monetary value. Hence, the criterion instead of the expected utility, P(t) is intrinsically nonlinear, which may related to the nonlinear preference in the prospect theory.

Mathematical interpretation of economical behaviors

The house money effect proposed by Thaler and Johnson [28] describes that people will tend to spend the money of a prior gain, e.g., earned from gambling, in a risk seeking mode. Our model can interpret that the one has a larger goal in dealing with this money. In other words, the one is willing to take a risk because the goal is out of the consideration. In the presence of prior losses, the decision maker will take the options with rewards that have opportunities to break even, which is called break—money effect [28]. This implies that with a prior loss, the goal might be shifted to break even or at least get something [65, 67, 68]. Such that, the tendencies toward a risk-seeking behavior would be enhanced especially in the frame of losses. Besides, people may divide their money in parts, which will be used in different ways according to mental accounts [28]. Our model suggests that people can finely adjust the weights in a strategy for each account by the goal and the time to reach that goal. The computational shortcut is proposed to be included in heuristic procedure in decision making [8, 25, 69–72]. Our model suggests that people might unconsciously proceed such a mental computation on the P.

Perspectives

An individual’s behavior or activity may play a crucial role in the collective cooperation. It is proven that the social diversity of individual behavior will substantially improve the level of cooperation [73-76]. Presently, our model can be considered as a single player game, which provides a reference for the behaviors of individuals in decision making. Such that, the game theory can involve by considering the players’ goals and urgencies based on our model. Additionally, the policy maker may design a game situation to enhance the level of cooperation in order to obtain the public good with less contradiction. For making decisions under uncertainty, where the (u, p) for each option is unknown. Learning from experience is thus important in constructing the (u, p) practically and in explaining the human behaviors [55, 77]. Our model provide a mathematical process to estimate the (u, p) for all possible options. As shown in Eq 6, A can be resolved by the resulting P(t) by setting a trial W after repeated decision-makings. This provides an opportunity to find the decision network, possible combinations of the (u, p)s, and the associated Qs through these trials. Therefore, learning from experiences becomes mathematically achievable. The proposed model combines “the goal and time” together to provide a mathematical procedure in decision making and also provides a mathematical foundation to explain several psychological effects in economical behaviors. Additionally, such a mathematical procedure also provides foundation for decision making by Artificial Intelligence.

Conclusion

The proposed network approach provides a powerful tool for analysing the time-dependent outcomes of decisions. The behaviors that contradict those predicted by expected utility theory can be mathematically explained. Our results imply that planning could be embedded in decision behaviors. For a rational decision under risk, one needs to consider achievement of the desired goal within a specific period in the future rather than maximizing the expected utility in the present. Moreover, the model not only provides a mathematical foundation for resolving ambiguity in decision making, but also provides mathematical reasoning on the behaviors that chooses the options with lower expected utilities.
  24 in total

1.  Decision making in the short and long run: repeated gambles and rationality.

Authors:  John A Aloysius
Journal:  Br J Math Stat Psychol       Date:  2007-05       Impact factor: 3.380

2.  Social diversity and promotion of cooperation in the spatial prisoner's dilemma game.

Authors:  Matjaz Perc; Attila Szolnoki
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2008-01-14

Review 3.  Neurobiological studies of risk assessment: a comparison of expected utility and mean-variance approaches.

Authors:  Mathieu D'Acremont; Peter Bossaerts
Journal:  Cogn Affect Behav Neurosci       Date:  2008-12       Impact factor: 3.282

4.  Rationality and emotions.

Authors:  Alan Kirman; Pierre Livet; Miriam Teschl
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2010-01-27       Impact factor: 6.237

5.  Discrepancy between medical decisions for individual patients and for groups.

Authors:  D A Redelmeier; A Tversky
Journal:  N Engl J Med       Date:  1990-04-19       Impact factor: 91.245

6.  Prediction and Validation of Disease Genes Using HeteSim Scores.

Authors:  Xiangxiang Zeng; Yuanlu Liao; Yuansheng Liu; Quan Zou
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2016-02-12       Impact factor: 3.710

Review 7.  Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks.

Authors:  Xiangxiang Zeng; Xuan Zhang; Quan Zou
Journal:  Brief Bioinform       Date:  2015-06-09       Impact factor: 11.622

8.  Conflict between intuitive and rational processing: when people behave against their better judgment.

Authors:  V Denes-Raj; S Epstein
Journal:  J Pers Soc Psychol       Date:  1994-05

Review 9.  A meta-analytic review of two modes of learning and the description-experience gap.

Authors:  Dirk U Wulff; Max Mergenthaler-Canseco; Ralph Hertwig
Journal:  Psychol Bull       Date:  2017-12-14       Impact factor: 17.737

10.  How short- and long-run aspirations impact search and choice in decisions from experience.

Authors:  Dirk U Wulff; Thomas T Hills; Ralph Hertwig
Journal:  Cognition       Date:  2015-07-25
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.