Literature DB >> 27981695

Expected utility versus expected regret theory versions of decision curve analysis do generate different results when treatment effects are taken into account.

Iztok Hozo¹, Athanasios Tsalatsanis^2,3, Benjamin Djulbegovic^2,3,4,5.

Abstract

RATIONALE, AIMS, AND
OBJECTIVES: Decision curve analysis (DCA) is a widely used method for evaluating diagnostic tests and predictive models. It was developed based on expected utility theory (EUT) and has been reformulated using expected regret theory (ERG). Under certain circumstances, these 2 formulations yield different results. Here we describe these situations and explain the variation.
METHODS: We compare the derivations of the EUT- and ERG-based formulations of DCA for a typical medical decision problem: "treat none," "treat all," or "use model" to guide treatment. We illustrate the differences between the 2 formulations when applied to the following clinical question: at which probability of death we should refer a terminally ill patient to hospice?
RESULTS: Both DCA formulations yielded identical but mirrored results when treatment effects are ignored; they generated significantly different results otherwise. Treatment effect has a significant effect on the results derived by EUT DCA and less so on ERG DCA. The elicitation of specific values for disutilities affected the results even more significantly in the context of EUT DCA, whereas no such elicitation was required within the ERG framework.
CONCLUSION: EUT and ERG DCA generate different results when treatment effects are taken into account. The magnitude of the difference depends on the effect of treatment and the disutilities associated with disease and treatment effects. This is important to realize as the current practice guidelines are uniformly based on EUT; the same recommendations can significantly differ if they are derived based on ERG framework.

Entities: Chemical Disease Gene Species

Keywords: evaluation; medical research; practical reasoning

Mesh：

Year: 2016 PMID： 27981695 PMCID： PMC5900988 DOI： 10.1111/jep.12676

Source DB: PubMed Journal: J Eval Clin Pract ISSN： 1356-1294 Impact factor: 2.431

INTRODUCTION

Arguably, the threshold model represents one of the most important advances in clinical decision making.1, 2 According to the threshold model when faced with uncertainty about whether to treat, order a test or apply a predictive model, or simply observe the patient, there exists some probability of disease or disease outcome (threshold), at which a physician is indifferent between administering versus not administering treatment, or acting according to test or predictive model.1, 2 The threshold model reflects one of the fundamental principles of rational decision making: it is rational for a doctor to act (ie, order a diagnostic test, prescribe treatment) and for the patient to accept the proposed health intervention when one believes that benefits (gains) (B) of such action will outweigh its harms (losses) (H), ie, exceed threshold (T).3, 4 Originally, the threshold model was derived within the precepts of the expected utility theory (EUT).1, 2 During the last 40 years, the threshold model has been reformulated in a number of ways, both within the framework of EUT and non‐EUT theories (for a review, see Djulbegovic et al4). One such extension of the threshold model is decision curve analysis (DCA). DCA is a widely used technique for evaluation of the value of diagnostic tests or predictive models over a range of all possible thresholds. 5, 6, 7, 8, 9, 10 The assessment of the threshold probability at which a decision maker is indifferent between failure to administer a beneficial, over committing to a potentially harmful health intervention, allows capturing patient preferences related to given management choices.1, 2, 4 DCA incorporates the predictive model's accuracy, the consequences of a decision action, and a patient's preferences to assess the best course of action, such as making a decision according to the predictive model, treat all patients, or treat none. 5, 6, 7 One of the advantages of DCA is that we do not actually need to elicit the threshold from each patient, but instead model decisions about treatment over a range of thresholds without knowing details about specific utilities that determine threshold.5, 6 DCA was originally formulated using EUT5, 6 and reformulated within the expected regret theory (ERG) framework.7, 11 In the EUT DCA, the best course of action is the one associated with the highest expected value, whereas in the ERG DCA, the best course of action is the decision that will lead to the least amount of regret.7 We previously showed that ERG DCA and EUT DCA lead to the same decisions if treatment effects are not taken into consideration.7 Note, however, that the original DCA did not explicitly model treatment effect on patient's outcomes. In this paper, we demonstrate that when treatment effects are included in the modeling, different results are generated by EUT and ERG DCA, which has important implications for medical decision making.

METHODS

Model structure

Figure 1 displays a typical decision tree describing treatment options based on the results of a prediction model. represents the model‐generated probability of the event of interest D (D+ event present, D− event absent), such as disease presence or occurrence of outcome in a patient ; is the actual probability of the event D+ for the patient . The individual's risk is , where N is the number of patients. RRR is the relative risk reduction expected from treatment Rx, U is the utility associated with each outcome j, and T is a threshold probability at which a decision maker is indifferent between the “do not treat” (NoRx) and “treat” (Rx) strategies.

Figure 1

Decision tree depicting use of a predictive model to guide treatment choices according to expected utility‐based theory (“EUT utilities”) and ERG (displayed as “Regret Utilities,” ie, differences in values or utility of the outcomes of the action taken and the utility of the outcomes of another action, which, in retrospect, we should have taken12, 13, 14). As explained in the text, q represents the model‐generated probability of disease outcome for a patient i, whereas p is the actual probability of the event D (disease outcome) for the same patient. T, threshold probability for treatment; D+, disease is present; D−, disease is absent; RRR, relative risk reduction of treatment. Regret is computed as the difference in utilities of the action taken and the action that, in retrospect, should have been taken We opt to model treatment effect as relative risk reduction (RRR), which is a convenient way to express a risk ratio (RR) as a proportion of risk (p ) reduction according to15, 16 where RR is defined as the risk of event in the treatment group over the risk of event in control group.16 The main advantage of using RRR as a measure of treatment effect over the treatment absolute differences is that the former remains constant over the range of predicted risks (p ).15, 16, 17 RRR is also easy to interpret: RRR = 1 means that the occurrence of outcome of interest is completely preventable [as ], whereas RRR = 0 means that treatment is useless as it does not affect underlying risk []. Note that we use the term “predictive model” in generic sense to predict or foresee/foretell something that is yet unknown (such as outcome occurrence in individual patients). Typically, such models convert available information (predictors) into a statement about the probability about diagnosis or prognosis.18, 19 The model shown in Figure 1 (and later in Figure 2) applies to both prognostic and diagnostic prediction as long as such a prediction is used to guide selection of treatment.

Figure 2

Decision tree depicting a typical 3‐choice dilemma. a, Expected utility‐based tree. b, expected regret‐based tree. Three alternatives are shown: treat all patients, treat none, and use a predictive model/test to decide whether to treat or not. q represents the model‐generated probability of disease outcome for a patient i whereas p is the actual probability of the event D+ (disease outcome) for the same patient. T, threshold probability for treatment; D−, disease is absent; RRR, relative risk reduction of treatment. Regret is computed as the difference in utilities of the action taken and the action that, in retrospect, should have been taken

Derivation of the EUT DCA

As an illustrative example, Figure 2a shows the decision tree of a 3‐choice dilemma associated with hospice referral. In this case, a patient may decide to receive treatment targeting his underlying disease (“Treat All”), accept referral to hospice (“Treat None”), or act according to the threshold model based the patient's estimated probability of death (“Model”). In Figure 2a, q and p represent the model estimated probability of death and the actual probability of death D+ for patient i respectively. RRR represents the relative risk reduction associated with treatment Rx, U is the utility of outcome j, and T is the threshold probability at which a decision maker is indifferent between benefits and harms of treatment. By solving the tree, we derive the expected value of the model as As explained earlier, according to the threshold model, we treat if . Therefore, the expected values of the “Treat none” and “Treat all” strategies can be computed by setting T = 1 and T = 0 in Equation (1), respectively. Thus, The optimal strategy is the one that yields the higher net benefit (NB). Using the derived expected utilities, we calculate the NB of a strategy by subtracting the expected utility of this strategy (eg, “Treat all” or “Model”) from the expected utility of the “Treat none” strategy.5, 6 Thus, To simplify Equation (2), we define the true positive rate for a given threshold as and False Positive rate for the given threshold as Thus, By replacing in Equation (2), we derive the NB of “Treat all” ( and To further simplify our notation, and following Pauker and Kassirer1, 2 and Vickers and Elkin,5 we define the differences between utilities (preferences) related to the consequence of administering treatment when it would have been of benefit as ; similarly, the preferences related to the consequences of being unnecessarily treated are denoted as harms . Finally, even if appropriately given, there is no guarantee that only ( patients will receive treatment (see Figure 2); some patients with ( may also receive treatment. We defined this difference (Δ) between utilities of administering treatment as Δ = . Note that all threshold models to date assumed that the differences in NB and harm between these utilities are positive (), which is a clinically sensible assumption. Although, in principle, that is possible, we do not consider the case of negative utilities or negative in our threshold model. With these substitutions, we can rewrite the formulas above as The threshold probability, or the probability at which one is indifferent between deciding to treat versus not to treat, is computed as Using the definitions for , and Δ, above, we have Note that if RRR = 0, reduces to the “classic” EUT Pauker and Kassirer threshold1, 2: B, H, and Δ can be further characterized in terms of disutilities, or using other popular evidence‐based statistical measures.20, 21 When done so, the threshold model can be further formulated in a number of other ways (for a review, see Djulbegovic et al4). Although the original and widely used DCA did not take treatment effect into consideration, Vickers et al10 did attempt to integrate treatment into EUT DCA by expressing the threshold as the absolute risk reduction between 2 treatments: ARD = , where “p represents the probability of event for patients receiving treatment and p 0 represents the probability of event in untreated patients”. In our views, this creates several problems. First, ARD does vary with baseline risk, and it is preferable to model treatment effects using relative effects such as RRR that remain constant over the range of predicted risks (p ).15, 16, 17 Second, it may be better to express the threshold with respect to individual risk probabilities (p ). In principle, that is possible by expressing , which would, in turn, allow reformulation of threshold via However, most importantly, the method described by Vickers et al10, assume that U1–U3 < 0. Although technically this is correct, clinically such a situation constitute an extremely rare case, particularly in the area of cancer treatment discussed by the authors. Regardless if one wants to apply this EUT DCA model, its use per se is immaterial to the main objective of our paper, which is to contrast findings using EUT DCA with ERG DCA. In further exposition, we do not consider all possible ways how the threshold equation can be expressed but focus on derivation of DCA, which is mainly done by scaling the original threshold formulas. As explained, our main intent here is to demonstrate differences between EUT‐ and ERG‐derived DCA. Using the relationship expressed in Equation (4), we can derive scaled NBs as Scaled by Note that if RRR = 0, these equations reduce to the Vickers and Elkin DCA equation (which uses “classic” T). 5 Scaled by Scaled by DCA curves are generated by plotting NB of all 3 strategies (eg, “Treat all,” “Treat none,” and “Model”) over all thresholds of interest.

Derivation of the ERG DCA

Figure 2b depicts the same decision problem from the regret point of view. Utilities of each outcome are now represented in terms of regret, ie, the difference between the utility of the action taken and the utility of the action that, in retrospect, should have been taken.12, 13, 14 Solving the tree in Figure 2b, we derive the expected regret associated with the prediction model as Just like in EUT case, when the threshold is equal to zero, or one, we have We derive the net expected regret difference (NERD) by calculating the expected regret of each choice (eg, “Treat All,” “Model”) and subtracting it from the expected regret of “Treat None.”7, 11, 22 Using the previously defined and , as well as , and Δ, we can rewrite this as With ( and ) in Equation (7), we derive the NERD of the “Treat all” strategy as The threshold probability is computed as or Note that if RRR = 0, T reduces to the “classic” EUT Pauker and Kassirer (P&K) threshold T c 1, 2 (see above). Scaled by Note that if RRR = 0, the equations above and below reduce to Vickers and Elkin DCA EUT equation (see also above).5 Scaled by Scaled by

RESULTS

EUT and ERG DCA generate different results

The major difference between EUT and ERG DCA arises because of the definition of the threshold probability when using the classical EUT and utilities expressed via ERG: Both of these thresholds can be connected to the “classical” P&K EUT threshold via Again, note that if RRR = 0, all these thresholds reduce to the “classic” P&K EUT threshold. 1, 2 Extending the threshold model into DCA, we obtain the following: Scaled by Therefore, Scaled by Therefore, Equations (5) and (7) show that EUT‐based DCA and ERG‐based DCA differ by threshold definitions (T EUT vs T ERG) and the requirement for specifying Δ in the EUT model. As a result, noticeable differences in the evaluation of predictive models will be generated (Figure 3). In this illustrative example, according to the EUT DCA, the use of model to guide a management strategy is almost always best strategy regardless of RRR. However, according to ERG DCA, “Treat None” becomes best strategy with increasing thresholds. Only when RRR = 0, EUT DCA generates the same results as ERG DCA.

Figure 3

Decision curve analysis (DCA) for a model of referral a patent with terminal illness to hospice as a function of the threshold probability T. Three strategies are considered: “Treat All,” “Treat None: Refer to Hospice” (=0 on x axis), and “Model: Use Model to Guide Management.” The strategies are equal if they cross each other. The higher the value, the more superior strategy is. NB, net benefit according to expected utility theory (EUT); NERD, net expected regret differences. To enable comparison of EUT DCA and expected ERG DCA, NERD values are presented inversed (ie, –NERD). The results clearly show that EUT DCA and ERG DCA generate different results. (Note that Δ/H is arbitrary fixed to 0.05; somewhat different results are obtained when Δ/H vary.) We used the original patient‐level data from the SUPPORT study23 to create a simpler version of the model concerned with a decision whether to refer a patient to hospice/palliative care in the end‐of‐life setting. The curves are generated by calculating NB and NERD over all thresholds (from 0 to 1, in increments of 0.01)

DISCUSSION

In this paper, we demonstrate that when treatment effects are included in modeling, different results are generated by EUT and ERG DCA. Under these circumstances, EUT DCA cannot be used to model decisions over all preferences without further knowledge of the specific utilities related to differences between U1 and U2 (Δ). This, however, would defy the DCA's original purpose of analyzing decision strategies without requiring the elicitation of patients' preferences. If the DCA method is to be used, ERG DCA seems to be preferable, which also has the following appeals: (1) it is a mathematically more parsimonious derivation of DCA derived within a coherent regret theory7; (2) as a cognitive emotion, regret is widely recognized as one of the key decision making mechanisms enabling a decision maker to experience consequences of decisions both at the emotional (type 1) and cognitive (type 2) level3, 24, 25; and (3) it is easily and reliably elicited using dual analog visual analog scale (“regret”‐meter)7 or similar scales.7, 26 We should also note that the model we used here is for illustration purposes only—not to advocate the use of this particular model but only to illustrate differences when the model is used within 2 different theoretical frameworks. Although we advocate using ERG DCA, we are aware of the long tradition of application of EUT in decision sciences, and of the unsettled debate about the superiority of 1 theory over the other. Our main point is that EUT and ERG versions of DCA do generate different results. Although, we cannot possibly settle here the question of superiority of EUT versus ERG (or other non‐EUT theories), the larger point we are making is that the decision at which threshold to act closely relates to the question of rational choice.3, 4 The “great rationality debate” has been prominent in nonmedical fields,27, 28 but has only been sporadic in clinical medicine. By highlighting the differences in the results between ERG DCA versus EUT DCA—the latter being widely used method—we hope to stimulate a “rationality debate” in clinical medicine. The practical importance of advancing this debate can be appreciated if we, for example, note that some practice guidelines such as guidelines for colorectal screening are based on EUT‐based modeling29; conceivably, different recommendations could have been made if the non‐EUT framework were used. Hence, we think that awareness of our findings is of importance to modelers, practicing physicians and policy makers.

Research Support

This study was supported by the DOD (grant no. W81 XWH 09‐2‐0175, PI: Djulbegovic). Supporting info item Click here for additional data file.

24 in total

1. Therapeutic decision making: a cost-benefit analysis.

Authors: S G Pauker; J P Kassirer
Journal: N Engl J Med Date: 1975-07-31 Impact factor: 91.245

2. Prognosis and prognostic research: application and impact of prognostic models in clinical practice.

Authors: Karel G M Moons; Douglas G Altman; Yvonne Vergouwe; Patrick Royston
Journal: BMJ Date: 2009-06-04

3. Decision curve analysis.

Authors: Mark Fitzgerald; Benjamin R Saville; Roger J Lewis
Journal: JAMA Date: 2015-01-27 Impact factor: 56.272

4. Linking evidence-based medicine therapeutic summary measures to clinical decision analysis.

Authors: B Djulbegovic; I Hozo; G H Lyman
Journal: MedGenMed Date: 2000-01-13

5. Estimation of Benefits, Burden, and Harms of Colorectal Cancer Screening Strategies: Modeling Study for the US Preventive Services Task Force.

Authors: Amy B Knudsen; Ann G Zauber; Carolyn M Rutter; Steffie K Naber; V Paul Doria-Rose; Chester Pabiniak; Colden Johanson; Sara E Fischer; Iris Lansdorp-Vogelaar; Karen M Kuntz
Journal: JAMA Date: 2016-06-21 Impact factor: 56.272

6. Decision curve analysis: a novel method for evaluating prediction models.

Authors: Andrew J Vickers; Elena B Elkin
Journal: Med Decis Making Date: 2006 Nov-Dec Impact factor: 2.583

7. Extensions to regret-based decision curve analysis: an application to hospice referral for terminal patients.

Authors: Athanasios Tsalatsanis; Laura E Barnes; Iztok Hozo; Benjamin Djulbegovic
Journal: BMC Med Inform Decis Mak Date: 2011-12-23 Impact factor: 2.796

8. The Cochrane Collaboration's tool for assessing risk of bias in randomised trials.

Authors: Julian P T Higgins; Douglas G Altman; Peter C Gøtzsche; Peter Jüni; David Moher; Andrew D Oxman; Jelena Savovic; Kenneth F Schulz; Laura Weeks; Jonathan A C Sterne
Journal: BMJ Date: 2011-10-18

9. Thinking Styles and Regret in Physicians.

Authors: Mia Djulbegovic; Jason Beckstead; Shira Elqayam; Tea Reljic; Ambuj Kumar; Charles Paidas; Benjamin Djulbegovic
Journal: PLoS One Date: 2015-08-04 Impact factor: 3.240

10. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests.

Authors: Andrew J Vickers; Ben Van Calster; Ewout W Steyerberg
Journal: BMJ Date: 2016-01-25

4 in total

1. Expected utility versus expected regret theory versions of decision curve analysis do generate different results when treatment effects are taken into account.

Authors: Iztok Hozo; Athanasios Tsalatsanis; Benjamin Djulbegovic
Journal: J Eval Clin Pract Date: 2016-12-15 Impact factor: 2.431