Literature DB >> 21696604

Decision curve analysis revisited: overall net benefit, relationships to ROC curve analysis, and application to case-control studies.

Valentin Rousson1, Thomas Zumbrunn.   

Abstract

BACKGROUND: Decision curve analysis has been introduced as a method to evaluate prediction models in terms of their clinical consequences if used for a binary classification of subjects into a group who should and into a group who should not be treated. The key concept for this type of evaluation is the "net benefit", a concept borrowed from utility theory.
METHODS: We recall the foundations of decision curve analysis and discuss some new aspects. First, we stress the formal distinction between the net benefit for the treated and for the untreated and define the concept of the "overall net benefit". Next, we revisit the important distinction between the concept of accuracy, as typically assessed using the Youden index and a receiver operating characteristic (ROC) analysis, and the concept of utility of a prediction model, as assessed using decision curve analysis. Finally, we provide an explicit implementation of decision curve analysis to be applied in the context of case-control studies.
RESULTS: We show that the overall net benefit, which combines the net benefit for the treated and the untreated, is a natural alternative to the benefit achieved by a model, being invariant with respect to the coding of the outcome, and conveying a more comprehensive picture of the situation. Further, within the framework of decision curve analysis, we illustrate the important difference between the accuracy and the utility of a model, demonstrating how poor an accurate model may be in terms of its net benefit. Eventually, we expose that the application of decision curve analysis to case-control studies, where an accurate estimate of the true prevalence of a disease cannot be obtained from the data, is achieved with a few modifications to the original calculation procedure.
CONCLUSIONS: We present several interrelated extensions to decision curve analysis that will both facilitate its interpretation and broaden its potential area of application.

Entities:  

Mesh:

Year:  2011        PMID: 21696604      PMCID: PMC3148204          DOI: 10.1186/1472-6947-11-45

Source DB:  PubMed          Journal:  BMC Med Inform Decis Mak        ISSN: 1472-6947            Impact factor:   2.796


Background

The decision to administer or not to administer a treatment against some disease is frequently based on an estimated probability pthat an individual i has the disease (or – in a prognostic setting – will develop the disease), typically obtained using a prediction model in the broadest sense. A treatment is then administered if pis high enough, whereas no treatment is administered if pis too low. To judge whether pis high enough, one should weigh the profit P obtained by treating an individual with the disease and the loss L caused by treating an individual without the disease. The rationale is to treat individual i if and only if p> (1 - p)L, i.e. if the expected profit is higher than the expected loss. Equivalently, the treatment is administered if and only if p>p, where pis some threshold probability defined by p/(1 - p) = L/P, that is, p= L/(L + P). The threshold probability p, and hence the decision to opt or not to opt for the treatment, is thus a one-to-one function of the ratio L/P which is informative of how a clinician or a patient weighs the relative harms of false positive and false negative results. This quantity is typically subjective and will vary from clinician to clinician and from patient to patient. In what follows, we just assume that 0 pin function of the threshold probability p[1]. It facilitates the comparison among alternative prediction models used to calculate p. As a consequence, it may facilitate the decision of which of several prediction models to select, typically as a result of a clinician or a patient favouring the model with the highest net benefit at their personally determined threshold probability. Two extreme examples of prediction models are a model for which p= 1 for all individuals (with the consequence of all of them being treated, whatever the threshold probability), and a model for which p= 0 for all individuals (with the consequence of none of them being treated, whatever the threshold probability). On the other hand, a model achieving a perfect prediction would provide p= 1 for individuals with the disease and p= 0 for individuals without the disease. Interestingly, a binary diagnostic or prognostic test with only two possible results (positive or negative), where an individual receives the treatment if and only if the diagnostic or prognostic test is positive (an individual with a negative diagnostic or prognostic test being not treated), could be seen as a special case of a prediction model by letting p= 1 if the test is positive for individual i (such that p>p, whatever the threshold p), and letting p= 0 if the test is negative (such that p Consider now a random sample of the population (e.g. the sample used to calculate the p). Let a be the proportion of individuals with p>pand with the disease, let b be the proportion of individuals with p>pand without the disease, let c be the proportion of individuals with p2]) than Vickers et al. [1] (see Figure 1 therein), who used the same letters to denote utilities of the four possible results.
Figure 1

Illustration of the net benefit for the treated, for the untreated and overall. Decision curves based on the net benefit for the treated (a), the net benefit for the untreated (b) and the overall net benefit (c) for a simple example model and a complex example model, accompanied by the reference strategies of treating none or treating all and a hypothetical perfect prediction model ("omniscient"). Note that the addition of the decision curves for the treated (a) and the untreated (b) results in the decision curves in (c).

Illustration of the net benefit for the treated, for the untreated and overall. Decision curves based on the net benefit for the treated (a), the net benefit for the untreated (b) and the overall net benefit (c) for a simple example model and a complex example model, accompanied by the reference strategies of treating none or treating all and a hypothetical perfect prediction model ("omniscient"). Note that the addition of the decision curves for the treated (a) and the untreated (b) results in the decision curves in (c). Vickers et al. [1] proposed to use the so-called net benefit for the evaluation of prediction models, and to calculate it, they proceed as follows. Focusing on the individuals who receive the treatment, i.e. for whom p>p, the expected profit will be aP and the expected loss will be bL, the resulting benefit being U= aP - bL. This definition of the benefit corresponds to the "average profit per prediction" or "utility of the method" introduced by Peirce [3]. In many situations, determining actual values of P and L is not an obvious task. In decision curve analysis, it suffices to have an idea of their ratio L/P (or equivalently, of p). In order to make the benefit dependent only on the ratio L/P, i.e. not on the actual values of P and L, Vickers et al. [1] divided the benefit by P, yielding the so-called net benefit, u= a - bL/P, or, expressed as a function of p, In the extreme case where none of the individuals is treated, one has a = b = 0 and u= 0, which is constant whatever the threshold probability p. In the other extreme case where all individuals are treated, one has a = π, b = 1 - π and u= π - (1 - π)p/(1 - p). This is a decreasing function of the threshold probability, ranging from the prevalence down to negative infinity. For a model achieving a perfect prediction, one has b = c = 0 and u= π, which is again constant whatever the threshold probability. For many prediction models, the empirical finding is that the net benefit for the treated will be a roughly decreasing function of the threshold probability, approximately ranging from the prevalence down to zero. Interpretation of the value of the net benefit is not quite straightforward. It is a difference between two complementary proportions summing up to π, those profiting from obtaining the treatment (since they really have the disease) minus those loosing by obtaining the treatment (since they do not have the disease and will only suffer from the treatment), the latter weighted by the ratio L/P. The net benefit is thus equal to zero if the loss compensates the profit and can even be negative if the loss surpasses the profit. The maximum possible value of the net benefit is π, the prevalence of the disease, which is achieved only by a perfect prediction model.

Application to an example

To illustrate the application of decision curve analysis, we make use of a data set which is available on the accompanying website of a text book by Dupont [4]: http://biostat.mc.vanderbilt.edu/dupontwd/wddtext/data/2.20.Framingham.csv. It is a subset of the 40-year follow-up data set of the Framingham Heart Study (made available by Levy [5]). We define death due to coronary heart disease (CHD) within 30 years as our binary response, bluntly ignoring censoring in order to keep the example simple and to obtain a relatively high prevalence, both of which render the presentation of the issues we would like to bring forward didactically easier. In what follows, only individuals for whom all variables were measured are analysed (n = 4658). Since we have a total of 1403 deaths within 30 years, our observed prevalence is π = 0.30. We fitted two logistic regression models for the response variable: A simple model including only the explanatory variable serum cholesterol, and a complex model including the explanatory variables sex, age, BMI, serum cholesterol, diastolic blood pressure, and systolic blood pressure. For each of the two models, for the strategies to treat none or to treat all as well as for a hypothetical perfect prediction model, the net benefit was calculated and plotted against threshold probabilities pranging from 0 to 1 constituting five decision curves (Figure 1a). The complex model has a higher net benefit than the simple model over the whole range of p, the difference being sizeable and visible on the graph for values of pbetween c. 0.2 and 0.5 (Figure 1a). The two decision curves for the extreme strategies to treat none or all individuals serve as reference lines to judge whether a prediction model has any additional benefit. For the simple model, this is roughly the case for pranging between c. 0.15 and 0.35, and for the complex model, this is roughly the case for pranging between c. 0.15 and 0.5 (Figure 1a). The decision curve for a hypothetical perfect prediction model, which is equivalent to an omniscient strategy, serves as a reference line indicating the maximum net benefit that can be achieved (Figure 1a).

The benefit for the untreated

However, there is no apparent reason to focus solely on the individuals who receive the treatment when calculating the net benefit. One could also be interested in the profit and loss for the individuals not receiving the treatment, i.e. for whom p as can be deduced from Baker et al. [6]. Thus, a model maximising Uat some threshold probability pwill also maximise Uat this threshold. In the same spirit as Vickers et al. [1], who defined a net benefit for the treated, one could also define a net benefit for the untreated, dividing Uby L and yielding u= d - cP/L, or, expressed as a function of p, This is the formula that Vickers et al. [1] would have obtained if they had reversed the coding of the outcome (i.e. if they had considered the individuals with the disease as being those without the disease, and the treated as being the untreated). Importantly, while a model maximising uwill also maximise u, the difference between two models will (in general) be different with respect to uor u, as we shall see in our example. In the extreme case where all individuals are treated, one has c = d = 0 and u= 0, whatever the threshold probability. In the other extreme case where none of the individuals is treated, one has c = π, d = 1 - π and u= (1 - π) - π(1 - p)/p. This is an increasing function of the threshold probability, ranging from negative infinity up to one minus the prevalence. For a model achieving a perfect prediction, one has b = c = 0 and u= 1 - π, whatever the threshold probability. For many prediction models, the empirical finding is that the net benefit for the untreated will be a roughly increasing function of the threshold probability, approximately ranging from 0 up to one minus the prevalence. The net benefit for the untreated is a difference between two complementary proportions summing up to 1 - π, those profiting from not obtaining the treatment (since they really do not have the disease) minus those loosing by not obtaining the treatment (since they have the disease and would need the treatment), the latter weighted by P/L. Here too, the net benefit is equal to zero if the loss compensates the profit, and it can even be negative if the loss surpasses the profit. The maximum possible value of the net benefit for the untreated is 1 - π, one minus the prevalence of the disease. It is again only achieved by a perfect prediction model. When the net benefit for the untreated is applied to the example introduced above, the zero reference line is now represented by the strategy of treating all individuals, while the strategy of treating none of the individuals is now represented by a monotonously increasing line (Figure 1b). The decision curve for a hypothetical perfect prediction model is in this example higher for the untreated than for the treated (because the prevalence is smaller than 0.5; Figure 1b). The net benefit for both the simple and the complex model surpasses the two extreme strategies in the same range of pas in Figure 1a. Similarly, the complex model is superior to the simple model in the same range of pas in Figure 1a. However, the difference between the two models with respect to uis visible for values of pbetween c. 0.1 and 0.5 (whereas it was visible only between c. 0.2 and 0.5 with respect to u). Thus, while the two models achieve roughly the same benefit with respect to ufor a clinician using a threshold of say p= 0.15, the complex model will appear to be superior to the simple model at that threshold with respect to u. In this regard, the use of the net benefit for the treated or of the net benefit for the untreated is not equivalent when comparing different models.

The overall benefit

If one wishes to define a measure of the net benefit for all individuals (the treated and the non-treated), it seems natural to add up the net benefits calculated for the treated and for the untreated, yielding an overall net benefit defined as u= (a + d) - bL/P - cP/L, or, expressed as a function of p, A model maximising uat some threshold probability pwill also maximise both uand uat this threshold probability. On the other hand, the difference between two models with respect to uis equal to the sum of the differences between these models with respect to uand to u, such that a visible difference between two models with respect to either of these quantities will also be visible with respect to u. The overall net benefit can be interpreted as a difference of complementary proportions summing up to 1, those profiting from being correctly classified a + d minus those loosing by being incorrectly classified b + c, the latter appropriately weighted to account for the fact that the values of P and L are different. As for the net benefits for the treated and for the untreated, the overall net benefit is equal to zero if the loss compensates the profit and it can even be negative if the loss surpasses the profit. The maximum possible value of the overall net benefit is 1, which is achieved only for a perfect prediction model. This maximum value of 1 is certainly a sensible value since it implies that 100% of the individuals would profit from a diagnostic or prognostic test, which is precisely the case for a perfect prediction model. Applying the concept of an overall net benefit to our example is equivalent to adding up the curves in Figure 1a and b, resulting in the curves in Figure 1c. The curves for the two extreme strategies of treating none and treating all are equal to the curves for the strategy of treating none in the case of the untreated and for the strategy of treating all in the case of the treated, respectively, since the complementary strategies are represented by zero lines (Figure 1c). The curves for the hypothetical perfect prediction model add up to 1 (Figure 1c). The curves for both the simple and the complex model now have the form of valley cross sections (Figure 1c). Again, the complex model surpasses the simple model over the whole range of p, the difference between the two models with respect to ubeing clearly visible between c. 0.1 and 0.5.

Relationships to ROC analysis

Vicker et al. [1] stress that with the introduction of decision curve analysis, they hope to overcome the schism between those solely interested in the accuracy and those solely interested in the utility of a prediction model. Whereas the latter select their threshold probability independently from the model as p= L/(L + P), which is a personal threshold as explained above, the former select their threshold probability (or cutoff probability, denoted here by p) to maximise some well-defined criterion. Thus, the resulting optimal pgenerally differs from model to model. There are many criteria which can be optimised, but as Baker et al. [6] point out, one of the most commonly applied ones is the Youden index, equal to sensitivity + specificity - 1. Baker et al. [6] show that the Youden index is equal to the so-called "science of the method", as again defined by Peirce [3], and as such does not bear the name of its actual author. Baker et al. [6] illustrate how these two kinds of threshold (the optimal por the personal p) are related to a receiver operating characteristic (ROC) curve. Recall that a ROC curve shows all possible couples (1 - specificity; sensitivity) achievable by a prediction model when varying the threshold probability. The ROC curve of the complex model from our example is drawn in Figure 2. The threshold pmaximising the Youden index corresponds to the operating point on the ROC curve that is the highest point above the diagonal from the lower left to the upper right corner (Figure 2a). To determine the operating point on the ROC curve corresponding to the personal threshold p= L/(L + P), which maximises the personal benefit u= a - bp/(1 - p), first note that
Figure 2

Relationships of decision curve analysis to ROC curve analysis. Receiver operating characteristic (ROC) curve for the complex example model. (a) The optimal operating point based on maximising the Youden index is the point on the curve with the largest vertical distance to the diagonal from the lower left corner to the upper right corner. The optimal operating point is indicated with a circle, the diagonal with a hatched line. (b) The optimal operating point based on maximising the net benefit is the point on the curve for which the slope is equal to [(1 - π)/π][p/(1 - p)]. Two examples for p= 0.2 and p= 0.4 are indicated with circles, the corresponding slopes with hatched lines.

Relationships of decision curve analysis to ROC curve analysis. Receiver operating characteristic (ROC) curve for the complex example model. (a) The optimal operating point based on maximising the Youden index is the point on the curve with the largest vertical distance to the diagonal from the lower left corner to the upper right corner. The optimal operating point is indicated with a circle, the diagonal with a hatched line. (b) The optimal operating point based on maximising the net benefit is the point on the curve for which the slope is equal to [(1 - π)/π][p/(1 - p)]. Two examples for p= 0.2 and p= 0.4 are indicated with circles, the corresponding slopes with hatched lines. This follows from the fact that (estimates of) the sensitivity and the specificity are given by a/(a + c) and by d/(d + b), while (estimates of) π and 1 - π are given by a + c and by b + d. One may thus calculate the derivative of uwith respect to 1 - specificity, given by and set it equal to zero, as explained in Baker et al. [6] (see also [7]). Thus pcorresponds to the operating point on the ROC curve for which the slope is equal to [(1 - π)/π][p/(1 - p)]. Steyerberg et al. [8] noticed that choosing p= π implies p= p(as long as the optimal pmaximising the Youden index corresponds to the operating point on the ROC curve for which the slope is equal to 1). For our example, the solution for two selected values, p= 0.2 and p= 0.4, is given in Figure 2b. Whereas an ROC curve allows to demonstrate the sensitivity and the specificity achieved at each personal pand to compare these values to what would be achieved using the optimal p, it might also be of interest to show the net benefit achieved by the optimal pas a function of p. This is illustrated in Figure 3 for the complex model of our example. One can see that the net benefit quickly drops below the extreme strategies of treating none or all as soon as pis departing from p(which in our example was found to be 0.306). This underlines that the concepts of accuracy and utility of a prediction model may differ drastically in practice.
Figure 3

Illustration of the net benefit achieved using an accurate binary diagnostic test. Dichotomising the complex example model with the optimal ROC cutoff probability based on the maximum Youden index (or Peirce's "science of the method") results in an arc-shaped decision curve, with respect to the net benefit for the treated (a), the net benefit for the untreated (b), or the overall net benefit (c).

Illustration of the net benefit achieved using an accurate binary diagnostic test. Dichotomising the complex example model with the optimal ROC cutoff probability based on the maximum Youden index (or Peirce's "science of the method") results in an arc-shaped decision curve, with respect to the net benefit for the treated (a), the net benefit for the untreated (b), or the overall net benefit (c).

Decision curve analysis in case-control studies

So far, we have implicitly assumed that our random sample was representative for the population of interest. If we now consider data from a case-control study, the empirical prevalence π will no longer be a consistent estimate of the true prevalence, which we shall denote by π0, since the cases are often overrepresented. In this context, it is well known that one can still use logistic regression to consistently estimate odds ratios. However, a pcalculated from a logistic regression model in a case-control study will no longer be a consistent estimate of the probability that the individual i has the disease. If one knew π0, this probability could be estimated by defined such that that is To perform a decision curve analysis in a case-control study, one should hence know (or estimate) the true prevalence π0 using another source than the data themselves, typically from the literature. In this context, one may calculate and opt for the treatment if and only if . Let then a' be the proportion of individuals with and with the disease, let b' be the proportion of individuals with and without the disease, let c' be the proportion of individuals with and with the disease, and let d' be the proportion of individuals with and without the disease (we further assume that an equality is not possible, such that a' + b' + c' + d' = 1). In line with the fact that the empirical prevalence π = a' + c' is not a consistent estimate of the true prevalence π0, the quantities a', b', c' and d' are no consistent estimates of the true proportions of true positive, false positive, false negative and true negative results, respectively. Note however that a'/(a' + c') and d'/(b' + d') are still consistent estimates of the true sensitivity and the true specificity (recall also that an ROC analysis can still be carried out in a case-control study). To estimate the net benefit for the treated in a case-control study, one may thus use the mathematical expression given in the previous section, that is Similarly, the net benefit for the untreated can be estimated as and the overall net benefit as If the true prevalence is unknown, the net benefit is a function of both the personal threshold pand the true prevalence π0. One can then perform several decision curve analyses over a plausible range of values of π0. Alternatively, for a given threshold p, one could show the net benefit achievable by the prediction models under consideration in function of the unknown π0. Note that Pencina et al. [9] used a similar strategy to calculate a "net reclassification improvement" in the context of a case-control study. For illustrative purposes, let us pretend that the data of our example above originate from a case-control study and that the empirical prevalence π thus does not reflect the true prevalence π0. We have drawn three decision curves for which the true prevalence π0 is taken to be smaller, higher or equal to the empirical prevalence π (Figure 4). One can see that all decision curves are both shifted and lowered or raised depending on the value of π0 (Figure 4a, b and c). As a consequence, the range of pwhere the prediction models have any additional benefit compared to the extreme strategies to treat none or all will depend on the postulated value of π0. More generally, a comparison among alternative prediction models (as well as the decision of which of them to select) will not only be made in function of the threshold probability pbut also of the assumed true prevalence π0 of the population of interest.
Figure 4

Application of decision curve analysis to a case-control study. Decision curves for a simple model, a complex model, the reference strategies to treat none or all, and a hypothetical perfect prediction model based on the overall net benefit for an assumed true prevalence of π0 = 0.15 (a), π0 = 0.45 (b) and π0 = 0.30 (c), the latter corresponding to the empirical prevalence π. While (c) is a reasonable assumption if the data are representative of the population of interest, this may no longer be the case if the data originate from a case-control study.

Application of decision curve analysis to a case-control study. Decision curves for a simple model, a complex model, the reference strategies to treat none or all, and a hypothetical perfect prediction model based on the overall net benefit for an assumed true prevalence of π0 = 0.15 (a), π0 = 0.45 (b) and π0 = 0.30 (c), the latter corresponding to the empirical prevalence π. While (c) is a reasonable assumption if the data are representative of the population of interest, this may no longer be the case if the data originate from a case-control study.

Discussion

We have recalled the foundations of decision curve analysis and discussed some new aspects. In the original proposal, prediction models are compared with respect to the net benefit for the treated where one weighs the profits and losses for those receiving the treatment. Besides the net benefit for the treated, one could also be interested in the net benefit for the untreated which weighs the profits and losses for those not receiving the treatment, as developed above. In fact, the net benefit for the untreated would correspond to the net benefit for the treated if one would reverse the coding of the outcome. However, whereas a model maximising the net benefit for the treated will also maximise the net benefit for the untreated, the difference between two models will (in general) not be the same with respect to the net benefit for the treated or for the untreated. This actually implies that the original definition of the net benefit by Vickers et al. [1] is sensitive to the coding of the outcome, which may not be an optimal feature. To illustrate further how our clinical judgement could be altered when using one measure rather than the other when calculating the net benefit, we compare a simple binary diagnostic test achieving 90% of sensitivity and specificity with a complex binary diagnostic test achieving 99% of sensitivity and specificity in a context where the prevalence is 0.5 (Figure 5, a fictitious example). One can see that the difference between the two tests is rather small with respect to the net benefit for the treated at a low threshold value p(Figure 5a), whereas it is much higher with respect to the net benefit for the untreated at the same threshold (Figure 5b). Thus, a clinician with a low personal threshold may conclude that the complex binary diagnostic test is not much more useful than the simple one from a clinical point of view if he/she looks at the net benefit for the treated, whereas his/her conclusion may drastically change if he/she looks at the net benefit for the untreated. Note that these differences are not a consequence of small sample problems and that the sample size appears in none of the formulae provided in this paper.
Figure 5

Comparison of simple and complex binary diagnostic tests with respect to the net benefit. At a low threshold probability p, the difference between the two tests appears to be much higher with respect to the net benefit for the untreated (b) than with respect to the net benefit for the treated (a). The converse is true at a high probability threshold. A complete picture of the situation is conveyed using the overall net benefit (c). In this example, the simple test has 90% sensitivity and specificity and the complex test has 99% sensitivity and specificity in a context where the prevalence is 0.5.

Comparison of simple and complex binary diagnostic tests with respect to the net benefit. At a low threshold probability p, the difference between the two tests appears to be much higher with respect to the net benefit for the untreated (b) than with respect to the net benefit for the treated (a). The converse is true at a high probability threshold. A complete picture of the situation is conveyed using the overall net benefit (c). In this example, the simple test has 90% sensitivity and specificity and the complex test has 99% sensitivity and specificity in a context where the prevalence is 0.5. In order to define a measure of the net benefit which is invariant with respect to the coding of the outcome, one may combine the net benefits for the treated and the untreated in a symmetric way. In this paper, we have simply considered their sum, referred to as the "overall net benefit" (as done in Figure 5c of our fictitious example). Another interesting property of the overall net benefit is that the maximum possible value is always 1 (even though it is difficult to reach). In contrast, the maximum possible value for the net benefit proposed by Vickers et al. [1] is equal to the prevalence of the disease, which will be different from study to study. We also note that the overall net benefit is not more complicated to calculate than the net benefit for the treated. A user-friendly R package [10] to calculate and draw decision curves based on the overall net benefit is available from the second author upon request. The crucial aspect of decision curve analysis is the choice of the threshold p, which may be viewed as a potential limitation of the method since it can be difficult to quantify the consequences of a misclassification (the harms of being a false positive or a false negative). Fortunately, the choice of the threshold only depends on the relative expression L/P, not on the absolute terms P and L, which should already greatly simplify the practice. This was one motivation of Vickers at al. [1] to consider a "net benefit" (in our notation, u) rather than an "absolute benefit" (in our notation, U). Interpreting the value of a net benefit is however not quite straightforward and interpreting a difference between the net benefits achieved by two different models may still be more problematic. Recently, Tsalatsanis et al. [11] addressed this issue via a regret theory approach. On the other hand, the value of an absolute benefit would indeed be informative from a cost-effectiveness perspective (regarding the gain achieved when using the prediction model to decide whether an individual should be treated or not, compared to the strategy of treating nobody, as provided by U, compared to the strategy of treating everybody, as provided by U, or compared to the strategy of doing the contrary of what is predicted by the model, as provided by the sum of Uand U). We note that it might be difficult to define a measure of the utility of a model which is invariant with respect to the coding of the outcome, which only depends on the relative expression L/P, and which is easily interpretable. The overall net benefit considered in the present paper does satisfy the first two properties, while it may still be difficult to interpret. To assist interpretation, we described the overall net benefit as a difference of two complementary proportions. In the special case where the harms of a false positive equal the harms of a false negative (the threshold being p= 0.5), the overall net benefit equals (a + d) - (b + c), which is simply the percentage of those correctly classified minus the percentage of those incorrectly classified. An overall net benefit of 0.2, for example, means that 60% of the subjects are correctly classified while 40% are not, the net benefit hence being 60% - 40% = 20%. In the more general case where different personal values are assigned to profits and losses, this difference of complementary proportions is weighted accordingly, such that a net benefit of 0.2 keeps the same "clinical value" in the context described above with p= 0.5, or in another context. In summary, we propose to use the overall net benefit instead of the benefit for the treated in a decision curve analysis when the issue is to assess and compare several concurrent prediction models with respect to their utility. We also took the opportunity to recall the important difference which may arise between the notions of accuracy and utility of a prediction model, as illustrated by Figure 3. Finally, we have illustrated how to use decision curve analysis in the context of case-control studies by allowing the prevalence, in addition to the threshold probability, to vary over a sensible range. We hope that our remarks will facilitate interpretation of decision curve analysis and motivate a more widespread adoption of the method.

Abbreviations

p: probability that an individual i has (or will develop) a disease of interest; p: threshold probability above which one opts for treatment; L: loss caused by treating an individual without the disease of interest; P : profit obtained by treating an individual with the disease of interest; a: proportion of individuals with p>pwith the disease of interest; b: proportion of individuals with p>pwithout the disease of interest; c: proportion of individuals with p

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

This paper is the result of numerous discussions between VR and TZ. Both authors participated actively in the writing and editing of the manuscript. Both authors read and approved the final draft.

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1472-6947/11/45/prepub
  6 in total

1.  Basic principles of ROC analysis.

Authors:  C E Metz
Journal:  Semin Nucl Med       Date:  1978-10       Impact factor: 4.446

2.  The numerical measure of the success of predictions.

Authors:  C S Peirce
Journal:  Science       Date:  1884-11-14       Impact factor: 47.728

3.  Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers.

Authors:  Michael J Pencina; Ralph B D'Agostino; Ewout W Steyerberg
Journal:  Stat Med       Date:  2010-11-05       Impact factor: 2.373

4.  Decision curve analysis: a novel method for evaluating prediction models.

Authors:  Andrew J Vickers; Elena B Elkin
Journal:  Med Decis Making       Date:  2006 Nov-Dec       Impact factor: 2.583

5.  Assessing the performance of prediction models: a framework for traditional and novel measures.

Authors:  Ewout W Steyerberg; Andrew J Vickers; Nancy R Cook; Thomas Gerds; Mithat Gonen; Nancy Obuchowski; Michael J Pencina; Michael W Kattan
Journal:  Epidemiology       Date:  2010-01       Impact factor: 4.822

6.  A regret theory approach to decision curve analysis: a novel method for eliciting decision makers' preferences and decision-making.

Authors:  Athanasios Tsalatsanis; Iztok Hozo; Andrew Vickers; Benjamin Djulbegovic
Journal:  BMC Med Inform Decis Mak       Date:  2010-09-16       Impact factor: 2.796

  6 in total
  47 in total

1.  Nomograms for predicting long-term overall survival and cancer-specific survival in patients with primary urethral carcinoma: a population-based study.

Authors:  Hao Zi; Lei Gao; Zhaohua Yu; Chaoyang Wang; Xuequn Ren; Jun Lyu; Xiaodong Li
Journal:  Int Urol Nephrol       Date:  2019-10-14       Impact factor: 2.370

2.  Population net benefit of prostate MRI with high spatiotemporal resolution contrast-enhanced imaging: A decision curve analysis.

Authors:  Vinay Prabhu; Andrew B Rosenkrantz; Ricardo Otazo; Daniel K Sodickson; Stella K Kang
Journal:  J Magn Reson Imaging       Date:  2019-01-10       Impact factor: 4.813

3.  Validation of a Host Response Assay, SeptiCyte LAB, for Discriminating Sepsis from Systemic Inflammatory Response Syndrome in the ICU.

Authors:  Russell R Miller; Bert K Lopansri; John P Burke; Mitchell Levy; Steven Opal; Richard E Rothman; Franco R D'Alessio; Venkataramana K Sidhaye; Neil R Aggarwal; Robert Balk; Jared A Greenberg; Mark Yoder; Gourang Patel; Emily Gilbert; Majid Afshar; Jorge P Parada; Greg S Martin; Annette M Esper; Jordan A Kempker; Mangala Narasimhan; Adey Tsegaye; Stella Hahn; Paul Mayo; Tom van der Poll; Marcus J Schultz; Brendon P Scicluna; Peter Klein Klouwenberg; Antony Rapisarda; Therese A Seldon; Leo C McHugh; Thomas D Yager; Silvia Cermelli; Dayle Sampson; Victoria Rothwell; Richard Newman; Shruti Bhide; Brian A Fox; James T Kirk; Krupa Navalkar; Roy F Davis; Roslyn A Brandon; Richard B Brandon
Journal:  Am J Respir Crit Care Med       Date:  2018-10-01       Impact factor: 21.405

4.  Decision curve analysis: a technical note.

Authors:  Zhongheng Zhang; Valentin Rousson; Wen-Chung Lee; Cyril Ferdynus; Mingyu Chen; Xin Qian; Yizhan Guo
Journal:  Ann Transl Med       Date:  2018-08

Review 5.  Prediction models for risk classification in cardiovascular disease.

Authors:  Mario Petretta; Alberto Cuocolo
Journal:  Eur J Nucl Med Mol Imaging       Date:  2012-10-03       Impact factor: 9.236

6.  A threshold-free summary index of prediction accuracy for censored time to event data.

Authors:  Yan Yuan; Qian M Zhou; Bingying Li; Hengrui Cai; Eric J Chow; Gregory T Armstrong
Journal:  Stat Med       Date:  2018-02-08       Impact factor: 2.373

7.  In-depth mining of clinical data: the construction of clinical prediction model with R.

Authors:  Zhi-Rui Zhou; Wei-Wei Wang; Yan Li; Kai-Rui Jin; Xuan-Yi Wang; Zi-Wei Wang; Yi-Shan Chen; Shao-Jia Wang; Jing Hu; Hui-Na Zhang; Po Huang; Guo-Zhen Zhao; Xing-Xing Chen; Bo Li; Tian-Song Zhang
Journal:  Ann Transl Med       Date:  2019-12

Review 8.  Reporting and Interpreting Decision Curve Analysis: A Guide for Investigators.

Authors:  Ben Van Calster; Laure Wynants; Jan F M Verbeek; Jan Y Verbakel; Evangelia Christodoulou; Andrew J Vickers; Monique J Roobol; Ewout W Steyerberg
Journal:  Eur Urol       Date:  2018-09-19       Impact factor: 20.096

9.  Noninvasive intracranial hypertension detection utilizing semisupervised learning.

Authors:  Sunghan Kim; Robert Hamilton; Stacy Pineles; Marvin Bergsneider; Xiao Hu
Journal:  IEEE Trans Biomed Eng       Date:  2012-11-15       Impact factor: 4.538

10.  A Powerful Nomogram Based on the Novel D-Index to Predict Prognosis After Surgical Resection of Hepatocellular Carcinoma.

Authors:  Xia Du; Cheng-Nan Guo; Xiao-Dong Bao
Journal:  Cancer Manag Res       Date:  2021-03-17       Impact factor: 3.989

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.