Literature DB >> 24944196

The value of heterogeneity for cost-effectiveness subgroup analysis: conceptual framework and application.

Manuel A Espinoza^1,2, Andrea Manca³, Karl Claxton^3,4, Mark J Sculpher³.

Abstract

This article develops a general framework to guide the use of subgroup cost-effectiveness analysis for decision making in a collectively funded health system. In doing so, it addresses 2 key policy questions, namely, the identification and selection of subgroups, while distinguishing 2 sources of potential value associated with heterogeneity. These are 1) the value of revealing the factors associated with heterogeneity in costs and outcomes using existing evidence (static value) and 2) the value of acquiring further subgroup-related evidence to resolve the uncertainty given the current understanding of heterogeneity (dynamic value). Consideration of these 2 sources of value can guide subgroup-specific treatment decisions and inform whether further research should be conducted to resolve uncertainty to explain variability in costs and outcomes. We apply the proposed methods to a cost-effectiveness analysis for the management of patients with acute coronary syndrome. This study presents the expected net benefits under current and perfect information when subgroups are defined based on the use and combination of 6 binary covariates. The results of the case study confirm the theoretical expectations. As more subgroups are considered, the marginal net benefit gains obtained under the current information show diminishing marginal returns, and the expected value of perfect information shows a decreasing trend. We present a suggested algorithm that synthesizes the results to guide policy.

Entities: Chemical

Keywords: cost-effectiveness analysis.; heterogeneity; subgroup analysis; value of information

Mesh：

Year: 2014 PMID： 24944196 PMCID： PMC4232328 DOI： 10.1177/0272989X14538705

Source DB: PubMed Journal: Med Decis Making ISSN： 0272-989X Impact factor: 2.583

Decisions based on average measures of cost-effectiveness may lead to incorrect treatment recommendations for specific subsets of the population.[1] This is because a treatment that is cost-effective for one type of patient may not be so for others. This type of heterogeneity can be ascribed to both individual and contextual-level factors,[2] which, if taken into account by the decision maker, would support a more efficient allocation of resources. Although the concept of subgroup analysis has a long history,[3,4] its adoption has been met with caution in some areas of health care decision making, possibly due to concerns related to low statistical power and multiple statistical testing.[5-7] Many authors have indicated their preference for using an average measure of treatment effect[4,8,9] and for findings from subgroup analyses being considered exploratory in nature.[8,10] More recently, there have been arguments against the use of strict inferential rules in assessing the effects of interventions for health care decision making.[11-14] This emphasizes the importance of considering evidence on subgroups of patients when making probabilistic statements about the (cost-) effectiveness of a given treatment strategy.[2,15] Several authors have made important contributions in this area in recent years. Phelps[16] introduced the idea that heterogeneity in cost-effectiveness can be explained by different factors (baseline risk, treatment efficacy, costs, and patient preferences), Coyle and others[15] proposed methods to quantify the potential health gains facilitated by making different decisions for different subgroups (stratified analysis), and Basu and Meltzer[17] extended this concept to decisions at the individual level (expected value of individualized care [EVIC]). All of these contributions have focused on the value of understanding the reasons for variability (i.e., translating variability into heterogeneity explainable by observable characteristics), but none has fully addressed how variability and uncertainty interact.[18] Nor have the implications for the relative priority of different types of research or the most appropriate level of discrimination (stratification) in differential access to care been fully explored. Several national health care agencies[19,20] responsible for issuing recommendations about the adoption of new medical technologies support the use of subgroup analyses when making decisions about new health technologies. For example, the methods guidance for technology appraisal issued by the National Institute for Health and Care Excellence (NICE) for England and Wales[19] states that “for many technologies, the capacity to benefit from treatment will differ for patients with differing characteristics. This should be explored as part of the reference-case analysis by the provision of estimates of clinical and cost-effectiveness separately for each relevant subgroup of patients.” Similarly, the technical guidance of the Canadian Agency for Drugs and Technologies in Health recommends “stratified analysis of smaller, more homogeneous subgroups, where appropriate, if there is variability (heterogeneity) in the target population.”[20] However, no specific guidance is offered for how to explore and reflect heterogeneity when conducting subgroup cost-effectiveness analyses to inform decisions. Given the current policy debate and government agenda relating to the personalization of health and social care services in the United Kingdom[21] and elsewhere,[22-24] a framework to support and guide decision making in different groups of patients is rapidly becoming a key policy need. This article presents a conceptual framework to explore heterogeneity between patients, consistent with the objective of maximizing population health subject to the resources available to a health care system. The framework builds on earlier work in this area,[15,17] adding the following elements: First, it introduces the efficiency frontier for subgroup analysis, an analytical tool that can be used to guide the choice of the optimal subgroup definition. Second, it characterizes 2 dimensions of the value of understanding heterogeneity: 1) the expected health gained because of stratified decisions and 2) the additional value of further data collection aimed at resolving subgroup-related uncertainty. The proposed framework is tested using a policy-relevant analysis as a simple extension of current cost-effectiveness methods. The article ends with a final section discussing the strengths and weaknesses of this work.

Subgroup Cost-Effectiveness Analysis Under Current Information

Net Benefits for Subgroup Cost-Effectiveness Analysis

Classical decision rules in cost-effectiveness analysis (CEA)[25] state that, under current information, the optimal strategy among j mutually exclusive alternatives, given θ, an uncertain vector of parameters, can be expressed as follows: That is, the optimal strategy is the one with the greatest expected net benefit (NB). If the total (present and future) patient population expected to benefit from the intervention is defined as where I represents the disease incidence for each period t, T indicates the period over which the technology is assumed to be relevant to clinical practice, and r is an appropriate discount rate, then the population expected NB can be estimated as Coyle and others[15] showed that by considering heterogeneity in treatment effect between patient subgroups within this framework (i.e., due to the presence of observed treatment effect modifiers), different recommendations could be made for different subgroups. This results in a greater expected NB compared with decisions based on the average across the patient population as a whole. Later, Basu and Meltzer[17] proposed EVIC, a metric that represents the additional value, in terms of NB, of making decisions at the level of the individual (patient) compared with that of the average population. EVIC can also be estimated for individual parameter(s), indicating the value of categorizing the population based on a particular (set of) parameter(s) to make individualized decisions about health care interventions. Using a slightly different notation from that used by Coyle and others,[15] the total incremental net benefit (TINB) (with = θ(j, θ) −θ(j*, θ), where j≠j*) across S subgroups can be written here as where W∈ (0, 1) is a weight indicating the proportion of the total population represented by subgroup s and . Hence, the TINB is the weighted average of the incremental net benefit (INB) in each of the subgroups. Since some of the subgroup-specific INB may be negative, the TINB when the intervention is restricted to those subgroups with positive INB is and the INB gained from reflecting heterogeneity in decisions—what Coyle and others[15] termed stratification—can be written as the negative sum of the population weighted INB in those subgroups where INB is negative: The estimation of the total net benefits (TNB) based on absolute NB values can be expressed more generally as which is the weighted sum of the maximum NBs for each subgroup, and the Δ being which corresponds to the difference between the weighted sum across subgroups (equation 7) and the average NBs.

Definition of Subgroups and Sources of Heterogeneity

Cost-effectiveness analysis needs to assess heterogeneity in a wider set of parameters than those typically considered in clinical studies.[2,26] The analysis of clinical trials generally focuses on inferences about treatment effects for the patient population defined by the study's inclusion criteria. Interest in heterogeneity is generally confined to treatment effect moderators.[10] In addition, there may be clinical interest in heterogeneity in the underlying (or baseline) risk of adverse clinical events associated with a disease.[27] This can lead to subgroup differences in the absolute benefit conferred by a treatment offering a common proportionate risk reduction across the entire patient population. There may also be situations in which baseline risk is correlated with the relative treatment effect.[28] These sources of heterogeneity relating to the intervention and the disease are also important in CEA. In addition, resource use (and hence costs) may systematically vary between individuals based on their characteristics and the geographical location of their treatment.[29] Finally, heterogeneity in individual preferences (and hence benefit) is increasingly recognized as another key source of heterogeneity in cost-effectiveness.[3,26,30] There are, however, some potential constraints on subgroup analysis for CEA. One is the need to consider the costs of implementing subgroup-specific guidance in the health system. These could include the costs of acquiring relevant characteristics of individual patients and the costs of monitoring whether clinicians’ practice is consistent with the guidance for patients with those characteristics. There are also potential ethical and equity constraints when conducting subgroup CEA for decision making. For example, NICE considers unethical the use of age as a source of heterogeneity in its decisions unless it directly affects the efficacy of an intervention.[19] These constraints should be made explicit when subgroups are being defined. Thus, the first issue decision makers must address in this context is the choice between subgroup specifications (f), that is, the choice of covariates used to define individuals’ membership to a particular subgroup. These variables should be biologically plausible and operationalizable in practice.[2] For example, patients at risk of cardiovascular events can be grouped on the basis of whether or not they have diabetes (f = 1), hypertension (f = 2), or a combination of both (f = 3). We propose a criterion based on efficiency (measured in terms of expected NB) to guide selection among alternative subgroup specifications. Consider an evaluation of the cost-effectiveness of 2 alternative treatments for non-ST elevation acute coronary syndrome with F possible subgroup specifications that could be defined by a given parameter. Assume now that only 2 subgroups are considered (S = 2) and that these can be obtained by subdividing the population using 3 alternative specifications: the first based on the presence or not of diabetes (f =1); the second based on high versus low baseline TIMI risk score,[31] a well-known score representing baseline risk (f = 2); and the third based on the presence or not of a highly sensitive and specific biomarker (troponin; f = 3). The same variables can be used to define alternative specifications for 3 and 4 subgroups (e.g., the combination of 2 binary specifications such as diabetes; troponin defines a specification that produces 4 subgroups). For each subgroup, total expected NBs can be estimated, based on existing evidence (current information). The goal is to identify relevant subgroups and associated specifications that produce the highest expected NB, resolving the following maximization problem: Equation 9 suggests that the specification with the greatest total expected NBs should be preferred, given current information. These points would lead to an efficiency frontier similar to that depicted in Figure 1, which represents the range of possible expected NBs achievable using alternative specifications for each given number of subgroups.

Figure 1

Efficiency frontier for subgroup analysis. Note: The dotted line joins the potential best specifications for each number of subgroups (S), that is, the specification that provides the greatest net health benefit (NHB) for each level of disaggregation. Points represent the NHB expected from applying different decisions for different subgroups, where the same number of subgroups can be defined by different alternative specifications. The segment (A) shows that there is no value of heterogeneity between the points. Segment (B) illustrates the value of a specification used to define subgroups and (C) the value of considering additional subgroups. The dotted line represents the frontier showing the set of the most efficient specifications for each number of subgroups. Figure 1 introduces 2 further important elements: first, the notion that there may be instances in which consideration of subgroups does not add any further societal benefit (in terms of expected NB) compared with what is achieved by providing a given treatment to the whole patient population. This would happen if the same treatment decision is appropriate for all subgroups (A). Second, in other cases, further exploration of subgroups offers an additional societal benefit, even given current information. This could be exclusively explained by the effect of a different specification for the same number of subgroups (B) or because additional numbers of subgroups have been taken into account (C).

Decision Uncertainty and the Value of Additional Research

The framework has so far not considered the role of uncertainty in CEA, which can be classified as structural and parameter uncertainty.[18,32] Although parameter uncertainty is the primary focus here, the same method of analysis can be applied if structural uncertainties are quantified or parameterized.[33] Parameter uncertainty reflects imperfect (imprecise) knowledge about the true mean value of a (set of) parameter(s) in the model. The existence of parameter uncertainty implies the possibility of making a wrong decision about which intervention is expected to be cost-effective (on average) for a target population or subpopulation of patients. Therefore, additional evidence is valuable because it can inform future decisions that will benefit future patients. Value of information (VoI) methods can be used to quantify the expected health that might be gained if the uncertainty surrounding decisions about the coverage or reimbursement of new health technologies were resolved.[14,34-36] This quantification requires an estimate of the value of making decisions once this uncertainty has been resolved (i.e., with perfect information or a sample size such that the probability of making the wrong decision is expected to be zero). In this circumstance, the decision maker would be able to select the intervention that maximizes NBs at the true value of the vector of parameters θ, that is, Since the true value of θ is unknown, only the expected value of this quantity can be estimated by averaging the maximum NBs over the joint distribution of θ, The expected value of perfect information (EVPI) in equation 12 is the difference between the NBs derived from a decision made with perfect information (equation 11) and the NBs derived from decisions under current information (equation 1). This represents the expected gain (in NBs), for a single patient, from collecting further information and resolving existing uncertainty. Notice that the EVPI for the patient population can be derived by multiplying equation 12 and equation 2. Claxton[14] showed that EVPI represents an upper bound for a new research proposal aimed at resolving the current levels of uncertainty. As long as obtaining new information is less costly than the population EVPI, there may be a positive potential payoff from further research.[37] Thus, a necessary condition is achieved when this payoff is positive; otherwise, investing in further research does not represent a good use of available resources. If mutually exclusive subgroups are considered, different decisions can be made for different subgroups. Thus, under current information, the decision maker will need to choose for each subgroup s the strategy with the maximum NB. Equation 1 can therefore be reexpressed as with the expected value of the decision for subgroup s under perfect information being and the EVPI for the subgroup s given by The EVPI represents an upper bound for further research on the target population considering that different decisions can be made for different subgroups. This expression considers the overall uncertainty in the population, which includes the uncertainty given by both exchangeable and nonexchangeable parameters. (Parameters are exchangeable if the estimate to inform the cost-effectiveness in one particular subgroup can be used to inform the cost-effectiveness in another mutually exclusive subgroup.) Generalizing equation 15, the total EVPI when considering S subgroups is the weighted average of each subgroup-specific EVPI weighted by the proportion of each subgroup in the population. The population EVPI can be estimated by multiplying equation 16 by the future population of patients expected to benefit from the new information, which for s subgroups is given by where T is the period over which the information that could be collected is useful in subgroup s,[38] and I is the incidence over period t. It follows that the population EVPI with S subgroups is which represents the maximum amount of resources that the health system should be willing to pay for further research given a particular cost-effectiveness threshold. This framework establishes a direct link between current decision uncertainty and the value of future research. Rather than considering uncertainty as a constraint for decision making, VoI highlights that its resolution is of value as a potential source of health gain.[14,34,39] This article extends this concept to encompass the value of resolving those aspects of variability (as opposed to uncertainty) that can be understood as heterogeneity and coins the term value of heterogeneity (VoH), to indicate the additional health gains obtained by understanding heterogeneity for decision making. In the proposed framework, VoH has 2 components. The first of these results from additional exploration of the existing evidence to identify, characterize, and quantify heterogeneity. This is termed static VoH, since it is not associated with new data collection. The second component reflects the value derived from collecting new evidence to reduce the sampling uncertainty associated with subgroup-specific parameter estimates (or parameters conditional on the value of specific covariates), which is defined as dynamic VoH. Both of these concepts can be illustrated graphically. Figure 2a illustrates the concept of EVPI. Here, for the whole population, the empty diamond marker represents the expected NB, expressed in health terms (net health benefit [NHB]) under current information, while the solid diamond marker indicates the maximum expected NHB achievable under perfect information, for one particular threshold value. The difference between these 2 quantities corresponds to the population-average EVPI.

Figure 2

Value of information, static and dynamic value of heterogeneity. (a) Value of information concept. Empty diamond represents the maximum expected net benefits (max EθNB), and the filled diamond shows the expected maximum net benefits (Eθmax NB). The difference between them is the expected value of perfect information (EVPI). (b) The distance A represents the EVPI for the average population. B is the EVPI for the population considering different decisions for 2 subgroups. C represents the static value of heterogeneity. D is the dynamic value of heterogeneity. In this case, where (A = B), acknowledging heterogeneity does not lead to a change in the expected value of further research. (c) The distances A and B represent the EVPI for the average and 2-subgroup case, respectively. In this case, B is greater than A. Distance C is equal to zero; hence, there is no static value of heterogeneity. D is greater than zero, representing a positive dynamic value of heterogeneity that is given by heterogeneity. (d) The distances A and B represent the EVPI for the average and 2-subgroup case, respectively. In this case, A is greater than B. The static value of heterogeneity (C) and the dynamic value of heterogeneity (D) are positive; however, D is smaller than C. Figure 2b shows the case with 2 subgroups when there is value in identifying and reflecting heterogeneity using existing evidence, over and above the value associated with undertaking further research. Here, the total expected NHB with current information (represented by the empty markers) is greater when subgroups are considered. The difference between the total NHB of a decision for the entire population and the total NHB when considering subgroups (represented by the vertical distance C) is the static value of heterogeneity. A formal expression for this concept is equation 8. Notice that the scenario depicted in Figure 2b indicates that, even if additional data could be collected through new research to resolve any decision uncertainty for the subgroups, the expected NHB to be gained would be similar to what could be derived from the population-average case. This is indicated by the fact that the vertical distance B is equal to the vertical distance A. In this scenario, a policy maker might be interested in making different decisions for different subgroups, according to the evidence available, and investing in further research would still be worthwhile if this were aimed at resolving uncertainties not associated with heterogeneity. Figure 2c shows a further scenario. Here, the same decision would be made for 2 subgroups under current information, which would yield the same total expected NHB as for the whole population. However, the estimate under perfect information obtained when considering subgroups (represented by the solid square marker) is greater than the NHB under perfect information derived from considering the population as a whole (represented by the solid diamond marker). The difference between these 2 quantities (indicated by the vertical distance D) is the dynamic value of heterogeneity. This value corresponds to the additional (population) health that is expected to be achieved when a sufficiently large sample is collected to resolve current uncertainty for a given stratification. This captures 2 sources of value: 1) the value of resolving uncertainty in the estimates of conditional parameters (parameter estimates conditional on the subgroup category determined by a particular specification) and 2) the value of estimating conditional parameters if uncertainty in their estimation could be resolved. The first of these values refers solely to uncertainty, that is, the difference between the expected maximum NHB and the maximum expected NHB (i.e., EVPI). The second is the value of heterogeneity with perfect information about mean parameter values. Finally, Figure 2d illustrates a situation in which there are both static and dynamic values of heterogeneity (C > 0 and D > 0). However, a particular feature of this example is that the EVPI when considering subgroups (distance B) is less than the average EVPI (distance A). This would occur when the specification used to define subgroups is informative about heterogeneity. In this situation, the effect is not only observed under perfect information but also under current information (positive static value). Hence, the difference between current and perfect information is lower than the average.

Presentation of Subgroup Analysis and Choice of the Optimal Number of Subgroups: The Role of the Cost Function

By considering the static and dynamic dimensions of heterogeneity simultaneously, the trend in expected NHBs, as a function of the number of subgroups, can be shown graphically for both current and perfect information. Several alternative specifications are available for each number of subgroups. These range from no subgroups (indicated by the vertical bar on the left in Figure 3) to decisions at the individual level, where the treatment is chosen according to the comparison between the observed outcome and its counterfactual (indicated by the vertical bar on the right in Figure 3). It is, therefore, possible to plot the most efficient specifications, that is, those with the highest NHB for each number of subgroups. In addition, those specifications that have lower NHB under current information but are expected to produce higher NHB with perfect information might also be included in the graph.

Figure 3

Representation of the gaining in expected net health benefit with current and perfect information when heterogeneity is considered. Note: The continuous line shows the theoretical efficiency frontier for subgroup analysis. The dashed line illustrates the potential health gains if decisions with perfect information were made for each level of disaggregation. Both curves do not converge completely at the individual level because the counterfactual will never be measured without uncertainty. The transaction cost function shows 2 kink points illustrating a segment where the additional costs are higher than the additional gains. This represents the case in which the optimal number of subgroups is the lower between those 2 points. In principle, if there is no residual unexplained heterogeneity (i.e., there is complete knowledge of the individual characteristics that determine variability), the decision maker has the best possible information to allocate resources efficiently (the right end of Figure 3). At the other extreme, the maximum value of exploring heterogeneity is observed when no subgroups have been taken into account. In terms of uncertainty, it is not necessarily the case that the maximum decision uncertainty is at the average population level. Indeed, increasing the number of subgroups could increase or decrease uncertainty, depending on how informative the specification used is and the reduction of the sample size due the number of subgroups considered. It would generally be expected, however, that if informative specifications are used to explain heterogeneity, then decision uncertainty should decrease as more heterogeneity is revealed. This uncertainty will, however, never be completely resolved because the true value of the individual treatment effect can never be measured, as the counterfactual can never be observed. The optimal number of subgroups would tend toward n if there were no costs associated with implementing recommendations for finer levels of stratification. However, there are a number of costs that are likely to be incurred as part of this process. Individualized care (n) will not be optimal if the marginal costs of increased stratification exceed the marginal benefits before n is reached. These costs include the transaction cost of implementing increasingly complex recommendations, monitoring and enforcement costs, and the opportunity costs of failing to restrict access to care that is not cost-effective. For example, the additional effort of implementing a finer level of stratification might include additional health professionals’ time, the cost of additional diagnostic tests required to categorize the patient, and the cost of dissemination and implementation strategies. They also include the effort required to monitor and then enforce compliance with restricted access and the additional opportunity costs of failing to enforce differential access when such further stratification makes effective monitoring more difficult. Since patients and their clinicians will not fully account for health care system costs that fall on others when making treatment choices, there is always a risk that providing guidance that adds more granularity to subgroup-specific decisions will not be adhered to and patients will receive interventions that are not cost-effective.[40] Assuming that all transaction costs fall on the health budget, this cost function can be expressed in terms of forgone NHB using the cost-effectiveness threshold and compared with the NHB gained due to further understanding of heterogeneity (Figure 3). In general, the implementation, monitoring, and effective enforcement of a guideline will tend to become more costly as finer stratification is made. Therefore, the optimal level of stratification depends on the marginal costs and benefits associated with finer stratification. This might be expressed in terms of the ratio between the incremental net benefits and the additional transaction costs of 2 adjacent levels of disaggregation (e.g., 1 and 2). If the ratio is lower than 1, then the next relevant comparison is levels 1 and 3. If the ratio is greater than 1, then 2 subgroups are better than 1, and the next comparison should be 2 against 3. The key qualitative conclusion, however, is that individualized care is not necessarily optimal. The most appropriate level of stratification will depend on context, such as the nature of the health care system (e.g., ease of monitoring and enforcement possibilities available to the decision maker), the nature of the characteristics that can be used to stratify (that must be easily observable and not easily manipulated), and the type of incentives faced by patients and their clinicians (e.g., third-party payment combined with fee for service tends to increase the incentives for moral hazard and the opportunity costs associated with it).[40]

Case Study: Subgroup Analysis of the Cost-Effectiveness of an Invasive Treatment for Acute Coronary Syndrome

Background and Methods

The applicability of the framework and methods described so far is demonstrated using a case study. The efficiency frontier for subgroups, the EVIC, and the static and dynamic value of heterogeneity were estimated for a set of relevant specifications. The example is a CEA that used data from the multicenter trial RITA-3, which compared an intensive versus a conservative strategy for the management of patients with non–ST-elevation acute coronary syndrome.[41] Briefly, the study used estimates derived from the individual patient data (n = 1810) in the trial to populate a decision analytic model. Parameters used in the model (e.g., transition probabilities, costs, and quality-of-life weights) were estimated from a set of regression equations, which were specified conditional to a set of individual-level covariates. The study showed that invasive treatment had an incremental cost-effectiveness ratio (ICER) of £21 943/quality-adjusted life-year (QALY). Details of the model and the equations have been reported elsewhere.[41,42] The model was used to estimate the individual NHB of an invasive and conservative intervention using individual participant data from the randomized controlled trial. Between-individuals variation could be characterized thanks to the fact that each parameter of the model was estimated conditional to the profile of each individual. The mean costs and QALYs were averaged across individual estimates to calculate the ICERs. For subgroup analysis, the mean values were obtained as the average across those patients who belong to one particular category (e.g., diabetics). This provided the information to estimate the EVIC, which can be calculated as the difference between the average of the maximum individual NHB and the maximum of the average NHB.[17,43] Parameter uncertainty was propagated through the model using probabilistic sensitivity analysis, which entailed running 1000 random draws from the (set of) parameter(s) characterizing each patient in the data set. This corresponds to the uncertainty surrounding the effect of the covariate on the estimation of the parameter of interest (e.g., transition probability, costs, or quality-of-life weights). This generated a total of 1 810 000 realizations; that is 4 matrices of 1000 by 1810 (2 for expected costs and 2 for expected QALYs of the invasive and conservative strategy, respectively). The model was implemented in Microsoft Excel 2007, and macros were written in Visual Basic (Microsoft Corporation, Redmond, WA). These matrices, each in a different Excel sheet, provided the data needed to implement the analytical framework. The uncertainty relating to the overall mean results was estimated by averaging each of the 1000 iterations across the 1810 individuals, producing a unique vector of 1000 iterations. Subgroup CEA considered all covariates used in the regression equations in the original analysis.[44] The potential for each covariate to inform different decisions based on the cost-effectiveness was also assessed. A logit model was developed to examine the effect of each covariate on the probability that the new strategy is cost-effective, in 1 particular individual, at a cost-effectiveness threshold of £20 000 per QALY. All covariates were significantly associated with a greater probability of the invasive strategy's being cost-effective (Table 1).

Table 1

Average Marginal Effects of 9 Covariates on the Probability That the Invasive Strategy Is the Most Cost-Effective at a Threshold of $20 000 per QALY

	dF/dx	Standard Error	z (P > z)
Diabetes	0.448	0.021	20.67 (P < 0.001)
Previous myocardial infarction	0.270	0.016	16.29 (P < 0.001)
Smoker	0.359	0.015	23.89 (P < 0.001)
ST depression	0.177	0.014	12.34 (P < 0.001)
Left bundle branch block	0.622	0.025	24.47 (P < 0.001)
Severe angina	–0.082	0.013	–6.31 (P < 0.001)
Sex	0.192	0.012	14.90 (P < 0.001)
Age	0.009	0.0006	15.15 (P < 0.001)
Pulse	0.04	0.002	19.29 (P < 0.001)

Note: Results are based on a multivariable logit model.

Average Marginal Effects of 9 Covariates on the Probability That the Invasive Strategy Is the Most Cost-Effective at a Threshold of $20 000 per QALY Note: Results are based on a multivariable logit model. Six covariates were selected based on clinical plausibility, feasibility of implementation, ethical constraints, and the probability of being informative of cost-effectiveness according to the analysis presented in Table 1. Sex and age were excluded, because decisions that differentiate reimbursement based on those specifications are likely to be subject to ethical criticism. The numerical variable of pulse was excluded as a covariate because there is no consensus about how to categorize it, and it would be very difficult to implement alternative decisions based on an arbitrary definition. A further subgroup specification was defined based on a baseline risk score, as used in the original analysis.[44] The score was estimated from the trial data to predict the primary outcome. This baseline risk score was also used by the original cost-effectiveness study to explore heterogeneity in subgroups. Parameter uncertainty for subgroups was analyzed using the same approach described for individuals, but separately for each specific subgroup. These estimates provided the basis to calculate the static and dynamic values of heterogeneity. All of the results shown in this case study were calculated for a threshold (λ) value of £20 000/QALY gained and expressed for an estimated population of 556 723 patients. This is based on an annual incidence of 59 756 patients, a time horizon of 10 y, and a discount rate of 3.5% per year.[45]

Results

Table 2 reports the results of this analysis. As in the original study, the invasive strategy was found not to be cost-effective, on average, at λ = 20 000/QALY (ICER = £21 960/QALY). The expected NHBs yielded by the most cost-effective strategy (conservative strategy) are 4 397 388 net-QALYs. If further research is undertaken to resolve the current uncertainty, the expected NHBs are 4 408 143 net-QALYs. The EVPI is, therefore, 10 755 net-QALYs (4 408 143 minus 4 397 388). The total population EVIC was estimated at 14 349 net-QALYs. This corresponds to the difference between the expected maximum individual NHBs (4 411 737 net-QALYs) and the maximum expected NHBs (4 397 388 net-QALYs). The individualized analysis also provided evidence that the new strategy should be implemented in 591 patients (32.65%) in the sample if the cost-effectiveness decision rule were applied to each patient.

Table 2

NHBs under Current and Perfect Information, and EVPI for the Specifications on the Efficiency Frontier for Subgroup Analysis

Specification	NHB (Current Information)	NHB (Perfect Information)	EVPI
Average	4 397 388	4 408 143	10 755
DM (2 subgroups)	4 403 199	4 412 177	8978
DM and LBBB (4 subgroups)	4 404 566	4 412 841	8275
DM and PMI and smoking (8 subgroups)	4 405 519	4 414 587	9068
DM and PMI and smoking and LBBB (16 subgroups)	4 406 788	4 415 294	8505
All covariates (49 subgroups)	4 408 359	4 416 806	8447

Note: The estimates consider a cost-effectiveness threshold of £20 000/QALY gained. NHB is expressed as QALYs net of costs. DM = diabetes mellitus; EVPI = expected value of perfect information; LBBB = left bundle branch block; NHB = net health benefit; PMI = previous myocardial infarction; QALY = quality-adjusted life-years; all covariates (including DM, LBBB, PMI, smoking, depression of the segment ST, and severe angina).

NHBs under Current and Perfect Information, and EVPI for the Specifications on the Efficiency Frontier for Subgroup Analysis Note: The estimates consider a cost-effectiveness threshold of £20 000/QALY gained. NHB is expressed as QALYs net of costs. DM = diabetes mellitus; EVPI = expected value of perfect information; LBBB = left bundle branch block; NHB = net health benefit; PMI = previous myocardial infarction; QALY = quality-adjusted life-years; all covariates (including DM, LBBB, PMI, smoking, depression of the segment ST, and severe angina). To identify those patients, subgroup analysis was conducted for 6 binary specifications: diabetes mellitus (DM), previous myocardial infarction, left bundle branch block, smoking, depression of the segment ST in the electrocardiogram, and severe angina. They were combined to explore specifications for 2, 4, 8, 16, and 64 subgroups. The expected NHB for current and perfect information and the EVPI of those specifications with the highest NHB for each level of disaggregation (specifications on the efficiency frontier) are reported in Table 2. Using all (6) covariates to characterize subgroup specification yielded 49 potential subgroups, since the remaining 15 subgroups were not represented in the sample. This analysis produced the highest expected NHBs (4 408 359 net-QALYs), which corresponds to a static value of heterogeneity of 10 971 net-QALYs, accounting for 76.5% of the total EVIC (10 971/14 349). By way of comparison, subgroup analysis based on patients’ baseline risk score was conducted in a similar way to the original cost-effectiveness study.[44] Five subgroups were defined (4 quartiles, with the upper one divided in 2 eights). The expected NHBs were 4 407 074 net-QALYs, which is less than what is obtained by using a guideline that combines all covariates. Figure 4 shows the efficiency frontier (maximum expected NHB for those specifications with the highest NHB) and the expected maximum NHB (expected net health with perfect information) for the same specifications on the efficiency frontier. These results are consistent with the hypothesis that the efficiency frontier for subgroup analysis shows diminishing marginal returns in terms of NHB; that is, as additional levels of stratification are assessed, the marginal gains between adjacent levels (e.g., 3 versus 2 or 5 versus 4 subgroups) are lower. The figure also shows that the EVPI tends to decrease with higher level of disaggregation. An additional element presented in Figure 4 is the only subgroup specification (DM and smoking) in which less NHB is obtained with current information but more health can be expected with perfect information. In this case, the decision about adoption or rejection is not affected (because it corresponds to the specification on the efficiency frontier); however, if further research is planned, this additional specification might be taken into account since greater health is expected when the uncertainty, conditional on that specification, is resolved.

Figure 4

Expected net health benefits under current and perfect information for different levels of disaggregation. Note: Expected net health benefits (NHBs) are expressed in quality-adjusted life-years (net of costs). Empty diamonds represent the expected NHB achieved with current information, only for the specification that presented the maximum net health for each level of disaggregation. The dotted line across those points illustrates the efficiency frontier for subgroup analysis estimated from the data for 2, 4, 8, 16, and 49 subgroups. Filled diamonds represent the expected NHB that might be achieved for its corresponding specification on the efficiency frontier if the parameter uncertainty was resolved (perfect information). The empty and filled squares correspond to the expected NHB under current and perfect information for the specification defined by diabetes and smoking (dm&smoking), which was the only case in which, despite not being on the efficiency frontier, greater net health can be expected with perfect information. The gray circles (projected on the secondary axis) are the expected value of perfect information, which is the difference between current and perfect information. The transaction costs of implementing guidelines for the different levels of disaggregation reported here were not estimated directly. However, because all of these covariates are part of routine clinical assessment, their implementation as part of a guideline is not expected to be associated with high transaction costs. This is not the case for the baseline risk score examined here, which has not been clinically validated and, therefore, would be difficult to implement in practice. A guideline based on the analysis that produced the highest NHB is presented in Figure 5. This combines all 6 covariates considered in this analysis. It is based on 49 subgroups provided by the study, which were grouped when they led to the same decision. The diagrams illustrate that a potential complex scenario can be simplified to a manageable clinical guideline.

Figure 5

Guideline for the specification that combines 6 covariates based on the results of the case study.

Discussion

The article contributes to frameworks and analyses to inform decisions regarding where health care resources should be invested: providing early access to new technologies, ensuring the findings of existing (or commissioned) research are (or will be) implemented, conducting research to provide additional evidence about particular sources of uncertainty in some (or all) subgroups, or conducting research that can lead to a better understanding of variability in effects. This type of research may be very different from the type of evaluative or comparative effectiveness research that commonly reduces uncertainty only about estimates of treatment effect. For example, it might include diagnostic procedures and technologies, pharmacogenetics, analysis of observational data, and treatment selection as well as novel trial designs that can reveal something of the joint distribution of effects.[40] Importantly, the framework also informs policy makers about the assessments that need to be made when considering finer stratification of access to treatment or promoting individualized care and patient choice. The key implication is that individualized care is not necessarily optimal, and the most appropriate level of stratification will depend on context. Ethical and equity constraints have also been mentioned in this article, in the context of their relevance when defining subgroups. Although recommendations based on cost-effectiveness subgroup analysis are supported by strong ethical principles (e.g., the fair use of limited resources across different beneficiaries of the same health care system), it should be acknowledged that ethical and equity criteria reflected in social values other than efficiency should also be considered. It seems reasonable that these criteria should be defined in advance, in methods guidelines or in the scoping phase of technology assessment. The article's more specific contributions are, first, to identify the best potential specifications for resource allocation decisions by choosing those that maximize NHBs for different levels of disaggregation. The set of specifications for different levels of disaggregation represents the efficiency frontier for subgroup analysis, which provides a tool for decisions about adoption of technologies in different subgroups. Second, the value of heterogeneity has been conceptualized as a bidimensional concept: the value of making different recommendations across subgroups (static value) and the value of potential future research conducted to resolve parameter uncertainty for different levels of heterogeneity (dynamic value). Third, it provides an application of these concepts to a case study as an extension of the classical methods used for cost-effectiveness, which makes this framework feasible for a wide implementation. Static value has been previously presented in the literature[15,17] and represents the health gained due to understanding heterogeneity (i.e., observable characteristics that explain differences between subgroups).[18] Basu and Meltzer[17] have previously presented this static value in a framework in which decisions can be made either with or without cost internalization. Although our work has been focused on the estimation of static value with cost internalization (i.e., decisions take into account not only account benefits but also opportunity costs), it is expected that the static value will be lower without cost internalization, as presented by Basu and Meltzer.[17] In addition, EVIC for specific parameter(s) (EVICθi) has been proposed as an informative metric to implement a subgroup-based policy.[43] The advantage of the EVICθi approach is that it can provide an estimate of the static value for a set of several parameters. The methods presented in this article add information to EVICθi, because they provide detailed information about the decisions that should be made in specific subgroups. Furthermore, this method provides a feasible approach to estimate the static value and the parameter uncertainty simultaneously, which is another important complement to the EVIC framework. The dynamic value, on the other hand, is the additional value of resolving second-order uncertainty in the future when we compare 2 adjacent levels of disaggregation. This value might be associated with an increase or decrease of EVPI. Because the value of estimating conditional parameters is also observed under current information, it is to be anticipated that the difference between the expected maximum NHB and the maximum expected NHB (i.e., EVPI) is smaller than the average (or the previous level of disaggregation). This was the case for the results shown in the case study. In contrast, if a particular specification contains limited information or the amount of data (sample size) to examine its effect is too limited, EVPI can increase. Of course, in the limit as more sources of variability are observed, the value of additional evidence will fall. Indeed, if all sources of variability could be observed, then there would be no uncertainty or value of information. Two issues that have not been addressed here are the value of the relevant metrics for specific parameters and the value of sampling information. Both correspond to different dimensions of the research space, and future extensions of this work might focus on these elements. The expected value of perfect information for parameters (EVPPI) provides information about the value of resolving the uncertainty of the effect of a given parameter θi on the net health outcome, shedding lights on which parameters should be in the focus of future research. A related issue to clarify is that there is also uncertainty surrounding the categorization of patients into alterative subgroups (e.g., moderators of treatment effects). For example, we may be certain that a given patient has a genetic polymorphism (complete information), but there is uncertainty about the effect of this characteristic on the patient's expected (net) health outcome. More realistically, there may be uncertainty about both the effect of the polymorphism on the outcome and whether the patient has the polymorphism (since genetic tests are not 100% sensitive and specific). It is, therefore, important to emphasize that EVICθi assumes that the information at the individual level is accurate (i.e., there is sensitivity and specificity of 1 at the individual level).[43] In principle, the uncertainty around the value that a covariate takes can be represented as another EVPPI for parameters relating to the diagnostic test. The EVPPI calculation is computationally demanding for the population analysis and may be even more difficult in the case of subgroups. An important concept mentioned in this article is that subgroup-specific parameters can be exchangeable or nonexchangeable. One parameter (θi) is exchangeable if the information used to estimate θi|s conditional for one subgroup can be used to estimate θi|(1-s) for another subgroup. The EVPI( estimated as the weighted average of the EVPI for each subgroup captures the uncertainty given by both exchangeable and nonexchangeable parameters. Therefore, EVPI( is the informative metric with which to address the question of whether further research should be conducted in order to update a guideline for the entire population, considering that different decisions can be made in different subgroups with future information. However, we might be interested in conducting research in only 1 subgroup. In this case, we should choose the one that offers the highest population EVPI. If the goal were to update the recommendation in only 1 subgroup, the EVPI is a good estimate of the dynamic value, but if the aim is to synthesize the new information and to update the guideline for the whole population, the EVPI estimate, as presented in this article, underestimates the real EVPI for that subgroup, because it does not take into account the value of (previous) information provided by another subgroup.

36 in total

1. The irrelevance of inference: a decision-making approach to the stochastic evaluation of health care technologies.

Authors: K Claxton
Journal: J Health Econ Date: 1999-06 Impact factor: 3.883

2. The TIMI risk score for unstable angina/non-ST elevation MI: A method for prognostication and therapeutic decision making.

Authors: E M Antman; M Cohen; P J Bernink; C H McCabe; T Horacek; G Papuchis; B Mautner; R Corbalan; D Radley; E Braunwald
Journal: JAMA Date: 2000-08-16 Impact factor: 56.272

Review 3. The role of the expected value of individualized care in cost-effectiveness analyses and decision making.

Authors: Aukje van Gestel; Janneke Grutters; Jan Schouten; Carroll Webers; Henny Beckers; Manuela Joore; Johan Severens
Journal: Value Health Date: 2012-01 Impact factor: 5.725

4. Reflecting heterogeneity in patient benefits: the role of subgroup analysis with comparative effectiveness.

Authors: Mark Sculpher
Journal: Value Health Date: 2010-06 Impact factor: 5.725

5. Subgroup analyses in randomized clinical trials: statistical and regulatory issues.

Authors: Jean-Marie Grouin; Maylis Coste; John Lewis
Journal: J Biopharm Stat Date: 2005 Impact factor: 1.051

6. Value of information on preference heterogeneity and individualized care.

Authors: Anirban Basu; David Meltzer
Journal: Med Decis Making Date: 2007 Mar-Apr Impact factor: 2.583

7. Using value of information analysis to inform publicly funded research priorities.

Authors: Laura Ginnelly; Karl Claxton; Mark J Sculpher; Sue Golder
Journal: Appl Health Econ Health Policy Date: 2005 Impact factor: 2.561

8. Subgroup Analysis of Trials Is Rarely Easy (SATIRE): a study protocol for a systematic review to characterize the analysis, reporting, and claim of subgroup effects in randomized trials.

Authors: Xin Sun; Matthias Briel; Jason W Busse; Elie A Akl; John J You; Filip Mejza; Malgorzata Bala; Natalia Diaz-Granados; Dirk Bassler; Dominik Mertz; Sadeesh K Srinathan; Per Olav Vandvik; German Malaga; Mohamed Alshurafa; Philipp Dahm; Pablo Alonso-Coello; Diane M Heels-Ansdell; Neera Bhatnagar; Bradley C Johnston; Li Wang; Stephen D Walter; Douglas G Altman; Gordon H Guyatt
Journal: Trials Date: 2009-11-09 Impact factor: 2.279

9. Subgroups and heterogeneity in cost-effectiveness analysis.

Authors: Mark Sculpher
Journal: Pharmacoeconomics Date: 2008 Impact factor: 4.981

10. Costs of an early intervention versus a conservative strategy in acute coronary syndrome.

Authors: David M Epstein; Mark J Sculpher; Tim C Clayton; Rob A Henderson; Stuart J Pocock; Martin J Buxton; Keith A A Fox
Journal: Int J Cardiol Date: 2007-08-16 Impact factor: 4.164

29 in total

1. Using Linked Electronic Health Records to Estimate Healthcare Costs: Key Challenges and Opportunities.

Authors: Miqdad Asaria; Katja Grasic; Simon Walker
Journal: Pharmacoeconomics Date: 2016-02 Impact factor: 4.981

2. Exploring Uncertainty in Economic Evaluations of Drugs and Medical Devices: Lessons from the First Review of Manufacturers' Submissions to the French National Authority for Health.

Authors: Salah Ghabri; Françoise F Hamers; Jean Michel Josselin
Journal: Pharmacoeconomics Date: 2016-06 Impact factor: 4.981

3. Characterizing Heterogeneity Bias in Cohort-Based Models.

Authors: Elamin H Elbasha; Jagpreet Chhatwal
Journal: Pharmacoeconomics Date: 2015-08 Impact factor: 4.981

4. Understanding the Value of Individualized Information: The Impact of Poor Calibration or Discrimination in Outcome Prediction Models.

Authors: Natalia Olchanski; Joshua T Cohen; Peter J Neumann; John B Wong; David M Kent
Journal: Med Decis Making Date: 2017-04-11 Impact factor: 2.583

Review 5. Tools for the Economic Evaluation of Precision Medicine: A Scoping Review of Frameworks for Valuing Heterogeneity-Informed Decisions.

Authors: Reka E Pataky; Stirling Bryan; Mohsen Sadatsafavi; Stuart Peacock; Dean A Regier
Journal: Pharmacoeconomics Date: 2022-07-27 Impact factor: 4.558

6. Local Instrumental Variable Methods to Address Confounding and Heterogeneity when Using Electronic Health Records: An Application to Emergency Surgery.

Authors: Silvia Moler-Zapata; Richard Grieve; David Lugo-Palacios; A Hutchings; R Silverwood; Luke Keele; Tommaso Kircheis; David Cromwell; Neil Smart; Robert Hinchliffe; Stephen O'Neill
Journal: Med Decis Making Date: 2022-05-24 Impact factor: 2.749

Review 7. Cost-effectiveness analyses of genetic and genomic diagnostic tests.

Authors: Katherine Payne; Sean P Gavan; Stuart J Wright; Alexander J Thompson
Journal: Nat Rev Genet Date: 2018-01-22 Impact factor: 53.242

8. Immunosuppressive agents in adult kidney transplantation in the National Health Service: a model-based economic evaluation.

Authors: Tristan M Snowsill; Jason Moore; Ruben E Mujica Mota; Jaime L Peters; Tracey L Jones-Hughes; Nicola J Huxley; Helen F Coelho; Marcela Haasova; Chris Cooper; Jenny A Lowe; Jo L Varley-Campbell; Louise Crathorne; Matt J Allwood; Rob Anderson
Journal: Nephrol Dial Transplant Date: 2017-07-01 Impact factor: 5.992

9. Cost-effectiveness of a national exercise referral programme for primary care patients in Wales: results of a randomised controlled trial.

Authors: Rhiannon Tudor Edwards; Pat Linck; Natalia Hounsome; Larry Raisanen; Nefyn Williams; Laurence Moore; Simon Murphy
Journal: BMC Public Health Date: 2013-10-29 Impact factor: 3.295

10. Concepts of 'personalization' in personalized medicine: implications for economic evaluation.

Authors: Wolf Rogowski; Katherine Payne; Petra Schnell-Inderst; Andrea Manca; Ursula Rochau; Beate Jahn; Oguzhan Alagoz; Reiner Leidl; Uwe Siebert
Journal: Pharmacoeconomics Date: 2015-01 Impact factor: 4.981