Literature DB >> 26133293

Is Meta-Analysis for Utility Values Appropriate Given the Potential Impact Different Elicitation Methods Have on Values?

Abstract

A growing number of published articles report estimates from meta-analysis or meta-regression on health state utility values (HSUVs), with a view to providing input into decision-analytic models. Pooling HSUVs is problematic because of the fact that different valuation methods and different preference-based measures (PBMs) can generate different values on exactly the same clinical health state. Existing meta-analyses of HSUVs are characterised by high levels of heterogeneity, and meta-regressions have identified significant (and substantial) impacts arising from the elicitation method used. The use of meta-regression with few utility values and inclusion criteria that extend beyond the required utility value has not helped. There is the potential to explore greater use of mapping between different PBMs and valuation methods prior to data synthesis, which could support greater use of pooling values. Researchers wishing to populate decision-analytic models have a responsibility to incorporate all high-quality evidence available. In relation to HSUVs, greater understanding of the differences between different methods and greater consistency of methodology is required before this can be achieved.

Entities: Disease Species

Mesh：

Year: 2015 PMID： 26133293 PMCID： PMC4607715 DOI： 10.1007/s40273-015-0310-y

Source DB: PubMed Journal: Pharmacoeconomics ISSN： 1170-7690 Impact factor: 4.981

Key Points for Decision Makers

Introduction

The evaluation of healthcare technologies is increasingly reliant upon decision-analytic models. Where quality-adjusted life-years (QALYs) are used as the overall outcome measure for a decision model, each health state included in the model requires a health-related quality-of-life score or health state utility value (HSUV). Good practice in parameter estimation relies on the principles of evidence-based medicine, hence, aims to include all (unbiased) evidence and employ formal evidence synthesis techniques, with systematic review and meta-analysis [1] being the highest level of evidence. That said, the diversity of methods for generating QALYs [2] and the variability across the values generated by these different methods leads to a quandary over whether meta-analysis of utility values will be appropriate. We are interpreting utility here to mean a measure of the social judgement of the value of a particular health state. Health economists use a number of different methods to extract that value, resulting in the same health state being attributed different (sometimes really quite different) utility scores. This variability arises from four factors: (1) who is asked (and when) to value health states (patients, ex-patients, or members of the public); (2) the technique used to extract preferences and estimate values [the most common being time trade-off, standard gamble (SG), visual analogue scale (VAS) and discrete choice experiment]; (3) different variants of each of the general method (such as the exact question wording, the mode of administration or the use of props); and (4) different preference-based measures (PBMs) or instruments with different descriptive systems, including different items and response options, valued using different methods. Meta-analysis provides a means to pool data collected across a number of studies and produce a weighted average of the measure of interest, thereby, generating a more precise measure. Most HSUV studies report more than one mean utility value (e.g. patients may complete more than one PBM); consequently any meta-analysis of HSUVs needs to adjust for the fact that these values will be correlated. Given the potential sources of variability of HSUVs, it is unsurprising that conventional tests find that pooled HSUVs reveal considerable heterogeneity (e.g. [3, 4]).

Existing Use of Meta-Analysis and Meta-Regression for Utility Values

Meta-regressions [5] allow researchers to explore heterogeneity and the impact of different elicitation methods. Existing meta-regressions (see Table 1) on HSUVs have found substantial differences in values between elicitation methods.

Table 1

Some example coefficients on utility instruments and elicitation methods in meta-regressions

References	Health states	Coefficient on utility instrument/elicitation method (all with p < 0.05)	Reference case
Sturza [6]	Lung cancer	Assessment of quality of life (AQoL) [7]: −0.263	SG
McLernon et al. [3]	Chronic liver disease states	TTO: 0.116; transformed VAS: 0.152	EQ-5D
Si et al. [4]	Hip fracture	SG: 0.36	EQ-5D
	Vertebral fracture	Health Utilities Index (HUI) [8]: 0.22	EQ-5D
Lung et al. [9]	Diabetes	TTO or SG: 0.068	EQ-5D
Wyld et al. [10]	Chronic kidney disease	Mapped EQ-5D: −0.14	TTO
Bremner et al. [11]	Prostate cancer	Quality of Well-being (QWB) [12]: −0.09	TTO
Djalalov et al. [13]	Colorectal cancer	SG: −0.13	TTO

SG standard gamble, TTO time trade-off, VAS visual analogue scale

Some example coefficients on utility instruments and elicitation methods in meta-regressions SG standard gamble, TTO time trade-off, VAS visual analogue scale These differences are worryingly large. Indeed, Sturza [6], reporting on her meta-regression for lung cancer, argued that since methodological factors affect utility values, lung cancer researchers “should avoid direct comparisons on lung cancer utility values elicited with dissimilar methods” (p. 691). Some HSUV synthesis has avoided some of these problems by only using meta-analysis on the EQ-5D (Peasgood et al. [14] for osteoporosis states; Doth et al. [15] for pain states) as this is the measure explicitly preferred by the National Institute for Health and Care Excellence (NICE) [16]. Others have conducted a separate meta-analysis for each overall method or instrument (Liem et al. [17] for renal replacement therapy states; Post et al. [18] for stroke; Mohiuddin and Payne [19] for depression). Whilst a weighted average of EQ-5D values may be adequate for NICE Health Technology Appraisal submissions, for non-NICE submissions, we are left with a decision as to which value to use to populate a decision model. This choice is likely to impact substantially upon the mean values used (e.g. Mohiuddin and Payne [19] reported a pooled SG value for mild depression of 0.69 compared with only 0.56 for the pooled EQ-5D estimate) and on the final incremental cost-effectiveness ratios [20]. Furthermore, a meta-analysis on one particular instrument or method results in considerable loss of evidence and information, which goes against the researcher’s responsibility to incorporate all high-quality evidence available.

Recommendations

How do we use the very best evidence under the circumstances of considerable parameter variation across methodologies? The problem may not be as bad as it at first seems. It may be that these elicitation method differences identified in meta-regressions are inflated. Firstly, some meta-regressions for HSUVs have been conducted on fairly small numbers of utility values. Secondly, meta-regressions have included values that do not appear to be measuring the same thing, i.e. the utility score on a scale of 0 (dead) to 1 (full health) representing how the relevant society views the value of a particular clinical health state. Meta-regressions with only a few studies and considerable study heterogeneity run the risk of showing false positives [21]; hence, a dummy variable for the elicitation method may appear to be statistically significant when it is not. Whilst there are no hard and fast rules for the appropriate sample size in meta-regression, a ratio of at least ten studies to each covariate is often recommended [5]. For meta-regressions of effectiveness, a minimum of four studies in a categorical subgroup variable has been recommended [22], while more are required to conduct significance testing. Meta-regressions of HSUVs have been conducted with small numbers of utility values (e.g. McLernon et al. [3] conducted a meta-regression with nine covariates and 40 utility values), and some have very few utility values in each category (e.g. Wyld et al. [10] included a covariate for Short Form 6 dimension with only one utility value identified that used this instrument). The pooling of utility values should only be attempted where the data are valuing the same clinical health state for the appropriate population. The breadth of the health state for which utility values are sought should be dictated by the economic model, and utility values should confidently reflect that exact health state required. Vignettes, which verbally describe a particular (hypothetical) clinical health state to allow individuals who are not in that particular health state to estimate a utility score, may have a useful role in populating economic models in the absence of any other utility values. However, they introduce another layer of uncertainty and may offer no additional benefit when values on the actual desired health state are available. In the meta-regression by Sturza [6], values derived from asking members of the public to link lung cancer vignettes to an EQ-5D state are included alongside direct patient EQ-5D responses without recognition of the superiority of the latter evidence. Making a judgement on whether a study is identifying a utility for the appropriate health state requires detailed information on the exact study population (including study selection, drop out, missing values and clinical diagnosis), and this is unfortunately not always available [19]. When in doubt, preference should be for including only studies where it is reasonable to assume that the utility refers to the desired population. The pooling of utility values should also only include utilities anchored on the dead to full-health scale. This would exclude values where the top anchor is symptom free (which would exclude some values used in Bremner et al. [11]) or ‘normal’ rather than full health (which would exclude some values used in Peasgood et al. [23], Tengs and Lin [24, 25] and Sturza [6]). Where there is uncertainty on whether the values really are utility scores, such as when the assessment method is not stated, these should not be included (which would exclude some values used in Tengs and Lin [25]). It is possible that some PBMs may not adequately identify important aspects of a particular clinical health state. Where there is strong psychometric evidence that a particular instrument lacks validity for the health condition of interest (e.g. see Longworth et al. [26] for a review), a synthesis that excludes those values will be useful for sensitivity analysis. Where an economic model is to be used to support decision making in a particular country, the desired utility values are those that give the social value of the health state as judged by the relevant population from that country. Utility scores using tariffs from other countries reflect different sets of preferences, and unless it is believed that preferences should be universal, or the value sets are very similar, the rational for pooling utilities that use different country-specific tariffs is not clear. Considerable inter-country differences in the social tariff of the EQ-5D have been identified, with differences varying across the EQ-5D distribution [27]. Including a country-specific tariff dummy, hence, shifting the intercept, will not capture this variability across the distribution or differences in the weight given to different items in the instrument. To include utility data from other countries would require patient level data to enable the appropriate social tariff to be applied or a mapping from one country tariff to another using more sophisticated methods (e.g. [28]). Even where we have included only utility values on the same clinical health state, the identified utility values are still likely to show variability across instruments and elicitation methods. For PBMs, it is likely that the different descriptive systems drive the variation as much as differences in valuation method [29]. Including the instrument as an intercept term on meta-regression is a limited approach as it does not pick up the relative weights attributed to the different domains within an instrument (including zero if the item is not included at all). An alternative approach would be to use mapping between instruments, at the aggregate or, if possible, the individual patient level. Whilst mapped values may still differ in terms of both mean and variance compared with direct values (e.g. Wyld et al. [10] found EQ-5D values mapped from Short Form 12 and Short Form 36 to have different values to direct EQ-5D values) and may not be feasible where descriptive content does not substantially overlap, where mapping is possible, the pooling of mapped-utility values could offer a means of generating an estimate that incorporates more of the relevant evidence and has a smaller variance. That said, consideration should be given to the quality of the mapping function, particularly at the ends of the distribution [30], and the appropriateness of the population on which the mapping function was based. In addition to generating a pooled mean value, consideration also needs to be given to an assessment of uncertainty of the parameter. Ara and Wailoo [31] note that this should incorporate the uncertainty from any mapping functions used, the uncertainty from tariff scores and uncertainty from the output of the descriptive system. More generally, pooling HSUVs would be aided if there was a greater consistency of valuation methods between instruments. Where instruments adopt different descriptive systems, effort could still be made to generate a social tariff that adopts a standardised methodology. This would facilitate greater understanding of the source of differences between instruments. The advantages of adopting a systematic review of utility values to populate economic models are clear—the adoption of a clear methodology to follow in terms of searching (see [32]) and transparent reporting of findings. This includes details of study characteristics that would allow modellers to select the most appropriate value [33] for both the main model and any sensitivity analysis. The advantage of including a meta-analysis or meta-regression is the use of all available good-quality evidence in generating the value to be used. Yet even with stricter inclusion criteria (excluding values that are not the appropriate utilities), we are still likely to be left with a considerable degree of heterogeneity across utility values. Higgins [34] has presented the case that in relation to study effect sizes ‘‘any amount of heterogeneity is acceptable, providing both that the predefined eligibility criteria for the meta-analysis are sound and that the data are correct.” (p. 1158). Where we are aiming to measure the same thing—the social value of a particular health state—we ought to be able to combine values. More work is required on understanding sources of variation in utility values, particularly, variation driven by differences in the descriptive system. For England and Wales, the current NICE methods guide states that when it is necessary to take HSUVs from the literature “the methods of identification of the data should be systematic and transparent. The justification for choosing a particular data set should be clearly explained. When more than one plausible set of EQ-5D data is available, sensitivity analyses should be carried out to show the impact of the alternative utility values” [16]. This does not then imply a requirement for meta-analysis on EQ-5D values at present. However, given the growing number of publications that incorporate meta-analysis or meta-regression of HSUVs, this guidance may change in the future.

Searching and synthesis of health state utility values (HSUVs) to populate decision models should incorporate all good-quality evidence, but the variability of utility scores by elicitation methods generates a problem for pooling values through meta-analysis.

Stricter inclusion criteria for meta-regression or meta-analysis of HSUVs may help.

There is potential for greater use of mapping algorithms between HSUVs prior to meta-analysis, although careful consideration should be given to the appropriateness of the mapping function and the additional level of uncertainty associated with mapped values.

30 in total

1. Controlling the risk of spurious findings from meta-regression.

Authors: Julian P T Higgins; Simon G Thompson
Journal: Stat Med Date: 2004-06-15 Impact factor: 2.373

Review 2. Health-state utility values in breast cancer.

Authors: Tessa Peasgood; Sue E Ward; John Brazier
Journal: Expert Rev Pharmacoecon Outcomes Res Date: 2010-10 Impact factor: 2.217

3. A comparison of United States and United Kingdom EQ-5D health states valuations using a nonparametric Bayesian method.

Authors: Samer A Kharroubi; Anthony O'Hagan; John E Brazier
Journal: Stat Med Date: 2010-07-10 Impact factor: 2.373

Review 4. A systematic review and meta-analysis of utility-based quality of life for osteoporosis-related conditions.

Authors: L Si; T M Winzenberg; B de Graaff; A J Palmer
Journal: Osteoporos Int Date: 2014-02-22 Impact factor: 4.507

Review 5. A review and meta-analysis of utility values for lung cancer.

Authors: Julie Sturza
Journal: Med Decis Making Date: 2010-05-06 Impact factor: 2.583

Review 6. A Review and Meta-analysis of Colorectal Cancer Utilities.

Authors: Sandjar Djalalov; Linda Rabeneck; George Tomlinson; Karen E Bremner; Robert Hilsden; Jeffrey S Hoch
Journal: Med Decis Making Date: 2014-06-05 Impact factor: 2.583

Review 7. Health-state utilities in liver disease: a systematic review.

Authors: David J McLernon; John Dillon; Peter T Donnan
Journal: Med Decis Making Date: 2008-04-18 Impact factor: 2.583

8. Health state utility values for diabetic retinopathy: protocol for a systematic review and meta-analysis.

Authors: Christopher J Sampson; Jonathan C Tosh; Christopher P Cheyne; Deborah Broadbent; Marilyn James
Journal: Syst Rev Date: 2015-02-21

9. Why do multi-attribute utility instruments produce different utilities: the relative importance of the descriptive systems, scale and 'micro-utility' effects.

Authors: Jeff Richardson; Angelo Iezzi; Munir A Khan
Journal: Qual Life Res Date: 2015-01-31 Impact factor: 4.147

Review 10. A systematic review and meta-analysis of utility-based quality of life in chronic kidney disease treatments.

Authors: Melanie Wyld; Rachael Lisa Morton; Andrew Hayen; Kirsten Howard; Angela Claire Webster
Journal: PLoS Med Date: 2012-09-11 Impact factor: 11.069

13 in total

1. Heath State Utility Values for Cost-Effectiveness Models.

Authors: Jonathan Karnon
Journal: Pharmacoeconomics Date: 2017-12 Impact factor: 4.981

Review 2. The Identification, Review and Synthesis of Health State Utility Values from the Literature.

Authors: Roberta Ara; John Brazier; Tessa Peasgood; Suzy Paisley
Journal: Pharmacoeconomics Date: 2017-12 Impact factor: 4.981

3. Health state utility values of high prevalence mental disorders in Australia: results from the National Survey of Mental Health and Wellbeing.

Authors: Cathrine Mihalopoulos; Lidia Engel; Long Khanh-Dao Le; Anne Magnus; Meredith Harris; Mary Lou Chatterton
Journal: Qual Life Res Date: 2018-04-09 Impact factor: 4.147

4. A systematic review of utility values in children with cerebral palsy.

Authors: Utsana Tonmukayakul; Long Khanh-Dao Le; Shalika Bohingamu Mudiyanselage; Lidia Engel; Jessica Bucholc; Brendan Mulhern; Rob Carter; Cathrine Mihalopoulos
Journal: Qual Life Res Date: 2018-08-02 Impact factor: 4.147

5. How to Appropriately Extrapolate Costs and Utilities in Cost-Effectiveness Analysis.

Authors: Laura Bojke; Andrea Manca; Miqdad Asaria; Ronan Mahon; Shijie Ren; Stephen Palmer
Journal: Pharmacoeconomics Date: 2017-08 Impact factor: 4.981

6. Estimating Health-State Utility Values in Kidney Transplant Recipients and Waiting-List Patients Using the EQ-5D-5L.

Authors: Bernadette Li; John A Cairns; Heather Draper; Christopher Dudley; John L Forsythe; Rachel J Johnson; Wendy Metcalfe; Gabriel C Oniscu; Rommel Ravanan; Matthew L Robb; Paul Roderick; Charles R Tomson; Christopher J E Watson; J Andrew Bradley
Journal: Value Health Date: 2017-05-12 Impact factor: 5.725

7. Utility value estimates in cardiovascular disease and the effect of changing elicitation methods: a systematic literature review.

Authors: Marissa Blieden Betts; Pratik Rane; Evelien Bergrath; Madhura Chitnis; Mohit Kumar Bhutani; Claudia Gulea; Yi Qian; Guillermo Villa
Journal: Health Qual Life Outcomes Date: 2020-07-27 Impact factor: 3.186

8. The impact of loneliness and social isolation on health state utility values: a systematic literature review.

Authors: Ishani K Majmudar; Cathrine Mihalopoulos; Bianca Brijnath; Michelle H Lim; Natasha Yvonne Hall; Lidia Engel
Journal: Qual Life Res Date: 2022-01-24 Impact factor: 4.147

Review 9. Systematic Review and Meta-Analysis of Community- and Choice-Based Health State Utility Values for Lung Cancer.

Authors: Erik F Blom; Kevin Ten Haaf; Harry J de Koning
Journal: Pharmacoeconomics Date: 2020-11 Impact factor: 4.981

Review 10. The impact of acute pneumococcal disease on health state utility values: a systematic review.

Authors: Ryan O'Reilly; Sayako Yokoyama; Justin Boyle; Jeffrey C Kwong; Allison McGeer; Teresa To; Beate Sander
Journal: Qual Life Res Date: 2021-07-17 Impact factor: 4.147