Literature DB >> 29270835

Australian Utility Weights for the EORTC QLU-C10D, a Multi-Attribute Utility Instrument Derived from the Cancer-Specific Quality of Life Questionnaire, EORTC QLQ-C30.

Madeleine T King^1,2, Rosalie Viney³, A Simon Pickard⁴, Donna Rowen⁵, Neil K Aaronson⁶, John E Brazier⁵, David Cella⁷, Daniel S J Costa^8,9, Peter M Fayers^10,11, Georg Kemmler¹², Helen McTaggart-Cowen¹³, Rebecca Mercieca-Bebber^8,9, Stuart Peacock¹³, Deborah J Street³, Tracey A Young⁵, Richard Norman¹⁴.

Abstract

BACKGROUND: The EORTC QLU-C10D is a new multi-attribute utility instrument derived from the widely used cancer-specific quality-of-life (QOL) questionnaire, EORTC QLQ-C30. The QLU-C10D contains ten dimensions (Physical, Role, Social and Emotional Functioning; Pain, Fatigue, Sleep, Appetite, Nausea, Bowel Problems), each with four levels. To be used in cost-utility analysis, country-specific valuation sets are required.
OBJECTIVE: The aim of this study was to provide Australian utility weights for the QLU-C10D.
METHODS: An Australian online panel was quota-sampled to ensure population representativeness by sex and age (≥ 18 years). Participants completed a discrete choice experiment (DCE) consisting of 16 choice-pairs. Each pair comprised two QLU-C10D health states plus life expectancy. Data were analysed using conditional logistic regression, parameterised to fit the quality-adjusted life-year framework. Utility weights were calculated as the ratio of each QOL dimension-level coefficient to the coefficient on life expectancy.
RESULTS: A total of 1979 panel members opted in, 1904 (96%) completed at least one choice-pair, and 1846 (93%) completed all 16 choice-pairs. Dimension weights were generally monotonic: poorer levels within each dimension were generally associated with greater utility decrements. The dimensions that impacted most on choice were, in order, Physical Functioning, Pain, Role Functioning and Emotional Functioning. Oncology-relevant dimensions with moderate impact were Nausea and Bowel Problems. Fatigue, Trouble Sleeping and Appetite had relatively small impact. The value of the worst health state was -0.096, somewhat worse than death.
CONCLUSIONS: This study provides the first country-specific value set for the QLU-C10D, which can facilitate cost-utility analyses when applied to data collected with the EORTC QLQ-C30, prospectively and retrospectively.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2018 PMID： 29270835 PMCID： PMC5805814 DOI： 10.1007/s40273-017-0582-5

Source DB: PubMed Journal: Pharmacoeconomics ISSN： 1170-7690 Impact factor: 4.981

Key Points for Decision Makers

Introduction

Economic evaluation is central to the evaluation of new therapies and technologies in many countries. Cost-utility analysis (CUA) is a form of economic evaluation that quantifies health outcomes on a standardised metric, typically the quality-adjusted life-year (QALY). The quality adjustment is provided by a ‘value set’; that is, a set of utility weights for a range of possible health states within any given health state classification system. Value sets can be derived for classification systems originally developed as multi-attribute utility instruments (MAUI), or by adaptation of existing health-related quality of life (HRQoL) profile measures [1, 2]. Such a measure, the EORTC QLQ-C30, is a widely used core questionnaire in the modular HRQoL suite of the European Organisation for Research and Treatment of Cancer (EORTC) [3]. The Multi-Attribute Utility in Cancer (MAUCa) Consortium aims to facilitate the use of HRQoL data in CUA in cancer settings by providing a series of country-specific value sets for the QLQ-C30. To this end, we have developed a health state classification system containing 13 of the QLQ-C30’s 30 items, combined into ten dimensions (Table 1) [4] and a valuation method based on a discrete choice experiment (DCE) [5]. These are key components of the QLU-C10D, a new cancer-specific MAUI. The aim of this current paper is to apply the valuation method in an Australian general population sample to produce the first country-specific utility weights for the QLU-C10D.

Table 1

The QLU-C10D health state classification system, how it maps to the 13 component items from the QLQ-C30, and the duration attribute included in the discrete choice experiment (DCE) valuation survey

Dimension	Level	Stem	Descriptor	QLQ-C30 item scores
Physical functioning^a,b	1	You have…	No trouble taking a long walk outside of the house	Item 2 (long walk) = 1
	2		No trouble taking a short walk outside of the house, but at least a little trouble taking a long walk	Item 3 (short walk) = 1 ANDItem 2 ≥ 2
	3		A little trouble taking a short walk outside of the house, and at least a little trouble taking a long walk	Item 3 = 2 ANDItem 2 ≥ 2
	4		Quite a bit or very much trouble taking a short walk outside the house	Item 3 ≥ 3 ANDItem 2 ≥ 2
Role functioning	1	You are limited in pursuing your work or other daily activities…	Not at all	Item 6 = 1
	2		A little	Item 6 = 2
	3		Quite a bit	Item 6 = 3
	4		Very much	Item 6 = 4
Social functioning^a,c	1	Your physical condition or medical treatment interferes with your social or family life…	Not at all	Items 26 AND 27 = 1
	2		A little	Items 26 OR 27 = 2^c
	3		Quite a bit	Items 26 OR 27 = 3^c
	4		Very much	Items 26 OR 27 = 4^c
Emotional functioning	1	You feel depressed…	Not at all	Item 24 = 1
	2		A little	Item 24 = 2
	3		Quite a bit	Item 24 = 3
	4		Very much	Item 24 = 4
Pain	1	You have pain…	Not at all	Item 9 = 1
	2		A little	Item 9 = 2
	3		Quite a bit	Item 9 = 3
	4		Very much	Item 9 = 4
Fatigue	1	You feel tired…	Not at all	Item 18 = 1
	2		A little	Item 18 = 2
	3		Quite a bit	Item 18 = 3
	4		Very much	Item 18 = 4
Sleep	1	You have trouble sleeping…	Not at all	Item 11 = 1
	2		A little	Item 11 = 2
	3		Quite a bit	Item 11 = 3
	4		Very much	Item 11 = 4
Appetite	1	You lack appetite…	Not at all	Item 13 = 1
	2		A little	Item 13 = 2
	3		Quite a bit	Item 13 = 3
	4		Very much	Item 13 = 4
Nausea	1	You feel nauseated…	Not at all	Item 14 = 1
	2		A little	Item 14 = 2
	3		Quite a bit	Item 14 = 3
	4		Very much	Item 14 = 4
Bowel problems^a,c	1	You…	Do not have constipation or diarrhoea at all	Items 16 AND 17 = 1
	2		Have a little constipation or diarrhoea	Items 16 OR 17 = 2^c
	3		Have constipation or diarrhoea quite a bit	Items 16 OR 17 = 3^c
	4		Have constipation or diarrhoea very much	Items 16 OR 17 = 4^c
Duration	1	You will live in this health state for…	1 year, and then die	Not applicable
	2		2 years, and then die	Not applicable
	3		5 years, and then die	Not applicable
	4		10 years, and then die	Not applicable

aThree dimensions of the QLU-C10D each involve two QLQ-C30 items

bThe Physical Functioning dimension includes ‘long walk’ and ‘short walk’ from the QLQ-C30; for the DCE, the levels are determined together, but were presented in the DCE survey separately, as shown in Fig. 1

cFor social functioning and bowel problems, the QLU-C10D level is determined by the maximum value of the two component items

The QLU-C10D health state classification system, how it maps to the 13 component items from the QLQ-C30, and the duration attribute included in the discrete choice experiment (DCE) valuation survey aThree dimensions of the QLU-C10D each involve two QLQ-C30 items bThe Physical Functioning dimension includes ‘long walk’ and ‘short walk’ from the QLQ-C30; for the DCE, the levels are determined together, but were presented in the DCE survey separately, as shown in Fig. 1

Fig. 1

An example choice set from the discrete choice experiment valuation task

cFor social functioning and bowel problems, the QLU-C10D level is determined by the maximum value of the two component items

Methods

The QLU-C10D

Table 1 shows the QLU-C10D health state classification system, and explains how the ten dimensions, each with four levels, map to 13 of the 30 items in the QLQ-C30. The derivation of this health state classification system is described elsewhere [4].

The Valuation Task: Discrete Choice Experiment (DCE) Presentation

The valuation task was based on methods developed for the Australian valuations for the EQ-5D(3L) and SF-6D instruments [6, 7]. The task involved choosing between two QLU-C10D health states, each with a specified duration (life years), described as ‘Situation A’ and ‘Situation B’ (Fig. 1). Because the QLU-C10D includes more dimensions than the EQ-5D(3L) or SF-6D, we first established the feasibility of the task, and pilot tested the DCE task wording, layout and presentation formats [5]. Choice sets were presented in a format preferred by participants in the QLU-C10D valuation methods experiment [5]; that is, dimensions that differed between situations A and B were highlighted in yellow. For the Physical Functioning dimension, the descriptors for levels 2 and 3 are quite complex (Table 1). To facilitate respondent understanding, we presented the two component items, ‘long walk’ and ‘short walk’, as two separate attributes in the survey (Fig. 1). Note that the Physical Functioning dimension was treated as one four-level dimension in the DCE design (online resource 1, see electronic supplementary material [ESM]) and data analysis. An example choice set from the discrete choice experiment valuation task

Health States Valued: DCE Design

The QLU-C10D health state classification system has over a million possible health states (410 = 1,048,576). We employed a designed experiment to select 960 choices sets that would maximise statistical efficiency in estimating the utility model parameters. Health states were operationalised as 12 attributes in the DCE: one for duration, two to represent physical functioning (long and short walk), and one for each of the remaining nine QLU-C10D dimensions. Because 12 dimensions is a relatively large number for respondents to consider simultaneously, we simplified the cognitive task by constraining the number of HRQoL dimensions that differed between health states in any given choice set to four, as done in the QLU-C10D valuation methods experiment [5], using the same experimental design. Briefly, we began with a balanced incomplete block design (BIBD) to define which four of the ten QLU-C10D dimensions differed within choice sets [8]. This BIBD was then duplicated. To determine the levels of these differing dimensions, a generator-based approach was employed, designed to allow estimation of main effects and all two-factor interactions involving duration [9]. The levels of the six dimensions that were constant between options were then developed using an orthogonal main effects plan. This follows the approach outlined by Demirkale et al. [10]. The final design comprised 1920 health states in 960 choice sets (online resource 1, see ESM). There were two levels of randomisation in the DCE component of the survey: (i) each respondent was randomly allocated 16 of the 960 choice sets without replacement; (ii) which option was seen as Situation A or B was randomised within each choice set to mitigate any ordering bias. The dimensions were always presented in the same order, as previous work showed that dimension order does not systematically bias utility weights for the QLU-C10D [11].

Survey Content

All survey content was developed by the MAUCa Consortium. In addition to the DCE, the survey contained other components (Fig. 2). The self-reported health questions included the general health question of the SF-36 [12] and the Kessler-10 (mental health) questionnaire [13]. Sociodemographic questions were worded such that they could be mapped directly to normative data to enable assessment of our sample’s representativeness of the Australian general population (Table 2).

Fig. 2

Respondent flow and sample size for each component of the survey. DCE discrete choice experiment

Table 2

Self-reported health and sociodemographic characteristics of the sample compared with those of the Australian general population

Question	Level	Number	Proportion (or mean, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\bar{x}}$$\end{document}x¯)	Population value	Statistic^a	p value
Sex	Male	913	0.49	0.49	Χ ² = 0.007	0.93
Sex	Female	943	0.51	0.51
Age (years)	18–29	409	0.22	0.22	Χ ² = 0.92	0.97
	30–39	334	0.18	0.18
	40–49	325	0.18	0.18
	50–59	301	0.16	0.17
	60–69	243	0.13	0.13
	70 or older	243	0.13	0.13
General Health Question (GHQ)	Excellent	206	0.10	0.10	Χ ² = 31.4	< 0.0001
	Very good	635	0.32	0.35
	Good	703	0.36	0.37
	Fair	343	0.17	0.15
	Poor	92	0.05	0.03
Mental health	Kessler-10	1822	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\bar{x}}$$\end{document}x¯ = 17.81	μ = 14.50	t = 18.0	< 0.0001
Country of Birth	Australia	1359	0.74	0.79	Χ ² = 33.6	< 0.0001
	Other English-speaking	271	0.15	0.10
	Other	201	0.11	0.11
Highest level of education	Year 11 or below	299	0.16	0.28	Χ ² = 382.3	< 0.0001
	Year 12	340	0.19	0.17
	Trade certificate	280	0.15	0.24
	Diploma	309	0.17	0.09
	Bachelor’s degree	420	0.23	0.14
	Higher	183	0.10	0.09
ATSI status	Yes	153	0.08	0.05	Χ ² = 43.3	< 0.0001
ATSI status	No	1679	0.92	0.95
Marital status	Married (registered)	797	0.44	0.49	Χ ² = 49.9	< 0.0001
	Separated	55	0.03	0.03
	Divorced	153	0.08	0.09
	Widowed	66	0.04	0.05
	Other	761	0.42	0.34

Australian sex and age distribution (Australian Bureau of Statistics, March 2013) from http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/3101.0Mar%202013?OpenDocument. The GHQ distribution, ATSI status, highest level of education, and country of birth are derived from the Household, Income and Labour Dynamics in Australia Survey (HILDA, Wave 10), limited to those aged 18 years and over. Kessler-10 Australian norms were derived from the 2007 Australian National Health Survey [18]

ATSI Aboriginal and Torres Strait Islander

aFor categorical variables, the chi-squared goodness-of-fit test was used to compare observed category frequencies with those expected based on population proportions; for the continuous K10 score, a one-sample t-test compared the observed K10 mean to the population value reported by Slade et al. 2011 [18]

Respondent flow and sample size for each component of the survey. DCE discrete choice experiment Self-reported health and sociodemographic characteristics of the sample compared with those of the Australian general population Australian sex and age distribution (Australian Bureau of Statistics, March 2013) from http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/3101.0Mar%202013?OpenDocument. The GHQ distribution, ATSI status, highest level of education, and country of birth are derived from the Household, Income and Labour Dynamics in Australia Survey (HILDA, Wave 10), limited to those aged 18 years and over. Kessler-10 Australian norms were derived from the 2007 Australian National Health Survey [18] ATSI Aboriginal and Torres Strait Islander aFor categorical variables, the chi-squared goodness-of-fit test was used to compare observed category frequencies with those expected based on population proportions; for the continuous K10 score, a one-sample t-test compared the observed K10 mean to the population value reported by Slade et al. 2011 [18]

Survey Implementation and Sample Recruitment

The content was implemented as an online survey by SurveyEngine [14], a company that specialises in choice experiments. SurveyEngine and its panel providers comply with the International Code on Market, Opinion and Social Research and Data Analytics [15]. SurveyEngine managed recruitment (via an Australian online panel provided by Toluna), administration of the survey and data collection. The target population was the Australian adult general population (aged ≥ 18 years). Participants were panel members aged 18 years or older who opted in to the survey. There were no exclusion criteria. Quota sampling by age and sex was used to achieve population representativeness on those variables.

Statistical Analysis

Sample Representativeness

Chi-square tests were used to assess our sample’s representativeness of the Australian population for age and sex (population data available from the Australian Bureau of Statistics as at March 2013 [16]); self-reported general health, Aboriginal and Torres Strait Islander (ATSI) status, highest level of education, and country of birth (population data available from the Household, Income and Labour Dynamics in Australia Surveys [HILDA], Wave 10 [17]); self-reported mental health (Kessler-10 Australian norms from the 2007 Australian National Health Survey [18]).

Utility Estimation

The DCE data were analysed in the statistical software package STATA-13 [19] using a functional form used previously to estimate utilities from DCE data consistent with standard QALY model restrictions [5–7, 20, 21]. The QALY model requires that all health states have zero utility at death (i.e. ‘the zero condition’) [22, 23]. A functional form that satisfied this requirement included the QLU-C10D dimension levels interacted with the duration variable (‘TIME’) (Eqs. 1, 2). Thus, as TIME tended to zero, the systematic component of the utility function tended to zero. Another requirement of the QALY model is constant proportional time trade-off, therefore the relationship between utility and TIME (life years) was constrained to be linear. A useful feature of this functional form was that the impact of moving away from level 1 (no problems) in each dimension was characterised through the two-factor interaction term with duration (note that the experimental design allowed for all these interactions). This enabled a utility algorithm in which the effect of each level of each dimension could be included as a decrement away from full health (which had a value of 1). We analysed the data in two ways, reflecting different approaches to modelling heterogeneity (Eqs. 1, 2). The primary analysis was underpinned by Eq. 1, in which the utility of option j in choice set s for survey respondent i was assumed to bewhere α was the utility associated with a life year, was a vector of dummy variables representing the levels of the QLU-C10D health state presented in option j, and β was the corresponding vector of utility weights associated with each level in each dimension within , for each life year. The error term was assumed to have a Gumbel distribution. In the primary analysis, DCE responses were estimated as a conditional logit model. To adjust the standard errors to allow for intra-individual correlation (as each respondent was asked to consider 16 DCE choice sets), we used a clustered sandwich estimator implemented by STATA’s vce (cluster) option. To estimate utility decrements for each movement away from level 1 (no problems) in each of the ten QLU-C10D dimensions, we divided each of the β terms by α. To estimate confidence intervals around these ratios, we used STATA’s wtp command [23], using the delta method. Model 1 included every move away from the best level (level 1, no problems) in each dimension within . Thus, contained 30 terms (i.e. 10 dimensions × [4−1] levels within each). If non-monotonicity was observed among levels within a dimension in Model 1 estimates, the non-monotonic levels were combined in Model 2. This restriction has been standardly imposed in previous studies [24-29]. In a secondary approach, we employed a mixed logit [30]. In Model 3, it was assumed that coefficients were drawn from a distribution, allowing for preference heterogeneity among individuals. Thus, α and the vector of βs now represent population mean preferences, while γ and η are individual deviations around those mean preferences. These deviations were assumed to be distributed multivariate normal (0,∑). We used the mixlogit STATA command [31] to estimate α, the vector of βs and the standard deviations of γ and the vector of ηs, with one adjustment. The standard command limits the number of parameters drawn from a distribution at 20. To allow all 31 coefficients (including duration) to be drawn from a distribution, we used pseudo-random draws (personal communication, Arne Risa Hole, Department of Economics, University of Sheffield, 15 June 2015). To compare models in terms of model fit, Akaike information criterion (AIC) and Bayesian information criterion (BIC) estimates are presented.

Results

Sample Characteristics and Representativeness

Figure 2 shows the recruitment flow, response rates and sample sizes for each component of the survey, and Table 2 shows the sample characteristics relative to population norms. While the sample differed statistically from the general population in all measured characteristics except age and sex, differences were generally small (≤ 3% in any one category). Notable exceptions were education (university education over-represented by 18%) and mental health (Kessler-10 sample mean was in the ‘medium risk’ range of anxiety or depressive disorder while the general population mean was in ‘low or no risk’ range).

Utility Estimates

Additional years of life were preferred, and all movements away from ‘no problems’ in each dimension were valued negatively (Model 1, Table 3), except level 2 of Social Functioning (value zero). Moving to worse levels in each dimension was associated with an absolutely larger coefficient, with only two exceptions. The worst two levels of the Sleep and Appetite dimensions were not monotonically ordered, but both violations of monotonicity were small (0.003 and 0.007, respectively).

Table 3

Conditional logit: Model 1 (unconstrained) and Model 2 (monotonicity imposed)

		Model 1	Model 2
Mean		Coefficient^a (robust SE)	Coefficient^a (robust SE)
Duration	Linear	0.555 (0.027)***	0.552 (0.027)***
Physical functioning × Duration^a	2	– 0.045 (0.009)***	– 0.044 (0.009)***
	3	– 0.084 (0.01)***	– 0.083 (0.01)***
	4	– 0.138 (0.01)***	– 0.138 (0.01)***
Role functioning × Duration^a	2	– 0.014 (0.007)*	– 0.013 (0.007)*
	3	– 0.051 (0.008)***	– 0.05 (0.007)***
	4	– 0.078 (0.007)***	– 0.077 (0.007)***
Social functioning × Duration^a	2	0 (0.007)	0 (0.007)
	3	– 0.036 (0.007)***	– 0.036 (0.007)***
	4	– 0.051 (0.007)***	– 0.05 (0.007)***
Emotional functioning × Duration^a	2	– 0.011 (0.007)	– 0.011 (0.007)
	3	– 0.037 (0.008)***	– 0.036 (0.008)***
	4	– 0.074 (0.007)***	– 0.073 (0.007)***
Pain × Duration^a	2	– 0.029 (0.007)***	– 0.029 (0.007)***
	3	– 0.071 (0.008)***	– 0.071 (0.008)***
	4	– 0.086 (0.007)***	– 0.086 (0.007)***
Fatigue × Duration^a	2	– 0.013 (0.006)**	– 0.013 (0.006)**
	3	– 0.017 (0.007)**	– 0.016 (0.007)**
	4	– 0.021 (0.006)***	– 0.02 (0.006)***
Sleep × Duration^a	2	– 0.019 (0.006)***	– 0.018 (0.006)***
	3	– 0.024 (0.007)***	– 0.022 (0.006)***
	4	– 0.021 (0.006)***	– 0.022 (0.006)***
Appetite × Duration^a	2	– 0.017 (0.006)**	– 0.015 (0.006)**
	3	– 0.032 (0.007)***	– 0.028 (0.006)***
	4	– 0.025 (0.006)***	– 0.028 (0.006)***
Nausea × Duration^a	2	– 0.026 (0.007)***	– 0.026 (0.007)***
	3	– 0.038 (0.007)***	– 0.038 (0.007)***
	4	– 0.059 (0.007)***	– 0.059 (0.007)***
Bowel problems × Duration^a	2	– 0.025 (0.006)***	– 0.026 (0.006)***
	3	– 0.043 (0.007)***	– 0.043 (0.007)***
	4	– 0.052 (0.006)***	– 0.052 (0.006)***
Log-likelihood		– 16,930	– 16,930
Parameters		31	29
AIC		33,921	33,919
BIC		34,200	34,180

aThe coefficient for each level of each QOL domain was estimated as the interaction of that level with duration. Levels combined to ensure monotonicity within each dimension are noted in italics

Levels of statistical significance: ***1%; **5%; *10%

AIC Akaike information criterion, BIC Bayesian information criterion

Conditional logit: Model 1 (unconstrained) and Model 2 (monotonicity imposed) aThe coefficient for each level of each QOL domain was estimated as the interaction of that level with duration. Levels combined to ensure monotonicity within each dimension are noted in italics Levels of statistical significance: ***1%; **5%; *10% AIC Akaike information criterion, BIC Bayesian information criterion Model 2 constrained the coefficients for levels 3 and 4, respectively, of the Sleep and Appetite dimensions to have the same coefficient, with very little loss of model fit (Table 3). The utility decrements for each level of each dimension from Model 2 are reported in Table 4 with corresponding 95% confidence intervals, and graphed in Fig. 3. The largest utility decrements were associated with physical, role, social and emotion functioning and pain. Sizeable decrements were associated with nausea, bowel problems and appetite, while smaller decrements were associated with problems with sleep and fatigue.

Table 4

Utility decrements used in the QLU-C10D utility algorithm

Dimension	Level	Utility decrement, w _dl (95% CI)
Physical functioning	1	0
	2	– 0.081 (– 0.051 to – 0.110)
	3	– 0.151 (– 0.120 to – 0.182)
	4	– 0.250 (– 0.220 to – 0.280)
Role functioning	1	0
	2	– 0.024 (0.001 to – 0.049)
	3	– 0.090 (– 0.066 to – 0.114)
	4	– 0.139 (– 0.117 to – 0.161)
Social functioning	1	0
	2	0.000 (0.024 to – 0.025)
	3	– 0.064 (– 0.040 to – 0.089)
	4	– 0.091 (– 0.070 to – 0.112)
Emotional functioning	1	0
	2	– 0.020 (0.003 to – 0.043)
	3	– 0.066 (– 0.041 to – 0.091)
	4	– 0.133 (– 0.112 to – 0.155)
Pain	1	0
	2	– 0.053 (– 0.029 to – 0.078)
	3	– 0.129 (– 0.105 to – 0.153)
	4	– 0.155 (– 0.133 to – 0.177)
Fatigue	1	0
	2	– 0.023 (– 0.001 to – 0.045)
	3	– 0.029 (– 0.006 to – 0.053)
	4	– 0.037 (– 0.016 to – 0.058)
Sleep	1	0
	2	– 0.033 (– 0.012 to – 0.054)
	3	– 0.039 (– 0.020 to – 0.059)
	4	– 0.039 (– 0.020 to – 0.059)
Appetite	1	0
	2	– 0.028 (– 0.006 to – 0.049)
	3	– 0.050 (– 0.030 to – 0.070)
	4	– 0.050 (– 0.030 to – 0.070)
Nausea	1	0
	2	– 0.047 (– 0.025 to – 0.070)
	3	– 0.068 (– 0.044 to – 0.092)
	4	– 0.107 (– 0.086 to – 0.127)
Bowel problems	1	0
	2	– 0.047 (– 0.025 to – 0.068)
	3	– 0.078 (– 0.054 to – 0.102)
	4	– 0.094 (– 0.073 to – 0.115)

From Model 2, conditional logit, monotonicity imposed

Fig. 3

Australian Utility Algorithm (derived from Model 2 conditional logit, monotonicity imposed). PF physical functioning, RF role functioning, SF social functioning, EF emotional functioning, PA Pain, FA fatigue, TS sleep, AP appetite, NA nausea, BO bowel problems

Utility decrements used in the QLU-C10D utility algorithm From Model 2, conditional logit, monotonicity imposed Australian Utility Algorithm (derived from Model 2 conditional logit, monotonicity imposed). PF physical functioning, RF role functioning, SF social functioning, EF emotional functioning, PA Pain, FA fatigue, TS sleep, AP appetite, NA nausea, BO bowel problems In the mixed logit results (Model 3, online resource 2, see ESM), the mean of the distributions for each of the coefficients were generally monotonic, with four exceptions. Three dimensions had small positive estimates for level 2, but none were statistically different from zero: Social Functioning (p = 0.56), Emotional Functioning (p = 0.93) and Fatigue (p = 0.81). For Appetite, levels 3 and 4 were non-monotonic (as in Model 1). Figure 4 compares the utility decrements from Models 1 and 3, showing a strong relationship between corresponding estimates from the conditional logit and mixed logit models. The coefficients from Model 1 were absolutely larger than those from Model 3, meaning the spread of the resultant utility algorithm was slightly larger.

Fig. 4

Scatter plot of utility decrements generated by conditional logit and mixed logit. Dotted line represents line of best fit, solid line represents line of equality

QLU-C10D Utility Calculation

The utility decrements from Model 2 (Table 4) provide the weights, w , for calculating QLU-C10D scores from QLQ-C30 responses (Eq. 3). Note that first, QLQ-C30 items must be converted to QLU-C10D levels, as shown in Table 1. A utility score of 1 is assigned to patients whose QLQ-C30 scores indicate they are at level 1 of all ten dimensions of the QLU-C10D. For all other health states, the utility score is 1 minus each utility decrement (w ) for each level down from no problems in each of the ten QLU-C10D dimensions. Thus, the QLU-C10D utility score for patient p, determined by their QLU-C10D level l for each dimension d, is For example, a health state with quite a lot of problems with Role Functioning, a little problem with Emotional Functioning, and a little Nausea, but no problems in any other dimensions, would be valued at 1 − the decrements for Role Functioning level 3, Emotional Functioning level 2 and Nausea level 2 = 1 − (0.09 + 0.02 + 0.047) = 0.843. By convention, the health states would be described as 1312111121. The best possible health state (1111111111) has a value of 1, and the worst possible state (4444444444) has a value of −0.096. Appendix 3 in online resource 3 (see ESM) provides detailed instructions on calculating utility weights for all the QLU-C10D health states, and provides STATA and SPSS syntax code to implement this. When asked about the difficulty of this survey compared with other surveys they had done, 28% of respondents reported the DCE questions to be ‘about the same’ level of difficulty and 39% felt it was ‘harder’. Most (76%) felt the presentation of the health states was clear or very clear. While 39% felt it was difficult or very difficult to choose between pairs of health states, 33% felt it was easy or very easy. Detailed participant feedback on the DCE task will be published separately.

Discussion

This paper reports the first value set for the QLU-C10D, a MAUI derived from the EORTC QLQ-C30. This approach has two important advantages. First, it allows direct quantification of utility for use in economic evaluation from responses to the QLQ-C30, a widely used cancer-specific HRQoL questionnaire. Second, it captures dimensions related to cancer symptoms that are not included in generic instruments (particularly appetite, nausea and bowel problems). The main drivers of utility were the generic dimensions, with the largest utility decrements for physical, role, social and emotion functioning and pain, mirroring generic MAUIs. However, sizeable decrements were associated with cancer-sensitive dimensions, particularly nausea, bowel problems and appetite. Problems with sleep and fatigue were smaller, perhaps because minor problems with sleep and fatigue are relatively common and therefore considered less important by survey respondents. It is possible that the size of relative utility weights would differ for cancer patients with experience of extreme levels of fatigue and sleep disturbance. The MAUCa Consortium is exploring this question in a related study currently underway, sampling Austrian patients and general population. Even though the utility decrements for the cancer-sensitive dimensions were smaller than those for the more generic dimensions, their inclusion provides a more relevant measure of utility for cancer interventions. Physical functioning had larger utility decrements than other dimensions, for each level. In the QLU-C10D, physical functioning is represented by walking, making it somewhat comparable to the EQ-5D Mobility and HUI3 Ambulation dimensions. In the HUI3 multi-attribute utility function, Ambulation does not have the largest utility decrements for any level [32]. Results from EQ-5D valuation studies are mixed. For example, in Australian valuations of the EQ-5D(3L) using DCE, Mobility had the largest utility decrement of all dimensions at level 3 but not level 2 [33], while in Australian EQ-5D(3L) valuations using time trade-off (TTO), Anxiety/Depression and Pain/Discomfort had the largest utility decrements at level 3, and at level 2 Mobility had the second lowest utility decrement [34]. Why might ability to walk have the largest impact on utility in the current study? First, physical functioning appeared as the first dimension in the choice set, as in the QLQ-C30 parent questionnaire. However, we have previously investigated and dismissed order effect after randomised testing in DCEs for both QLU-C10D [11] and EQ-5D-5L [35]. Second, due to the complexity of the level descriptors, physical functioning appeared in the DCE as two attributes, even though it was a single four-level dimension in the DCE experimental design and analysis. A useful comparator here is the EORTC-8D, where Physical Functioning was presented as a single five-level dimension (representing the same QLQ-C30 long/short walk items as in the QLU-C10D) [27]. It had the third largest utility decrement for the worst level (exceeded by Social and Emotional Functioning) and the second largest for level 2 (exceed by Pain). While country is a confounder of this comparison, this issue cannot be ruled out as a driver of the effect, but will be resolved soon as UK valuations of the QLU-C10D are underway. Finally, the QLU-C10D Physical Functioning dimension covers a large range of mobility with four levels reflecting the combined range of the two QLQ-C30 walking items (see Table 1), while other dimensions are based on the range in only one item. It may therefore be appropriate that the utility decrements are correspondingly large. In similar studies, initial models have contained some inconsistent orderings of utility decrements within dimensions, particularly for dimensions with small utility impacts [24-29]. Consistent with previous studies, we imposed constraints to remove non-monotonicities. This did not reduce model fit markedly, and avoids perverse results in QALY calculations. Anchors at one (full health) and zero (death) are imposed by the QALY model, but there is no natural anchor for the pits state (worst possible health state). The QLU-C10D pits state value is -0.095. This is considerably lower than the 0.29 pits state value for the QLU-C10D’s precursor, EORTC-8D [27], which has eight of the ten QLU-C10D dimensions (four with exactly the same items and levels as the QLU-C10D, four that differ slightly), but lacks Sleep and Appetite. In our study, the worst levels of Sleep and Appetite had a combined utility decrement of 0.09, so the difference in content explains some of the difference in pits state values. Since the EORTC-8D was valued with TTO in the United Kingdom (UK) general population, valuation method and country likely explain much of the remainder, as both instruments share a simple additive utility function. The values of the pits state in the original UK and Australian EQ-5D(3L) TTO studies were −0.594 [36] and −0.217 [34], respectively, and in the Australian EQ-5D(3L) DCE study it was −0.516 [33]. Variations in the value of health states, including the pits state, are driven by several factors [37], including country-specific cultural differences in attitudes to trading between mortality and morbidity, different health state classification system content, valuation method and utility functional form. Arguably, a lower pits state value means a greater range in a value set which may lead to greater differences between interventions in CUA. A related issue is sensitivity to mild impairments. Values for health states with level 2 across all dimensions were 0.464 for the QLU-C10D (Australian DCE) and 0.715 for the EORTC-8D (UK TTO). Assessing the sensitivity of the QLU-C10D to differences in mild and extreme QOL impacts, and comparing this with other candidate MAUIs, are important issues for future research. The DCE method has emerged as an alternative to TTO and standard gamble (SG) methods for valuation of health outcomes in the past decade, and has now been used in a number of studies [6, 20, 21, 33, 38, 39]. The discrete choice method is attractive for several reasons: it is embedded in a strong theoretical measurement framework; it utilises well established statistically robust experimental design and modelling methods; it is based on a relatively simple judgmental task; it is feasible with online recruitment and data collection. The use of DCEs to value health states is maturing, but still presents some challenges. While the judgmental task is simple relative to TTO and SG, thus allowing survey respondents to consider a larger number of attributes, the 12 attributes in the current study is a relatively large number. This study confirms the QLU-C10D valuation methods experiment in finding this is feasible for respondents [5]. We reduced cognitive challenge firstly by allowing only four dimensions to differ in each choice set and secondly by presenting choice sets in the format preferred by participants in the methods experiment, using yellow highlighting to identify differences between situations A and B [5]. Allowing only some dimensions to differ across choice sets has the additional advantage of requiring respondents who employ heuristics such as considering a single attribute to trade off between other attributes. We designed an experiment with 960 choice sets that would maximise statistical efficiency in estimating utility parameters. This meant the survey included some health states that might seem rather unlikely to respondents, such as severe vomiting yet no problems with social function. However, we note that in the patient-reported QLQ-C30 data used to derive the QLU-C10D health state classification system, at least one patient reported each pairwise combination of levels [4]. We used two modelling approaches, conditional logit and mixed logit, which yielded similar mean utility decrements. We have chosen conditional logit (Model 2) as the basis for calculating utilities for CUA for the following reasons: (i) for economic evaluation, we are generally most interested in the mean response, so preference heterogeneity is a secondary concern; (ii) to our knowledge, there remains uncertainty about the appropriate distributional assumptions for the mixed logit. This study has several strengths and some limitations. It provides a preference-based measure for calculating utilities for the QLQ-C30, which is theoretically and empirically stronger than using mappings of the EORTC QLQ-C30 to other preference-based utility measures [40]. The development of the health state classification system was psychometrically thorough [4]. The valuation survey sample was large, with quota sampling achieving population representativeness for age and sex. The extent to which non-representativeness on the other measured sociodemographic variables is a limitation is as yet unknown, and will be explored in future researching by pooling valuation data across the MAUCa Consortium. We established the feasibility of our DCE method [5], and have noted its strengths and limitations above. We used modelling approaches appropriate to our data structure and analysis purpose. Our choice of a monotonic main-effects model for calculating utility is readily accessible for a range of end users, clinically interpretable and consistent with the EORTC quality-of-life conceptual model. The appropriateness of disease-specific utility weights for CUA is debated by health economists [41]. Conventionally, generic MAUIs such as the EQ-5D are used, primarily to enable comparability across health conditions and interventions. However, the capacity of generic instruments to capture clinically relevant differences in cancer is also debated [42-44]. Arguably, the QLU-C10D should provide a more cancer-sensitive measure of utility than provided by generic MAUIs, although this is yet to be tested empirically. Further, data on generic utility measures may not always be available. The QLU-C10D enables utility values to be retrospectively generated from the wealth of existing QLQ-C30 data, thus facilitating economic evaluation from existing studies. It is anticipated that the QLU-C10D will have good psychometric properties, and future research will examine this, as well as assessing its performance relative to generic MAUIs. The QLU-C10D has been developed by the MAUCa Consortium in collaboration with the EORTC QOL Group. A key strength of the Consortium’s approach is the use of identical valuation methods across countries, creating a unique opportunity to explore predictors of health outcome values, including country, age, sex, education and health status of valuation survey respondents—this will be done in future analyses. The QLU-C10D is endorsed by the EORTC QOL Group, and supersedes the EORTC-8D. Notably, the development of the health state classification system of the EORTC-8D was based on data from 655 multiple myeloma patients, while that of the QLU-C10D was informed by a much larger (n = 2616) and more diverse sample: 13 countries, 15 primary cancer types, localised/regional (n = 1037) and recurrent/metastatic stages (n = 1579) [4]. The EORTC QOL Group now has stewardship of the QLU-C10D, being responsible for all aspects of its management, developing and maintaining information regarding administration, scoring and interpretation, and housing relevant materials on the EORTC QOL website. This will make the QLU-C10D widely available for use prospectively and retrospectively, and thereby facilitate the incorporation of quality of life into healthcare decision-making for cancer care.

Conclusions

CUA represents a major part of the reimbursement process in many countries. In Australia, the government guidelines for preparing submissions to the Pharmaceutical Benefits Advisory Committee (PBAC) favour direct estimation of utilities over mapping, do not mandate a particular MAUI but prefer Australian-based preference weights and encourage the use of patient-reported outcomes/MAUIs that capture all important disease- or condition-specific factors [45; pages 37, 77]. Based on the experience of RV and RN serving on PBAC and its subcommittees, submissions for cancer interventions frequently present QLQ-C30 data. Therefore, the value set presented here will aid Australian resource allocation decisions. Further, the methods presented in this paper provide a template for further international valuations of the QLU-C10D. A number of these are underway, using exactly the same DCE design, presentation format and analysis, including Austria, Canada, France, Germany, Poland, the UK and the US, enabling assessment of international comparability of preferences for cancer-specific health states.

Data Availability Statement

The dataset generated during the current study will not be publicly available until all planned analyses are complete (see Sect. 4). For updates, please contact the EORTC Quality of Life Group Health Technology Committee. Below is the link to the electronic supplementary material. Supplementary material 1 (PDF 1872 kb) Supplementary material 2 (PDF 44 kb) Supplementary material 3 (PDF 95 kb)

This study provides the first value set (i.e. set of utility weights) for the EORTC QLU-C10D, a new preference-based multi-attribute utility instrument derived from the widely used cancer-specific quality-of-life questionnaire, EORTC QLQ-C30.

Cost-utility analysis (CUA) represents a major part of the reimbursement process in many countries. The availability of the EORTC QLU-C10D will facilitate CUA for cancer interventions, as it can be applied to data collected with the EORTC QLQ-C30, prospectively and retrospectively.

Sizeable utility decrements associated with cancer-sensitive dimensions, notably nausea, bowel problems and appetite, may make the QLU-C10D more sensitive than generic measures in CUA. Future research is required to assess this in datasets containing both the QLQ-C30 and a generic utility instrument.

29 in total

1. The estimation of a preference-based measure of health from the SF-36.

Authors: John Brazier; Jennifer Roberts; Mark Deverill
Journal: J Health Econ Date: 2002-03 Impact factor: 3.883

2. Estimating preference-based single index measures for dementia using DEMQOL and DEMQOL-Proxy.

Authors: Donna Rowen; Brendan Mulhern; Sube Banerjee; Ben van Hout; Tracey A Young; Martin Knapp; Sarah C Smith; Donna L Lamping; John E Brazier
Journal: Value Health Date: 2012-01-27 Impact factor: 5.725

3. Deriving a preference-based measure for cancer using the EORTC QLQ-C30.

Authors: Donna Rowen; John Brazier; Tracey Young; Sabine Gaugris; Benjamin M Craig; Madeleine T King; Galina Velikova
Journal: Value Health Date: 2011 Jul-Aug Impact factor: 5.725

Review 4. A Systematic Review of the Literature on the Development of Condition-Specific Preference-Based Measures of Health.

Authors: Elizabeth Goodwin; Colin Green
Journal: Appl Health Econ Health Policy Date: 2016-04 Impact factor: 2.561

5. Overview of the SF-36 Health Survey and the International Quality of Life Assessment (IQOLA) Project.

Authors: J E Ware; B Gandek
Journal: J Clin Epidemiol Date: 1998-11 Impact factor: 6.437

6. The validity of QALYs: an experimental test of constant proportional tradeoff and utility independence.

Authors: H Bleichrodt; M Johannesson
Journal: Med Decis Making Date: 1997 Jan-Mar Impact factor: 2.583

7. Deriving a Preference-Based Measure for Myelofibrosis from the EORTC QLQ-C30 and the MF-SAF.

Authors: Clara Mukuria; Donna Rowen; John E Brazier; Tracey A Young; Beenish Nafees
Journal: Value Health Date: 2015-08-24 Impact factor: 5.725

8. QLU-C10D: a health state classification system for a multi-attribute utility measure based on the EORTC QLQ-C30.

Authors: M T King; D S J Costa; N K Aaronson; J E Brazier; D F Cella; P M Fayers; P Grimison; M Janda; G Kemmler; R Norman; A S Pickard; D Rowen; G Velikova; T A Young; R Viney
Journal: Qual Life Res Date: 2016-01-20 Impact factor: 4.147

9. A pilot discrete choice experiment to explore preferences for EQ-5D-5L health states.

Authors: Richard Norman; Paula Cronin; Rosalie Viney
Journal: Appl Health Econ Health Policy Date: 2013-06 Impact factor: 2.561

10. Testing a discrete choice experiment including duration to value health states for large descriptive systems: addressing design and sampling issues.

Authors: Nick Bansback; Arne Risa Hole; Brendan Mulhern; Aki Tsuchiya
Journal: Soc Sci Med Date: 2014-05-20 Impact factor: 4.634

23 in total

1. Health-Related Quality of Life Associated with Barrett's Esophagus and Cancer.

Authors: Norma B Bulamu; Gang Chen; Julie Ratcliffe; Ann Schloite; Tim Bright; David I Watson
Journal: World J Surg Date: 2019-06 Impact factor: 3.352

2. Utility Values for the CP-6D, a Cerebral Palsy-Specific Multi-Attribute Utility Instrument, Using a Discrete Choice Experiment.

Authors: Mina Bahrampour; Richard Norman; Joshua Byrnes; Martin Downes; Paul A Scuffham
Journal: Patient Date: 2020-10-19 Impact factor: 3.883

3. Assessing health-related quality of life in cancer survivors: factors impacting on EORTC QLU-C10D-derived utility values.

Authors: Thomas van Gelder; Brendan Mulhern; Dounya Schoormans; Olga Husson; Richard De Abreu Lourenço
Journal: Qual Life Res Date: 2020-01-14 Impact factor: 4.147

4. Effects of Metreleptin on Patient Outcomes and Quality of Life in Generalized and Partial Lipodystrophy.

Authors: Keziah Cook; Kelly Adamski; Aparna Gomes; Edward Tuttle; Henner Kalden; Elaine Cochran; Rebecca J Brown
Journal: J Endocr Soc Date: 2021-02-16

5. A Systematic Review of the Methodologies and Modelling Approaches Used to Generate International EQ-5D-5L Value Sets.

Authors: Donna Rowen; Clara Mukuria; Emily McDool
Journal: Pharmacoeconomics Date: 2022-07-13 Impact factor: 4.558

6. The FACT-8D, a new cancer-specific utility algorithm based on the Functional Assessment of Cancer Therapies-General (FACT-G): a Canadian valuation study.

Authors: Helen McTaggart-Cowan; Madeleine T King; Richard Norman; Daniel S J Costa; A Simon Pickard; Rosalie Viney; Stuart J Peacock
Journal: Health Qual Life Outcomes Date: 2022-06-16 Impact factor: 3.077

7. Discrete choice experiments to generate utility values for multi-attribute utility instruments: a systematic review of methods.

Authors: Mina Bahrampour; Joshua Byrnes; Richard Norman; Paul A Scuffham; Martin Downes
Journal: Eur J Health Econ Date: 2020-05-04

Review 8. Dimensions Used in Instruments for QALY Calculation: A Systematic Review.

Authors: Moustapha Touré; Christian R C Kouakou; Thomas G Poder
Journal: Int J Environ Res Public Health Date: 2021-04-21 Impact factor: 3.390

9. Discrete choice experiment to evaluate preferences of patients with cystic fibrosis among alternative treatment-related health outcomes: a protocol.

Authors: Charlie McLeod; Richard Norman; Andre Schultz; Steven Mascaro; Steve Webb; Tom Snelling
Journal: BMJ Open Date: 2019-08-18 Impact factor: 2.692

10. TROG 15.03 phase II clinical trial of Focal Ablative STereotactic Radiosurgery for Cancers of the Kidney - FASTRACK II.

Authors: Shankar Siva; Brent Chesson; Mathias Bressel; David Pryor; Braden Higgs; Hayley M Reynolds; Nicholas Hardcastle; Rebecca Montgomery; Ben Vanneste; Vincent Khoo; Jeremy Ruben; Eddie Lau; Michael S Hofman; Richard De Abreu Lourenco; Swetha Sridharan; Nicholas R Brook; Jarad Martin; Nathan Lawrentschuk; Tomas Kron; Farshad Foroudi
Journal: BMC Cancer Date: 2018-10-23 Impact factor: 4.430