Literature DB >> 33598685

Establishing anchor-based minimally important differences for the EORTC QLQ-C30 in glioma patients.

Linda Dirven^1,2, Jammbe Z Musoro³, Corneel Coens³, Jaap C Reijneveld⁴, Martin J B Taphoorn^1,2, Florien W Boele^5,6, Mogens Groenvold^5,7, Martin J van den Bent⁸, Roger Stupp⁹, Galina Velikova⁵, Kim Cocks¹⁰, Mirjam A G Sprangers¹¹, Madeleine T King¹², Hans-Henning Flechtner¹³, Andrew Bottomley³.

Abstract

BACKGROUND: Minimally important differences (MIDs) allow interpretation of the clinical relevance of health-related quality of life (HRQOL) results. This study aimed to estimate MIDs for all European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire Core 30 (QLQ-C30) scales for interpreting group-level results in brain tumor patients.
METHODS: Clinical and HRQOL data from three glioma trials were used. Clinical anchors were selected for each EORTC QLQ-C30 scale, based on correlation (>0.30) and clinical plausibility of association. Changes in both HRQOL and the anchors were calculated, and for each scale and time period, patients were categorized into one of the three clinical change groups: deteriorated by one anchor category, no change, or improved by one anchor category. Mean change method and linear regression were applied to estimate MIDs for interpreting within-group change and between-group differences in change over time, respectively. Distribution-based methods were applied to generate supportive evidence.
RESULTS: A total of 1687 patients were enrolled in the three trials. The retained anchors were performance status and eight Common Terminology Criteria for Adverse Events (CTCAE) scales. MIDs for interpreting within-group change ranged from 4 to 12 points for improvement and -4 to -14 points for deterioration. MIDs for between-group difference in change ranged from 4 to 9 for improvement and -4 to -16 for deterioration. Most anchor-based MIDs were closest to the 0.3 SD distribution-based estimates (range: 3-10).
CONCLUSIONS: MIDs for the EORTC QLQ-C30 scales generally ranged between 4 and 11 points for both within-group mean change and between-group mean difference in change. These results can be used to interpret QLQ-C30 results from glioma trials.

Entities: Chemical Disease Gene Species

Keywords: EORTC QLQ-C30; brain tumor; clinical relevance; health-related quality of life (HRQOL); minimally important difference (MID)

Year: 2021 PMID： 33598685 PMCID： PMC8328025 DOI： 10.1093/neuonc/noab037

Source DB: PubMed Journal: Neuro Oncol ISSN： 1522-8517 Impact factor: 12.300

Minimally important differences for EORTC QLQ-C30 scales ranged between 4 and 11 points in glioma. Estimate for each HRQOL scale can be used to better interpret group-level results in glioma trials. Minimally important differences (MIDs) allow interpretation of the clinical relevance of health-related quality of life (HRQOL) results. Typically, a score of >10 points was considered clinically relevant for all scales. More recent work, however, suggests that these MIDs are too simplistic as they do not distinguish between domain scales, direction of change (ie, improvement or deterioration), and disease sites. In this study, we estimated MIDs for all EORTC QLQ-C30 scales for interpreting group-level changes over time within a group of glioma patients, and for interpreting the mean difference in change between two groups, using both anchor-based and distribution-based approaches. We found that MIDs for the EORTC QLQ-C30 scales generally ranged between 4 and 11 points for both within-group mean change and between-group mean difference in change, representing small changes. The current estimates can be used to better interpret the results of clinical trials in glioma patients using the EORTC QLQ-C30 questionnaire. Despite multimodal anti-cancer treatment, patients with primary and metastatic brain cancer have a poor prognosis. The most common malignant primary brain tumors are gliomas, with median survival rates ranging between 15 months and >15 years, depending on the histological tumor type, malignancy grade, and tumor genetics.[1] For these patients, the quality of survival is of utmost importance. Health-related quality of life (HRQOL) is typically included as a secondary outcome in brain tumor clinical trials,[2-8] as a key component of determining the net clinical benefit of a treatment strategy. This means that the benefits of possible prolonged survival are weighed against the possible negative effects of treatment on the patients’ level of functioning and well-being. The most common instruments used in brain tumor clinical trials to evaluate HRQOL are the European Organisation for Research and Treatment of Cancer (EORTC) Quality of Life Questionnaire Core 30 (QLQ-C30),[9] Functional Assessment of Cancer Therapy—General (FACT-G),[10] and the MD Anderson Symptom Inventory,[11] typically combined with their brain-specific modules.[12-14] The impact of an experimental treatment on HRQOL is usually evaluated by comparing mean HRQOL scores over time in the experimental vs the control arm. Some studies rely on statistical significance alone to assess differences between treatment arms.[3,15] However, to demonstrate a true impact, differences should be both statistically significant and clinically relevant. For some questionnaires, minimally important differences (MIDs) have been determined, allowing to interpret differences and changes in HRQOL as clinically meaningful. Indeed, an MID can be considered as the smallest change/difference in a HRQOL score that is perceived as relevant or important by the patient and would indicate a change in the patient’s disease management. Historically, for the EORTC QLQ-C30 questionnaire, a crude MID between 5 and 10 points was suggested to represent a small difference, between 10 and 20 points a moderate difference, and >20 a large difference, for both mean changes over time[16] and for mean differences between groups.[17] Therefore, in earlier studies, a difference of >10 points was considered clinically relevant for all scales. More recent work, however, suggests that these MIDs are too simplistic[18,19]; they do not distinguish between domain scales, direction of change (ie, improvement or deterioration), and disease sites. Therefore, MIDs for each EORTC QLQ-C30 scale specifically for glioma patients are needed, as they will be helpful in evaluating whether observed differences in HRQOL scores over time between treatment arms in a clinical trial are also clinically meaningful. There are two approaches to estimating MIDs. The anchor-based approach links HRQOL measures to external criteria, either a factor that has clinical relevance (eg, performance status [PS]) or patient- or physician-reported ratings of health (eg, adverse events). The distribution-based approach, on the other hand, relies on the statistical features of HRQOL data (eg, standard deviation [SD]), and is typically used as supportive evidence for anchor-based estimates.[20] Maringwa et al. previously published MIDs for brain tumor patients based on two anchors, PS and Mini-Mental State Examination (MMSE) scores.[21] However, these anchors were suitable for only seven selected HRQOL scales. MID estimates for the selected scales, derived for both improvement and deterioration, were found to range between 5 and 14 points,[21] providing further evidence that the 10-point threshold is overly simplistic. The aim of this study was to estimate the MIDs for all EORTC QLQ-C30 scales for interpreting group-level changes over time within a group of glioma patients, and for interpreting the mean difference in change between two groups, using both anchor-based and distribution-based approaches.

Materials and Methods

Data Description

Data were derived from three published EORTC phase III trials in glioma patients. Trial 1 (EORTC 26053) evaluated the impact of concurrent and/or adjuvant temozolomide chemotherapy to radiotherapy on overall survival in newly diagnosed non-1p/19q deleted anaplastic gliomas (n = 745).[22] Trial 2 (EORTC 26951) evaluated whether adjuvant procarbazine, lomustine, and vincristine chemotherapy improved overall survival compared to radiotherapy alone in newly diagnosed patients with anaplastic oligodendrogliomas or anaplastic oligoastrocytomas (n = 368).[23] Trial 3 (EORTC 26981) evaluated whether radiotherapy with concomitant and adjuvant temozolomide improved overall survival compared to radiotherapy alone in newly diagnosed glioblastoma patients (n = 573).[24] HRQOL was longitudinally assessed in all trials using the EORTC QLQ-C30 questionnaire. All trials were approved by the ethical committee of each participating center and all patients gave their informed consent to participate in the respective study.

EORTC QLQ-C30

The EORTC QLQ-C30 questionnaire[9] contains 30 items, 24 of which are combined into nine multi-item scales, ie, five functional scales (physical functioning [PF], role functioning [RF], cognitive functioning [CF], emotional functioning [EF], and social functioning [SF]) and three symptom scales (fatigue [FA], pain [PA], and nausea/vomiting [NV]), and one global health status (QL) scale. The remaining six single-item scales assess symptoms: dyspnea (DY), appetite loss (AP), insomnia (SL), constipation (CO), diarrhea (DI), and financial difficulties (FI). Trials 1 and 3 used version 3 of the EORTC QLQ-C30, whereas trial 2 used version 2. The two versions differ only in the response categories of questions 1-5 (ie, the PF scale), which were yes/no in version 2, whereas version 3 uses a four-point Likert scale ranging from “not at all” to “very much.” The scoring of the EORTC QLQ-C30 scales was done according to the EORTC QLQ-C30 scoring manual,[25] with the means of the raw scores for each scale transformed to fall between 0 and 100. To facilitate interpretation, all scales were scored such that 0 represents the worst possible score and 100 the best possible score. The FI scale was omitted from the analysis because appropriate anchors were not obtainable, leaving 14 scales for analyses.

Clinical Anchors

Anchors for each EORTC QLQ-C30 scale were selected from clinical variables available in the datasets from the three source trials, including Common Terminology Criteria for Adverse Events (CTCAE), WHO PS, and MMSE, based on clinical plausibility and cross-sectional correlation.[20] First, we assessed correlation. Depending on the distribution of the HRQOL scale/anchor pair, either a polyserial or polychoric correlation was estimated. Anchors with correlations of ≥|0.30|[26] were prioritized and where achievable, anchors with much stronger correlations were targeted.[27] The retained anchors were then assessed by a panel of five experts in both brain cancer and HRQoL for clinical plausibility.[28]

Definition of Clinical Change Groups

For each scale and time period, patients were categorized into one of the three clinical change groups: (i) deterioration (worsened by one anchor category), (ii) stable (no change in anchor categories), and (iii) improvement (improved by one anchor category). Patients who changed by ≥2 categories of an anchor were excluded because they were considered to be above the “minimal” expected change.

Data Analysis

Anchor-based method

The data were pooled across all three trials. Next, change scores for each EORTC QLQ-C30 scale and anchor pair were computed across all available pairwise time points during the entire follow-up period, and then combined into one dataset to provide sufficient data for examining clinically relevant changes. As an example, for a patient measured at time points ta, tb, and tc, change scores were computed between ta & tb, ta & tc, and tb & tc. This meant that one patient could contribute to multiple change scores, and given their change scores, patients could contribute to multiple clinical change groups. The mean change method was used to estimate MIDs for interpreting change over time within a group of patients. MIDs for improvement and deterioration corresponded to the mean HRQOL change scores within the improvement and deterioration clinical change groups, respectively. Moreover, for a given HRQOL scale, an effect size (ES) was computed within each clinical change group by dividing the mean of the HRQOL change scores (derived from all the pairwise time point differences) by the SD of the HRQOL change scores over all time points. Cohen[29] recommended that an ES of 0.2 is small, 0.5 is moderate, and ≥0.8 is large. Based on this guideline, only mean changes with an ES ≥0.2 and <0.8 were considered appropriate for inclusion as MIDs. The rationale was that an observed ES <0.2 reflects changes that are clinically unimportant, while ESs ≥0.8 were considered to reflect changes more than minimally important. A linear regression approach was used to estimate MIDs for interpreting differences in change score over time between two distinct groups of patients. For a given EORTC QLQ-C30 scale, the outcome variable was the scale change score, and the covariate was a binary anchor variable with categories “stable” = 0 and “improvement” = 1 when modeling improvement (deteriorated observations were excluded), and a similar procedure was adopted for deterioration. Since some patients contributed change scores to multiple clinical change groups, and more than one change score to a particular clinical change group, we corrected for the association between multiple change scores contributed by these patients by specifying a suitable covariance structure using generalized estimating equations (GEE).[30] The MIDs for improvement and deterioration correspond to the slope parameters for the “improved” and “deteriorated” covariates, respectively, and are useful for interpreting differences in change scores between groups of patients. Multiple MID values for a given scale were summarized to single values via correlation-based weighted average. To check whether using data from different trials had an impact on the estimated MIDs, this trial effect (ie, interaction between the trial and the anchor indicator) was included as covariate in the regression models. To account for multiple testing across the EORTC QLQ-C30 scales, we chose a priori to apply more stringent P values, ie, P values below .001 were considered statistically significant.

Distribution-based method

Two distribution-based methods were used, with calculations for each QLQ-C30 scale based on cross-sectional analysis; ie, the time point before or on the first day of treatment administration (t1). Three proportions of an SD (0.2 SD, 0.3 SD, 0.5 SD) were calculated, representing the range of proportions currently considered relevant to MID estimation,[18,31,32] although there is no consensus on which best approximates the MID. The standard error of measurement (SEM) was calculated as SD multiplied by square root of (1–r), using SD at t1. The test-retest reliability estimates (r) for the QLQ-C30 scales were obtained from Hjermstad et al.[33] All statistical analyses were performed using SAS version 9.4.[34] In addition, two practical examples are provided to explain how MID results can be used to interpret HRQOL data from clinical trials and how MIDs can be used for a sample size calculation for a clinical trial in which HRQOL is the primary endpoint.

Results

A total of 1687 patients were enrolled across the three trials. Baseline characteristics are summarized in Table 1. Supplementary Figure S1 presents an overview of patient inclusion for the anchor-based method.

Table 1

Baseline Patient Sociodemographic and Clinical Characteristics for Each Trial Separately and for the Total Population

	Trial			Total (N = 1697)
	26053 (N = 756)	26951 (N = 368)	26981 (N = 573)
	N (%)	N (%)	N (%)	N (%)
Gender
Male	448 (59.3)	212 (57.6)	360 (62.8)	1020 (60.1)
Female	308 (40.7)	156 (42.4)	213 (37.2)	677 (39.9)
WHO performance status
0	440 (58.2)	138 (37.5)	223 (38.9)	801 (47.2)
1	284 (37.6)	172 (46.7)	277 (48.3)	733 (43.2)
2	32 (4.2)	58 (15.8)	73 (12.7)	163 (9.6)
Region
Northern Europe	0 (0.0)	15 (4.1)	1 (0.2)	16 (0.9)
Southern Europe	64 (8.4)	43 (11.7)	57 (9.9)	164 (9.7)
Western Europe	482 (63.8)	300 (38.3)	324 (56.5)	1106 (65.2)
Eastern Europe	0 (0.0)	10 (2.7)	6 (1.0)	16 (0.9)
Non-European	210 (27.8)	0 (0.0)	185 (32.3)	395 (23.3)
Age
Mean (SD)	43.01 (13.17)	47.62 (11.11)	53.81 (10.39)	47.65 (12.76)
Interquartile	32.0-52.0	40.0-55.0	48.0-62.0	37.0-58.0

Northern Europe: Sweden, Finland; Southern Europe: Italy, Spain; Western Europe: the Netherlands, France, United Kingdom, Germany, Belgium, Switzerland, Austria; Eastern Europe: Hungary, Poland, and Slovenia; Non-European: Canada, United States, Australia, Israel, and Turkey.

Baseline Patient Sociodemographic and Clinical Characteristics for Each Trial Separately and for the Total Population Northern Europe: Sweden, Finland; Southern Europe: Italy, Spain; Western Europe: the Netherlands, France, United Kingdom, Germany, Belgium, Switzerland, Austria; Eastern Europe: Hungary, Poland, and Slovenia; Non-European: Canada, United States, Australia, Israel, and Turkey.

Anchor-Based Method

The final list of retained anchors, summarized in Table 2, comprised PS, scored between 0 (no symptoms of cancer) and 4 (bedbound), and eight CTCAEs (pain, fatigue, nausea and vomiting, gastrointestinal, constipation, anorexia, dyspnea, and neurology), which are graded between 0 (no toxicity) and 4 (life-threatening). At least one clinical anchor could be retained for 12/14 EORTC QLQ-C30 scales assessed, with insomnia and diarrhea therefore being excluded from further analysis.

Table 2

Cross-Sectional Correlations of the EORTC QLQ-C30 Scales With Anchors and Correlations Between Their Change Scores

		Cross-Sectional		Change Scores
Scale	Anchor	N (N_o)	Correlation	N (N_o)	Correlation
Physical functioning	Performance status	1317 (9687)	−0.52	921 (66765)	−0.31
Role functioning	Performance status	1325 (9655)	−0.51	921 (66551)	−0.24
Social functioning	Performance status	1330 (9660)	−0.41	921 (66209)	−0.20
Emotional functioning	Performance status	1326 (9669)	−0.30	921 (66288)	−0.20
Cognitive functioning	Performance status	1331 (9682)	−0.39	921 (66315)	−0.20
Cognitive functioning	CTCAE neurological	653 (3803)	−0.30	548 (18407)	−0.20
Global health status	Performance status	1330 (9653)	−0.43	921 (66045)	−0.20
Global health status	CTCAE fatigue	1179 (5577)	−0.33	975 (21129)	−0.20
Pain	CTCAE pain	654 (4015)	−0.44	548 (18570)	−0.30
Fatigue	Performance status	1334 (9701)	−0.44	975 (66662)	−0.30
Fatigue	CTCAE fatigue	1178 (5616)	−0.43	975 (21393)	−0.30
Nausea and vomiting	CTCAE nausea vomiting	1182 (5630)	−0.55	975 (21431)	−0.38
Nausea and vomiting	CTCAE gastrointestinal	1182 (5630)	−0.42	677 (21431)	−0.30
Appetite loss	CTCAE nausea vomiting	1182 (5616)	−0.33	975 (21359)	−0.20
Appetite loss	CTCAE gastrointestinal	1182 (5616)	−0.36	677 (21359)	−0.22
Appetite loss	CTCAE anorexia	1181 (5614)	−0.52	975 (21355)	−0.33
Dyspnea	CTCAE dyspnea	654 (3812)	−0.44	548 (18511)	−0.21
Constipation	CTCAE gastrointestinal	1179 (5597)	−0.30	677 (21232)	−0.20
Constipation	CTCAE constipation	526 (1795)	−0.53	427 (2800)	−0.40

Abbreviations: CTCAE, Common Terminology Criteria for Adverse Events; EORTC QLQ-C30, European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire Core 30.

N = number of patients, No = number of HRQOL observations.

Cross-Sectional Correlations of the EORTC QLQ-C30 Scales With Anchors and Correlations Between Their Change Scores Abbreviations: CTCAE, Common Terminology Criteria for Adverse Events; EORTC QLQ-C30, European Organisation for Research and Treatment of Cancer Quality of Life Questionnaire Core 30. N = number of patients, No = number of HRQOL observations. The cross-sectional correlations between the different HRQOL scales and anchors ranged from 0.30 to 0.55 in absolute value, and the correlations between their change scores ranged from 0.20 to 0.40 (Table 2). Supplementary Table S1 provides the distribution of patients and the number of change observations across change categories of the various anchors. According to the anchor change scores, most patients were stable on all anchors. The number of patients who either improved or deteriorated was similar for most anchors except for PS, in which more patients deteriorated than improved. Anchor-based MIDs with clinically important ES could be determined for deterioration in 12 EORTC QLQ-C30 scales assessed, and in 11 scales for improvement. The anchor-based MIDs for change within-group and between-group differences in change over time are presented in Table 3. Supplementary Table S2 reports full details for both the mean change method (within-group change) and the linear regression method (between-group difference in change). Note that only MIDs estimated from anchor clinical change groups with a clinically important ES ≥0.2 and <0.8 contributed to MID estimates in Table 3.

Table 3

Summary of Anchor-Based MIDs (Weighted Average) for Within-Group and Between-Group Difference in Change Over Time

	Within-Group Changea		Between-Group Difference in Changea
Scale	Improvement	Deterioration	Improvement	Deterioration
Physical functioning	5	−9	5	−7
Role functioning	9	−9	8	−9
Social functioning	6	−6	5	−6
Emotional functioning	6	−5	4	−4
Cognitive functioning	No MID	−9 to −5 (−7)	No MID	−6.1 to −5.0 (−5.5)
Global health status	4 to 6 (5)b	−6	3.6 to 5.3 (4.4)	−6
Fatigue	8.2 to 9.2 (8.7)b	−8 to −6 (−7)	7.5 to 7.7 (7.6)	−8 to −7 (−7.4)
Pain	6	−8	7	−6
Nausea and vomiting	7	−7	6.42	−7
Dyspnea	9	−8	6.64	−8
Appetite loss	10.9 to 11.7 (11.3)b	−5.3 to −4.3 (−4.8)	9.21 to 9.24 (9.22)	−7.6 to −7.4 (−7.5)
Constipation	5	−14 to −9 (−10)	5	−16 to −7 (−10)

Abbreviation: MIDs, minimally important differences.

aThe within-group MIDs are derived from the mean change method and the between-group MIDs from the linear regression.

bThe numbers between brackets represent the weighted correlations, and only applies to scales with more than one anchor retained.

Note: The symptom scores were reversed to follow the functioning scales’ interpretation, ie, 0 represents the worst possible score and 100, the best possible score; “no MID” is used where no MID estimate is available either due to the absence of a suitable anchor or effect size <0.2 or ≥0.8.

Summary of Anchor-Based MIDs (Weighted Average) for Within-Group and Between-Group Difference in Change Over Time Abbreviation: MIDs, minimally important differences. aThe within-group MIDs are derived from the mean change method and the between-group MIDs from the linear regression. bThe numbers between brackets represent the weighted correlations, and only applies to scales with more than one anchor retained. Note: The symptom scores were reversed to follow the functioning scales’ interpretation, ie, 0 represents the worst possible score and 100, the best possible score; “no MID” is used where no MID estimate is available either due to the absence of a suitable anchor or effect size <0.2 or ≥0.8. Figure 1 presents MIDs from the mean change method (see also Table 3), plotted alongside their 95% confidence intervals, showing how MIDs vary by EORTC QLQ-C30 scale, direction of change (improvement vs deterioration), and anchor. All MID estimates were always in the expected direction, ie, positive vs negative mean change scores within the improvement vs deterioration clinical change group respectively. MIDs for interpreting within-group change (derived from the mean change method) ranged from 4 to 12 points for improvement and −14 to −4 points for deterioration. MIDs for between-group difference in change (derived from the linear regression) ranged from 4 to 9 for improvement and −16 to −4 deterioration (Table 3).

Fig. 1

Mean change and 95% confidence interval for improvement and deterioration in EORTC QLQ-C30 scales, across multiple anchors. Estimates are available only for scales with at least 1 suitable anchor and with effect size >=0.2 and <0.8. These mean change scores are useful for interpreting within-group change over time. Abbreviation: AP, appetite loss; CO, constipation; CF, cognitive functioning; CTCAE, common terminology criteria for adverse events; DY, dyspnea; EF, emotional functioning; FA, fatigue; NV, nausea/vomiting; PA, pain; PF, physical functioning, QL, global health status; RF, role functioning; SF, social functioning. Deteriorate = worsened by at least 1 anchor category; no change = no change in anchor categories; improve = improved by at least 1 category. Taking into account the weighted summaries, MIDs for the EORTC QLQ-C30 scales generally ranged from 4 to 11 points in absolute values for both within-group and between-group changes. Adding the trial effect to the regression models showed no statistically significant differences for improving or deteriorating scores (data not shown), supporting the combination of the three trials.

Distribution-Based Method

Distribution-based estimates for all 14 EORTC QLQ-C30 scales considered in our analyses are presented in Supplementary Table S3. Most anchor-based MIDs were closest to the distribution-based estimates when using 0.3 SD as cutoff, ranging between 3.40 and 9.86.

Practical Examples

Table 4 presents two practical example examples on how MIDs can be used to interpret HRQOL results in a glioma clinical trial (example 1), or to calculate the sample size for a clinical trial in glioma patients in which HRQOL is the primary endpoint (example 2).

Table 4

Practical Examples on How Minimally Important Differences (MIDs) Can Be Used to (1) Interpret Health-Related Quality of Life (HRQOL) Results in a Glioma Clinical Trial or (2) to Calculate the Sample Size for a Clinical Trial in Glioma Patients in Which HRQOL Is the Primary Endpoint

Practical example 1

In this example, we will illustrate how the MIDs can be used to interpret HRQOL results in a clinical trial. In a clinical trial comparing treatment A with treatment B, a statistically significant mean difference in change of 3 points between the two treatment arms was found for the physical functioning scale (both arms deteriorated in their level of physical functioning, treatment A with 2 points and treatment B with 5 points). The estimated MID for deterioration of physical functioning was found to be 7 points (Table 3). This means that although the observed difference of 3 points between treatment arms was statistically significant, this difference cannot be considered clinically meaningful and would not require a change in patient management. In this case, we should conclude that the treatments were similar in their impact on the patients’ HRQOL. Such a scenario is quite plausible in a trial with a large sample size, where HRQOL comparisons will be statistically overpowered.

Practical example 2

Another application of the estimated MIDs is their use in the sample size calculation for a clinical trial in glioma patients in which HRQOL is the primary endpoint. For example, it is expected that the deterioration in physical functioning with treatment A will be less pronounced than the deterioration in physical functioning with treatment B. The difference in deterioration between the treatment arms is expected to be clinically meaningful. In that case, we could use the MID for deterioration, which is 7 points, as the mean difference that is needed to detect a clinically meaningful difference between the treatment arms, and power/calculate the sample size accordingly.

Discussion

The MIDs for most scales of the EORTC QLQ-C30 estimated in this study can be used to interpret group-level changes over time in glioma patients. MIDs generally ranged from 4 to 11 points, both for change within-group and between-group differences in change. Moreover, the magnitude of the MIDs per scale was fairly similar for both deterioration and improvement. The results of this study underline the assumption that using one MID as clinically relevant for all EORTC QLQ-C30 scales in all cancers is too simplistic. Although the MIDs for the direction of change were fairly similar for each scale, there was a wide range in MIDs between the different scales in this brain tumor population. For only a minority of scales, the MID was >10 points,[16] meaning that applying this cutoff may possibly have resulted in an underestimation of clinically relevant differences between and within groups in previously analyzed brain cancer trials. Our results also highlight that MIDs differ across disease sites. For example, QLQ-C30 MIDs for (advanced) cancer patients were not equal to those of glioma patients.[26,35,36] It should be emphasized that the presented MIDs are developed for interpreting group-level results of analyses based on HRQOL scales as continuous variables. They may not be applicable to analyses of HRQOL-derived metrics at the individual patient level (ie, responder analyses), eg, reporting the proportion of patients that changed in HRQOL by a clinically important degree in a specific time period or the time to HRQOL deterioration. A project to establish so-called responder thresholds for changes in HRQOL on the individual patient level is currently ongoing.[28] Previously, Maringwa et al.[21] estimated MIDs for five scales of the EORTC QLQ-C30 and two scales of the brain cancer-specific EORTC QLQ-BN20 using two trials that are also used in this study. They found MIDs ranging between 5 and 14 points. For the scales that could be compared between studies, the MIDs found by Maringwa et al. were generally a few points higher than those found in our study. Differences may be explained by the number and type of anchors that were used, as well as methodological differences, ie, number of time points to calculate changes and number of patients within each clinical change group. In contrast to the study of Maringwa et al., the current study included all scales of the EORTC QLQ-C30, except for financial difficulties, insomnia or diarrhea for which no anchor was available or could be established, and MIDs for both change within-group and between-group differences in change were calculated. In the current study, brain tumor-specific functioning and symptoms as measured with the EORTC QLQ-BN20 were not included, while these may be particularly relevant for brain tumor patients. Currently, the EORTC QLQ-BN20 is undergoing a revision, as this questionnaire has been developed in 1996[12] with an international validation in 2010,[13] and treatments have changed since then. These new treatments bring new toxicities, eg, eye problems,[37] which are not sufficiently covered by the current version. For the revised brain cancer module, it would also be important to determine scale-specific MIDs that can be used to interpret differences or changes in HRQOL scores in brain cancer clinical trials. It should be noted though, that the clinical trial data that were used in this study was based on glioma patients only. The MIDs derived in this study may therefore not be applicable to all brain cancer populations, including those with metastatic brain cancer, in which the EORTC QLQ-C30 is often used. Also, the EORTC questionnaires are regularly used in studies including patients with benign brain tumors, such as meningioma.[38-40] Whether these MIDs are also applicable to other types of brain tumors therefore remains to be determined. Another limitation in clinical trials that may have impacted the estimations of the MIDs is missing HRQOL data. In all three trials, compliance rates with HRQOL assessment decreased over time, which is likely related to the health status of the patient. Patients with a better health status may have been overrepresented during follow-up, hampering generalizability of our results. Although the correlations between the HRQOL scales and anchors were not strong, ie, between 0.2 and 0.4 for the change scores, estimates from the distribution-based method using the 0.3 SD cutoff were comparable to those derived with the anchor-based method. Currently, there is no consensus on which SD best approximates the MID,[18,31,32] but 0.2 SD has previously been suggested to reflect a clinically important small effect,[29] and 0.3 SD has been suggested to reflect an appropriate threshold for defining MIDs.[41] Also, the estimated MIDs were often within a relatively small range and in the expected direction of change similar to the anchor, and the final MID estimates were based on correlation weighted averages thereby mitigating the influence of weaker anchors, further supporting the use of the chosen anchors. Moreover, the anchors that were retained based on statistical considerations were subsequently reviewed by experts on their clinical plausibility to avoid spurious findings. However, not all (clinically) appropriate anchors may have been available to determine MIDs, since we used existing data of closed clinical trials. For example, we were not able to establish an MID for improvements in the cognitive functioning scale, while this is an important outcome for brain tumor patients, as over 80% of patients with primary or metastatic brain cancer have neurocognitive impairments at diagnosis.[42-44] In this study, PS and CTCAE neurological were used as clinical anchors for cognitive functioning, but change scores in cognitive functioning had weak correlations with change in the respective anchors, both −0.20. This may be due to the fact that both anchors were physician-reported, which are subjective by nature. Although objective neurocognitive functioning is often measured in brain tumor trials, the correlation between objectively measured neurocognitive functioning and cognitive complaints as measured with a patient-reported outcome is generally poor.[45-47] This was also true for this study; MMSE was considered as a clinical anchor as this outcome was measured in all trials, but the correlations were <0.30 and therefore excluded from further analyses. Using a patient-reported outcome to assess cognitive complaints, eg, MOS (Medical Outcomes Study) Cognitive Functioning Scale,[48] may have been a more appropriate anchor. Otherwise, more objective anchors should be identified and used in clinical trials. In conclusion, our study provides estimates for MIDs for group-level interpretation of the EORTC QLQ-C30 in glioma patients. In general, the estimates range between 4 and 11 points and correspond to small differences as proposed by Cocks et al. in the guidelines for interpretation of longitudinal HRQOL differences.[18] The current estimates can be used to better interpret results of clinical trials in glioma patients, and subsequently inform patients, as well as for sample size calculations for planning future glioma clinical trials. Advances to establish MIDs to evaluate changes in HRQOL on the individual patient level, which is relevant in clinical practice, are currently ongoing.[49] Click here for additional data file.

44 in total

1. CBTRUS Statistical Report: Primary Brain and Central Nervous System Tumors Diagnosed in the United States in 2008-2012.

Authors: Quinn T Ostrom; Haley Gittleman; Jordonna Fulop; Max Liu; Rachel Blanda; Courtney Kromer; Yingli Wolinsky; Carol Kruchko; Jill S Barnholtz-Sloan
Journal: Neuro Oncol Date: 2015-10-27 Impact factor: 12.300

Review 2. Evidence-based guidelines for determination of sample size and interpretation of the European Organisation for the Research and Treatment of Cancer Quality of Life Questionnaire Core 30.

Authors: Kim Cocks; Madeleine T King; Galina Velikova; Marrissa Martyn St-James; Peter M Fayers; Julia M Brown
Journal: J Clin Oncol Date: 2010-11-22 Impact factor: 44.544

Review 3. Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes.

Authors: Dennis Revicki; Ron D Hays; David Cella; Jeff Sloan
Journal: J Clin Epidemiol Date: 2007-08-03 Impact factor: 6.437

4. Group Changes in Cognitive Performance After Surgery Mask Changes in Individual Patients with Glioblastoma.

Authors: Inge S van Loenen; Sophie J M Rijnen; Jimme Bruijn; Geert-Jan M Rutten; Karin Gehring; Margriet M Sitskoorn
Journal: World Neurosurg Date: 2018-06-07 Impact factor: 2.104

Review 5. Regression analysis for correlated data.

Authors: K Y Liang; S L Zeger
Journal: Annu Rev Public Health Date: 1993 Impact factor: 21.981

6. Validation of the M.D. Anderson Symptom Inventory Brain Tumor Module (MDASI-BT).

Authors: T S Armstrong; T Mendoza; I Gning; I Gring; C Coco; M Z Cohen; L Eriksen; Ming-Ann Hsu; M R Gilbert; C Cleeland
Journal: J Neurooncol Date: 2006-04-06 Impact factor: 4.130

7. Interpreting the significance of changes in health-related quality-of-life scores.

Authors: D Osoba; G Rodrigues; J Myles; B Zee; J Pater
Journal: J Clin Oncol Date: 1998-01 Impact factor: 44.544

8. The Functional Assessment of Cancer Therapy scale: development and validation of the general measure.

Authors: D F Cella; D S Tulsky; G Gray; B Sarafian; E Linn; A Bonomi; M Silberman; S B Yellen; P Winicour; J Brannon
Journal: J Clin Oncol Date: 1993-03 Impact factor: 44.544

9. Neurocognitive function and progression in patients with brain metastases treated with whole-brain radiation and motexafin gadolinium: results of a randomized phase III trial.

Authors: Christina A Meyers; Jennifer A Smith; Andrea Bezjak; Minesh P Mehta; James Liebmann; Tim Illidge; Ian Kunkler; Jean-Michel Caudrelier; Peter D Eisenberg; Jacobus Meerwaldt; Ross Siemers; Christian Carrie; Laurie E Gaspar; Walter Curran; See-Chun Phan; Richard A Miller; Markus F Renschler
Journal: J Clin Oncol Date: 2004-01-01 Impact factor: 44.544

10. Efficacy of depatuxizumab mafodotin (ABT-414) monotherapy in patients with EGFR-amplified, recurrent glioblastoma: results from a multi-center, international study.

Authors: Martin van den Bent; Hui K Gan; Andrew B Lassman; Priya Kumthekar; Ryan Merrell; Nicholas Butowski; Zarnie Lwin; Tom Mikkelsen; Louis B Nabors; Kyriakos P Papadopoulos; Marta Penas-Prado; John Simes; Helen Wheeler; Tobias Walbert; Andrew M Scott; Erica Gomez; Ho-Jin Lee; Lisa Roberts-Rapp; Hao Xiong; Earle Bain; Peter J Ansell; Kyle D Holen; David Maag; David A Reardon
Journal: Cancer Chemother Pharmacol Date: 2017-10-26 Impact factor: 3.333

3 in total

Review 1. The need to consider return to work as a main outcome in patients undergoing surgery for diffuse low-grade glioma: a systematic review.

Authors: Juan Silvestre G Pascual; Hugues Duffau
Journal: Acta Neurochir (Wien) Date: 2022-08-09 Impact factor: 2.816

2. Especially for neuro-oncologists-minimally important differences for the EORTC QLQ-C30 in glioma patients.

Authors: Tito R Mendoza
Journal: Neuro Oncol Date: 2021-08-02 Impact factor: 13.029

3. Is the EORTC QLQ-C30 emotional functioning scale appropriate as an initial screening measure to identify brain tumour patients who may possibly have a mood disorder?

Authors: Quirien Oort; Hanneke Zwinkels; Johan A F Koekkoek; Maaike J Vos; Jaap C Reijneveld; Martin J B Taphoorn; Linda Dirven
Journal: Psychooncology Date: 2022-01-28 Impact factor: 3.955

3 in total