Literature DB >> 31360865

Reliability of Oncology Value Framework Outputs: Concordance Between Independent Research Groups.

Joseph C Del Paggio1, Sierra Cheng1, Christopher M Booth2,3, Matthew C Cheung1, Kelvin K W Chan1,4.   

Abstract

Research groups are increasingly utilizing value frameworks, but little is known of their reliability. To assess framework concordance and interrater reliability between two major value frameworks currently in use, we identified all previously published datasets containing both scores from the American Society of Clinical Oncology Value Framework (ASCO-VF) and grades from the European Society for Medical Oncology-Magnitude of Clinical Benefit Scale (ESMO-MCBS). The intraclass correlation coefficient (ICC) was used to assess interrater reliability. Four eligible studies contained drugs evaluated by both value frameworks, resulting in a dataset of 39 grades/scores for discrete drug indications. ICC was 0.82 (95% confidence interval = 0.70 to 0.90) for ASCO-VF and 0.88 (95% confidence interval = 0.80 to 0.93) for ESMO-MCBS. Absolute concordance was found to be 5% for ASCO-VF and 44% for ESMO-MCBS, increasing to 74% and 80% when deviations within 20 points and 1 grade were considered, respectively. Interrater reliability of ASCO-VF and ESMO-MCBS is, therefore, near perfect, while absolute concordance is poor. This has implications when considering framework outputs in drug funding or treatment decision making.

Entities:  

Year:  2018        PMID: 31360865      PMCID: PMC6650061          DOI: 10.1093/jncics/pky050

Source DB:  PubMed          Journal:  JNCI Cancer Spectr        ISSN: 2515-5091


Clinical value can be challenging to quantify, so value frameworks have been created to objectify therapeutic benefit. Various cohort studies have attempted to evaluate the congruity between two important frameworks—the American Society of Clinical Oncology Value Framework (ASCO-VF) (1) and the European Society for Medical Oncology-Magnitude of Clinical Benefit Scale (ESMO-MCBS) (2)—finding discrepancy in their outputs (3), as well as overall fair correlation (4–7). Their reliability, however, is unclear: how consistent are framework outputs when calculated by independent users? The aim of this study was to assess interrater reliability and absolute concordance of ASCO and ESMO frameworks across research groups. A PubMed search was performed on August 30, 2017, to identify all studies that scored/graded trials based on the ASCO-VF and ESMO-MCBS. Studies were included if both ASCO scores and ESMO grades were systematically applied to a cohort of anticancer therapies, and outputs (ie, scores/grades) were either 1) included in the publication and/or its supplementary appendix, or 2) available to the authorship group because of direct study involvement. All corresponding trials for drugs subjected to framework analysis were tabulated. Individual ASCO-VF scores and ESMO-MCBS grades assigned by the authors to each respective drug were abstracted by two authors (S.C. and J.C.D.P.); any discrepancies were resolved. Inclusion criteria for the final dataset consisted of drugs for specific indications that had both ASCO scores and ESMO grades, rated independently by at least two different study groups. The intraclass correlation coefficient (ICC) assesses the degree of consistency between ordinal, interval/continuous, and ratio variables and is defined as the ratio of variability between subjects (ie, collated framework outputs) to total variability (8). ICC values range from slight agreement (0.0–0.2) to near-perfect agreement (0.81–1.0) (9). ICC was assessed for the entire study cohort, as well as for the subset where scores/grades were derived from a single trial. Absolute concordance was assessed by calculating the frequency of studies with identical scores/grades from all assigned scores/grades. As a sensitivity analysis, concordance was also derived from the cohort of drug indications scored/graded from single trials. The frequencies of deviation in scores/grades were calculated among raters within ±10 points and ±20 points and within ±1 grade and ±2 grades for ASCO-VF and ESMO-MCBS, respectively. Statistical analyses were conducted with R 3.3.1 (R Project, Vienna, Austria). The initial search strategy yielded 40 articles, 36 of which did not meet the inclusion criteria (Figure 1) . The final cohort consisted of four studies (10%) suitable for data extraction and agreement analysis, two of which were studies previously published by members of this authorship group (5,6). All four studies utilized the revised ASCO-VF (1) and version 1.0 of ESMO-MCBS (2).
Figure 1.

Identification of cohort studies to which American Society of Clinical Oncology (ASCO) scores and European Society for Medical Oncology (ESMO) grades were systematically applied. *ESMO-MCBS framework paper grades were included in the correlative analysis, where applicable.

Identification of cohort studies to which American Society of Clinical Oncology (ASCO) scores and European Society for Medical Oncology (ESMO) grades were systematically applied. *ESMO-MCBS framework paper grades were included in the correlative analysis, where applicable. The final cohort comprised 39 indications from 36 drugs (Table 1). Approximately 25% of these drug/indication combinations (10/39) consisted of scores/grades derived from more than one trial. A sample of grades from the ESMO authorship group were included in the cohort (2); no “gold-standard” scores were applicable from the ASCO authorship group.
Table 1.

Trials of drugs evaluated by the American Society of Clinical Oncology (ASCO) Value Framework and the European Society for Medical Oncology (ESMO)-Magnitude of Clinical Benefit Scale by four independent research publications, as well as grades denoted by the ESMO framework authorship group, where applicable∗

DrugIndicationPMID (s) for framework outputsBecker et al. (4)
Vivot et al. (3)
Cheng et al. (6)
Del Paggio et al. (5)
ESMO authorship grade (2)
ASCOESMOASCOESMOASCOESMOASCOESMO
Abiraterone acetateSecond-line treatment of prostate cancer22995653, 2161246823434.6434.34XX4
Ado-trastuzumab emtansineSecond-line treatment of HER2-positive breast cancer2302016233.7562.4536.454555
AfatinibFirst-line treatment of non-small cell lung cancer with EGFR mutations2381696047.7431.7430.643144
BevacizumabFirst-line treatment of colorectal cancer1517543523.51313XXXX3
CabazitaxelSecond-line treatment of prostate cancer2088899245.2240.5225.22XX2
CabozantinibSecond-line treatment of medullary thyroid cancer2400250155.1337.62−17.33XXX
CobimetinibFirst-line treatment of BRAF-mutated melanoma2526549428.7352.24XXXX4
CrizotinibSecond-line treatment of ALK-positive lung cancer23724913704XX65.245944
DabrafenibFirst-line treatment of BRAF-mutated melanoma2273538452359.6452.03XX4
EnzalutamideSecond-line treatment of prostate cancer2289455359.7452.6440.64XXX
Eribulin mesylateThird-line treatment of breast cancer2137638518.5218.3218.021722
ErlotinibSecond-line treatment of non-small cell lung cancer16014882, 2049377147.9342.14XXXX1
EverolimusFirst-line treatment of renal cell carcinoma18653228, 2054983249.6332.8333.63XX3
EverolimusSecond-line treatment of postmenopausal hormone-receptor-positive breast cancer22149876XXXX33.22302X
IpilimumabSecond-line treatment of melanoma2052599232228.9437.94XX4
IxabepiloneSecond-line treatment of breast cancer1796802035.5018.3212.92XXX
LapatinibSecond-line treatment of breast cancer19786658, 20124187, 2268980734.9154.6221.712644
LenvatinibSecond-line treatment of thyroid cancer2567125459.2259.2259.22XXX
NecitumumabFirst-line treatment of squamous non-small cell lung cancer2604534012.0211.32XX142X
NivolumabSecond-line treatment of melanoma2579541030.2222.0322.42XXX
NivolumabSecond-line treatment of squamous non-small cell lung cancer26028407XXXX73.15614X
Paclitaxel protein boundFirst-line treatment of non-small cell lung cancer2254759161.24XX22.24291X
PanitumumabSecond-line treatment of EGFR-expressing colorectal cancer17470858, 2092146234.3219.8119.82XX2
PazopanibFirst-line treatment of renal cell carcinoma2010096243.2246.3342.42XX3
Pemetrexed for injectionFirst-line treatment of mesothelioma1286093810217.63XXXXX
PertuzumabFirst-line treatment of HER 2-positive breast cancer23602601, 25693012, 22149875, 2386890538.9546.8438.9431.744
Radium 223 dichlorideSecond-line treatment of prostate cancer23863050, 2483627360.5360.9539.85XX5
RamucirumabSecond-line treatment of gastroesophageal cancer2409476838.8139.4130.61XX2
RegorafenibThird treatment of colorectal cancer231775143.513.414.41311
SorafenibFirst-line treatment of renal cell carcinoma1721553051.3353.62XXXX3
SunitinibFirst-line treatment of renal cell carcinoma17215529, 1948738157.44XX464XX4
SunitinibSecond-line treatment of gastrointestinal stromal tumor1704646536.1336.13XXXX3
TemsirolimusFirst-line treatment of renal cell carcinoma1753808625.1321.6422.74XX4
TrabectedinSecond-line treatment of liposarcoma or leiomyosarcoma263711430.4314.63XXXXX
TrametinibFirst-line treatment of BRAF-mutated melanoma23020132, 2266301162.6452.7452.63XX4
Trifluridine and tipiracilThird-line treatment of colorectal cancer2597005028.2349.82XX29.222
VandetanibFirst-line treatment of thyroid cancer2202514637345.738.33XXX
VemurafenibFirst-line treatment of BRAF-mutated melanoma21639808, 2450810315.6266.5449.74XX4
Ziv-afliberceptSecond-line treatment of colorectal cancer2294914715.9116.6115.711511

*X = no score/grade assigned by authorship group. EGFR = epidermal growth factor receptor.

Trials of drugs evaluated by the American Society of Clinical Oncology (ASCO) Value Framework and the European Society for Medical Oncology (ESMO)-Magnitude of Clinical Benefit Scale by four independent research publications, as well as grades denoted by the ESMO framework authorship group, where applicable∗ *X = no score/grade assigned by authorship group. EGFR = epidermal growth factor receptor. Interrater reliability was found to be near-perfect for both ASCO scores (0.82, 95% CI = 0.7 to 0.9) and ESMO grades (0.88, 95% CI = 0.8 to 0.93) in all settings (Table 2). When drug scors/grades derived from multiple trials were removed from analysis (n = 29 drug indications), ICCs were similar: 0.89 for ASCO-VF (95% CI = 0.72 to 0.92) and 0.9 (95% CI = 0.83 to 0.95). Using the ESMO-MCBS definition of “substantial clinical benefit” (ie, grades B, A, 4, or 5) to dichotomize authorship grades, ICC values were similar: 0.84 (95% CI = 0.74 to 0.9) for all drug grades and 0.89 (95% CI = 0.80 to 0.94) for drug grades derived from single randomized, controlled trials.
Table 2.

Interrater reliability of framework outputs between four independent authorship groups∗

FrameworkSettingICC (95% CI)
ASCOAll drugs scored (n = 39)0.82 (0.7 to 0.9)
Only those drugs scored from a single trial (n = 29)0.85 (0.72 to 0.92)
ESMOAll grades (n = 39)0.88 (0.8 to 0.93)
Only those drugs graded from a single trial (n = 29)0.9 (0.83 to 0.95)

∗ASCO = American Society of Clinical Oncology; ESMO = European Society for Medical Oncology.

Interrater reliability of framework outputs between four independent authorship groups∗ ∗ASCO = American Society of Clinical Oncology; ESMO = European Society for Medical Oncology. Absolute concordance among all authorship groups’ final ASCO scores and ESMO-MCBS grades was 5% and 44%, respectively (Table 3). For ASCO scores, concordance increased to 46% and 74% when respectively assessed within ±10 points and ±20 points; for ESMO grades, concordance increased to 80% and 90% when respectively assessed within ±1 grade and ±2 grades. For the sensitivity analysis, concordance was similar when derived from drug indications scored/graded from single trials (n = 29 drug indications) (Table 3).
Table 3.

Score/grade concordance and deviation frequencies for 39 evaluated drug indications as well as the 29 drug indications scored/graded from single trials∗

FrameworkConcordance between authorship groupsDeviance between authorship group
Within ±10 points or ±1 gradeWithin ±20 points or ±2 grades
ASCO v2 Scores (all indications, n = 39)2 (5%)18 (46%)29 (74%)
ESMO-MCBS Grades all indications, n = 39)17 (44%)31 (80%)35 (90%)
ASCO v2 Scores (single trial evaluations, n = 29)2 (7%)16 (55%)22 (76%)
ESMO-MCBS Grades (single trial evaluations, n = 29)14 (48%)25 (86%)28 (97%)

∗ASCO = American Society of Clinical Oncology; ESMO-MCBS = European Society for Medical Oncology-Magnitude of Clinical Benefit Scale.

Score/grade concordance and deviation frequencies for 39 evaluated drug indications as well as the 29 drug indications scored/graded from single trials∗ ∗ASCO = American Society of Clinical Oncology; ESMO-MCBS = European Society for Medical Oncology-Magnitude of Clinical Benefit Scale. The increasing interest in quantifying the value of anticancer agents is becoming justified by mounting evidence that these approved drugs have little to know correlation between their cost and efficacy (5,10–12). As such, understanding framework reliability is essential. In this regard, we have found that: 1) interrater reliability by ICC is near perfect between users for both frameworks, while 2) absolute concordance is poor, but greater for ESMO-MCBS than ASCO-VF. Although studies have previously shown near-perfect agreement between users assessing a single cohort of trials (6,13), this is the first study, to our knowledge, that systematically assessed outputs from ASCO-VF and ESMO-MCBS across published works. We show similar findings with respect to ICC. Comparable concordance, however, is only established with deviations of 1 grade for ESMO-MCBS and 20 points for ASCO-VF. Although the clinical meaning of these deviations may be apparent with ESMO’s definition of “substantial benefit” (ie, grades 4 or 5 in palliative studies and B or A in curative studies) (2), a change in 20 points is ambiguous by ASCO’s scale, particularly given its condemnation of direct score comparisons (14). Nonetheless, support for this degree of deviation is found in our previously assessed cohort of 111 drug approval trials with a median ASCO score of 22 (6): A deviation of 20 points about this median would exceed the interquartile range identified in this cohort (ie, 8–35). Overall, these degrees of imprecision in scores/grades may have significant clinical implications in framework utilization. The differences in concordance found in this study are likely a consequence of each framework’s inherent scoring system: continuous ASCO-VF scores vs discrete ESMO-MCBS grades. Despite the concern that scores/grades derived from more than one trial (∼25% of the cohort) may result in reduced reliability, ICCs only marginally improved when these studies were removed (Table 2). Nevertheless, the discrepancy in absolute framework outputs remains problematic even when indications evaluated from more than one trial were removed (Table 3). Causes of discrepancy (Table 1) are likely the result of differing interpretations of toxicity and quality-of-life benefits, while improved concordance for ESMO-MCBS is likely due to its relatively finite outputs. Framework authors have recently made efforts to clarify technicalities (14,15), but future iterations should strongly consider the publication of formal measurement studies (eg, interrater reliability) prior to framework distribution in order to adequately evaluate measurement characteristics and minimize user errors in framework utilization. This study has limitations, the most significant of which is its sample size; however, value frameworks, and studies utilizing them, are novel. Also, our score/grade deviations (ie, 1–2 and 10–20) are arbitrary, and changing these values would change our findings; nonetheless, these deviations were determined a priori and are felt to represent substantial differences in the respective framework outputs. Finally, given the evolving nature of these frameworks, the scores/grades culled from these published studies may be obsolete with future revisions. For example, since the time of our initial search strategy, the ESMO-MCBS group has published v1.1 of their framework (16); although grades across identical cohorts remain relatively stable (7,16), grading thresholds have been modified to aid in future framework revisions. In conclusion, value frameworks are remarkably reliable when assessing trials of anticancer therapies. At present, however, the absolute concordance is poor, but greater for ESMO-MCBS than for ASCO-VF. This is an important consideration for users placing an onus on absolute framework scores/grades, particularly if the outputs are used for public policy implementation and patient/doctor decision making.

Notes

Affiliations of authors: Department of Medicine, Division of Medical Oncology, University of Toronto, Toronto, Ontario Canada (JCDP, SC, MCC, KKWC); Departments of Oncology and Public Health Sciences, Queen’s University, Kingston, Ontario, Canada (CMB); Division of Cancer Care and Epidemiology, Queen’s University Cancer Research Institute, Kingston, Ontario, Canada (CMB); Canadian Centre for Applied Research in Cancer Control, Toronto, Ontario, Canada (KKWC).

Disclosures

None.

Funding

The Canadian Centre for Applied Research in Cancer Control received core funding from the Canadian Cancer Society Research Institute (grant 2105-703549).
  2 in total

1.  Assessment of Food and Drug Administration- and European Medicines Agency-Approved Systemic Oncology Therapies and Clinically Meaningful Improvements in Quality of Life: A Systematic Review.

Authors:  Vanessa Arciero; Seanthel Delos Santos; Liza Koshy; Amanda Rahmadian; Ronak Saluja; Louis Everest; Ambica Parmar; Kelvin K W Chan
Journal:  JAMA Netw Open       Date:  2021-02-01

2.  Clinical benefit of immune checkpoint inhibitors approved by US Food and Drug Administration.

Authors:  Fei Liang; Sheng Zhang; Qin Wang; Wenfeng Li
Journal:  BMC Cancer       Date:  2020-08-31       Impact factor: 4.430

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.