Steven MacLennan1,2, Paula R Williamson3. 1. Academic Urology Unit, Institute of Applied Health Sciences, University of Aberdeen, Aberdeen, UK. 2. European Association of Urology Guidelines Office Methodology Committee, Arnhem, The Netherlands. 3. MRC North West Hub for Trials Methodology Research, University of Liverpool and Liverpool Health Partners, Liverpool, UK.
Choosing outcomes relevant to patients and healthcare professionals is essential if clinical trial results are to be translated into practice. A frequent frustration encountered in summarising the clinical effectiveness of treatments for urological cancers is heterogeneity in the outcomes reported. This refers to two interrelated problems: inconsistency, where different outcomes are reported across different studies, and variability, where the same outcomes are reported across studies but are defined or measured differently. Outcome inconsistency and variability make comparing, contrasting, synthesising and interpreting the results of different studies on the same topic more complicated than it ought to be. The implications of outcome reporting heterogeneity come into sharp relief in systematic reviews of interventions. For instance, in a cohort of Cochrane systematic reviews, 40% of reviewers noted problems due outcome inconsistency (1). This problem is of particular importance when meta-analysis (the statistical pooling of aggregated data from two or more studies, providing more power and precision) is either not possible, or worse, done inappropriately regardless. Systematic reviews are a cornerstone of the Evidence Based Medicine (EBM) movement and are an essential component in creating clinical practice guideline (CPG) recommendations—which inform patient, clinician and policy decision-making. To arrive at CPG treatment recommendations, urology guideline making bodies such as the European Association of Urology (EAU) rely on published or commissioned systematic reviews, ideally of RCTs, but frequently incorporating various study designs. There are numerous examples of outcome reporting heterogeneity hindering guideline panels from making evidence-based recommendations throughout urology oncology and some examples from prostate and bladder cancer are outlined below.For clinical trial results to be useful the views of a variety of stakeholders, particularly patients, must be considered. Localised prostate cancer patients and health care professionals alike prioritise oncological outcomes, such as survival and progression, as well as health related quality of life (HRQoL) outcomes, encompassing bowel, urinary and sexual function, as among the most important outcomes to be measured in prostate cancer research (2,3). Given that there is a lack of evidence on oncological superiority in currently available localised prostate cancer treatments, the EAU prostate cancer guideline panel commissioned a systematic review to ascertain HRQoL and functional outcomes after any intervention for localised prostate cancer (4). Numerous patient-reported outcome measures (PROMs) were found purporting to assess HRQoL and various aspects of the functional outcomes (e.g., EPIC, UCLA-PCI, EORTC QLQ-30/PR25, PCOS) across the 18 included studies. Not every tool covers every functional domain and the measurement scales are often non-commensurable. Meta-analysis was not possible, and interpretation was difficult. Similarly, in a Health Technology Assessment including 54 studies comparing laparoscopic with robotic prostatectomy, Ramsay et al. (5) (page 84) noted that a “…specific methodological limitation that frustrated pooled analysis was the use of differing definitions and measures of functional outcomes for both urinary and erectile dysfunction. The variety of different ways of measuring dysfunction reduced the ability to compare data or to conduct a comprehensive meta-analysis”. A further review of PROMs used in prostate cancer research found 15 disease specific or generic tools have been used and that none were established enough to be relevant for long term prostate cancer survivorship (6). The results of these three examples all point to the same conclusion: it is difficult to say anything meaningful about the comparative effectiveness of the variety of available treatments for localised prostate cancer on long term functional outcomes.Two systematic reviews commissioned by the EAU Guidelines Office muscle invasive bladder cancer (MIBC) panel demonstrated outcome inconsistency and variability. Veskimäe et al. (7) reviewed oncological and functional outcomes of pelvic organ sparing radical cystectomy compared with standard radical cystectomy in women undergoing curative surgery and orthotopic neobladder substitution for bladder cancer. In the main results table, the column headers are ‘local recurrence’, ‘time to local recurrence’, ‘metastatic recurrence’, ‘disease specific survival’ and ‘overall survival’. Of the 15 included studies reporting on oncological outcomes, only one reported all these outcomes, and no single oncology outcome was reported in all the studies. Indeed ‘not reported’ is by far the most frequently occurring data in the table’s cells. No meta-analysis was possible, and a cumbersome narrative synthesis was required. Hernández et al. (8) systematically reviewed sexual function preserving cystectomy versus standard radical cystectomy in men. Continence was frequently measured across the 12 included studies, but the time point of measurement (6, 1–9, 12, 48 months) and the method of measurement [self-reported, number of pads, pad test (volume), voiding diary, and Bladder Cancer Index questionnaire] were variable. The data relating to these outcomes are qualitatively different from each other, so statistically combining them is not possible, and even if standardisation were possible, the differing time points would make a pooled average statistic meaningless and misleading.The previous examples have been situated in the context of difficulties in recommendation-making for guideline panels. However, outcome heterogeneity is also problematic for initiatives seeking to address improvement through benchmarking value across providers, such as the International Consortium for Health Outcome Measurement (ICHOM) (3,9), and for big data projects such as Prostate PIONEER where consistency in outcome definitions and measurements are key considerations in data harmonisation with multiple sources such as RCTs, observational studies, institutional registries, and claims databases all contributing data (10).Core outcome sets (COSs) are a solution to heterogeneity in outcomes reporting. A COS is an agreed standardised collection of outcomes which should be reported as a minimum in all trials for a specific clinical area (11). The Core Outcome Measures for Effectiveness Trials (COMET) initiative is a hub for COS development. They provide guidance, develop methodological standards (12) and reporting guidelines (13,14) as well as maintain a searchable database of ongoing and completed COS projects. Importantly, COMET note that where COS exist, researchers should not feel restricted to measure only the core outcomes, other outcomes may be collected additionally.The development of a COS proceeds by first identifying what outcomes are important, then how they should be measured. An essential initial step is to define the COS scope, for which the first three elements of the Patient Intervention Comparator Outcomes (PICO) structure is useful. Consideration should also be given to the intended end uses of the COS, for example in RCTs only, routine care only, or both. Then, the COMET database (http://www.comet-initiative.org/Studies) should be checked to ensure no duplication of effort or to potentially offer collaboration where an ongoing COS is already registered.If a new COS is to be initiated then a study protocol should be developed according to the Core Outcome Set Standard Protocol (COS-STAP) items statement and the study registered with COMET (14). Next, various research methods such as systematic reviews and stakeholder interviews are used to generate a comprehensive list of outcomes potentially important to a variety of stakeholders, such as surgeons, oncologists, nurses and patients. A consensus exercise to prioritise the most important outcomes is then recommended using methods such as a Delphi survey, and often culminating in a face-to-face consensus meeting, where a final list of core outcomes is agreed and ratified (11). Finally, the COS is usually reported in a journal manuscript and reporting according to the Core Outcome Set Standards for Reporting (COS-STAR) statement (13) is encouraged. Then attention should turn to uptake of the COS among the clinical and research communities. However, implementation is rarely linear and other efforts, such as encouraging uptake through endorsement from journal editors, trial funders, trial registries and regulatory authorities is likely to be more successful than publication alone.Once the core outcomes are known, further review and consensus work is required to assess and recommend the most appropriate definitions and measurement tools. Definitions for clinician reported outcomes may require reviewing existing definitions used in published studies and seeking consensus from stakeholders on which of those definitions is most appropriate, and other considerations like the timepoint(s) of measurement. For PROMs additional work on assessing the psychometric properties and feasibility of use is required prior to further consensus work. Extensive guidance on systematically reviewing and evaluating PROMs is available from the COnsensus-based Standards for the selection of health Measurement Instruments (COSMIN) group (15,16).Although COS require effort to develop, promote and engage the clinical specialty community, experience from other disciplines demonstrate that improvement is possible. For instance, the Outcome Measures in Rheumatology (OMERACT) group involved a variety of stakeholders in identifying and prioritising outcomes in antirheumatic drugs for rheumatoid arthritis (RA) trials. OMERACT published their first RA COS in 1994 (17). Since then there has been a consistent increase in the use of the RA COS and 81% of registered and completed RA trials now report the COS, and selective outcome reporting has decreased (18). Also, the CoRe Outcomes in Women’s and Newborn health (CROWN) initiative have demonstrated a targeted and organised approach to implementing their COS by facilitating a consortium of over 80 gynaecology-obstetrics related journals to publicise relevant COS through editorials, publish the various COS, and endorse the use of COS in studies submitted to their journals (19). Finally, the Cochrane Skin group, who manage Cochrane reviews in the dermatology setting, have embedded a COS initiative within their review group which aims to facilitate implementation (20).Outcome reporting heterogeneity is problematic for urology cancer research and decision-making at many levels. COS are proposed as a solution. Fortunately, a supportive methodological hub and a wealth of applied research examples are available. Initiatives from other disciplines show that improvement is possible and there is no reason why the urology community cannot organise in a concerted effort to reduce outcome reporting inconsistency and variability, to reduce waste in research.The article’s supplementary files as
Authors: Cecilia A C Prinsen; Phyllis I Spuls; Jan Kottner; Kim S Thomas; Christian Apfelbacher; Joanne R Chalmers; Stefanie Deckert; Masutaka Furue; Louise Gerbens; Jamie Kirkham; Eric L Simpson; Murad Alam; Katrin Balzer; Dimitri Beeckman; Viktoria Eleftheriadou; Khaled Ezzedine; Sophie E R Horbach; John R Ingram; Alison M Layton; Karsten Weller; Thomas Wild; Albert Wolkerstorfer; Hywel C Williams; Jochen Schmitt Journal: J Am Acad Dermatol Date: 2019-03-13 Impact factor: 11.527
Authors: Steven MacLennan; Paula R Williamson; Hanneke Bekema; Marion Campbell; Craig Ramsay; James N'Dow; Sara MacLennan; Luke Vale; Philipp Dahm; Nicolas Mottet; Thomas Lam Journal: BJU Int Date: 2017-05-03 Impact factor: 5.588
Authors: Michael Lardas; Matthew Liew; Roderick C van den Bergh; Maria De Santis; Joaquim Bellmunt; Thomas Van den Broeck; Philip Cornford; Marcus G Cumberbatch; Nicola Fossati; Tobias Gross; Ann M Henry; Michel Bolla; Erik Briers; Steven Joniau; Thomas B Lam; Malcolm D Mason; Nicolas Mottet; Henk G van der Poel; Olivier Rouvière; Ivo G Schoots; Thomas Wiegel; Peter-Paul M Willemse; Cathy Yuhong Yuan; Liam Bourke Journal: Eur Urol Date: 2017-07-27 Impact factor: 20.096
Authors: M Boers; P Tugwell; D T Felson; P L van Riel; J R Kirwan; J P Edmonds; J S Smolen; N Khaltaev; K D Muirden Journal: J Rheumatol Suppl Date: 1994-09
Authors: C Ramsay; R Pickard; C Robertson; A Close; L Vale; N Armstrong; D A Barocas; C G Eden; C Fraser; T Gurung; D Jenkinson; X Jia; T B Lam; G Mowatt; D E Neal; M C Robinson; J Royle; S P Rushton; P Sharma; M D F Shirley; N Soomro Journal: Health Technol Assess Date: 2012 Impact factor: 4.014
Authors: Jamie J Kirkham; Katherine Davis; Douglas G Altman; Jane M Blazeby; Mike Clarke; Sean Tunis; Paula R Williamson Journal: PLoS Med Date: 2017-11-16 Impact factor: 11.069
Authors: Jamie J Kirkham; Sarah Gorst; Douglas G Altman; Jane M Blazeby; Mike Clarke; Declan Devane; Elizabeth Gargon; David Moher; Jochen Schmitt; Peter Tugwell; Sean Tunis; Paula R Williamson Journal: PLoS Med Date: 2016-10-18 Impact factor: 11.069
Authors: Cecilia A C Prinsen; Sunita Vohra; Michael R Rose; Maarten Boers; Peter Tugwell; Mike Clarke; Paula R Williamson; Caroline B Terwee Journal: Trials Date: 2016-09-13 Impact factor: 2.279