Michael W Sjoding1,2, Colin R Cooke1,2, Theodore J Iwashyna1,3,4,5, Timothy P Hofer1,2,3. 1. 1 Department of Internal Medicine and. 2. 2 Institute for Healthcare Policy & Innovation, University of Michigan, Ann Arbor, Michigan. 3. 3 VA Center for Clinical Management Research, Ann Arbor, Michigan. 4. 4 Institute for Social Research, Ann Arbor, Michigan; and. 5. 5 Australian and New Zealand Intensive Care Research Centre, Department of Epidemiology and Preventive Medicine, Monash University, Melbourne, Victoria, Australia.
Abstract
RATIONALE: Identifying patients with acute respiratory distress syndrome (ARDS) is a recognized challenge. Experts often have only moderate agreement when applying the clinical definition of ARDS to patients. However, no study has fully examined the implications of low reliability measurement of ARDS on clinical studies. OBJECTIVES: To investigate how the degree of variability in ARDS measurement commonly reported in clinical studies affects study power, the accuracy of treatment effect estimates, and the measured strength of risk factor associations. METHODS: We examined the effect of ARDS measurement error in randomized clinical trials (RCTs) of ARDS-specific treatments and cohort studies using simulations. We varied the reliability of ARDS diagnosis, quantified as the interobserver reliability (κ-statistic) between two reviewers. In RCT simulations, patients identified as having ARDS were enrolled, and when measurement error was present, patients without ARDS could be enrolled. In cohort studies, risk factors as potential predictors were analyzed using reviewer-identified ARDS as the outcome variable. MEASUREMENTS AND MAIN RESULTS: Lower reliability measurement of ARDS during patient enrollment in RCTs seriously degraded study power. Holding effect size constant, the sample size necessary to attain adequate statistical power increased by more than 50% as reliability declined, although the result was sensitive to ARDS prevalence. In a 1,400-patient clinical trial, the sample size necessary to maintain similar statistical power increased to over 1,900 when reliability declined from perfect to substantial (κ = 0.72). Lower reliability measurement diminished the apparent effectiveness of an ARDS-specific treatment from a 15.2% (95% confidence interval, 9.4-20.9%) absolute risk reduction in mortality to 10.9% (95% confidence interval, 4.7-16.2%) when reliability declined to moderate (κ = 0.51). In cohort studies, the effect on risk factor associations was similar. CONCLUSIONS: ARDS measurement error can seriously degrade statistical power and effect size estimates of clinical studies. The reliability of ARDS measurement warrants careful attention in future ARDS clinical studies.
RATIONALE: Identifying patients with acute respiratory distress syndrome (ARDS) is a recognized challenge. Experts often have only moderate agreement when applying the clinical definition of ARDS to patients. However, no study has fully examined the implications of low reliability measurement of ARDS on clinical studies. OBJECTIVES: To investigate how the degree of variability in ARDS measurement commonly reported in clinical studies affects study power, the accuracy of treatment effect estimates, and the measured strength of risk factor associations. METHODS: We examined the effect of ARDS measurement error in randomized clinical trials (RCTs) of ARDS-specific treatments and cohort studies using simulations. We varied the reliability of ARDS diagnosis, quantified as the interobserver reliability (κ-statistic) between two reviewers. In RCT simulations, patients identified as having ARDS were enrolled, and when measurement error was present, patients without ARDS could be enrolled. In cohort studies, risk factors as potential predictors were analyzed using reviewer-identified ARDS as the outcome variable. MEASUREMENTS AND MAIN RESULTS: Lower reliability measurement of ARDS during patient enrollment in RCTs seriously degraded study power. Holding effect size constant, the sample size necessary to attain adequate statistical power increased by more than 50% as reliability declined, although the result was sensitive to ARDS prevalence. In a 1,400-patient clinical trial, the sample size necessary to maintain similar statistical power increased to over 1,900 when reliability declined from perfect to substantial (κ = 0.72). Lower reliability measurement diminished the apparent effectiveness of an ARDS-specific treatment from a 15.2% (95% confidence interval, 9.4-20.9%) absolute risk reduction in mortality to 10.9% (95% confidence interval, 4.7-16.2%) when reliability declined to moderate (κ = 0.51). In cohort studies, the effect on risk factor associations was similar. CONCLUSIONS: ARDS measurement error can seriously degrade statistical power and effect size estimates of clinical studies. The reliability of ARDS measurement warrants careful attention in future ARDS clinical studies.
Entities:
Keywords:
acute lung injury; bias; clinical trial; diagnosis; observational study
Authors: Marta L Render; James Deddens; Ron Freyberg; Peter Almenoff; Alfred F Connors; Douglas Wagner; Timothy P Hofer Journal: Crit Care Med Date: 2008-04 Impact factor: 7.598
Authors: G R Bernard; A Artigas; K L Brigham; J Carlet; K Falke; L Hudson; M Lamy; J R Legall; A Morris; R Spragg Journal: Am J Respir Crit Care Med Date: 1994-03 Impact factor: 21.405
Authors: Theodore J Iwashyna; James F Burke; Jeremy B Sussman; Hallie C Prescott; Rodney A Hayward; Derek C Angus Journal: Am J Respir Crit Care Med Date: 2015-11-01 Impact factor: 21.405
Authors: Chirag V Shah; Paul N Lanken; A Russell Localio; Robert Gallop; Scarlett Bellamy; Shwu-Fan Ma; Carlos Flores; Jeremy M Kahn; Barbara Finkel; Barry D Fuchs; Joe G N Garcia; Jason D Christie Journal: Chest Date: 2010-06-24 Impact factor: 9.410
Authors: Gordon D Rubenfeld; Ellen Caldwell; Eve Peabody; Jim Weaver; Diane P Martin; Margaret Neff; Eric J Stern; Leonard D Hudson Journal: N Engl J Med Date: 2005-10-20 Impact factor: 91.245
Authors: V Marco Ranieri; Gordon D Rubenfeld; B Taylor Thompson; Niall D Ferguson; Ellen Caldwell; Eddy Fan; Luigi Camporota; Arthur S Slutsky Journal: JAMA Date: 2012-06-20 Impact factor: 56.272
Authors: Michael W Sjoding; David A Schoenfeld; Samuel M Brown; Catherine L Hough; Donald M Yealy; Marc Moss; Derek C Angus; Theodore J Iwashyna Journal: Ann Am Thorac Soc Date: 2017-01
Authors: Michael W Sjoding; Timothy P Hofer; Ivan Co; Anthony Courey; Colin R Cooke; Theodore J Iwashyna Journal: Chest Date: 2017-12-14 Impact factor: 9.410
Authors: Craig M Coopersmith; Daniel De Backer; Clifford S Deutschman; Ricard Ferrer; Ishaq Lat; Flavia R Machado; Greg S Martin; Ignacio Martin-Loeches; Mark E Nunnally; Massimo Antonelli; Laura E Evans; Judith Hellman; Sameer Jog; Jozef Kesecioglu; Mitchell M Levy; Andrew Rhodes Journal: Intensive Care Med Date: 2018-07-03 Impact factor: 17.440