Katya Strage1, Stephen Stacey1, Cyril Mauffrey1, Joshua A Parry2. 1. Department of Orthopaedics, Denver Health Medical Center, Denver Health, 777 Bannock St, MC 0188, Denver, CO 80204, USA. 2. Department of Orthopaedics, Denver Health Medical Center, Denver Health, 777 Bannock St, MC 0188, Denver, CO 80204, USA. Electronic address: Joshua.alan.parry@gmail.com.
Abstract
BACKGROUND: A measure of effect size, such as observed difference (OD) and its 95% confidence interval (CI), is necessary to determine clinical relevance (CR) of research findings. The purpose of this paper is to (1) determine the interobserver reliability (IOR) of determining CR when presented with only the OD and CI and (2) to determine if a ratio of OD over CI (OD/CI) had a stronger association with CR than the p-value. METHODS: A survey including the OD and CI results from 21 studies was sent to 36 physicians, of which 21 responded. Respondents were asked to determine if the results were clinically relevant or not clinically relevant. RESULTS: Twenty-one (58%) physicians responded. The IOR of interpreting CR based on OD and the CI was weak (kappa=0.13, CI 0.10 to 0.15). The p-value did not differ between CR and non-CR results (median difference -0.001, CI -0.005 to 0.0, p = 0.07). The OD/CI however, was greater for CR vs. non-CR results (median difference 0.5, CI 0.09 to 0.95, p = 0.02). The area under the curve for the p-value and OD/CI receiver-operator characteristic curve was 0.70 and 0.80. The p-value and OD/CI that maximized the sensitivity (SN) and specificity (SP) for identifying CR was 0.001 (SN 88%, SP 59%) and 0.95 (SN 88%, SP 84%). CONCLUSION: Determining CR from the OD and CI alone had weak interobserver reliability. The OD/CI ratio had a stronger association with CR than the p-value making it potentially useful in evaluating the CR of research findings.
BACKGROUND: A measure of effect size, such as observed difference (OD) and its 95% confidence interval (CI), is necessary to determine clinical relevance (CR) of research findings. The purpose of this paper is to (1) determine the interobserver reliability (IOR) of determining CR when presented with only the OD and CI and (2) to determine if a ratio of OD over CI (OD/CI) had a stronger association with CR than the p-value. METHODS: A survey including the OD and CI results from 21 studies was sent to 36 physicians, of which 21 responded. Respondents were asked to determine if the results were clinically relevant or not clinically relevant. RESULTS: Twenty-one (58%) physicians responded. The IOR of interpreting CR based on OD and the CI was weak (kappa=0.13, CI 0.10 to 0.15). The p-value did not differ between CR and non-CR results (median difference -0.001, CI -0.005 to 0.0, p = 0.07). The OD/CI however, was greater for CR vs. non-CR results (median difference 0.5, CI 0.09 to 0.95, p = 0.02). The area under the curve for the p-value and OD/CI receiver-operator characteristic curve was 0.70 and 0.80. The p-value and OD/CI that maximized the sensitivity (SN) and specificity (SP) for identifying CR was 0.001 (SN 88%, SP 59%) and 0.95 (SN 88%, SP 84%). CONCLUSION: Determining CR from the OD and CI alone had weak interobserver reliability. The OD/CI ratio had a stronger association with CR than the p-value making it potentially useful in evaluating the CR of research findings.