Adri T Apeldoorn1, Hans van Helvoirt2, Raymond W Ostelo3, Hanneke Meihuizen2, Steven J Kamper4, Maurits W van Tulder5, Henrica C W de Vet6. 1. Department of Epidemiology and Biostatistics and the EMGO+ Institute for Health and Care Research, VU University Medical Centre, Amsterdam, The Netherlands & Rehabilitation Department, Medical Centre Alkmaar, The Netherlands. 2. Medical Back Neck Centre, The Hague, The Netherlands. 3. Department of Epidemiology and Biostatistics and the EMGO+ Institute for Health and Care Research, VU University Medical Centre & Department of Health Sciences, Faculty of Earth and Life Sciences, VU University Amsterdam, The Netherlands. 4. Department of Epidemiology and Biostatistics and the EMGO+ Institute for Health and Care Research, VU University Medical Centre, Amsterdam, The Netherlands & The George Institute, University of Sydney, Australia. 5. Department of Health Sciences and the EMGO+ Institute for Health and Care Research, Faculty of Earth and Life Sciences, VU University Amsterdam,, The Netherlands. 6. Department of Epidemiology and Biostatistics and the EMGO+ Institute for Health and Care Research, VU University Medical Centre, Amsterdam,, The Netherlands.
Abstract
STUDY DESIGN: Observational inter-rater reliability study. OBJECTIVES: To examine: (1) the inter-rater reliability of a modified version of Delitto et al.'s classification-based algorithm for patients with low back pain; (2) the influence of different levels of familiarity with the system; and (3) the inter-rater reliability of algorithm decisions in patients who clearly fit into a subgroup (clear classifications) and those who do not (unclear classifications). METHODS: Patients were examined twice on the same day by two of three participating physical therapists with different levels of familiarity with the system. Patients were classified into one of four classification groups. Raters were blind to the others' classification decision. In order to quantify the inter-rater reliability, percentages of agreement and Cohen's Kappa were calculated. RESULTS: A total of 36 patients were included (clear classification n = 23; unclear classification n = 13). The overall rate of agreement was 53% and the Kappa value was 0·34 [95% confidence interval (CI): 0·11-0·57], which indicated only fair inter-rater reliability. Inter-rater reliability for patients with a clear classification (agreement 52%, Kappa value 0·29) was not higher than for patients with an unclear classification (agreement 54%, Kappa value 0·33). Familiarity with the system (i.e. trained with written instructions and previous research experience with the algorithm) did not improve the inter-rater reliability. CONCLUSION: Our pilot study challenges the inter-rater reliability of the classification procedure in clinical practice. Therefore, more knowledge is needed about factors that affect the inter-rater reliability, in order to improve the clinical applicability of the classification scheme.
STUDY DESIGN: Observational inter-rater reliability study. OBJECTIVES: To examine: (1) the inter-rater reliability of a modified version of Delitto et al.'s classification-based algorithm for patients with low back pain; (2) the influence of different levels of familiarity with the system; and (3) the inter-rater reliability of algorithm decisions in patients who clearly fit into a subgroup (clear classifications) and those who do not (unclear classifications). METHODS:Patients were examined twice on the same day by two of three participating physical therapists with different levels of familiarity with the system. Patients were classified into one of four classification groups. Raters were blind to the others' classification decision. In order to quantify the inter-rater reliability, percentages of agreement and Cohen's Kappa were calculated. RESULTS: A total of 36 patients were included (clear classification n = 23; unclear classification n = 13). The overall rate of agreement was 53% and the Kappa value was 0·34 [95% confidence interval (CI): 0·11-0·57], which indicated only fair inter-rater reliability. Inter-rater reliability for patients with a clear classification (agreement 52%, Kappa value 0·29) was not higher than for patients with an unclear classification (agreement 54%, Kappa value 0·33). Familiarity with the system (i.e. trained with written instructions and previous research experience with the algorithm) did not improve the inter-rater reliability. CONCLUSION: Our pilot study challenges the inter-rater reliability of the classification procedure in clinical practice. Therefore, more knowledge is needed about factors that affect the inter-rater reliability, in order to improve the clinical applicability of the classification scheme.
Entities:
Keywords:
Agreement; Classification; Low back pain; Physical therapy; Reliability; Subgrouping
Authors: Mark W Werneke; Daniel Deutscher; Dennis L Hart; Paul Stratford; Joel Ladin; Jon Weinberg; Scott Herbowy; Linda Resnik Journal: Spine (Phila Pa 1976) Date: 2014-02-01 Impact factor: 3.468
Authors: Tasha R Stanton; Julie M Fritz; Mark J Hancock; Jane Latimer; Christopher G Maher; Benedict M Wand; Eric C Parent Journal: Phys Ther Date: 2011-02-17