Andreas Roposch1, John H Wedge, Georg Riedl. 1. Department of Orthopaedic Surgery, Great Ormond Street Hospital for Children, Institute of Child Health, University College London, London WC1N 3JH, England, UK. a.roposch@ich.ucl.ac.uk
Abstract
BACKGROUND: Osteonecrosis is perhaps the most important serious complication after treatment of developmental dysplasia of the hip (DDH). The classification by Bucholz and Ogden has been used most frequently for grading osteonecrosis in this context, but its reliability is not established and unreliability could affect the validity of studies reporting the outcome of treatment. QUESTIONS/ PURPOSE: We established the interrater and intrarater reliabilities of this classification and analyzed the frequency and nature of disagreements. METHODS: Three pediatric hip surgeons, a musculoskeletal pediatric radiologist, and three orthopaedic trainees graded 39 radiographs (hips) according to the Bucholz and Ogden classification, blinded to any clinical data. Ratings were repeated after 2 weeks. Interrater reliability and intrarater reliability were determined using the simple kappa statistic. Grading was compared among raters, the nature and frequency of disagreements established, and subgroup analyses performed. RESULTS: Interrater reliability was 0.34 (95% CI = 0.28, 0.40) for all raters, and 0.31 (0.20 to 0.43) for the three surgeons. The best interrater reliability was observed between the radiologist and a surgeon with a kappa of 0.51 (0.30, 0.72). Intrarater reliability estimates ranged from 0.44 to 0.69. Raters disagreed regarding the grade of osteonecrosis in 26 of 39 hips (67%), with seven of 26 disagreements (27%) involving confusion between Grades I and II. CONCLUSIONS: The interrater reliability was lower than expected, considering the raters' experience. Distinguishing between Grades I and II was the most frequently observed problem. We believe that the low reliability was a result of an ambiguous classification scheme rather than the variability among the raters. Outcome studies of DDH based on this classification should be interpreted with caution. We recommend the development of a new classification with better prognostic ability. LEVEL OF EVIDENCE: Level III, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
BACKGROUND:Osteonecrosis is perhaps the most important serious complication after treatment of developmental dysplasia of the hip (DDH). The classification by Bucholz and Ogden has been used most frequently for grading osteonecrosis in this context, but its reliability is not established and unreliability could affect the validity of studies reporting the outcome of treatment. QUESTIONS/ PURPOSE: We established the interrater and intrarater reliabilities of this classification and analyzed the frequency and nature of disagreements. METHODS: Three pediatric hip surgeons, a musculoskeletal pediatric radiologist, and three orthopaedic trainees graded 39 radiographs (hips) according to the Bucholz and Ogden classification, blinded to any clinical data. Ratings were repeated after 2 weeks. Interrater reliability and intrarater reliability were determined using the simple kappa statistic. Grading was compared among raters, the nature and frequency of disagreements established, and subgroup analyses performed. RESULTS: Interrater reliability was 0.34 (95% CI = 0.28, 0.40) for all raters, and 0.31 (0.20 to 0.43) for the three surgeons. The best interrater reliability was observed between the radiologist and a surgeon with a kappa of 0.51 (0.30, 0.72). Intrarater reliability estimates ranged from 0.44 to 0.69. Raters disagreed regarding the grade of osteonecrosis in 26 of 39 hips (67%), with seven of 26 disagreements (27%) involving confusion between Grades I and II. CONCLUSIONS: The interrater reliability was lower than expected, considering the raters' experience. Distinguishing between Grades I and II was the most frequently observed problem. We believe that the low reliability was a result of an ambiguous classification scheme rather than the variability among the raters. Outcome studies of DDH based on this classification should be interpreted with caution. We recommend the development of a new classification with better prognostic ability. LEVEL OF EVIDENCE: Level III, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
Authors: Robert G Marx; Jason Connor; Stephen Lyman; Annunziato Amendola; Jack T Andrish; Christopher Kaeding; Eric C McCarty; Richard D Parker; Rick W Wright; Kurt P Spindler Journal: Am J Sports Med Date: 2005-08-10 Impact factor: 6.202
Authors: Chang Ho Shin; Eunkyu Yang; Chaemoon Lim; Won Joon Yoo; In Ho Choi; Tae-Joon Cho Journal: Clin Orthop Relat Res Date: 2020-09 Impact factor: 4.755