Michael A Rotondi1, Allan Donner. 1. School of Kinesiology and Health Sciences, York University, Room 364, Norman Bethune College, 4700 Keele Street, Toronto, Ontario, Canada M3J 1P3. mrotondi@yorku.ca
Abstract
OBJECTIVE: Studies measuring interobserver agreement (reliability) are common in clinical practice, yet discussion of appropriate sample size estimation techniques is minimal as compared with clinical trials. The authors propose a sample size estimation technique to achieve a prespecified lower and upper limit for a confidence interval for the κ coefficient in studies of interobserver agreement. STUDY DESIGN AND SETTING: The proposed technique can be used to design a study measuring interobserver agreement with any number of outcomes and any number of raters. Potential application areas include: pathology, psychiatry, dentistry, and physical therapy. RESULTS: This technique is illustrated using two examples. The first considers a pilot study in oral radiology, whose authors studied the reliability of the mandibular cortical index as measured by three dental professionals. The second example examines the level of interobserver agreement among four nurses with respect to five triage levels used in the Canadian Triage and Acuity Scale. CONCLUSION: This method should be useful in the planning stages of an interobserver agreement study in which the investigator would like to obtain a prespecified level of precision in the estimation of κ. An R software package (R Foundation for Statistical Computing, Vienna, Austria), kappaSize is also provided that implements this method.
OBJECTIVE: Studies measuring interobserver agreement (reliability) are common in clinical practice, yet discussion of appropriate sample size estimation techniques is minimal as compared with clinical trials. The authors propose a sample size estimation technique to achieve a prespecified lower and upper limit for a confidence interval for the κ coefficient in studies of interobserver agreement. STUDY DESIGN AND SETTING: The proposed technique can be used to design a study measuring interobserver agreement with any number of outcomes and any number of raters. Potential application areas include: pathology, psychiatry, dentistry, and physical therapy. RESULTS: This technique is illustrated using two examples. The first considers a pilot study in oral radiology, whose authors studied the reliability of the mandibular cortical index as measured by three dental professionals. The second example examines the level of interobserver agreement among four nurses with respect to five triage levels used in the Canadian Triage and Acuity Scale. CONCLUSION: This method should be useful in the planning stages of an interobserver agreement study in which the investigator would like to obtain a prespecified level of precision in the estimation of κ. An R software package (R Foundation for Statistical Computing, Vienna, Austria), kappaSize is also provided that implements this method.
Authors: Henry E Wang; Robert H Schmicker; Heather Herren; Siobhan Brown; John P Donnelly; Randal Gray; Sally Ragsdale; Andrew Gleeson; Adam Byers; Jamie Jasti; Christina Aguirre; Pam Owens; Joe Condle; Brian Leroux Journal: Acad Emerg Med Date: 2015-01-29 Impact factor: 3.451
Authors: Tomas Zamora; Julio Urrutia; Daniel Schweitzer; Pedro Pablo Amenabar; Eduardo Botello Journal: Clin Orthop Relat Res Date: 2017-02-15 Impact factor: 4.176
Authors: Gustavo B Lovadini; Fernanda B Fukushima; Joao F L Schoueri; Roberto Dos Reis; Cecilia G F Fonseca; Jahaira J C Rodriguez; Cauana S Coelho; Adriele F Neves; Aniela M Rodrigues; Marina A Marques; Rick Bassett; Karl E Steinberg; Alvin H Moss; Edison I O Vidal Journal: J Am Med Dir Assoc Date: 2020-11-13 Impact factor: 4.669