Mary Lu Angelilli1, Ronald Thomas. 1. Wayne State University School of Medicine, Pediatrics, Detroit, Michigan, USA. mangeli@med.wayne.edu
Abstract
BACKGROUND: Many clinical scores that measure the degree of asthma are used without adequate evaluation of inter-rater reliability. When reliability is tested, most often the Cohen K statistic is used, which limits the comparative results of only two raters at a time. OBJECTIVE: To evaluate inter-rater agreement of a clinical asthma score using a multi-rater K statistic. METHODS: Four raters administered a clinical asthma score to 17 children with clinical asthma. Five items were evaluated: O2 requirement, inspiratory breath sounds, accessory muscle use, expiratory wheeze, and cerebral function. For each, a score of zero indicated a normal state; one, moderate impairment; two, severe impairment. A multi-rater kappa statistic was used as a measure of agreement among all four raters simultaneously. This was applied using hand calculations then cross-checked by using a standard statistical syntax, a component of the Statistical Package for Social Sciences (SPSS 9.0). RESULTS: Application of the multi-rater K statistic revealed strong agreement among raters on oxygenation (K = 0.759), moderate agreement for expiratory wheeze and cerebral function (K = 0.698), and poor agreement for accessory muscle use (K = 0.528) and inspiratory breath sounds (K = 0.316). CONCLUSIONS: The level of agreement varied by item with the least subjective item, O2 requirement, demonstrating the highest inter-rater correlation. A multi-rater kappa statistic can be applied to data obtained from a clinical scoring instrument either manually or by using statistical syntax provided by SPSS.
BACKGROUND: Many clinical scores that measure the degree of asthma are used without adequate evaluation of inter-rater reliability. When reliability is tested, most often the Cohen K statistic is used, which limits the comparative results of only two raters at a time. OBJECTIVE: To evaluate inter-rater agreement of a clinical asthma score using a multi-rater K statistic. METHODS: Four raters administered a clinical asthma score to 17 children with clinical asthma. Five items were evaluated: O2 requirement, inspiratory breath sounds, accessory muscle use, expiratory wheeze, and cerebral function. For each, a score of zero indicated a normal state; one, moderate impairment; two, severe impairment. A multi-rater kappa statistic was used as a measure of agreement among all four raters simultaneously. This was applied using hand calculations then cross-checked by using a standard statistical syntax, a component of the Statistical Package for Social Sciences (SPSS 9.0). RESULTS: Application of the multi-rater K statistic revealed strong agreement among raters on oxygenation (K = 0.759), moderate agreement for expiratory wheeze and cerebral function (K = 0.698), and poor agreement for accessory muscle use (K = 0.528) and inspiratory breath sounds (K = 0.316). CONCLUSIONS: The level of agreement varied by item with the least subjective item, O2 requirement, demonstrating the highest inter-rater correlation. A multi-rater kappa statistic can be applied to data obtained from a clinical scoring instrument either manually or by using statistical syntax provided by SPSS.
Authors: Todd A Florin; Lilliam Ambroggio; Cole Brokamp; Mantosh S Rattan; Eric J Crotty; Andrea Kachelmeyer; Richard M Ruddy; Samir S Shah Journal: Pediatrics Date: 2017-09 Impact factor: 7.124