Literature DB >> 26095449

Measures of agreement between many raters for ordinal classifications.

Kerrie P Nelson1, Don Edwards2.   

Abstract

Screening and diagnostic procedures often require a physician's subjective interpretation of a patient's test result using an ordered categorical scale to define the patient's disease severity. Because of wide variability observed between physicians' ratings, many large-scale studies have been conducted to quantify agreement between multiple experts' ordinal classifications in common diagnostic procedures such as mammography. However, very few statistical approaches are available to assess agreement in these large-scale settings. Many existing summary measures of agreement rely on extensions of Cohen's kappa. These are prone to prevalence and marginal distribution issues, become increasingly complex for more than three experts, or are not easily implemented. Here we propose a model-based approach to assess agreement in large-scale studies based upon a framework of ordinal generalized linear mixed models. A summary measure of agreement is proposed for multiple experts assessing the same sample of patients' test results according to an ordered categorical scale. This measure avoids some of the key flaws associated with Cohen's kappa and its extensions. Simulation studies are conducted to demonstrate the validity of the approach with comparison with commonly used agreement measures. The proposed methods are easily implemented using the software package R and are applied to two large-scale cancer agreement studies.
Copyright © 2015 John Wiley & Sons, Ltd.

Entities:  

Keywords:  Cohen's kappa; Fleiss' kappa; generalized linear mixed model; inter-rater agreement; ordinal categorical data

Mesh:

Year:  2015        PMID: 26095449      PMCID: PMC4560692          DOI: 10.1002/sim.6546

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  28 in total

1.  Association of volume and volume-independent factors with accuracy in screening mammogram interpretation.

Authors:  Craig A Beam; Emily F Conant; Edward A Sickles
Journal:  J Natl Cancer Inst       Date:  2003-02-19       Impact factor: 13.506

Review 2.  Modelling patterns of agreement and disagreement.

Authors:  A Agresti
Journal:  Stat Methods Med Res       Date:  1992       Impact factor: 3.021

3.  Different ranking approaches defining association and agreement measures of paired ordinal data.

Authors:  Elisabeth Svensson
Journal:  Stat Med       Date:  2012-06-19       Impact factor: 2.373

4.  A method to analyse observer disagreement in visual grading studies: example of assessed image quality in paediatric cerebral multidetector CT images.

Authors:  K Ledenius; E Svensson; F Stålhammar; L-M Wiklund; A Thilander-Klang
Journal:  Br J Radiol       Date:  2010-03-24       Impact factor: 3.039

5.  2 x 2 kappa coefficients: measures of agreement or association.

Authors:  D A Bloch; H C Kraemer
Journal:  Biometrics       Date:  1989-03       Impact factor: 2.571

Review 6.  The 2005 International Society of Urological Pathology (ISUP) Consensus Conference on Gleason Grading of Prostatic Carcinoma.

Authors:  Jonathan I Epstein; William C Allsbrook; Mahul B Amin; Lars L Egevad
Journal:  Am J Surg Pathol       Date:  2005-09       Impact factor: 6.394

7.  Bayesian random effects for interrater and test-retest reliability with nested clinical observations.

Authors:  Chuhsing K Hsiao; Pei-Chun Chen; Wen-Hsin Kao
Journal:  J Clin Epidemiol       Date:  2011-02-02       Impact factor: 6.437

8.  Separation of systematic and random differences in ordinal rating scales.

Authors:  E Svensson; S Holm
Journal:  Stat Med       Date:  1994 Dec 15-30       Impact factor: 2.373

9.  Variability in interpretive performance at screening mammography and radiologists' characteristics associated with accuracy.

Authors:  Joann G Elmore; Sara L Jackson; Linn Abraham; Diana L Miglioretti; Patricia A Carney; Berta M Geller; Bonnie C Yankaskas; Karla Kerlikowske; Tracy Onega; Robert D Rosenberg; Edward A Sickles; Diana S M Buist
Journal:  Radiology       Date:  2009-10-28       Impact factor: 11.105

10.  Radiologist agreement for mammographic recall by case difficulty and finding type.

Authors:  Tracy Onega; Megan Smith; Diana L Miglioretti; Patricia A Carney; Berta A Geller; Karla Kerlikowske; Diana S M Buist; Robert D Rosenberg; Robert A Smith; Edward A Sickles; Sebastien Haneuse; Melissa L Anderson; Bonnie Yankaskas
Journal:  J Am Coll Radiol       Date:  2012-11       Impact factor: 5.532

View more
  21 in total

1.  Measuring intrarater association between correlated ordinal ratings.

Authors:  Kerrie P Nelson; Thomas J Zhou; Don Edwards
Journal:  Biom J       Date:  2020-06-11       Impact factor: 2.207

2.  Assessing the influence of rater and subject characteristics on measures of agreement for ordinal ratings.

Authors:  Kerrie P Nelson; Aya A Mitani; Don Edwards
Journal:  Stat Med       Date:  2017-06-13       Impact factor: 2.373

3.  Image Processing to Improve Detection of Mesial Temporal Sclerosis in Adults.

Authors:  F Dahi; M S Parsons; H L P Orlowski; A Salter; S Dahiya; A Sharma
Journal:  AJNR Am J Neuroradiol       Date:  2019-04-04       Impact factor: 3.825

Review 4.  Evidence-based statistical analysis and methods in biomedical research (SAMBR) checklists according to design features.

Authors:  Alok Kumar Dwivedi; Rakesh Shukla
Journal:  Cancer Rep (Hoboken)       Date:  2019-08-22

5.  Assessment of variability in motor grading and patient-reported outcome reporting: a multi-specialty, multi-national survey.

Authors:  Brandon W Smith; Sarada Sakamuri; Kara E Flavin; Michael Jensen; David A Purger; Lynda J-S Yang; Robert J Spinner; Thomas J Wilson
Journal:  Acta Neurochir (Wien)       Date:  2021-05-15       Impact factor: 2.216

6.  An image processing algorithm to aid diagnosis of mesial temporal sclerosis in children: a case-control study.

Authors:  Benjamin S Strnad; Hilary L P Orlowski; Matthew S Parsons; Amber Salter; Sonika Dahiya; Aseem Sharma
Journal:  Pediatr Radiol       Date:  2019-10-02

7.  A paired kappa to compare binary ratings across two medical tests.

Authors:  Kerrie P Nelson; Don Edwards
Journal:  Stat Med       Date:  2019-05-17       Impact factor: 2.373

8.  A measure of association for ordered categorical data in population-based studies.

Authors:  Kerrie P Nelson; Don Edwards
Journal:  Stat Methods Med Res       Date:  2016-05-16       Impact factor: 3.021

9.  Assessing method agreement for paired repeated binary measurements administered by multiple raters.

Authors:  Wei Wang; Nan Lin; Jordan D Oberhaus; Michael S Avidan
Journal:  Stat Med       Date:  2019-12-01       Impact factor: 2.373

Review 10.  Summary measures of agreement and association between many raters' ordinal classifications.

Authors:  Aya A Mitani; Phoebe E Freer; Kerrie P Nelson
Journal:  Ann Epidemiol       Date:  2017-09-22       Impact factor: 3.797

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.