Jayashree Kalpathy-Cramer1, J Peter Campbell2, Deniz Erdogmus3, Peng Tian3, Dharanish Kedarisetti3, Chace Moleta2, James D Reynolds4, Kelly Hutcheson5, Michael J Shapiro6, Michael X Repka7, Philip Ferrone8, Kimberly Drenser9, Jason Horowitz10, Kemal Sonmez11, Ryan Swan11, Susan Ostmo2, Karyn E Jonas12, R V Paul Chan12, Michael F Chiang13. 1. Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, Massachusetts. 2. Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon. 3. Cognitive Systems Laboratory, Northeastern University, Boston, Massachusetts. 4. Department of Ophthalmology, Ross Eye Institute, State University of New York at Buffalo, Buffalo, New York. 5. Department of Ophthalmology, Sidra Medical & Research Center, Doha, Qatar. 6. Retina Consultants, Chicago, Illinois. 7. Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, Maryland. 8. Long Island Vitreoretinal Consultants, Great Neck, New York. 9. Associated Retinal Consultants, Oakland University, Royal Oak, Michigan. 10. Department of Ophthalmology, Columbia University, New York, New York. 11. Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon. 12. Department of Ophthalmology and Visual Sciences, Illinois Eye and Ear Infirmary, University of Illinois at Chicago, Chicago, Illinois. 13. Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon; Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon. Electronic address: chiangm@ohsu.edu.
Abstract
PURPOSE: To determine expert agreement on relative retinopathy of prematurity (ROP) disease severity and whether computer-based image analysis can model relative disease severity, and to propose consideration of a more continuous severity score for ROP. DESIGN: We developed 2 databases of clinical images of varying disease severity (100 images and 34 images) as part of the Imaging and Informatics in ROP (i-ROP) cohort study and recruited expert physician, nonexpert physician, and nonphysician graders to classify and perform pairwise comparisons on both databases. PARTICIPANTS: Six participating expert ROP clinician-scientists, each with a minimum of 10 years of clinical ROP experience and 5 ROP publications, and 5 image graders (3 physicians and 2 nonphysician graders) who analyzed images that were obtained during routine ROP screening in neonatal intensive care units. METHODS: Images in both databases were ranked by average disease classification (classification ranking), by pairwise comparison using the Elo rating method (comparison ranking), and by correlation with the i-ROP computer-based image analysis system. MAIN OUTCOME MEASURES: Interexpert agreement (weighted κ statistic) compared with the correlation coefficient (CC) between experts on pairwise comparisons and correlation between expert rankings and computer-based image analysis modeling. RESULTS: There was variable interexpert agreement on diagnostic classification of disease (plus, preplus, or normal) among the 6 experts (mean weighted κ, 0.27; range, 0.06-0.63), but good correlation between experts on comparison ranking of disease severity (mean CC, 0.84; range, 0.74-0.93) on the set of 34 images. Comparison ranking provided a severity ranking that was in good agreement with ranking obtained by classification ranking (CC, 0.92). Comparison ranking on the larger dataset by both expert and nonexpert graders demonstrated good correlation (mean CC, 0.97; range, 0.95-0.98). The i-ROP system was able to model this continuous severity with good correlation (CC, 0.86). CONCLUSIONS: Experts diagnose plus disease on a continuum, with poor absolute agreement on classification but good relative agreement on disease severity. These results suggest that the use of pairwise rankings and a continuous severity score, such as that provided by the i-ROP system, may improve agreement on disease severity in the future.
PURPOSE: To determine expert agreement on relative retinopathy of prematurity (ROP) disease severity and whether computer-based image analysis can model relative disease severity, and to propose consideration of a more continuous severity score for ROP. DESIGN: We developed 2 databases of clinical images of varying disease severity (100 images and 34 images) as part of the Imaging and Informatics in ROP (i-ROP) cohort study and recruited expert physician, nonexpert physician, and nonphysician graders to classify and perform pairwise comparisons on both databases. PARTICIPANTS: Six participating expert ROP clinician-scientists, each with a minimum of 10 years of clinical ROP experience and 5 ROP publications, and 5 image graders (3 physicians and 2 nonphysician graders) who analyzed images that were obtained during routine ROP screening in neonatal intensive care units. METHODS: Images in both databases were ranked by average disease classification (classification ranking), by pairwise comparison using the Elo rating method (comparison ranking), and by correlation with the i-ROP computer-based image analysis system. MAIN OUTCOME MEASURES: Interexpert agreement (weighted κ statistic) compared with the correlation coefficient (CC) between experts on pairwise comparisons and correlation between expert rankings and computer-based image analysis modeling. RESULTS: There was variable interexpert agreement on diagnostic classification of disease (plus, preplus, or normal) among the 6 experts (mean weighted κ, 0.27; range, 0.06-0.63), but good correlation between experts on comparison ranking of disease severity (mean CC, 0.84; range, 0.74-0.93) on the set of 34 images. Comparison ranking provided a severity ranking that was in good agreement with ranking obtained by classification ranking (CC, 0.92). Comparison ranking on the larger dataset by both expert and nonexpert graders demonstrated good correlation (mean CC, 0.97; range, 0.95-0.98). The i-ROP system was able to model this continuous severity with good correlation (CC, 0.86). CONCLUSIONS: Experts diagnose plus disease on a continuum, with poor absolute agreement on classification but good relative agreement on disease severity. These results suggest that the use of pairwise rankings and a continuous severity score, such as that provided by the i-ROP system, may improve agreement on disease severity in the future.
Authors: Grace M Richter; Steven L Williams; Justin Starren; John T Flynn; Michael F Chiang Journal: Surv Ophthalmol Date: 2009-08-08 Impact factor: 6.048
Authors: J Peter Campbell; Jayashree Kalpathy-Cramer; Deniz Erdogmus; Peng Tian; Dharanish Kedarisetti; Chace Moleta; James D Reynolds; Kelly Hutcheson; Michael J Shapiro; Michael X Repka; Philip Ferrone; Kimberly Drenser; Jason Horowitz; Kemal Sonmez; Ryan Swan; Susan Ostmo; Karyn E Jonas; R V Paul Chan; Michael F Chiang Journal: Ophthalmology Date: 2016-08-31 Impact factor: 12.079
Authors: R V Paul Chan; Steven L Williams; Yoshihiro Yonekawa; David J Weissgold; Thomas C Lee; Michael F Chiang Journal: Retina Date: 2010-06 Impact factor: 4.256
Authors: Steven L Williams; Lu Wang; Steven A Kane; Thomas C Lee; David J Weissgold; Audina M Berrocal; Daniel Rabinowitz; Justin Starren; John T Flynn; Michael F Chiang Journal: Br J Ophthalmol Date: 2009-12-02 Impact factor: 4.638
Authors: Aaron Nagiel; Michael J Espiritu; Ryan K Wong; Thomas C Lee; Andreas K Lauer; Michael F Chiang; R V Paul Chan Journal: Ophthalmology Date: 2012-12 Impact factor: 12.079
Authors: Esra Ataer-Cansizoglu; Veronica Bolon-Canedo; J Peter Campbell; Alican Bozkurt; Deniz Erdogmus; Jayashree Kalpathy-Cramer; Samir Patel; Karyn Jonas; R V Paul Chan; Susan Ostmo; Michael F Chiang Journal: Transl Vis Sci Technol Date: 2015-11-30 Impact factor: 3.283
Authors: Chace Moleta; J Peter Campbell; Jayashree Kalpathy-Cramer; R V Paul Chan; Susan Ostmo; Karyn Jonas; Michael F Chiang Journal: Am J Ophthalmol Date: 2017-01-11 Impact factor: 5.258
Authors: Hilal Biten; Travis K Redd; Chace Moleta; J Peter Campbell; Susan Ostmo; Karyn Jonas; R V Paul Chan; Michael F Chiang Journal: JAMA Ophthalmol Date: 2018-05-01 Impact factor: 7.389
Authors: Aaron S Coyner; Jimmy Chen; J Peter Campbell; Susan Ostmo; Praveer Singh; Jayashree Kalpathy-Cramer; Michael F Chiang Journal: AMIA Annu Symp Proc Date: 2021-01-25
Authors: Layla Ghergherehchi; Sang Jin Kim; J Peter Campbell; Susan Ostmo; R V Paul Chan; Michael F Chiang Journal: Asia Pac J Ophthalmol (Phila) Date: 2018-05-24
Authors: Rene Y Choi; James M Brown; Jayashree Kalpathy-Cramer; R V Paul Chan; Susan Ostmo; Michael F Chiang; J Peter Campbell Journal: Ophthalmol Retina Date: 2020-05-04
Authors: James M Brown; J Peter Campbell; Andrew Beers; Ken Chang; Susan Ostmo; R V Paul Chan; Jennifer Dy; Deniz Erdogmus; Stratis Ioannidis; Jayashree Kalpathy-Cramer; Michael F Chiang Journal: JAMA Ophthalmol Date: 2018-07-01 Impact factor: 7.389
Authors: Travis K Redd; John Peter Campbell; James M Brown; Sang Jin Kim; Susan Ostmo; Robison Vernon Paul Chan; Jennifer Dy; Deniz Erdogmus; Stratis Ioannidis; Jayashree Kalpathy-Cramer; Michael F Chiang Journal: Br J Ophthalmol Date: 2018-11-23 Impact factor: 4.638
Authors: Kellyn N Bellsmith; James Brown; Sang Jin Kim; Isaac H Goldstein; Aaron Coyner; Susan Ostmo; Kishan Gupta; R V Paul Chan; Jayashree Kalpathy-Cramer; Michael F Chiang; J Peter Campbell Journal: Ophthalmology Date: 2020-02-07 Impact factor: 12.079