V Bolón-Canedo1, E Ataer-Cansizoglu2, D Erdogmus2, J Kalpathy-Cramer3, O Fontenla-Romero4, A Alonso-Betanzos4, M F Chiang5. 1. Department of Computer Science, Universidade da Coruña, A Coruña, Spain. Electronic address: vbolon@udc.es. 2. Cognitive Systems Laboratory, Northeastern University, Boston, MA, USA. 3. Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, MA, USA. 4. Department of Computer Science, Universidade da Coruña, A Coruña, Spain. 5. Departments of Ophthalmology & Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, OR, USA.
Abstract
BACKGROUND AND OBJECTIVE: Understanding the causes of disagreement among experts in clinical decision making has been a challenge for decades. In particular, a high amount of variability exists in diagnosis of retinopathy of prematurity (ROP), which is a disease affecting low birth weight infants and a major cause of childhood blindness. A possible cause of variability, that has been mostly neglected in the literature, is related to discrepancies in the sets of important features considered by different experts. In this paper we propose a methodology which makes use of machine learning techniques to understand the underlying causes of inter-expert variability. METHODS: The experiments are carried out on a dataset consisting of 34 retinal images, each with diagnoses provided by 22 independent experts. Feature selection techniques are applied to discover the most important features considered by a given expert. Those features selected by each expert are then compared to the features selected by other experts by applying similarity measures. Finally, an automated diagnosis system is built in order to check if this approach can be helpful in solving the problem of understanding high inter-rater variability. RESULTS: The experimental results reveal that some features are mostly selected by the feature selection methods regardless the considered expert. Moreover, for pairs of experts with high percentage agreement among them, the feature selection algorithms also select similar features. By using the relevant selected features, the classification performance of the automatic system was improved or maintained. CONCLUSIONS: The proposed methodology provides a handy framework to identify important features for experts and check whether the selected features reflect the pairwise agreements/disagreements. These findings may lead to improved diagnostic accuracy and standardization among clinicians, and pave the way for the application of this methodology to other problems which present inter-expert variability.
BACKGROUND AND OBJECTIVE: Understanding the causes of disagreement among experts in clinical decision making has been a challenge for decades. In particular, a high amount of variability exists in diagnosis of retinopathy of prematurity (ROP), which is a disease affecting low birth weight infants and a major cause of childhood blindness. A possible cause of variability, that has been mostly neglected in the literature, is related to discrepancies in the sets of important features considered by different experts. In this paper we propose a methodology which makes use of machine learning techniques to understand the underlying causes of inter-expert variability. METHODS: The experiments are carried out on a dataset consisting of 34 retinal images, each with diagnoses provided by 22 independent experts. Feature selection techniques are applied to discover the most important features considered by a given expert. Those features selected by each expert are then compared to the features selected by other experts by applying similarity measures. Finally, an automated diagnosis system is built in order to check if this approach can be helpful in solving the problem of understanding high inter-rater variability. RESULTS: The experimental results reveal that some features are mostly selected by the feature selection methods regardless the considered expert. Moreover, for pairs of experts with high percentage agreement among them, the feature selection algorithms also select similar features. By using the relevant selected features, the classification performance of the automatic system was improved or maintained. CONCLUSIONS: The proposed methodology provides a handy framework to identify important features for experts and check whether the selected features reflect the pairwise agreements/disagreements. These findings may lead to improved diagnostic accuracy and standardization among clinicians, and pave the way for the application of this methodology to other problems which present inter-expert variability.
Authors: James D Reynolds; Velma Dobson; Graham E Quinn; Alistair R Fielder; Earl A Palmer; Richard A Saunders; Robert J Hardy; Dale L Phelps; John D Baker; Michael T Trese; David Schaffer; Betty Tung Journal: Arch Ophthalmol Date: 2002-11
Authors: Augusto Azuara-Blanco; L Jay Katz; George L Spaeth; Stephen A Vernon; Fiona Spencer; Ines M Lanzl Journal: Am J Ophthalmol Date: 2003-11 Impact factor: 5.258
Authors: G M Senapati; D Levine; C Smith; J A Estroff; C E Barnewolt; R L Robertson; T Y Poussaint; T S Mehta; X Q Werdich; D Pier; H A Feldman; C D Robson Journal: Ultrasound Obstet Gynecol Date: 2010-11 Impact factor: 7.299
Authors: J Peter Campbell; Michael C Ryan; Emily Lore; Peng Tian; Susan Ostmo; Karyn Jonas; R V Paul Chan; Michael F Chiang Journal: Ophthalmology Date: 2016-05-27 Impact factor: 12.079
Authors: J Peter Campbell; Jayashree Kalpathy-Cramer; Deniz Erdogmus; Peng Tian; Dharanish Kedarisetti; Chace Moleta; James D Reynolds; Kelly Hutcheson; Michael J Shapiro; Michael X Repka; Philip Ferrone; Kimberly Drenser; Jason Horowitz; Kemal Sonmez; Ryan Swan; Susan Ostmo; Karyn E Jonas; R V Paul Chan; Michael F Chiang Journal: Ophthalmology Date: 2016-08-31 Impact factor: 12.079
Authors: Brittni A Scruggs; R V Paul Chan; Jayashree Kalpathy-Cramer; Michael F Chiang; J Peter Campbell Journal: Transl Vis Sci Technol Date: 2020-02-10 Impact factor: 3.283
Authors: Nita G Valikodath; Emily Cole; Daniel S W Ting; J Peter Campbell; Louis R Pasquale; Michael F Chiang; R V Paul Chan Journal: Transl Vis Sci Technol Date: 2021-06-01 Impact factor: 3.283
Authors: Domenico Lepore; Marco H Ji; Monica M Pagliara; Jacopo Lenkowicz; Nikola D Capocchiano; Luca Tagliaferri; Luca Boldrini; Vincenzo Valentini; Andrea Damiani Journal: Transl Vis Sci Technol Date: 2020-07-07 Impact factor: 3.283