Wendy W Chapman1, John N Dowling, George Hripcsak. 1. Department of Biomedical Informatics, University of Pittsburgh, 200 Meyran Avenue, M-183 VALE, Pittsburgh, PA 15260, United States. chapman@cbmi.pitt.edu
Abstract
OBJECTIVE: Determine whether agreement among annotators improves after being trained to use an annotation schema that specifies: what types of clinical conditions to annotate, the linguistic form of the annotations, and which modifiers to include. METHODS: Three physicians and 3 lay people individually annotated all clinical conditions in 23 emergency department reports. For annotations made using a Baseline Schema and annotations made after training on a detailed annotation schema, we compared: (1) variability of annotation length and number and (2) annotator agreement, using the F-measure. RESULTS: Physicians showed higher agreement and lower variability after training on the detailed annotation schema than when applying the Baseline Schema. Lay people agreed with physicians almost as well as other physicians did but showed a slower learning curve. CONCLUSION: Training annotators on the annotation schema we developed increased agreement among annotators and should be useful in generating reference standard sets for natural language processing studies. The methodology we used to evaluate the schema could be applied to other types of annotation or classification tasks in biomedical informatics.
OBJECTIVE: Determine whether agreement among annotators improves after being trained to use an annotation schema that specifies: what types of clinical conditions to annotate, the linguistic form of the annotations, and which modifiers to include. METHODS: Three physicians and 3 lay people individually annotated all clinical conditions in 23 emergency department reports. For annotations made using a Baseline Schema and annotations made after training on a detailed annotation schema, we compared: (1) variability of annotation length and number and (2) annotator agreement, using the F-measure. RESULTS: Physicians showed higher agreement and lower variability after training on the detailed annotation schema than when applying the Baseline Schema. Lay people agreed with physicians almost as well as other physicians did but showed a slower learning curve. CONCLUSION: Training annotators on the annotation schema we developed increased agreement among annotators and should be useful in generating reference standard sets for natural language processing studies. The methodology we used to evaluate the schema could be applied to other types of annotation or classification tasks in biomedical informatics.
Authors: Louise Deleger; Qi Li; Todd Lingren; Megan Kaiser; Katalin Molnar; Laura Stoutenborough; Michal Kouril; Keith Marsolo; Imre Solti Journal: AMIA Annu Symp Proc Date: 2012-11-03
Authors: Roy J Byrd; Steven R Steinhubl; Jimeng Sun; Shahram Ebadollahi; Walter F Stewart Journal: Int J Med Inform Date: 2013-01-11 Impact factor: 4.046
Authors: Brett R South; Shuying Shen; Makoto Jones; Jennifer Garvin; Matthew H Samore; Wendy W Chapman; Adi V Gundlapalli Journal: Summit Transl Bioinform Date: 2009-03-01
Authors: Brett R South; Shuying Shen; Wendy W Chapman; Sylvain Delisle; Matthew H Samore; Adi V Gundlapalli Journal: Summit Transl Bioinform Date: 2010-03-01
Authors: Todd Lingren; Louise Deleger; Katalin Molnar; Haijun Zhai; Jareen Meinzen-Derr; Megan Kaiser; Laura Stoutenborough; Qi Li; Imre Solti Journal: J Am Med Inform Assoc Date: 2013-09-03 Impact factor: 4.497
Authors: Ehtesham Iqbal; Robbie Mallah; Daniel Rhodes; Honghan Wu; Alvin Romero; Nynn Chang; Olubanke Dzahini; Chandra Pandey; Matthew Broadbent; Robert Stewart; Richard J B Dobson; Zina M Ibrahim Journal: PLoS One Date: 2017-11-09 Impact factor: 3.240
Authors: Brett R South; Shuying Shen; Makoto Jones; Jennifer Garvin; Matthew H Samore; Wendy W Chapman; Adi V Gundlapalli Journal: BMC Bioinformatics Date: 2009-09-17 Impact factor: 3.169