Literature DB >> 22707745

Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules.

Siddhartha Reddy Jonnalagadda1, Dingcheng Li, Sunghwan Sohn, Stephen Tze-Inn Wu, Kavishwar Wagholikar, Manabu Torii, Hongfang Liu.   

Abstract

OBJECTIVE: This paper describes the coreference resolution system submitted by Mayo Clinic for the 2011 i2b2/VA/Cincinnati shared task Track 1C. The goal of the task was to construct a system that links the markables corresponding to the same entity.
MATERIALS AND METHODS: The task organizers provided progress notes and discharge summaries that were annotated with the markables of treatment, problem, test, person, and pronoun. We used a multi-pass sieve algorithm that applies deterministic rules in the order of preciseness and simultaneously gathers information about the entities in the documents. Our system, MedCoref, also uses a state-of-the-art machine learning framework as an alternative to the final, rule-based pronoun resolution sieve.
RESULTS: The best system that uses a multi-pass sieve has an overall score of 0.836 (average of B(3), MUC, Blanc, and CEAF F score) for the training set and 0.843 for the test set. DISCUSSION: A supervised machine learning system that typically uses a single function to find coreferents cannot accommodate irregularities encountered in data especially given the insufficient number of examples. On the other hand, a completely deterministic system could lead to a decrease in recall (sensitivity) when the rules are not exhaustive. The sieve-based framework allows one to combine reliable machine learning components with rules designed by experts.
CONCLUSION: Using relatively simple rules, part-of-speech information, and semantic type properties, an effective coreference resolution system could be designed. The source code of the system described is available at https://sourceforge.net/projects/ohnlp/files/MedCoref.

Entities:  

Mesh:

Year:  2012        PMID: 22707745      PMCID: PMC3422831          DOI: 10.1136/amiajnl-2011-000766

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  17 in total

Review 1.  Two biomedical sublanguages: a description based on the theories of Zellig Harris.

Authors:  Carol Friedman; Pauline Kra; Andrey Rzhetsky
Journal:  J Biomed Inform       Date:  2002-08       Impact factor: 6.317

2.  Document clustering of clinical narratives: a systematic study of clinical sublanguages.

Authors:  Olga Patterson; John F Hurdle
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

Review 3.  Evaluating the state of the art in coreference resolution for electronic medical records.

Authors:  Ozlem Uzuner; Andreea Bodnari; Shuying Shen; Tyler Forbush; John Pestian; Brett R South
Journal:  J Am Med Inform Assoc       Date:  2012-02-24       Impact factor: 4.497

4.  Anaphoric reference in clinical reports: characteristics of an annotated corpus.

Authors:  Wendy W Chapman; Guergana K Savova; Jiaping Zheng; Melissa Tharp; Rebecca Crowley
Journal:  J Biomed Inform       Date:  2012-02-09       Impact factor: 6.317

5.  Evaluation of a method to identify and categorize section headers in clinical documents.

Authors:  Joshua C Denny; Anderson Spickard; Kevin B Johnson; Neeraja B Peterson; Josh F Peterson; Randolph A Miller
Journal:  J Am Med Inform Assoc       Date:  2009-08-28       Impact factor: 4.497

Review 6.  Hidden Markov models.

Authors:  S R Eddy
Journal:  Curr Opin Struct Biol       Date:  1996-06       Impact factor: 6.809

7.  Comparing methods for identifying pancreatic cancer patients using electronic data sources.

Authors:  Jeff Friedlin; Marc Overhage; Mohammed A Al-Haddad; Joshua A Waters; J Juan R Aguilar-Saavedra; Joe Kesterson; Max Schmidt
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

Review 8.  Coreference resolution: a review of general methodologies and applications in the clinical domain.

Authors:  Jiaping Zheng; Wendy W Chapman; Rebecca S Crowley; Guergana K Savova
Journal:  J Biomed Inform       Date:  2011-08-12       Impact factor: 6.317

9.  NEMO: Extraction and normalization of organization names from PubMed affiliations.

Authors:  Siddhartha Reddy Jonnalagadda; Philip Topham
Journal:  J Biomed Discov Collab       Date:  2010-10-04

10.  Feasibility of pooling annotated corpora for clinical concept extraction.

Authors:  Kavishwar Wagholikar; Manabu Torii; Siddhartha Jonnalagadda; Hongfang Liu
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2012-03-19
View more
  16 in total

Review 1.  Evaluating the state of the art in coreference resolution for electronic medical records.

Authors:  Ozlem Uzuner; Andreea Bodnari; Shuying Shen; Tyler Forbush; John Pestian; Brett R South
Journal:  J Am Med Inform Assoc       Date:  2012-02-24       Impact factor: 4.497

2.  Electronic health records-driven phenotyping: challenges, recent advances, and perspectives.

Authors:  Jyotishman Pathak; Abel N Kho; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2013-12       Impact factor: 4.497

3.  Automated annotation and classification of BI-RADS assessment from radiology reports.

Authors:  Sergio M Castro; Eugene Tseytlin; Olga Medvedeva; Kevin Mitchell; Shyam Visweswaran; Tanja Bekhuis; Rebecca S Jacobson
Journal:  J Biomed Inform       Date:  2017-04-18       Impact factor: 6.317

4.  Comprehensive temporal information detection from clinical text: medical events, time, and TLINK identification.

Authors:  Sunghwan Sohn; Kavishwar B Wagholikar; Dingcheng Li; Siddhartha R Jonnalagadda; Cui Tao; Ravikumar Komandur Elayavilli; Hongfang Liu
Journal:  J Am Med Inform Assoc       Date:  2013-04-04       Impact factor: 4.497

Review 5.  "Big data" and the electronic health record.

Authors:  M K Ross; W Wei; L Ohno-Machado
Journal:  Yearb Med Inform       Date:  2014-08-15

6.  The UAB Informatics Institute and 2016 CEGS N-GRID de-identification shared task challenge.

Authors:  Duy Duc An Bui; Mathew Wyatt; James J Cimino
Journal:  J Biomed Inform       Date:  2017-05-03       Impact factor: 6.317

7.  Towards generalizable entity-centric clinical coreference resolution.

Authors:  Timothy Miller; Dmitriy Dligach; Steven Bethard; Chen Lin; Guergana Savova
Journal:  J Biomed Inform       Date:  2017-04-21       Impact factor: 6.317

8.  An automatic system to identify heart disease risk factors in clinical texts over time.

Authors:  Qingcai Chen; Haodi Li; Buzhou Tang; Xiaolong Wang; Xin Liu; Zengjian Liu; Shu Liu; Weida Wang; Qiwen Deng; Suisong Zhu; Yangxin Chen; Jingfeng Wang
Journal:  J Biomed Inform       Date:  2015-09-08       Impact factor: 6.317

9.  PDF text classification to leverage information extraction from publication reports.

Authors:  Duy Duc An Bui; Guilherme Del Fiol; Siddhartha Jonnalagadda
Journal:  J Biomed Inform       Date:  2016-04-01       Impact factor: 6.317

10.  Patient-level temporal aggregation for text-based asthma status ascertainment.

Authors:  Stephen T Wu; Young J Juhn; Sunghwan Sohn; Hongfang Liu
Journal:  J Am Med Inform Assoc       Date:  2014-05-15       Impact factor: 4.497

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.