Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Challenges in clinical natural language processing for automated disorder normalization.

Literature DB >> 26187250

Challenges in clinical natural language processing for automated disorder normalization.

Robert Leaman¹, Ritu Khare², Zhiyong Lu³.

Abstract

BACKGROUND: Identifying key variables such as disorders within the clinical narratives in electronic health records has wide-ranging applications within clinical practice and biomedical research. Previous research has demonstrated reduced performance of disorder named entity recognition (NER) and normalization (or grounding) in clinical narratives than in biomedical publications. In this work, we aim to identify the cause for this performance difference and introduce general solutions.
METHODS: We use closure properties to compare the richness of the vocabulary in clinical narrative text to biomedical publications. We approach both disorder NER and normalization using machine learning methodologies. Our NER methodology is based on linear-chain conditional random fields with a rich feature approach, and we introduce several improvements to enhance the lexical knowledge of the NER system. Our normalization method - never previously applied to clinical data - uses pairwise learning to rank to automatically learn term variation directly from the training data.
RESULTS: We find that while the size of the overall vocabulary is similar between clinical narrative and biomedical publications, clinical narrative uses a richer terminology to describe disorders than publications. We apply our system, DNorm-C, to locate disorder mentions and in the clinical narratives from the recent ShARe/CLEF eHealth Task. For NER (strict span-only), our system achieves precision=0.797, recall=0.713, f-score=0.753. For the normalization task (strict span+concept) it achieves precision=0.712, recall=0.637, f-score=0.672. The improvements described in this article increase the NER f-score by 0.039 and the normalization f-score by 0.036. We also describe a high recall version of the NER, which increases the normalization recall to as high as 0.744, albeit with reduced precision. DISCUSSION: We perform an error analysis, demonstrating that NER errors outnumber normalization errors by more than 4-to-1. Abbreviations and acronyms are found to be frequent causes of error, in addition to the mentions the annotators were not able to identify within the scope of the controlled vocabulary.
CONCLUSION: Disorder mentions in text from clinical narratives use a rich vocabulary that results in high term variation, which we believe to be one of the primary causes of reduced performance in clinical narrative. We show that pairwise learning to rank offers high performance in this context, and introduce several lexical enhancements - generalizable to other clinical NER tasks - that improve the ability of the NER system to handle this variation. DNorm-C is a high performing, open source system for disorders in clinical text, and a promising step toward NER and normalization methods that are trainable to a wide variety of domains and entities. (DNorm-C is open source software, and is available with a trained model at the DNorm demonstration website: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#DNorm.). Published by Elsevier Inc.

Entities: Chemical Disease Gene Species

Keywords: Electronic health records; Information extraction; Natural language processing

Mesh：

Year: 2015 PMID： 26187250 PMCID： PMC4713367 DOI： 10.1016/j.jbi.2015.07.010

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

30 in total

1. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors: A R Aronson
Journal: Proc AMIA Symp Date: 2001

2. BANNER: an executable survey of advances in biomedical named entity recognition.

Authors: Robert Leaman; Graciela Gonzalez
Journal: Pac Symp Biocomput Date: 2008

3. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions.

Authors: Wendy W Chapman; Prakash M Nadkarni; Lynette Hirschman; Leonard W D'Avolio; Guergana K Savova; Ozlem Uzuner
Journal: J Am Med Inform Assoc Date: 2011 Sep-Oct Impact factor: 4.497

4. Mining clinical text for signals of adverse drug-drug interactions.

Authors: Srinivasan V Iyer; Rave Harpaz; Paea LePendu; Anna Bauer-Mehren; Nigam H Shah
Journal: J Am Med Inform Assoc Date: 2013-10-24 Impact factor: 4.497

Review 5. Mining electronic health records: towards better research applications and clinical care.

Authors: Peter B Jensen; Lars J Jensen; Søren Brunak
Journal: Nat Rev Genet Date: 2012-05-02 Impact factor: 53.242

6. Semantic classification of diseases in discharge summaries using a context-aware rule-based classifier.

Authors: Illés Solt; Domonkos Tikk; Viktor Gál; Zsolt T Kardkovács
Journal: J Am Med Inform Assoc Date: 2009-04-23 Impact factor: 4.497

Review 7. Evaluating temporal relations in clinical text: 2012 i2b2 Challenge.

Authors: Weiyi Sun; Anna Rumshisky; Ozlem Uzuner
Journal: J Am Med Inform Assoc Date: 2013-04-05 Impact factor: 4.497

8. Recognizing clinical entities in hospital discharge summaries using Structural Support Vector Machines with word representation features.

Authors: Buzhou Tang; Hongxin Cao; Yonghui Wu; Min Jiang; Hua Xu
Journal: BMC Med Inform Decis Mak Date: 2013-04-05 Impact factor: 2.796

9. Overview of BioCreative II gene mention recognition.

Authors: Larry Smith; Lorraine K Tanabe; Rie Johnson nee Ando; Cheng-Ju Kuo; I-Fang Chung; Chun-Nan Hsu; Yu-Shi Lin; Roman Klinger; Christoph M Friedrich; Kuzman Ganchev; Manabu Torii; Hongfang Liu; Barry Haddow; Craig A Struble; Richard J Povinelli; Andreas Vlachos; William A Baumgartner; Lawrence Hunter; Bob Carpenter; Richard Tzong-Han Tsai; Hong-Jie Dai; Feng Liu; Yifei Chen; Chengjie Sun; Sophia Katrenko; Pieter Adriaans; Christian Blaschke; Rafael Torres; Mariana Neves; Preslav Nakov; Anna Divoli; Manuel Maña-López; Jacinto Mata; W John Wilbur
Journal: Genome Biol Date: 2008-09-01 Impact factor: 13.583

10. DNorm: disease name normalization with pairwise learning to rank.

Authors: Robert Leaman; Rezarta Islamaj Dogan; Zhiyong Lu
Journal: Bioinformatics Date: 2013-08-21 Impact factor: 6.937

25 in total

1. MetaMap Lite: an evaluation of a new Java implementation of MetaMap.

Authors: Dina Demner-Fushman; Willie J Rogers; Alan R Aronson
Journal: J Am Med Inform Assoc Date: 2017-07-01 Impact factor: 4.497

2. TaggerOne: joint named entity recognition and normalization with semi-Markov Models.

Authors: Robert Leaman; Zhiyong Lu
Journal: Bioinformatics Date: 2016-06-09 Impact factor: 6.937

3. What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization.

Authors: Griffin Adams; Emily Alsentzer; Mert Ketenci; Jason Zucker; Noémie Elhadad
Journal: Proc Conf Date: 2021-06

4. The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records.

Authors: Sam Henry; Yanshan Wang; Feichen Shen; Ozlem Uzuner
Journal: J Am Med Inform Assoc Date: 2020-10-01 Impact factor: 4.497

Review 5. Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text.

Authors: G Gonzalez-Hernandez; A Sarker; K O'Connor; G Savova
Journal: Yearb Med Inform Date: 2017-09-11

Review 6. Evolving Role and Future Directions of Natural Language Processing in Gastroenterology.

Authors: Fredy Nehme; Keith Feldman
Journal: Dig Dis Sci Date: 2020-02-27 Impact factor: 3.199

7. Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

Authors: Surabhi Datta; Yuqi Si; Laritza Rodriguez; Sonya E Shooshan; Dina Demner-Fushman; Kirk Roberts
Journal: J Biomed Inform Date: 2020-06-18 Impact factor: 6.317

8. Improving the Path from Diagnoses to Documentation: A Cognitive Review Tool for Clinical Notes and Administrative Records.

Authors: Yufan Guo; Joy Wu; Tyler Baldwin; David Beymer; Vandana V Mukherjee; Tanveer F Syeda-Mahmood
Journal: AMIA Annu Symp Proc Date: 2018-12-05

9. Reducing Physicians' Cognitive Load During Chart Review: A Problem-Oriented Summary of the Patient Electronic Record.

Authors: Jennifer J Liang; Ching-Huei Tsou; Bharath Dandala; Ananya Poddar; Venkata Joopudi; Diwakar Mahajan; John Prager; Preethi Raghavan; Michele Payne
Journal: AMIA Annu Symp Proc Date: 2022-02-21

10. Extracting Drug Names and Associated Attributes From Discharge Summaries: Text Mining Study.

Authors: Ghada Alfattni; Maksim Belousov; Niels Peek; Goran Nenadic
Journal: JMIR Med Inform Date: 2021-05-05