Literature DB >> 25977405

Automating the generation of lexical patterns for processing free text in clinical documents.

Frank Meng1, Craig Morioka2.   

Abstract

OBJECTIVE: Many tasks in natural language processing utilize lexical pattern-matching techniques, including information extraction (IE), negation identification, and syntactic parsing. However, it is generally difficult to derive patterns that achieve acceptable levels of recall while also remaining highly precise.
MATERIALS AND METHODS: We present a multiple sequence alignment (MSA)-based technique that automatically generates patterns, thereby leveraging language usage to determine the context of words that influence a given target. MSAs capture the commonalities among word sequences and are able to reveal areas of linguistic stability and variation. In this way, MSAs provide a systemic approach to generating lexical patterns that are generalizable, which will both increase recall levels and maintain high levels of precision.
RESULTS: The MSA-generated patterns exhibited consistent F1-, F.5-, and F2- scores compared to two baseline techniques for IE across four different tasks. Both baseline techniques performed well for some tasks and less well for others, but MSA was found to consistently perform at a high level for all four tasks. DISCUSSION: The performance of MSA on the four extraction tasks indicates the method's versatility. The results show that the MSA-based patterns are able to handle the extraction of individual data elements as well as relations between two concepts without the need for large amounts of manual intervention.
CONCLUSION: We presented an MSA-based framework for generating lexical patterns that showed consistently high levels of both performance and recall over four different extraction tasks when compared to baseline methods.
© The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Keywords:  information extraction; natural language processing; text mining

Mesh:

Year:  2015        PMID: 25977405      PMCID: PMC4986670          DOI: 10.1093/jamia/ocv012

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  19 in total

1.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  Automated extraction of ejection fraction for quality measurement using regular expressions in Unstructured Information Management Architecture (UIMA) for heart failure.

Authors:  Jennifer H Garvin; Scott L DuVall; Brett R South; Bruce E Bray; Daniel Bolton; Julia Heavirland; Steve Pickard; Paul Heidenreich; Shuying Shen; Charlene Weir; Matthew Samore; Mary K Goldstein
Journal:  J Am Med Inform Assoc       Date:  2012-03-21       Impact factor: 4.497

3.  MedEx: a medication information extraction system for clinical narratives.

Authors:  Hua Xu; Shane P Stenner; Son Doan; Kevin B Johnson; Lemuel R Waitman; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2010 Jan-Feb       Impact factor: 4.497

4.  Automated concept-level information extraction to reduce the need for custom software and rules development.

Authors:  Leonard W D'Avolio; Thien M Nguyen; Sergey Goryachev; Louis D Fiore
Journal:  J Am Med Inform Assoc       Date:  2011-06-22       Impact factor: 4.497

5.  A knowledge discovery and reuse pipeline for information extraction in clinical notes.

Authors:  Jon D Patrick; Dung H M Nguyen; Yefeng Wang; Min Li
Journal:  J Am Med Inform Assoc       Date:  2011-07-07       Impact factor: 4.497

6.  Automatic extraction of relations between medical concepts in clinical texts.

Authors:  Bryan Rink; Sanda Harabagiu; Kirk Roberts
Journal:  J Am Med Inform Assoc       Date:  2011 Sep-Oct       Impact factor: 4.497

7.  A hybrid system for temporal information extraction from clinical text.

Authors:  Buzhou Tang; Yonghui Wu; Min Jiang; Yukun Chen; Joshua C Denny; Hua Xu
Journal:  J Am Med Inform Assoc       Date:  2013-04-09       Impact factor: 4.497

8.  TEMPTING system: a hybrid method of rule and machine learning for temporal relation extraction in patient discharge summaries.

Authors:  Yung-Chun Chang; Hong-Jie Dai; Johnny Chi-Yang Wu; Jian-Ming Chen; Richard Tzong-Han Tsai; Wen-Lian Hsu
Journal:  J Biomed Inform       Date:  2013-09-20       Impact factor: 6.317

9.  Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives.

Authors:  Aleksandar Kovacevic; Azad Dehghan; Michele Filannino; John A Keane; Goran Nenadic
Journal:  J Am Med Inform Assoc       Date:  2013-04-20       Impact factor: 4.497

10.  Finding biomedical categories in Medline®.

Authors:  Lana Yeganova; Won Kim; Donald C Comeau; W John Wilbur
Journal:  J Biomed Semantics       Date:  2012-10-05
View more
  3 in total

1.  Clinical Natural Language Processing in 2015: Leveraging the Variety of Texts of Clinical Interest.

Authors:  A Névéol; P Zweigenbaum
Journal:  Yearb Med Inform       Date:  2016-11-10

2.  Enhanced Quality Measurement Event Detection: An Application to Physician Reporting.

Authors:  Suzanne R Tamang; Tina Hernandez-Boussard; Elsie Gyang Ross; Gregory Gaskin; Manali I Patel; Nigam H Shah
Journal:  EGEMS (Wash DC)       Date:  2017-05-30

3.  Named Entity Recognition in Prehospital Trauma Care.

Authors:  Greg M Silverman; Elizabeth A Lindemann; Geetanjali Rajamani; Raymond L Finzel; Reed McEwan; Benjamin C Knoll; Serguei Pakhomov; Genevieve B Melton; Christopher J Tignanelli
Journal:  Stud Health Technol Inform       Date:  2019-08-21
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.