Literature DB >> 21549857

Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies.

Jung-Wei Fan1, Carol Friedman.   

Abstract

Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. With either a rule-based or corpus-based approach, the grammar engineering process requires substantial time and knowledge from experts, and does not always yield a semantically transferable grammar. To reduce the human effort and to promote semantic transferability, we propose an automated method for deriving a probabilistic grammar based on a training corpus consisting of concept strings and semantic classes from the Unified Medical Language System (UMLS), a comprehensive terminology resource widely used by the community. The grammar is designed to specify noun phrases only due to the nominal nature of the majority of biomedical terminological concepts. Evaluated on manually parsed clinical notes, the derived grammar achieved a recall of 0.644, precision of 0.737, and average cross-bracketing of 0.61, which demonstrated better performance than a control grammar with the semantic information removed. Error analysis revealed shortcomings that could be addressed to improve performance. The results indicated the feasibility of an approach which automatically incorporates terminology semantics in the building of an operational grammar. Although the current performance of the unsupervised solution does not adequately replace manual engineering, we believe once the performance issues are addressed, it could serve as an aide in a semi-supervised solution.
Copyright © 2011 Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2011        PMID: 21549857      PMCID: PMC3172402          DOI: 10.1016/j.jbi.2011.04.006

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  22 in total

1.  Aggregating UMLS semantic types for reducing conceptual complexity.

Authors:  A T McCray; A Burgun; O Bodenreider
Journal:  Stud Health Technol Inform       Date:  2001

2.  MEDSYNDIKATE--a natural language system for the extraction of medical information from findings reports.

Authors:  Udo Hahn; Martin Romacker; Stefan Schulz
Journal:  Int J Med Inform       Date:  2002-12-04       Impact factor: 4.046

3.  UMLS language and vocabulary tools.

Authors:  Allen C Browne; Guy Divita; Alan R Aronson; Alexa T McCray
Journal:  AMIA Annu Symp Proc       Date:  2003

4.  Adding a medical lexicon to an English Parser.

Authors:  Peter Szolovits
Journal:  AMIA Annu Symp Proc       Date:  2003

5.  A field theoretical approach to medical natural language processing.

Authors:  Ricky K Taira; Vijayaraghavan Bashyam; Hooshang Kangarloo
Journal:  IEEE Trans Inf Technol Biomed       Date:  2007-07

Review 6.  Frontiers of biomedical text mining: current progress.

Authors:  Pierre Zweigenbaum; Dina Demner-Fushman; Hong Yu; Kevin B Cohen
Journal:  Brief Bioinform       Date:  2007-10-30       Impact factor: 11.622

7.  Towards identifying intervention arms in randomized controlled trials: extracting coordinating constructions.

Authors:  Grace Yuet-Chee Chung
Journal:  J Biomed Inform       Date:  2009-01-04       Impact factor: 6.317

Review 8.  Natural language processing and the representation of clinical data.

Authors:  N Sager; M Lyman; C Bucknall; N Nhan; L J Tick
Journal:  J Am Med Inform Assoc       Date:  1994 Mar-Apr       Impact factor: 4.497

9.  The Unified Medical Language System.

Authors:  D A Lindberg; B L Humphreys; A T McCray
Journal:  Methods Inf Med       Date:  1993-08       Impact factor: 2.176

10.  A natural language understanding system combining syntactic and semantic techniques.

Authors:  P Haug; S Koehler; L M Lau; P Wang; R Rocha; S Huff
Journal:  Proc Annu Symp Comput Appl Med Care       Date:  1994
View more
  2 in total

1.  Part-of-speech tagging for clinical text: wall or bridge between institutions?

Authors:  Jung-wei Fan; Rashmi Prasad; Rommel M Yabut; Richard M Loomis; Daniel S Zisook; John E Mattison; Yang Huang
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences.

Authors:  Jung-wei Fan; Elly W Yang; Min Jiang; Rashmi Prasad; Richard M Loomis; Daniel S Zisook; Josh C Denny; Hua Xu; Yang Huang
Journal:  J Am Med Inform Assoc       Date:  2013-08-01       Impact factor: 4.497

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.