Literature DB >> 10332654

A semantic lexicon for medical language processing.

S B Johnson1.   

Abstract

OBJECTIVE: Construction of a resource that provides semantic information about words and phrases to facilitate the computer processing of medical narrative.
DESIGN: Lexemes (words and word phrases) in the Specialist Lexicon were matched against strings in the 1997 Metathesaurus of the Unified Medical Language System (UMLS) developed by the National Library of Medicine. This yielded a "semantic lexicon," in which each lexeme is associated with one or more syntactic types, each of which can have one or more semantic types. The semantic lexicon was then used to assign semantic types to lexemes occurring in a corpus of discharge summaries (603,306 sentences). Lexical items with multiple semantic types were examined to determine whether some of the types could be eliminated, on the basis of usage in discharge summaries. A concordance program was used to find contrasting contexts for each lexeme that would reflect different semantic senses. Based on this evidence, semantic preference rules were developed to reduce the number of lexemes with multiple semantic types.
RESULTS: Matching the Specialist Lexicon against the Metathesaurus produced a semantic lexicon with 75,711 lexical forms, 22,805 (30.1 percent) of which had two or more semantic types. Matching the Specialist Lexicon against one year's worth of discharge summaries identified 27,633 distinct lexical forms, 13,322 of which had at least one semantic type. This suggests that the Specialist Lexicon has about 79 percent coverage for syntactic information and 38 percent coverage for semantic information for discharge summaries. Of those lexemes in the corpus that had semantic types, 3,474 (12.6 percent) had two or more types. When semantic preference rules were applied to the semantic lexicon, the number of entries with multiple semantic types was reduced to 423 (1.5 percent). In the discharge summaries, occurrences of lexemes with multiple semantic types were reduced from 9.41 to 1.46 percent.
CONCLUSION: Automatic methods can be used to construct a semantic lexicon from existing UMLS sources. This semantic information can aid natural language processing programs that analyze medical narrative, provided that lexemes with multiple semantic types are kept to a minimum. Semantic preference rules can be used to select semantic types that are appropriate to clinical reports. Further work is needed to increase the coverage of the semantic lexicon and to exploit contextual information when selecting semantic senses.

Mesh:

Year:  1999        PMID: 10332654      PMCID: PMC61361          DOI: 10.1136/jamia.1999.0060205

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  21 in total

1.  Automated analysis of the discharge summary.

Authors:  E R Gabrieli; D J Speth
Journal:  J Clin Comput       Date:  1986

2.  A Medical Text Analysis System for German--syntax analysis.

Authors:  P M Pietrzyk
Journal:  Methods Inf Med       Date:  1991-10       Impact factor: 2.176

3.  The Unified Medical Language System.

Authors:  D A B Lindberg; B L Humphreys; A T McCray
Journal:  Yearb Med Inform       Date:  1993

4.  Knowledge-based approaches to the maintenance of a large controlled medical terminology.

Authors:  J J Cimino; P D Clayton; G Hripcsak; S B Johnson
Journal:  J Am Med Inform Assoc       Date:  1994 Jan-Feb       Impact factor: 4.497

Review 5.  Natural language processing and the representation of clinical data.

Authors:  N Sager; M Lyman; C Bucknall; N Nhan; L J Tick
Journal:  J Am Med Inform Assoc       Date:  1994 Mar-Apr       Impact factor: 4.497

6.  A general natural-language text processor for clinical radiology.

Authors:  C Friedman; P O Alderson; J H Austin; J J Cimino; S B Johnson
Journal:  J Am Med Inform Assoc       Date:  1994 Mar-Apr       Impact factor: 4.497

7.  UMLS knowledge for biomedical language processing.

Authors:  A T McCray; A R Aronson; A C Browne; T C Rindflesch; A Razi; S Srinivasan
Journal:  Bull Med Libr Assoc       Date:  1993-04

8.  Modelling for natural language understanding.

Authors:  R Baud; C Lovis; L Alpay; A M Rassinoux; J R Scherrer; A Nowlan; A Rector
Journal:  Proc Annu Symp Comput Appl Med Care       Date:  1993

9.  Formal properties of the Metathesaurus.

Authors:  M S Tuttle; N E Olson; K E Campbell; D D Sherertz; S J Nelson; W G Cole
Journal:  Proc Annu Symp Comput Appl Med Care       Date:  1994

10.  A natural language understanding system combining syntactic and semantic techniques.

Authors:  P Haug; S Koehler; L M Lau; P Wang; R Rocha; S Huff
Journal:  Proc Annu Symp Comput Appl Med Care       Date:  1994
View more
  25 in total

1.  Evaluating UMLS strings for natural language processing.

Authors:  A T McCray; O Bodenreider; J D Malley; A C Browne
Journal:  Proc AMIA Symp       Date:  2001

2.  Evaluating the UMLS as a source of lexical knowledge for medical language processing.

Authors:  C Friedman; H Liu; L Shagina; S Johnson; G Hripcsak
Journal:  Proc AMIA Symp       Date:  2001

3.  Electronic health record meets digital library: a new environment for achieving an old goal.

Authors:  B L Humphreys
Journal:  J Am Med Inform Assoc       Date:  2000 Sep-Oct       Impact factor: 4.497

4.  "Understanding" medical school curriculum content using KnowledgeMap.

Authors:  Joshua C Denny; Jeffrey D Smithers; Randolph A Miller; Anderson Spickard
Journal:  J Am Med Inform Assoc       Date:  2003-03-28       Impact factor: 4.497

5.  Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS.

Authors:  Hongfang Liu; Stephen B Johnson; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2002 Nov-Dec       Impact factor: 4.497

6.  The horizontal and vertical nature of patient phenotype retrieval: new directions for clinical text processing.

Authors:  Christopher G Chute
Journal:  Proc AMIA Symp       Date:  2002

7.  Voice capture of medical residents' clinical information needs during an inpatient rotation.

Authors:  Herbert S Chase; David R Kaufman; Stephen B Johnson; Eneida A Mendonca
Journal:  J Am Med Inform Assoc       Date:  2009-03-04       Impact factor: 4.497

8.  Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies.

Authors:  Jung-Wei Fan; Carol Friedman
Journal:  J Biomed Inform       Date:  2011-04-28       Impact factor: 6.317

9.  EliXR: an approach to eligibility criteria extraction and representation.

Authors:  Chunhua Weng; Xiaoying Wu; Zhihui Luo; Mary Regina Boland; Dimitri Theodoratos; Stephen B Johnson
Journal:  J Am Med Inform Assoc       Date:  2011-07-31       Impact factor: 4.497

10.  The Sublanguage of Clinical Problem Lists: A Corpus Analysis.

Authors:  Kevin J Peterson; Hongfang Liu
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.