| Literature DB >> 24551362 |
Aron Henriksson1, Mike Conway2, Martin Duneld1, Wendy W Chapman3.
Abstract
Medical terminologies and ontologies are important tools for natural language processing of health record narratives. To account for the variability of language use, synonyms need to be stored in a semantic resource as textual instantiations of a concept. Developing such resources manually is, however, prohibitively expensive and likely to result in low coverage. To facilitate and expedite the process of lexical resource development, distributional analysis of large corpora provides a powerful data-driven means of (semi-)automatically identifying semantic relations, including synonymy, between terms. In this paper, we demonstrate how distributional analysis of a large corpus of electronic health records - the MIMIC-II database - can be employed to extract synonyms of SNOMED CT preferred terms. A distinctive feature of our method is its ability to identify synonymous relations between terms of varying length.Mesh:
Year: 2013 PMID: 24551362 PMCID: PMC3900203
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076