| Literature DB >> 20351847 |
Rosa L Figueroa1, Qing Zeng-Treitler, Sergey Goryachev, Eduardo P Wiechmann.
Abstract
We developed a method to help tailor a comprehensive vocabulary system (e.g. the UMLS) for a sub-domain (e.g. clinical reports) in support of natural language processing (NLP). The method detects unused sense in a sub-domain by comparing the relational neighborhood of a word/term in the vocabulary with the semantic neighborhood of the word/term in the sub-domain. The semantic neighborhood of the word/term in the sub-domain is determined using latent semantic analysis (LSA). We trained and tested the unused sense detection on two clinical text corpora: one contains discharge summaries and the other outpatient visit notes. We were able to detect unused senses with precision from 79% to 87%, recall from 48% to 74%, and an area under receiver operation curve (AUC) of 72% to 87%.Entities:
Mesh:
Year: 2009 PMID: 20351847 PMCID: PMC2815465
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076