| Literature DB >> 14728277 |
Pierre Zweigenbaum1, Natalia Grabar.
Abstract
Knowledge of morphologically derived words, as provided for medical English by the UMLS Specialist Lexicon, is useful to detect term variants for automated coding and indexing. For most other languages though, no comparable morphological knowledge base is available. We therefore endeavored to design general methods to help collect such knowledge for a given language. We propose here a method for discovering derived words in text corpora and apply it to a French medical corpus. To evaluate this method, we study its ability to suggest derived adjectives for 2,297 nouns found in the SNOMED nomenclature, which itself specifies adjectival equivalents for some of its terms. 74% of the proposed adjectives are judged correct (precision) and cover 16% of these nouns (recall), a larger amount than what SNOMED already specifies. Furthermore, the corpus suggests additional adjectives which can increase SNOMED's by 76%. We conclude that such a method can help speed up the construction of a morphological knowledge base which can increase the number of term variants in an existing controlled vocabulary.Mesh:
Year: 2003 PMID: 14728277 PMCID: PMC1480343
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076