| Literature DB >> 15460695 |
Martin Honeck1, Udo Hahn, Rüdiger Klar, Stefan Schulz.
Abstract
In biomedical documents, there is ample evidence for complex morphological structures in specialized terms. While inflection is relatively easy to deal with, productive morphological processes such as derivation and single-word composition constitute a major challenge. Considering the problem from an information retrieval perspective, we split morphologically complex words into biomedically significant, morpheme-like subwords and match subwords the query terms and document terms are composed of. This way, morphologically motivated word form alterations can be eliminated from the retrieval procedure. Based on a series of retrieval experiments, we have gathered evidence that subword-based indexing and retrieval for the German biomedical sublanguage, at least--outperforms conventional string matching approaches.Mesh:
Year: 2002 PMID: 15460695
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630