| Literature DB >> 11604745 |
P Ruch1, R Baud, A Geissbühler, A M Rassinoux.
Abstract
In this paper we compare two types of corpus, focusing on the lexical ambiguity of each of them. The first corpus consists mainly of general newspaper articles and literature excerpts, while the second belongs to the medical domain. To conduct the study, we have used two different disambiguation tools. First, each tool was validated in its respective application area. We then use these systems in order to assess and compare both the general ambiguity rate and the particularities of each domain. Quantitative results show that medical documents are lexically less ambiguous than unrestricted documents. Our conclusions emphasize the importance of the application area in the design of NLP tools.Mesh:
Year: 2001 PMID: 11604745
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630