| Literature DB >> 23304389 |
Bjoern-Toby Berster1, J Caleb Goodwin, Trevor Cohen.
Abstract
Coping with the ambiguous meanings of words has long been a hurdle for information retrieval and natural language processing systems. This paper presents a new word sense disambiguation approach using high-dimensional binary vectors, which encode meanings of words based on the different contexts in which they occur. In our approach, a randomly constructed vector is assigned to each ambiguous term, and another to each sense of this term. In the context of a sense-annotated training set, a reversible vector transformation is used to combine these vectors, such that both the term and the sense assigned to a context in which the term occurs are encoded into vectors representing the surrounding terms in this context. When a new context is encountered, the information required to disambiguate this term is extracted from the trained semantic vectors for the terms in this context by reversing the vector transformation to recover the correct sense of the term. On repeated experiments using ten-fold cross-validation and a standard test set, we obtained results comparable to the best obtained in previous studies. These results demonstrate the potential of our methodology, and suggest directions for future research.Entities:
Mesh:
Year: 2012 PMID: 23304389 PMCID: PMC3540565
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076