| Literature DB >> 14664023 |
Pavel B Dobrokhotov1, Cyril Goutte, Anne-Lise Veuthey, Eric Gaussier.
Abstract
The goal of medical annotation of human proteins in Swiss-Prot is to add features specifically intended for researchers working on genetic diseases and polymorphisms. For this purpose, it is necessary to search through a vast number of publications containing relevant information. Promising results have been obtained by applying natural language processing and machine learning techniques to solve this problem. By using the Probabilistic Latent Categorizer on representative query sets, 69% recall and 59% precision was achieved for relevant documents. This classifier also rejected irrelevant abstracts with more than 96% precision. Better linguistic pre-processing of source documents can further improve such computer approach.Entities:
Mesh:
Year: 2003 PMID: 14664023
Source DB: PubMed Journal: Stud Health Technol Inform ISSN: 0926-9630