Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Rutabaga by any other name: extracting biological names.

Literature DB >> 12755519

Rutabaga by any other name: extracting biological names.

Lynette Hirschman¹, Alexander A Morgan, Alexander S Yeh.

Abstract

As the pace of biological research accelerates, biologists are becoming increasingly reliant on computers to manage the information explosion. Biologists communicate their research findings by relying on precise biological terms; these terms then provide indices into the literature and across the growing number of biological databases. This article examines emerging techniques to access biological resources through extraction of entity names and relations among them. Information extraction has been an active area of research in natural language processing and there are promising results for information extraction applied to news stories, e.g., balanced precision and recall in the 93-95% range for identifying person, organization and location names. But these results do not seem to transfer directly to biological names, where results remain in the 75-80% range. Multiple factors may be involved, including absence of shared training and test sets for rigorous measures of progress, lack of annotated training data specific to biological tasks, pervasive ambiguity of terms, frequent introduction of new terms, and a mismatch between evaluation tasks as defined for news and real biological problems. We present evidence from a simple lexical matching exercise that illustrates some specific problems encountered when identifying biological names. We conclude by outlining a research agenda to raise performance of named entity tagging to a level where it can be used to perform tasks of biological importance.

Entities: Species

Mesh：

Year: 2002 PMID： 12755519 DOI： 10.1016/s1532-0464(03)00014-5

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

Keyword Cloud
Cited

27 in total

Rutabaga by any other name: extracting biological names.

1. Cross-species gene normalization by species inference.

2. Recognizing Medication related Entities in Hospital Discharge Summaries using Support Vector Machine.

3. Using co-occurrence network structure to extract synonymous gene and protein names from MEDLINE abstracts.

4. Computer and Internet Utilization among the Medical Students in Qassim University, Saudi Arabia.

5. Throw the bath water out, keep the baby: keeping medically-relevant terms for text mining.

6. Getting started in text mining: part two.

7. Seeking a new biology through text mining.

8. Prioritizing PubMed articles for the Comparative Toxicogenomic Database utilizing semantic information.

9. An automated framework for hypotheses generation using literature.

10. Recognition of medication information from discharge summaries using ensembles of classifiers.