Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Unsupervised biomedical named entity recognition: experiments with clinical and biological texts.

Literature DB >> 23954592

Unsupervised biomedical named entity recognition: experiments with clinical and biological texts.

Abstract

Named entity recognition is a crucial component of biomedical natural language processing, enabling information extraction and ultimately reasoning over and knowledge discovery from text. Much progress has been made in the design of rule-based and supervised tools, but they are often genre and task dependent. As such, adapting them to different genres of text or identifying new types of entities requires major effort in re-annotation or rule development. In this paper, we propose an unsupervised approach to extracting named entities from biomedical text. We describe a stepwise solution to tackle the challenges of entity boundary detection and entity type classification without relying on any handcrafted rules, heuristics, or annotated data. A noun phrase chunker followed by a filter based on inverse document frequency extracts candidate entities from free text. Classification of candidate entities into categories of interest is carried out by leveraging principles from distributional semantics. Experiments show that our system, especially the entity classification step, yields competitive results on two popular biomedical datasets of clinical notes and biological literature, and outperforms a baseline dictionary match approach. Detailed error analysis provides a road map for future work.

Entities: Chemical Disease Gene Species

Keywords: Chunking; Distributional semantics; Named entity recognition; Natural language processing; UMLS

Mesh：

Year: 2013 PMID： 23954592 PMCID： PMC3865922 DOI： 10.1016/j.jbi.2013.08.004

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

30 in total

Unsupervised biomedical named entity recognition: experiments with clinical and biological texts.

1. GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles.

2. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

3. Aggregating UMLS semantic types for reducing conceptual complexity.

4. Tagging gene and protein names in biomedical text.

5. Using an ensemble system to improve concept extraction from clinical records.

6. A hybrid knowledge-based and data-driven approach to identifying semantically similar concepts.

7. Unlocking clinical data from narrative reports: a study of natural language processing.

8. Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010.

9. Overview of BioCreAtIvE: critical assessment of information extraction for biology.

10. An ontology for cell types.

1. Characterizing the sublanguage of online breast cancer forums for medications, symptoms, and emotions.

2. tmChem: a high performance approach for chemical named entity recognition and normalization.

3. Feature extraction for phenotyping from semantic and knowledge resources.

4. Wide-coverage relation extraction from MEDLINE using deep syntax.

5. Automating the Determination of Prostate Cancer Risk Strata From Electronic Medical Records.

Review 6. Natural Language Processing for EHR-Based Computational Phenotyping.

7. Identifying named entities from PubMed for enriching semantic categories.

8. Expansion of medical vocabularies using distributional semantics on Japanese patient blogs.

9. A New Data Representation Based on Training Data Characteristics to Extract Drug Name Entity in Medical Text.

10. Text Mining the History of Medicine.