| Literature DB >> 26043907 |
Min Song, Hwanjo Yu, Wook-Shin Han.
Abstract
BACKGROUND: Bio-entity extraction is a pivotal component for information extraction from biomedical literature. The dictionary-based bio-entity extraction is the first generation of Named Entity Recognition (NER) techniques.Entities:
Mesh:
Year: 2015 PMID: 26043907 PMCID: PMC4460617 DOI: 10.1186/1472-6947-15-S1-S9
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Algorithm of The proposed technique.
| Given a dictionary |
|---|
| 1: Apply the approximate string matching technique |
Figure 1Portion of MeSH tree hierarchy.
Figure 2Sample string alignment.
Figure 3Weighted directed acyclic graph.
Figure 4Graph constructed from the 6x4 lattice.
Basic Statistics of the Test Data.
| NER | GENIA | GENIA+MeSH | GENIA+MeSH+UMLS | ||||
|---|---|---|---|---|---|---|---|
| Precision | Recall | Precision | Recall | Precision | Recall | ||
| The proposed technique | A | 98.7% | 71.4% | 83.3% | 71.4% | 70.7% | 68.9% |
| B | 94.8% | 62.4% | 90.3% | 76.9% | 90.1% | 74.1% | |
| C | 93.0% | 57.3% | 87.8% | 72.0% | 88.3% | 68.4% | |
| Context Only | A | 69.4% | 68.5% | 39.6% | 80.7% | 33.8% | 74.9% |
| B | 75.9% | 66.7% | 55.0% | 84.8% | 50.5% | 82.6% | |
| C | 76.4% | 62.2% | 56.8% | 81.0% | 51.7% | 78.2% | |
| SPED Only | A | 98.7% | 44.1% | 94.4% | 60.6% | 94.8% | 63.4% |
| B | 99.3% | 41.8% | 94.3% | 71.7% | 94.5% | 72.9% | |
| C | 97.6% | 37.5% | 93.2% | 64.1% | 93.6% | 66.7% | |
Experimental results of three different combinations of the proposed technique.
| Test Set | # of abstracts | # of tokens |
|---|---|---|
| A (1978-1989) | 104 | 22,320 |
| B (1990-1999) | 106 | 25,080 |
| C (2000-2001) | 130 | 33,380 |
Figure 5Performance comparison on GENIA data (F-measure).
Figure 6Performance comparison on GENIA+MeSH data (F-measure).
Figure 7Performance comparison on GENIA+MeSH+UMLS data (F-measure).
Performance comparison between the proposed technique and Zhou and Su's Technique (P, R, and F denote precision, recall, and F-measure respectively).
| Techniques | A | B | C | |
|---|---|---|---|---|
| The proposed technique | P | 70.7 | 90.1 | 88.3 |
| R | 68.9 | 74.1 | 68.4 | |
| F | 69.8 | 81.3 | 77.1 | |
| Zho04 | P | 75.3 | 77.1 | 75.6 |
| R | 69.5 | 69.2 | 71.3 | |
| F | 72.3 | 72.9 | 73.8 | |