| Literature DB >> 23969135 |
Robert Leaman1, Rezarta Islamaj Dogan, Zhiyong Lu.
Abstract
MOTIVATION: Despite the central role of diseases in biomedical research, there have been much fewer attempts to automatically determine which diseases are mentioned in a text-the task of disease name normalization (DNorm)-compared with other normalization tasks in biomedical text mining research.Entities:
Mesh:
Year: 2013 PMID: 23969135 PMCID: PMC3810844 DOI: 10.1093/bioinformatics/btt474
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Size of the NCBI disease corpus
| Setup | Abstracts | Mentions | Concepts |
|---|---|---|---|
| Training subset | 593 | 5145 | 670 |
| Development subset | 100 | 787 | 176 |
| Test subset | 100 | 960 | 203 |
Fig. 1.The DNorm disease normalization pipeline, with examples, as described in Section 2.1
Micro-averaged performance comparing the pLTR method against several baseline approaches, with the highest value in bold
| Setup | Precision | Recall | F-measure |
|---|---|---|---|
| NLM Lexical Normalization | 0.218 | 0.685 | 0.331 |
| MetaMap | 0.502 | 0.665 | 0.572 |
| Inference method | 0.533 | 0.662 | 0.591 |
| BANNER + Lucene | 0.612 | 0.647 | 0.629 |
| BANNER + cosine similarity | 0.649 | 0.674 | 0.661 |
| DNorm (BANNER + pLTR) |
Macro-averaged performance comparing the pLTR method against several baseline approaches, with the highest value in bold
| Setup | Precision | Recall | F-measure |
|---|---|---|---|
| NLM Lexical Normalization | 0.213 | 0.718 | 0.316 |
| MetaMap | 0.510 | 0.702 | 0.559 |
| Inference method | 0.597 | 0.731 | 0.637 |
| BANNER + Lucene | 0.662 | 0.714 | 0.673 |
| BANNER + cosine similarity | 0.692 | 0.732 | 0.711 |
| DNorm (BANNER + pLTR) |
Fig. 2.Comparison between BANNER + Lucene, BANNER + cosine similarity and DNorm (BANNER + pLTR) of the micro-averaged recall when considering a concept to be found if it appears in the top n ranked results
Effect of varying the learning rate (λ) on the number of training iterations performed, total training time and the resulting micro-averaged F-measure. The highest performance is shown in bold
| λ | Iterations | Time (min) | F-measure |
|---|---|---|---|
| 10−2 | 4 | 10.7 | 0.743 |
| 10−3 | 4 | 13.3 | 0.765 |
| 10−4 | 4 | 48.8 | |
| 10−5 | 2 | 124.0 | 0.762 |
| 10−6 | 8 | 986.6 | 0.775 |
| 10−7 | 17 | 4656.5 | 0.770 |
Fig. 3.Summary of error analysis. Errors in the NER and ranking components contributed >95% of the total errors