| Literature DB >> 26994911 |
Chih-Hsuan Wei1, Yifan Peng2, Robert Leaman1, Allan Peter Davis3, Carolyn J Mattingly3, Jiao Li4, Thomas C Wiegers3, Zhiyong Lu5.
Abstract
Manually curating chemicals, diseases and their relationships is significantly important to biomedical research, but it is plagued by its high cost and the rapid growth of the biomedical literature. In recent years, there has been a growing interest in developing computational approaches for automatic chemical-disease relation (CDR) extraction. Despite these attempts, the lack of a comprehensive benchmarking dataset has limited the comparison of different techniques in order to assess and advance the current state-of-the-art. To this end, we organized a challenge task through BioCreative V to automatically extract CDRs from the literature. We designed two challenge tasks: disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. To assist system development and assessment, we created a large annotated text corpus that consisted of human annotations of chemicals, diseases and their interactions from 1500 PubMed articles. 34 teams worldwide participated in the CDR task: 16 (DNER) and 18 (CID). The best systems achieved an F-score of 86.46% for the DNER task--a result that approaches the human inter-annotator agreement (0.8875)--and an F-score of 57.03% for the CID task, the highest results ever reported for such tasks. When combining team results via machine learning, the ensemble system was able to further improve over the best team results by achieving 88.89% and 62.80% in F-score for the DNER and CID task, respectively. Additionally, another novel aspect of our evaluation is to test each participating system's ability to return real-time results: the average response time for each team's DNER and CID web service systems were 5.6 and 9.3 s, respectively. Most teams used hybrid systems for their submissions based on machining learning. Given the level of participation and results, we found our task to be successful in engaging the text-mining research community, producing a large annotated corpus and improving the results of automatic disease recognition and CDR extraction. Database URL: http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.Entities:
Mesh:
Year: 2016 PMID: 26994911 PMCID: PMC4799720 DOI: 10.1093/database/baw032
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.The pipeline of the task workflow. The task organization is shown in purple; corpus development is shown in green; and team participation is shown in red.
Statistics of the CDR data sets
| Task dataset | Articles | Chemical | Disease | CID relation | ||
|---|---|---|---|---|---|---|
| Mention | ID | Mention | ID | |||
| Training | 500 | 5,203 | 1,467 | 4,182 | 1,965 | 1,038 |
| Development | 500 | 5,347 | 1,507 | 4,244 | 1,865 | 1,012 |
| Test | 500 | 5,385 | 1,435 | 4,424 | 1,988 | 1,066 |
Figure 2.DNER results of all teams as well as the baseline (dictionary look up) and DNorm systems.
Figure 3.CID results of all teams as well as two variants of the co-occurrence baseline method (i.e. abstract- and sentence-level).
Figure 4.Average response time of each individual team for DNER and CID tasks.
DNER results using a combination method
| DNER | P | R | F |
|---|---|---|---|
| Best team result | 89.63 | 83.50 | 86.46 |
| Inter-annotator agreement | – | – | 88.75 |
| All teams combined (16 teams) | 93.21 | 84.96 | 88.89 |
Performance is shown in Precision (P), Recall (R) and F-score (F).
CID results using a combination method
| CID | P | R | F |
|---|---|---|---|
| Best team result | 55.67 | 58.44 | 57.03 |
| Crowdsourcing | 56.10 | 76.50 | 64.70 |
| All teams combined (18 teams) | 76.45 | 53.28 | 62.80 |
Performance is shown in Precision (P), Recall (R) and F-score (F).
100 abstracts from training set. (Li T, Bravo A, Furlong LI, Good BM, Su AI, Extracting structured CID relations from free text via crowdsourcing. BioCreative V, 2015)
Overview of how many teams correctly identified DNER concepts and CID relations
| No. of teams | DNER concepts | CID relations | ||
|---|---|---|---|---|
| No. | % | No. | % | |
| 0 | 103 | 5.18 | 128 | 12.01 |
| 1 | 44 | 2.21 | 71 | 6.66 |
| 2 | 39 | 1.96 | 74 | 6.94 |
| 3 | 31 | 1.56 | 69 | 6.47 |
| 4 | 35 | 1.76 | 63 | 5.91 |
| 5 | 25 | 1.26 | 69 | 6.47 |
| 6 | 25 | 1.26 | 55 | 5.16 |
| 7 | 35 | 1.76 | 46 | 4.32 |
| 8 | 31 | 1.56 | 45 | 4.22 |
| 9 | 59 | 2.97 | 42 | 3.94 |
| 10 | 87 | 4.38 | 40 | 3.75 |
| 11 | 111 | 5.58 | 34 | 3.19 |
| 12 | 137 | 6.89 | 53 | 4.97 |
| 13 | 135 | 6.79 | 42 | 3.94 |
| 14 | 194 | 9.76 | 44 | 4.13 |
| 15 | 416 | 20.93 | 40 | 3.75 |
| 16 | 481 | 24.20 | 39 | 3.66 |
| 17 | — | — | 50 | 4.69 |
| 18 | — | — | 62 | 5.82 |
| Sum | 1988 | 100.00 | 1066 | 100.00 |