| Literature DB >> 24919658 |
Thomas C Wiegers1, Allan Peter Davis2, Carolyn J Mattingly2.
Abstract
The Critical Assessment of Information Extraction systems in Biology (BioCreAtIvE) challenge evaluation tasks collectively represent a community-wide effort to evaluate a variety of text-mining and information extraction systems applied to the biological domain. The BioCreative IV Workshop included five independent subject areas, including Track 3, which focused on named-entity recognition (NER) for the Comparative Toxicogenomics Database (CTD; http://ctdbase.org). Previously, CTD had organized document ranking and NER-related tasks for the BioCreative Workshop 2012; a key finding of that effort was that interoperability and integration complexity were major impediments to the direct application of the systems to CTD's text-mining pipeline. This underscored a prevailing problem with software integration efforts. Major interoperability-related issues included lack of process modularity, operating system incompatibility, tool configuration complexity and lack of standardization of high-level inter-process communications. One approach to potentially mitigate interoperability and general integration issues is the use of Web services to abstract implementation details; rather than integrating NER tools directly, HTTP-based calls from CTD's asynchronous, batch-oriented text-mining pipeline could be made to remote NER Web services for recognition of specific biological terms using BioC (an emerging family of XML formats) for inter-process communications. To test this concept, participating groups developed Representational State Transfer /BioC-compliant Web services tailored to CTD's NER requirements. Participants were provided with a comprehensive set of training materials. CTD evaluated results obtained from the remote Web service-based URLs against a test data set of 510 manually curated scientific articles. Twelve groups participated in the challenge. Recall, precision, balanced F-scores and response times were calculated. Top balanced F-scores for gene, chemical and disease NER were 61, 74 and 51%, respectively. Response times ranged from fractions-of-a-second to over a minute per article. We present a description of the challenge and summary of results, demonstrating how curation groups can effectively use interoperable NER technologies to simplify text-mining pipeline implementation. Database URL: http://ctdbase.org/Entities:
Mesh:
Year: 2014 PMID: 24919658 PMCID: PMC4207221 DOI: 10.1093/database/bau050
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Web service-based NER logical design. Under a Web service-based conceptual design, (1) a list of potentially relevant PubMed IDs (PMIDs) is secured via a search of PubMed, typically for a target chemical. (2) The list is processed asynchronously by batch-oriented processes. Rather than performing NER using locally installed NER tools, (3) HTTP calls containing text passages are made to remote Web services; the results of NER are used as a key component in document ranking algorithms. (4) PMIDs are then assigned a DRS by the document ranking algorithms.
Training and test data sets
| Training data set | Test data set | |
|---|---|---|
| Total scientific articles | 1112 | 510 |
| Distinct genes/proteins | 3511 | 1122 |
| Chemicals/drugs | 3144 | 1192 |
| Diseases | 1965 | 943 |
| Action terms | 2521 | 966 |
| Interactions | 9877 | 3953 |
An overview is provided of the training and test data sets.
Figure 2.BioC-based high-level inter-process communications. A sample request in BioC format is sent by Web service from the text-mining (TM) pipeline to the NER tool (green arrow). The PubMed ID, title, abstract and designated key file describing the semantics of the data are included within the XML request (left, green box). A chemical-specific response is returned from the NER tool to the TM pipeline (blue arrow). The NER Web service reads the BioC XML and attempts to identify chemicals in the title and abstract. Here, two chemical entities (fenfluramine and dexfenfluramine) are identified as BioC annotation objects for the NER chemical category in the response (right, blue boxes).
Figure 3.BioCreative IV Track 3 NER Testing Facility. Participants were provided with the BioCreative IV Track 3 NER Testing Facility developed by CTD. This testing facility provided a front-end to a CTD Web service that on execution called the participant's Web service using BioC XML associated with a specified PubMed ID for inter-process communications (top left screenshot). CTD's Web service would in turn receive text-mined annotations from the participant's Web service (using BioC XML). CTD's Web service then processed the annotations and computed the results against the curated data set, providing the user with recall, precision, response time and a detailed list of curated terms, text-mined terms and text-mined term hits (bottom right screenshot).
Participating teams
| Institution/department | NER tool summary | Primary contact <email address> | Web service URL(s) |
|---|---|---|---|
| National ICT Australia/Victoria Research Laboratory (Melbourne, Australia) | Gene, chemical, disease and action term: dictionary-based lookup using ConceptMapper ( | Andrew MacKinlay <Andrew.MacKinlay@nicta.com.au> | To be determined |
| Wuhan University (Wuhan, Hubei, China) | Gene, chemical and disease: a proprietary dictionary-based NER tagger | Cheng Sun <whucsnlp@gmail.com> | |
| Action term: LIBSVN ( | |||
| University of Applied Sciences of Western Switzerland, Geneva/BiTeM Group, Information Science Department (Geneva, Switzerland) | Gene: NormaGene ( | Dina Vishnyakova <dina.vishnyakova@unige.ch> | |
| Diseases and chemicals: | |||
| Action Terms: GOCat ( | |||
| University and University Hospitals of Geneva/Division of Medical Information Sciences (Geneva, Switzerland) | |||
| SIB Swiss Institute of Bioinformatics/SIBtex (Geneva, Switzerland) | |||
| University of Zurich/Institute of Computational Linguistics (Zurich, Switzerland) | Gene, chemical and disease: Ontogene ( | Fabio Rinaldi <fabio.rinaldi@uzh.ch> | |
| Action terms: a document classifier-based entity tagger | |||
| Academia Sinica/Institute of Information Science (Taipei, Taiwan) | Gene: BioC-GN, a proprietary machine learning- and dictionary-based NER module. | Hong-Jie Dai <hjdai@tmu.edu.tw> | |
| Yuan Ze University/Department of Computer Science & Engineering (Taoyuan, Taiwan) | |||
| Taipei Medical University/Graduate Institute of BioMedical Informatics (Taipei, Taiwan) | |||
| National Tsing-Hua University/Department of Computer Science (HsinChu, Taiwan) | |||
| National Central University/Department of Computer Science and Information Engineering (Zhongli City, Taiwan) | |||
| National Cheng Kung University/Department of Computer Science and Information Engineering (Tainan, Taiwan) | Gene, chemical, disease and action term: adapted version of CoINNER, a proprietary dictionary/conditional random fields-based NER tagger | Hung-Yu Kao <hykao@mail.ncku.edu.tw> | |
| National Cheng Kung University/Department of Computer Science and Information Engineering (Tainan, Taiwan) | Gene, chemical, disease and action term: adapted version of GCDA, a proprietary dictionary- and search engine-based NER tool that integrates OSCAR4 ( | Jiun-Huang Ju <jujh@iir.csie.ncku.edu.tw> | |
| Jung-Hsien Chiang <jchiang@mail.ncku.edu.tw> | |||
| Mayo Clinic/Department of Health Sciences Research (Rochester, MN, USA) | Gene, chemical and disease: a proprietary dictionary-based and conditional random fields-based NER tagger, integrating BioTagger-GM ( | Komandur Elayavilli, Ravikumar <KomandurElayavilli.Ravikumar@mayo.edu> | |
| OntoChem GmbH (Halle/Saale, Germany) | Gene, chemical, disease, and action term: adapted version of OCMiner ( | Matthias Irmer <matthias.irmer@ontochem.com> | |
| University of Manchester/National Centre for Text Mining (Manchester, UK) | Gene, chemical and disease: a conditional random fields model built with NERSuite ( | Rafal Rak <rafal.rak@manchester.ac.uk> | |
| Action term: Support Vector Machine-based model with dictionary- and co-occurrence-based features. | |||
| RelAgent Technologies Pvt. Ltd. (Adyar, Chennai, India) | Gene, chemical, disease and action term: adapted version of Cocoa, a proprietary dictionary/rule based entity tagger | S V Ramanan <ramanan@relagent.com> | |
| Anna University/AU-KBC Research Centre (Chrompet, Chennai, India) | Gene, chemical, disease and action term: a proprietary CRF++ ( | Sindhuja Gopalan<sindhujagopalan@au-kbc.org> | |
| Sobha Lalitha Devi<sobha@au-kbc.org> | |||
Twelve teams participated in BioCreative IV, Track 3, submitting a combined total of 44 Web services for testing. Although all of the Web services were fully operational, five of the services could not process the entire Track 3 test data set. The participating institutions, brief descriptions of the basic design of the tools, along with the associated Web service URLs and points-of-contact, are provided. Note that in most cases the four NER categories (i.e. gene, chem, disease and action term) are integral to the URL naming nomenclature. Also, be aware that the majority of the URLs provided are machine-to-machine URLs, and have no associated graphical user interfaces.
Figure 4.Gene/protein named-entity recognition. Gene recall (blue), precision (red) and balanced F-score (green) results are shown for each participating group (anonymously identified by group number on x-axis). Average scores for each metric (dotted lines) are also provided.
Figure 5.Balanced F-scores by group. Balanced F-score results for each NER category, as well as a combined average, are provided for each participating group (anonymously identified by group number on x-axis). Average scores for each metric (dotted lines) are also provided.
Figure 6.Response times. Response time results for each NER category, as well as a combined average, are provided for each participating group (anonymously identified by group number on x-axis). Note: the response time in seconds (y-axis) uses a logarithmic scale.
Figure 7.Chemical/drug named-entity recognition. Chemical recall (blue), precision (red) and balanced F-score (green) results are shown for each participating group (anonymously identified by group number on x-axis). Average scores for each metric (dotted lines) are also provided.
Figure 8.Disease named-entity recognition. Disease recall (blue), precision (red) and balanced F-score (green) results are shown for each participating group (anonymously identified by group number on x-axis). Average scores for each metric (dotted lines) are also provided.
Figure 9.Action term named-entity recognition. Chemical/gene action term recall (blue), precision (red) and balanced F-score (green) results are shown for each participating group (anonymously identified by group number on x-axis). Average scores for each metric (dotted lines) are also provided.
Figure 10.Recall and precision. Combined average recall (x-axis) and precision (y-axis) results are shown for each participating group (color-coded by group number) within major NER category. For some groups there appeared to be a clear trade-off between recall and precision (e.g. 203), whereas for other groups trade-offs were less apparent (e.g. 184 and 199).
Figure 11.Balanced F-score and response time. Combined average balanced F-score (x-axis) and response time (y-axis) results are shown for each participating group (color-coded by group number) within major NER category. There was no clear relationship between response time and F-score. Note: the response time in seconds (y-axis) uses a logarithmic scale.