| Literature DB >> 18834497 |
Florian Leitner1, Martin Krallinger, Carlos Rodriguez-Penagos, Jörg Hakenberg, Conrad Plake, Cheng-Ju Kuo, Chun-Nan Hsu, Richard Tzong-Han Tsai, Hsi-Chuan Hung, William W Lau, Calvin A Johnson, Rune Saetre, Kazuhiro Yoshida, Yan Hua Chen, Sun Kim, Soo-Yong Shin, Byoung-Tak Zhang, William A Baumgartner, Lawrence Hunter, Barry Haddow, Michael Matthews, Xinglong Wang, Patrick Ruch, Frédéric Ehrler, Arzucan Ozgür, Güneş Erkan, Dragomir R Radev, Michael Krauthammer, ThaiBinh Luong, Robert Hoffmann, Chris Sander, Alfonso Valencia.
Abstract
We introduce the first meta-service for information extraction in molecular biology, the BioCreative MetaServer (BCMS; http://bcms.bioinfo.cnio.es/). This prototype platform is a joint effort of 13 research groups and provides automatically generated annotations for PubMed/Medline abstracts. Annotation types cover gene names, gene IDs, species, and protein-protein interactions. The annotations are distributed by the meta-server in both human and machine readable formats (HTML/XML). This service is intended to be used by biomedical researchers and database annotators, and in biomedical language processing. The platform allows direct comparison, unified access, and result aggregation of the annotations.Entities:
Mesh:
Year: 2008 PMID: 18834497 PMCID: PMC2559990 DOI: 10.1186/gb-2008-9-s2-s6
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Annotation servers
| Team/Group | GM | GN | TX | PPI | Conf | State | Web Page |
| Hakenberg | + | + | + | + | True | Dynamic | |
| Kuo | + | False | Dynamic | ||||
| Tsai | + | + | True | Dynamic | |||
| Lau | + | True | Dynamic | ||||
| Sætre | + | + | + | True | Static | ||
| Kim | + | True | Dynamic | ||||
| Baumgartner | + | + | + | False | Dynamic | ||
| Haddow | + | + | + | + | True | Static | |
| Ruch | + | + | True | Dynamic | |||
| Özgür | + | True | Static | ||||
| Luong | + | True | Dynamic | ||||
| Hoffmann | a | + | + | + | True | Dynamic | |
| Totals | 6 | 8 | 3 | 9 | 10 | 12 teams |
List of annotation servers used by the meta-server, plus a Boolean flag determining whether the classifiers use a confidence score (Conf) and the system state (State): dynamic = online system, already capable of delivering content for any PubMed abstract; static = offline system, server in development. The webpage columns provide a link to an online site for a team's annotation system. aiHOP (Hoffmann) delivers GMs, but because of data compatibility issues this is a feature to be added in later versions of the meta-server. GM, gene/protein mention detections; GN, gene/protein normalizations; PPI, protein-protein interaction classification; TX, taxon classifications.
Figure 1BioCreative MetaServer (BCMS) annotation view screenshot. This screenshot of the annotation view of the meta-server shows the main annotations for the given Medline abstracts (PMID 16458891). Central view: gene mentions (GMs) are marked in the text, ranging from gray (single annotation server [AS] detecting the particular mention) to yellow (all five ASs that have analyzed the text detect the highlighted text snippet as a GM), as a gradient that is shown below the text. At the bottom, the list of servers providing the annotations for this abstract can be found (only four of all thirteen visible). Left column: all raw annotation results can be viewed here. Gene mentions (GMs) results are expanded and sorted first by the number of servers predicting that mention and then by the median confidence for it. On the bottom left, a quick bar indicates protein-protein interaction (PPI) results. The bar is split in two, where the left and right bar lengths indicate the number of servers classifying this abstract as negative or positive in relation to mentioning PPIs. The color of the bars indicate the mean confidence of all classifications of one type: the negative (left) bar ranges from blue (low) to yellow (high confidence), and the positive (right) bar from yellow (low) to blue (high confidence). The bar also provides some interactivity: shortened names (indicated py an elipsis at the end) can be seen in their full form by mouse over, mousing over the gene mention highlights its position in the text, and individual gene normalization results can be clicked to see the exact database identifier, name, organism and a link to the DB record. Right column: by clicking on an italic mention in the central view, all possible mappings of GMs to GNs are shown: in bold the GM, and then the list of GNs (together with the species) and their official names (here for the text span "interferon-inducible p200 family"). This simple mapping is based on case-insensitive substring matching of GMs and the GN names and synonyms extracted from the DB records.
Figure 2BioCreative MetaServer (BCMS) annotation view screenshot. This screenshot of the annotation view of the meta-server shows the main annotations for the given Medline abstracts (PMID 16458891). Central view: gene mentions (GMs) are marked in the text, ranging from gray (single annotation server [AS] detecting the particular mention) to yellow (all five ASs that have analyzed the text detect the highlighted text snippet as a GM), as a gradient that is shown below the text. At the bottom, the list of servers providing the annotations for this abstract can be found (only four of all thirteen visible). Left column: all raw annotation results can be viewed here. Gene mentions (GMs) results are expanded and sorted first by the number of servers predicting that mention and then by the median confidence for it. On the bottom left, a quick bar indicates protein-protein interaction (PPI) results. The bar is split in two, where the left and right bar lengths indicate the number of servers classifying this abstract as negative or positive in relation to mentioning PPIs. The color of the bars indicate the mean confidence of all classifications of one type: the negative (left) bar ranges from blue (low) to yellow (high confidence), and the positive (right) bar from yellow (low) to blue (high confidence). The bar also provides some interactivity: shortened names (indicated py an elipsis at the end) can be seen in their full form by mouse over, mousing over the gene mention highlights its position in the text, and individual gene normalization results can be clicked to see the exact database identifier, name, organism and a link to the DB record. Right column: by clicking on an italic mention in the central view, all possible mappings of GMs to GNs are shown: in bold the GM, and then the list of GNs (together with the species) and their official names (here for the text span "interferon-inducible p200 family"). This simple mapping is based on case-insensitive substring matching of GMs and the GN names and synonyms extracted from the DB records.