| Literature DB >> 24980129 |
Donald C Comeau1, Riza Theresa Batista-Navarro2, Hong-Jie Dai2, Rezarta Islamaj Doğan2, Antonio Jimeno Yepes2, Ritu Khare2, Zhiyong Lu2, Hernani Marques2, Carolyn J Mattingly2, Mariana Neves3, Yifan Peng2, Rafal Rak2, Fabio Rinaldi2, Richard Tzong-Han Tsai2, Karin Verspoor3, Thomas C Wiegers2, Cathy H Wu3, W John Wilbur2.
Abstract
BioC is a new simple XML format for sharing biomedical text and annotations and libraries to read and write that format. This promotes the development of interoperable tools for natural language processing (NLP) of biomedical text. The interoperability track at the BioCreative IV workshop featured contributions using or highlighting the BioC format. These contributions included additional implementations of BioC, many new corpora in the format, biomedical NLP tools consuming and producing the format and online services using the format. The ease of use, broad support and rapidly growing number of tools demonstrate the need for and value of the BioC format. Database URL: http://bioc.sourceforge.net/. Published by Oxford University 2014. This work is written by US Government employees and is in the public domain in the US.Entities:
Mesh:
Year: 2014 PMID: 24980129 PMCID: PMC4074764 DOI: 10.1093/database/bau053
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.BioC process sequence. The BioC workflow allows data in the BioC format, from a file or any other stream, to be read into the BioC data classes via the Input Connector, or written into a new stream, via the Output Connector. The Data Processing module stands for any kind of NLP or text mining process that uses these data. Several processing modules may be chained together between input and output.
Figure 2.Simple example of a BioC file.
Figure 3.Key file describing BioC file in Figure 2.