| Literature DB >> 19223449 |
Georgios A Pavlopoulos1, Evangelos Pafilis, M Kuhn, Sean D Hooper, Reinhard Schneider.
Abstract
UNLABELLED: OnTheFly is a web-based application that applies biological named entity recognition to enrich Microsoft Office, PDF and plain text documents. The input files are converted into the HTML format and then sent to the Reflect tagging server, which highlights biological entity names like genes, proteins and chemicals, and attaches to them JavaScript code to invoke a summary pop-up window. The window provides an overview of relevant information about the entity, such as a protein description, the domain composition, a link to the 3D structure and links to other relevant online resources. OnTheFly is also able to extract the bioentities mentioned in a set of files and to produce a graphical representation of the networks of the known and predicted associations of these entities by retrieving the information from the STITCH database. AVAILABILITY: http://onthefly.embl.de, http://onthefly.embl.de/FAQ.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.Entities:
Mesh:
Year: 2009 PMID: 19223449 PMCID: PMC2660876 DOI: 10.1093/bioinformatics/btp081
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.The Figure shows an annotated table (A) of an PDF full text article (Pitre et al., 2006), the generated pop-up window with information about the protein YGL227W (B) and an automatically generated protein–protein interaction network (C) of associated entities for the proteins shown in part (A). For demonstration purposes, we isolated the table from the pdf file and processed the table separately. (D) The architecture and the functionality. Files get uploaded to OnTheFly server and they get converted into HTML. Reflect server annotates the HTML file and sends back the annotated HTML to OnTheFly server. A user can drag and drop files in the OnTheFly applet. The ‘GO’ button sends the selected documents to the conversion server that converts the according file formats into HTML pages, which will then be sent to the tagging server. A URL pointing to the generated HTML document is returned. The organism selection drop-down list enables users to define a species protein dictionary to be used by default. The ‘Network’ and ‘Summary’ option will extract the STITCH derived networks of associations of the recognized entities in the document(s) and produce a summary page listing the recognized entities.