Literature DB >> 15231534

BioRAT: extracting biological information from full-length papers.

David P A Corney1, Bernard F Buxton, William B Langdon, David T Jones.   

Abstract

MOTIVATION: Converting the vast quantity of free-format text found in journals into a concise, structured format makes the researcher's quest for information easier. Recently, several information extraction systems have been developed that attempt to simplify the retrieval and analysis of biological and medical data. Most of this work has used the abstract alone, owing to the convenience of access and the quality of data. Abstracts are generally available through central collections with easy direct access (e.g. PubMed). The full-text papers contain more information, but are distributed across many locations (e.g. publishers' web sites, journal web sites and local repositories), making access more difficult. In this paper, we present BioRAT, a new information extraction (IE) tool, specifically designed to perform biomedical IE, and which is able to locate and analyse both abstracts and full-length papers. BioRAT is a Biological Research Assistant for Text mining, and incorporates a document search ability with domain-specific IE.
RESULTS: We show first, that BioRAT performs as well as existing systems, when applied to abstracts; and second, that significantly more information is available to BioRAT through the full-length papers than via the abstracts alone. Typically, less than half of the available information is extracted from the abstract, with the majority coming from the body of each paper. Overall, BioRAT recalled 20.31% of the target facts from the abstracts with 55.07% precision, and achieved 43.6% recall with 51.25% precision on full-length papers.

Entities:  

Mesh:

Year:  2004        PMID: 15231534     DOI: 10.1093/bioinformatics/bth386

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  37 in total

1.  Parenthetically speaking: classifying the contents of parentheses for text mining.

Authors:  K Bretonnel Cohen; Thomas Christiansen; Lawrence E Hunter
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  KID--an algorithm for fast and efficient text mining used to automatically generate a database containing kinetic information of enzymes.

Authors:  Stephanie Heinen; Bernhard Thielen; Dietmar Schomburg
Journal:  BMC Bioinformatics       Date:  2010-07-13       Impact factor: 3.169

3.  Semantic text mining support for lignocellulose research.

Authors:  Marie-Jean Meurs; Caitlin Murphy; Ingo Morgenstern; Greg Butler; Justin Powlowski; Adrian Tsang; René Witte
Journal:  BMC Med Inform Decis Mak       Date:  2012-04-30       Impact factor: 2.796

4.  Molecular interaction maps of bioregulatory networks: a general rubric for systems biology.

Authors:  Kurt W Kohn; Mirit I Aladjem; John N Weinstein; Yves Pommier
Journal:  Mol Biol Cell       Date:  2005-11-02       Impact factor: 4.138

5.  Extraction of protein interaction data: a comparative analysis of methods in use.

Authors:  Hena Jose; Thangavel Vadivukarasi; Jyothi Devakumar
Journal:  EURASIP J Bioinform Syst Biol       Date:  2007

6.  LINNAEUS: a species name identification system for biomedical literature.

Authors:  Martin Gerner; Goran Nenadic; Casey M Bergman
Journal:  BMC Bioinformatics       Date:  2010-02-11       Impact factor: 3.169

7.  Text mining for modeling of protein complexes enhanced by machine learning.

Authors:  Varsha D Badal; Petras J Kundrotas; Ilya A Vakser
Journal:  Bioinformatics       Date:  2021-05-01       Impact factor: 6.937

8.  Signalling network construction for modelling plant defence response.

Authors:  Dragana Miljkovic; Tjaša Stare; Igor Mozetič; Vid Podpečan; Marko Petek; Kamil Witek; Marina Dermastia; Nada Lavrač; Kristina Gruden
Journal:  PLoS One       Date:  2012-12-18       Impact factor: 3.240

9.  A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.

Authors:  David Westergaard; Hans-Henrik Stærfeldt; Christian Tønsberg; Lars Juhl Jensen; Søren Brunak
Journal:  PLoS Comput Biol       Date:  2018-02-15       Impact factor: 4.475

10.  The textual characteristics of traditional and Open Access scientific journals are similar.

Authors:  Karin Verspoor; K Bretonnel Cohen; Lawrence Hunter
Journal:  BMC Bioinformatics       Date:  2009-06-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.