Literature DB >> 15811782

Comparative experiments on learning information extractors for proteins and their interactions.

Razvan Bunescu1, Ruifang Ge, Rohit J Kate, Edward M Marcotte, Raymond J Mooney, Arun K Ramani, Yuk Wah Wong.   

Abstract

OBJECTIVE: Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in Medline. However, extraction efforts have been frustrated by the lack of conventions for describing human genes and proteins. We have developed and evaluated a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting information on interactions between the proteins. METHODS AND MATERIAL: We used a variety of machine learning methods to automatically develop information extraction systems for extracting information on gene/protein name, function and interactions from Medline abstracts. We present cross-validated results on identifying human proteins and their interactions by training and testing on a set of approximately 1000 manually-annotated Medline abstracts that discuss human genes/proteins.
RESULTS: We demonstrate that machine learning approaches using support vector machines and maximum entropy are able to identify human proteins with higher accuracy than several previous approaches. We also demonstrate that various rule induction methods are able to identify protein interactions with higher precision than manually-developed rules.
CONCLUSION: Our results show that it is promising to use machine learning to automatically build systems for extracting information from biomedical text. The results also give a broad picture of the relative strengths of a wide variety of methods when tested on a reasonably large human-annotated corpus.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15811782     DOI: 10.1016/j.artmed.2004.07.016

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  83 in total

1.  Bridging semantics and syntax with graph algorithms-state-of-the-art of extracting biomedical relations.

Authors:  Yuan Luo; Özlem Uzuner; Peter Szolovits
Journal:  Brief Bioinform       Date:  2016-02-05       Impact factor: 11.622

2.  High-recall protein entity recognition using a dictionary.

Authors:  Zhenzhen Kou; William W Cohen; Robert F Murphy
Journal:  Bioinformatics       Date:  2005-06       Impact factor: 6.937

3.  BioSimplify: an open source sentence simplification engine to improve recall in automatic biomedical information extraction.

Authors:  Siddhartha Jonnalagadda; Graciela Gonzalez
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

4.  Generalizing biomedical relation classification with neural adversarial domain adaptation.

Authors:  Anthony Rios; Ramakanth Kavuluru; Zhiyong Lu
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

5.  A comprehensive benchmark of kernel methods to extract protein-protein interactions from literature.

Authors:  Domonkos Tikk; Philippe Thomas; Peter Palaga; Jörg Hakenberg; Ulf Leser
Journal:  PLoS Comput Biol       Date:  2010-07-01       Impact factor: 4.475

6.  Construction of an annotated corpus to support biomedical information extraction.

Authors:  Paul Thompson; Syed A Iqbal; John McNaught; Sophia Ananiadou
Journal:  BMC Bioinformatics       Date:  2009-10-23       Impact factor: 3.169

7.  HypertenGene: extracting key hypertension genes from biomedical literature with position and automatically-generated template features.

Authors:  Richard Tzong-Han Tsai; Po-Ting Lai; Hong-Jie Dai; Chi-Hsin Huang; Yue-Yang Bow; Yen-Ching Chang; Wen-Harn Pan; Wen-Lian Hsu
Journal:  BMC Bioinformatics       Date:  2009-12-03       Impact factor: 3.169

8.  A realistic assessment of methods for extracting gene/protein interactions from free text.

Authors:  Renata Kabiljo; Andrew B Clegg; Adrian J Shepherd
Journal:  BMC Bioinformatics       Date:  2009-07-28       Impact factor: 3.169

9.  Linguistic feature analysis for protein interaction extraction.

Authors:  Timur Fayruzov; Martine De Cock; Chris Cornelis; Veronique Hoste
Journal:  BMC Bioinformatics       Date:  2009-11-12       Impact factor: 3.169

10.  Evaluation of linguistic features useful in extraction of interactions from PubMed; application to annotating known, high-throughput and predicted interactions in I2D.

Authors:  Yun Niu; David Otasek; Igor Jurisica
Journal:  Bioinformatics       Date:  2009-10-22       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.