Literature DB >> 18433469

Extraction of semantic biomedical relations from text using conditional random fields.

Markus Bundschus1, Mathaeus Dejori, Martin Stetter, Volker Tresp, Hans-Peter Kriegel.   

Abstract

BACKGROUND: The increasing amount of published literature in biomedicine represents an immense source of knowledge, which can only efficiently be accessed by a new generation of automated information extraction tools. Named entity recognition of well-defined objects, such as genes or proteins, has achieved a sufficient level of maturity such that it can form the basis for the next step: the extraction of relations that exist between the recognized entities. Whereas most early work focused on the mere detection of relations, the classification of the type of relation is also of great importance and this is the focus of this work. In this paper we describe an approach that extracts both the existence of a relation and its type. Our work is based on Conditional Random Fields, which have been applied with much success to the task of named entity recognition.
RESULTS: We benchmark our approach on two different tasks. The first task is the identification of semantic relations between diseases and treatments. The available data set consists of manually annotated PubMed abstracts. The second task is the identification of relations between genes and diseases from a set of concise phrases, so-called GeneRIF (Gene Reference Into Function) phrases. In our experimental setting, we do not assume that the entities are given, as is often the case in previous relation extraction work. Rather the extraction of the entities is solved as a subproblem. Compared with other state-of-the-art approaches, we achieve very competitive results on both data sets. To demonstrate the scalability of our solution, we apply our approach to the complete human GeneRIF database. The resulting gene-disease network contains 34758 semantic associations between 4939 genes and 1745 diseases. The gene-disease network is publicly available as a machine-readable RDF graph.
CONCLUSION: We extend the framework of Conditional Random Fields towards the annotation of semantic relations from text and apply it to the biomedical domain. Our approach is based on a rich set of textual features and achieves a performance that is competitive to leading approaches. The model is quite general and can be extended to handle arbitrary biological entities and relation types. The resulting gene-disease network shows that the GeneRIF database provides a rich knowledge source for text mining. Current work is focused on improving the accuracy of detection of entities as well as entity boundaries, which will also greatly improve the relation extraction performance.

Entities:  

Mesh:

Year:  2008        PMID: 18433469      PMCID: PMC2386138          DOI: 10.1186/1471-2105-9-207

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  25 in total

1.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

Review 2.  Network biology: understanding the cell's functional organization.

Authors:  Albert-László Barabási; Zoltán N Oltvai
Journal:  Nat Rev Genet       Date:  2004-02       Impact factor: 53.242

3.  Gene indexing: characterization and analysis of NLM's GeneRIFs.

Authors:  Joyce A Mitchell; Alan R Aronson; James G Mork; Lillian C Folk; Susanne M Humphrey; Janice M Ward
Journal:  AMIA Annu Symp Proc       Date:  2003

4.  Tagging gene and protein names in biomedical text.

Authors:  Lorraine Tanabe; W John Wilbur
Journal:  Bioinformatics       Date:  2002-08       Impact factor: 6.937

5.  GeneWays: a system for extracting, analyzing, visualizing, and integrating molecular pathway data.

Authors:  Andrey Rzhetsky; Ivan Iossifov; Tomohiro Koike; Michael Krauthammer; Pauline Kra; Mitzi Morris; Hong Yu; Pablo Ariel Duboué; Wubin Weng; W John Wilbur; Vasileios Hatzivassiloglou; Carol Friedman
Journal:  J Biomed Inform       Date:  2004-02       Impact factor: 6.317

6.  GeneRIF quality assurance as summary revision.

Authors:  Zhiyong Lu; K Bretonnel Cohen; Lawrence Hunter
Journal:  Pac Symp Biocomput       Date:  2007

7.  The human disease network.

Authors:  Kwang-Il Goh; Michael E Cusick; David Valle; Barton Childs; Marc Vidal; Albert-László Barabási
Journal:  Proc Natl Acad Sci U S A       Date:  2007-05-14       Impact factor: 11.205

8.  GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support.

Authors:  M Rebhan; V Chalifa-Caspi; J Prilusky; D Lancet
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

9.  Argument-predicate distance as a filter for enhancing precision in extracting predications on the genetic etiology of disease.

Authors:  Marco Masseroli; Halil Kilicoglu; François-Michel Lang; Thomas C Rindflesch
Journal:  BMC Bioinformatics       Date:  2006-06-08       Impact factor: 3.169

10.  Entrez Gene: gene-centered information at NCBI.

Authors:  Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  59 in total

1.  Knowledge acquisition, semantic text mining, and security risks in health and biomedical informatics.

Authors:  Jingshan Huang; Dejing Dou; Jiangbo Dang; J Harold Pardue; Xiao Qin; Jun Huan; William T Gerthoffer; Ming Tan
Journal:  World J Biol Chem       Date:  2012-02-26

2.  Semantic relations for problem-oriented medical records.

Authors:  Ozlem Uzuner; Jonathan Mailoa; Russell Ryan; Tawanda Sibanda
Journal:  Artif Intell Med       Date:  2010-06-19       Impact factor: 5.326

3.  Use of ontology structure and Bayesian models to aid the crowdsourcing of ICD-11 sanctioning rules.

Authors:  Yun Lou; Samson W Tu; Csongor Nyulas; Tania Tudorache; Robert J G Chalmers; Mark A Musen
Journal:  J Biomed Inform       Date:  2017-02-10       Impact factor: 6.317

4.  Improved Survival Among all Interferon-α-Treated Patients in HCV-002, a Veterans Affairs Hepatitis C Cohort of 2211 Patients, Despite Increased Cirrhosis Among Nonresponders.

Authors:  Myrna L Cozen; James C Ryan; Hui Shen; Ramsey Cheung; David E Kaplan; Christine Pocha; Norbert Brau; Ayse Aytaman; Warren N Schmidt; Marcos Pedrosa; Bhupinderjit S Anand; Kyong-Mi Chang; Timothy Morgan; Alexander Monto
Journal:  Dig Dis Sci       Date:  2016-04-08       Impact factor: 3.199

5.  Dynamic programming re-ranking for PPI interactor and pair extraction in full-text articles.

Authors:  Richard Tzong-Han Tsai; Po-Ting Lai
Journal:  BMC Bioinformatics       Date:  2011-02-23       Impact factor: 3.169

6.  [Attenuation regulation of amino acid and amino acyl-tRNA biosynthetic operons in bacteria: comparative genomics analysis].

Authors:  K V Lopatovskaia; A V Seliverstov; V A Liubetskiĭ
Journal:  Mol Biol (Mosk)       Date:  2010 Jan-Feb

7.  Toward patient-tailored summarization of lung cancer literature.

Authors:  Jean I Garcia-Gathright; Nicholas J Matiasz; Edward B Garon; Denise R Aberle; Ricky K Taira; Alex A T Bui
Journal:  IEEE EMBS Int Conf Biomed Health Inform       Date:  2016-04-21

8.  Using classification models for the generation of disease-specific medications from biomedical literature and clinical data repository.

Authors:  Liqin Wang; Peter J Haug; Guilherme Del Fiol
Journal:  J Biomed Inform       Date:  2017-04-20       Impact factor: 6.317

9.  Annotating the human genome with Disease Ontology.

Authors:  John D Osborne; Jared Flatow; Michelle Holko; Simon M Lin; Warren A Kibbe; Lihua Julie Zhu; Maria I Danila; Gang Feng; Rex L Chisholm
Journal:  BMC Genomics       Date:  2009-07-07       Impact factor: 3.969

10.  Identifying and classifying biomedical perturbations in text.

Authors:  Raul Rodriguez-Esteban; Phoebe M Roberts; Matthew E Crawford
Journal:  Nucleic Acids Res       Date:  2008-12-12       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.