Literature DB >> 16099201

Evaluation of two dependency parsers on biomedical corpus targeted at protein-protein interactions.

Sampo Pyysalo1, Filip Ginter, Tapio Pahikkala, Jorma Boberg, Jouni Järvinen, Tapio Salakoski.   

Abstract

We present an evaluation of Link Grammar and Connexor Machinese Syntax, two major broad-coverage dependency parsers, on a custom hand-annotated corpus consisting of sentences regarding protein-protein interactions. In the evaluation, we apply the notion of an interaction subgraph, which is the subgraph of a dependency graph expressing a protein-protein interaction. We measure the performance of the parsers for recovery of individual dependencies, fully correct parses, and interaction subgraphs. For Link Grammar, an open system that can be inspected in detail, we further perform a comprehensive failure analysis, report specific causes of error, and suggest potential modifications to the grammar. We find that both parsers perform worse on biomedical English than previously reported on general English. While Connexor Machinese Syntax significantly outperforms Link Grammar, the failure analysis suggests specific ways in which the latter could be modified for better performance in the domain.

Mesh:

Year:  2005        PMID: 16099201     DOI: 10.1016/j.ijmedinf.2005.06.009

Source DB:  PubMed          Journal:  Int J Med Inform        ISSN: 1386-5056            Impact factor:   4.046


  5 in total

1.  A de-identifier for medical discharge summaries.

Authors:  Ozlem Uzuner; Tawanda C Sibanda; Yuan Luo; Peter Szolovits
Journal:  Artif Intell Med       Date:  2007-11-28       Impact factor: 5.326

2.  Benchmarking natural-language parsers for biological applications using dependency graphs.

Authors:  Andrew B Clegg; Adrian J Shepherd
Journal:  BMC Bioinformatics       Date:  2007-01-25       Impact factor: 3.169

3.  Lexical adaptation of link grammar to the biomedical sublanguage: a comparative evaluation of three approaches.

Authors:  Sampo Pyysalo; Tapio Salakoski; Sophie Aubin; Adeline Nazarenko
Journal:  BMC Bioinformatics       Date:  2006-11-24       Impact factor: 3.169

4.  BioInfer: a corpus for information extraction in the biomedical domain.

Authors:  Sampo Pyysalo; Filip Ginter; Juho Heimonen; Jari Björne; Jorma Boberg; Jouni Järvinen; Tapio Salakoski
Journal:  BMC Bioinformatics       Date:  2007-02-09       Impact factor: 3.169

5.  A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools.

Authors:  Karin Verspoor; Kevin Bretonnel Cohen; Arrick Lanfranchi; Colin Warner; Helen L Johnson; Christophe Roeder; Jinho D Choi; Christopher Funk; Yuriy Malenkiy; Miriam Eckert; Nianwen Xue; William A Baumgartner; Michael Bada; Martha Palmer; Lawrence E Hunter
Journal:  BMC Bioinformatics       Date:  2012-08-17       Impact factor: 3.169

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.