| Literature DB >> 22388011 |
Abstract
There is a surge of research interest in protein-protein interaction (PPI) extraction from biomedical literature. While most of the state-of-the-art PPI extraction systems focus on dependency-based structured information, the rich structured information inherent in constituent parse trees has not been extensively explored for PPI extraction. In this paper, we propose a novel approach to tree kernel-based PPI extraction, where the tree representation generated from a constituent syntactic parser is further refined using the shortest dependency path between two proteins derived from a dependency parser. Specifically, all the constituent tree nodes associated with the nodes on the shortest dependency path are kept intact, while other nodes are removed safely to make the constituent tree concise and precise for PPI extraction. Compared with previously used constituent tree setups, our dependency-motivated constituent tree setup achieves the best results across five commonly used PPI corpora. Moreover, our tree kernel-based method outperforms other single kernel-based ones and performs comparably with some multiple kernel ones on the most commonly tested AIMed corpus.Mesh:
Substances:
Year: 2012 PMID: 22388011 DOI: 10.1016/j.jbi.2012.02.004
Source DB: PubMed Journal: J Biomed Inform ISSN: 1532-0464 Impact factor: 6.317