Yu Hao1, Xiaoyan Zhu, Minlie Huang, Ming Li. 1. State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China.
Abstract
MOTIVATION: An enormous number of protein-protein interaction relationships are buried in millions of research articles published over the years, and the number is growing. Rediscovering them automatically is a challenging bioinformatics task. Solutions to this problem also reach far beyond bioinformatics. RESULTS: We study a new approach that involves automatically discovering English expression patterns, optimizing them and using them to extract protein-protein interactions. In a sister paper, we described how to generate English expression patterns related to protein-protein interactions, and this approach alone has already achieved precision and recall rates significantly higher than those of other automatic systems. This paper continues to present our theory, focusing on how to improve the patterns. A minimum description length (MDL)-based pattern-optimization algorithm is designed to reduce and merge patterns. This has significantly increased generalization power, and hence the recall and precision rates, as confirmed by our experiments. AVAILABILITY: http://spies.cs.tsinghua.edu.cn.
MOTIVATION: An enormous number of protein-protein interaction relationships are buried in millions of research articles published over the years, and the number is growing. Rediscovering them automatically is a challenging bioinformatics task. Solutions to this problem also reach far beyond bioinformatics. RESULTS: We study a new approach that involves automatically discovering English expression patterns, optimizing them and using them to extract protein-protein interactions. In a sister paper, we described how to generate English expression patterns related to protein-protein interactions, and this approach alone has already achieved precision and recall rates significantly higher than those of other automatic systems. This paper continues to present our theory, focusing on how to improve the patterns. A minimum description length (MDL)-based pattern-optimization algorithm is designed to reduce and merge patterns. This has significantly increased generalization power, and hence the recall and precision rates, as confirmed by our experiments. AVAILABILITY: http://spies.cs.tsinghua.edu.cn.
Authors: Chen Li; Antonio Jimeno-Yepes; Miguel Arregui; Harald Kirsch; Dietrich Rebholz-Schuhmann Journal: Database (Oxford) Date: 2013-05-02 Impact factor: 3.451
Authors: Dean Cheng; Craig Knox; Nelson Young; Paul Stothard; Sambasivarao Damaraju; David S Wishart Journal: Nucleic Acids Res Date: 2008-05-16 Impact factor: 16.971