T Ono1, H Hishigaki, A Tanigami, T Takagi. 1. Otsuka GEN Research Institute, Otsuka Pharmaceutical Co. Ltd, 463-10 Kagasuno, Kawauchi-cho, Tokushima, 771-0192, Japan. ono@otsuka.gr.jp
Abstract
MOTIVATION: To understand biological process, we must clarify how proteins interact with each other. However, since information about protein-protein interactions still exists primarily in the scientific literature, it is not accessible in a computer-readable format. Efficient processing of large amounts of interactions therefore needs an intelligent information extraction method. Our aim is to develop an efficient method for extracting information on protein-protein interaction from scientific literature. RESULTS: We present a method for extracting information on protein-protein interactions from the scientific literature. This method, which employs only a protein name dictionary, surface clues on word patterns and simple part-of-speech rules, achieved high recall and precision rates for yeast (recall = 86.8% and precision = 94.3%) and Escherichia coli (recall = 82.5% and precision = 93.5%). The result of extraction suggests that our method should be applicable to any species for which a protein name dictionary is constructed. AVAILABILITY: The program is available on request from the authors.
MOTIVATION: To understand biological process, we must clarify how proteins interact with each other. However, since information about protein-protein interactions still exists primarily in the scientific literature, it is not accessible in a computer-readable format. Efficient processing of large amounts of interactions therefore needs an intelligent information extraction method. Our aim is to develop an efficient method for extracting information on protein-protein interaction from scientific literature. RESULTS: We present a method for extracting information on protein-protein interactions from the scientific literature. This method, which employs only a protein name dictionary, surface clues on word patterns and simple part-of-speech rules, achieved high recall and precision rates for yeast (recall = 86.8% and precision = 94.3%) and Escherichia coli (recall = 82.5% and precision = 93.5%). The result of extraction suggests that our method should be applicable to any species for which a protein name dictionary is constructed. AVAILABILITY: The program is available on request from the authors.
Authors: Takeshi Nagashima; Diego G Silva; Nikolai Petrovsky; Luis A Socha; Harukazu Suzuki; Rintaro Saito; Takeya Kasukawa; Igor V Kurochkin; Akihiko Konagaya; Christian Schönbach Journal: Genome Res Date: 2003-06 Impact factor: 9.043
Authors: Hong Pan; Li Zuo; Vidhu Choudhary; Zhuo Zhang; Shoi Houi Leow; Fui Teen Chong; Yingliang Huang; Victor Wui Siong Ong; Bijayalaxmi Mohanty; Sin Lam Tan; S P T Krishnan; Vladimir B Bajic Journal: Nucleic Acids Res Date: 2004-07-01 Impact factor: 16.971
Authors: Jean I Garcia-Gathright; Nicholas J Matiasz; Edward B Garon; Denise R Aberle; Ricky K Taira; Alex A T Bui Journal: IEEE EMBS Int Conf Biomed Health Inform Date: 2016-04-21
Authors: Daniel J Rigden; Peter Setlow; Barbara Setlow; Irina Bagyan; Richard A Stein; Mark J Jedrzejas Journal: Protein Sci Date: 2002-10 Impact factor: 6.725