| Literature DB >> 31583627 |
Kalpana Raja1,2, Jeyakumar Natarajan2, Finn Kuusisto1, John Steill1, Ian Ross3, James Thomson1,4, Ron Stewart5.
Abstract
Proteins perform their functions by interacting with other proteins. Protein-protein interaction (PPI) is critical for understanding the functions of individual proteins, the mechanisms of biological processes, and the disease mechanisms. High-throughput experiments accumulated a huge number of PPIs in PubMed articles, and their extraction is possible only through automated approaches. The standard text-mining protocol includes four major tasks, namely, recognizing protein mentions, normalizing protein names and aliases to unique identifiers such as gene symbol, extracting PPIs, and visualizing the PPI network using Cytoscape or other visualization tools. Each task is challenging and has been revised over several years to improve the performance. We present a protocol based on our hybrid approaches and show the possibility of presenting each task as an independent web-based tool, NAGGNER for protein name recognition, ProNormz for protein name normalization, PPInterFinder for PPI extraction, and HPIminer for PPI network visualization. The protocol is specific to human but can be generalized to other organisms. We include KinderMiner, our most recent text-mining tool that predicts PPIs by retrieving significant co-occurring protein pairs. The algorithm is simple, easy to implement, and generalizable to other biological challenges.Entities:
Keywords: HPIminer; Information extraction; KinderMiner; NAGGNER; Network visualization; PPInterFinder; ProNormz; Protein–protein interaction
Mesh:
Year: 2020 PMID: 31583627 DOI: 10.1007/978-1-4939-9873-9_2
Source DB: PubMed Journal: Methods Mol Biol ISSN: 1064-3745