Khushnood Abbas1,2, Alireza Abbasi3, Shi Dong4, Ling Niu4, Laihang Yu4, Bolun Chen5, Shi-Min Cai6, Qambar Hasan7. 1. School of Computer Science and Technology, Zhoukou Normal University, Zhoukou, 466001, China. abbas@cigit.ac.cn. 2. School of Engineering and Information Technology, University of New South Wales, Canberra, NSW, 2006, Australia. abbas@cigit.ac.cn. 3. School of Engineering and Information Technology, University of New South Wales, Canberra, NSW, 2006, Australia. 4. School of Computer Science and Technology, Zhoukou Normal University, Zhoukou, 466001, China. 5. College of Computer and Software Engineering, Huaiyin Institute of Technology, Huaian, 223003, China. 6. School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 610054, China. 7. Centre for Cellular and Molecular Biology, School of Life and Environmental Science, Deakin University, Burwood, VIC, 3125, Australia.
Abstract
BACKGROUND: Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and similar complex systems. In the field of network biology and network medicine, there is a particular interest in predicting results from drug-drug, drug-disease, and protein-protein interactions to advance the speed of drug discovery. Existing data and modern computational methods allow to identify potentially beneficial and harmful interactions, and therefore, narrow drug trials ahead of actual clinical trials. Such automated data-driven investigation relies on machine learning techniques. However, traditional machine learning approaches require extensive preprocessing of the data that makes them impractical for large datasets. This study presents wide range of machine learning methods for predicting outcomes from biomedical interactions and evaluates the performance of the traditional methods with more recent network-based approaches. RESULTS: We applied a wide range of 32 different network-based machine learning models to five commonly available biomedical datasets, and evaluated their performance based on three important evaluations metrics namely AUROC, AUPR, and F1-score. We achieved this by converting link prediction problem as binary classification problem. In order to achieve this we have considered the existing links as positive example and randomly sampled negative examples from non-existant set. After experimental evaluation we found that Prone, ACT and [Formula: see text] are the top 3 best performers on all five datasets. CONCLUSIONS: This work presents a comparative evaluation of network-based machine learning algorithms for predicting network links, with applications in the prediction of drug-target and drug-drug interactions, and applied well known network-based machine learning methods. Our work is helpful in guiding researchers in the appropriate selection of machine learning methods for pharmaceutical tasks.
BACKGROUND: Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and similar complex systems. In the field of network biology and network medicine, there is a particular interest in predicting results from drug-drug, drug-disease, and protein-protein interactions to advance the speed of drug discovery. Existing data and modern computational methods allow to identify potentially beneficial and harmful interactions, and therefore, narrow drug trials ahead of actual clinical trials. Such automated data-driven investigation relies on machine learning techniques. However, traditional machine learning approaches require extensive preprocessing of the data that makes them impractical for large datasets. This study presents wide range of machine learning methods for predicting outcomes from biomedical interactions and evaluates the performance of the traditional methods with more recent network-based approaches. RESULTS: We applied a wide range of 32 different network-based machine learning models to five commonly available biomedical datasets, and evaluated their performance based on three important evaluations metrics namely AUROC, AUPR, and F1-score. We achieved this by converting link prediction problem as binary classification problem. In order to achieve this we have considered the existing links as positive example and randomly sampled negative examples from non-existant set. After experimental evaluation we found that Prone, ACT and [Formula: see text] are the top 3 best performers on all five datasets. CONCLUSIONS: This work presents a comparative evaluation of network-based machine learning algorithms for predicting network links, with applications in the prediction of drug-target and drug-drug interactions, and applied well known network-based machine learning methods. Our work is helpful in guiding researchers in the appropriate selection of machine learning methods for pharmaceutical tasks.
Entities:
Keywords:
Data-driven drug discovery; Drug-target prediction; Network link prediction; Poly-pharmacy; Poly-pharmacy side effects prediction
Authors: Steven M Paul; Daniel S Mytelka; Christopher T Dunwiddie; Charles C Persinger; Bernard H Munos; Stacy R Lindborg; Aaron L Schacht Journal: Nat Rev Drug Discov Date: 2010-02-19 Impact factor: 84.694
Authors: Daniel Domingo-Fernández; Yojana Gadiya; Abhishek Patel; Sarah Mubeen; Daniel Rivas-Barragan; Chris W Diana; Biswapriya B Misra; David Healey; Joe Rokicki; Viswa Colluru Journal: PLoS Comput Biol Date: 2022-02-25 Impact factor: 4.475