Yahui Long1,2, Min Wu3, Yong Liu4, Chee Keong Kwoh2, Jiawei Luo1, Xiaoli Li3. 1. College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China. 2. School of Computer Science and Engineering, Nanyang Technological University, Singapore, 639798, Singapore. 3. Machine Intellection Department, Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), 138632, Singapore. 4. Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly (LILY), Nanyang Technological University, Singapore, 639798, Singapore.
Abstract
MOTIVATION: Human microbes get closely involved in an extensive variety of complex human diseases and become new drug targets. In silico methods for identifying potential microbe-drug associations provide an effective complement to conventional experimental methods, which can not only benefit screening candidate compounds for drug development but also facilitate novel knowledge discovery for understanding microbe-drug interaction mechanisms. On the other hand, the recent increased availability of accumulated biomedical data for microbes and drugs provides a great opportunity for a machine learning approach to predict microbe-drug associations. We are thus highly motivated to integrate these data sources to improve prediction accuracy. In addition, it is extremely challenging to predict interactions for new drugs or new microbes, which have no existing microbe-drug associations. RESULTS: In this work, we leverage various sources of biomedical information and construct multiple networks (graphs) for microbes and drugs. Then, we develop a novel ensemble framework of graph attention networks with a hierarchical attention mechanism for microbe-drug association prediction from the constructed multiple microbe-drug graphs, denoted as EGATMDA. In particular, for each input graph, we design a graph convolutional network with node-level attention to learn embeddings for nodes (i.e. microbes and drugs). To effectively aggregate node embeddings from multiple input graphs, we implement graph-level attention to learn the importance of different input graphs. Experimental results under different cross-validation settings (e.g. the setting for predicting associations for new drugs) showed that our proposed method outperformed seven state-of-the-art methods. Case studies on predicted microbe-drug associations further demonstrated the effectiveness of our proposed EGATMDA method. AVAILABILITY: Source codes and supplementary materials are available at: https://github.com/longyahui/EGATMDA/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION:Human microbes get closely involved in an extensive variety of complex human diseases and become new drug targets. In silico methods for identifying potential microbe-drug associations provide an effective complement to conventional experimental methods, which can not only benefit screening candidate compounds for drug development but also facilitate novel knowledge discovery for understanding microbe-drug interaction mechanisms. On the other hand, the recent increased availability of accumulated biomedical data for microbes and drugs provides a great opportunity for a machine learning approach to predict microbe-drug associations. We are thus highly motivated to integrate these data sources to improve prediction accuracy. In addition, it is extremely challenging to predict interactions for new drugs or new microbes, which have no existing microbe-drug associations. RESULTS: In this work, we leverage various sources of biomedical information and construct multiple networks (graphs) for microbes and drugs. Then, we develop a novel ensemble framework of graph attention networks with a hierarchical attention mechanism for microbe-drug association prediction from the constructed multiple microbe-drug graphs, denoted as EGATMDA. In particular, for each input graph, we design a graph convolutional network with node-level attention to learn embeddings for nodes (i.e. microbes and drugs). To effectively aggregate node embeddings from multiple input graphs, we implement graph-level attention to learn the importance of different input graphs. Experimental results under different cross-validation settings (e.g. the setting for predicting associations for new drugs) showed that our proposed method outperformed seven state-of-the-art methods. Case studies on predicted microbe-drug associations further demonstrated the effectiveness of our proposed EGATMDA method. AVAILABILITY: Source codes and supplementary materials are available at: https://github.com/longyahui/EGATMDA/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.