| Literature DB >> 33430993 |
Ryosuke Kojima1, Shoichi Ishida2, Masateru Ohta3, Hiroaki Iwata4, Teruki Honma3,5, Yasushi Okuno4,3.
Abstract
Deep learning is developing as an important technology to perform various tasks in cheminformatics. In particular, graph convolutional neural networks (GCNs) have been reported to perform well in many types of prediction tasks related to molecules. Although GCN exhibits considerable potential in various applications, appropriate utilization of this resource for obtaining reasonable and reliable prediction results requires thorough understanding of GCN and programming. To leverage the power of GCN to benefit various users from chemists to cheminformaticians, an open-source GCN tool, kGCN, is introduced. To support the users with various levels of programming skills, kGCN includes three interfaces: a graphical user interface (GUI) employing KNIME for users with limited programming skills such as chemists, as well as command-line and Python library interfaces for users with advanced programming skills such as cheminformaticians. To support the three steps required for building a prediction model, i.e., pre-processing, model tuning, and interpretation of results, kGCN includes functions of typical pre-processing, Bayesian optimization for automatic model tuning, and visualization of the atomic contribution to prediction for interpretation of results. kGCN supports three types of approaches, single-task, multi-task, and multi-modal predictions. The prediction of compound-protein interaction for four matrixmetalloproteases, MMP-3, -9, -12 and -13, in the inhibition assays is performed as a representative case study using kGCN. Additionally, kGCN provides the visualization of atomic contributions to the prediction. Such visualization is useful for the validation of the prediction models and the design of molecules based on the prediction model, realizing "explainable AI" for understanding the factors affecting AI prediction. kGCN is available at https://github.com/clinfo.Entities:
Keywords: Graph convolutional network; Graph neural network; KNIME; Open source software; kGCN
Year: 2020 PMID: 33430993 PMCID: PMC7216578 DOI: 10.1186/s13321-020-00435-6
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Fig. 1Architecture of kGCN
Fig. 2Graph convolutional network for a prediction task with a compound input
Fig. 3Multi-task graph convolutional network with a compound input
Fig. 4Multi-modal graph convolutional network with compound and sequence inputs
Fig. 5Single-task workflow for the hold-out procedure using the KNIME interface (Upper). Multi-task workflow for the hold-out procedure (Lower)
Fig. 6Multi-modal workflow for the hold-out procedure
Number of compounds in our dataset
| Assay type | |
|---|---|
| MMP-3 | 2095 |
| MMP-9 | 2829 |
| MMP-12 | 533 |
| MMP-13 | 2607 |
Fig. 7AUCs obtained from five-fold cross-validation
Fig. 8a Chemical structure. b Atomic contributions to the predicted MMP-9 activity. Red color represents the positive contribution to the prediction (MMP-9 active in this case). Blue color represents the negative contribution (not active)