| Literature DB >> 30311095 |
Ming Hao1, Stephen H Bryant1, Yanli Wang2.
Abstract
BACKGROUND: Fast and accurate identification of potential drug candidates against therapeutic targets (i.e., drug-target interactions, DTIs) is a fundamental step in the early drug discovery process. However, experimental determination of DTIs is time-consuming and costly, especially for testing the associations between the entire chemical and genomic spaces. Therefore, computationally efficient algorithms with accurate predictions are required to achieve such a challenging task. In this work, we design a new chemoinformatics approach derived from neighbor-based collaborative filtering (NBCF) to infer potential drug candidates for targets of interest. One of the fundamental steps of NBCF in the application of DTI predictions is to accurately measure the similarity between drugs solely based on the DTI profiles of known knowledge. However, commonly used similarity calculation methods such as COSINE may be noise-prone due to the extremely sparse property of the DTI bipartite network, which decreases the model performance of NBCF. We herein propose three strategies to remedy such a dilemma, which include: (1) adopting a positive pointwise mutual information (PPMI)-based similarity metric, which is noise-immune to some extent; (2) performing low-rank approximation of the original prediction scores; (3) incorporating auxiliary (complementary) information to produce the final predictions.Entities:
Year: 2018 PMID: 30311095 PMCID: PMC6755712 DOI: 10.1186/s13321-018-0303-x
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Benchmark datasets and corresponding properties
| DATASET-H | DATASET-K | DATASET-Y | |
|---|---|---|---|
| Number of targets | 733 | 809 | 664 |
| Number of drugs | 829 | 786 | 445 |
| Number of interactions | 3688 | 3681 | 2926 |
| Average interaction number of each drug with targets | ~ 4 | ~ 5 | ~ 7 |
| Average interaction number of each target with drugs | ~ 5 | ~ 5 | ~ 4 |
| Minimum interaction number of each drug with targets | 1 | 1 | 1 |
| Maximum interaction number of each drug with targets | 48 | 48 | 96 |
| Minimum interaction number of each target with drugs | 1 | 1 | 1 |
| Maximum interaction number of each target with drugs | 75 | 55 | 61 |
| Sparsity | 0.006 | 0.006 | 0.010 |
Fig. 1Workflow of the proposed NBCF algorithm with strategies designed for improving DTI predictions
Results of MPR for the proposed algorithms based on 5 trials of tenfold cross-validation in the benchmark datasets
| Similarity method | DATASET-H | DATASET-K | DATASET-Y |
|---|---|---|---|
| Strategy 1 | |||
| PPMI | 0.054 ± 0.010 | 0.049 ± 0.010 | 0.020 ± 0.006 |
| COSINE | 0.081 ± 0.019 | 0.068 ± 0.019 | 0.037 ± 0.013 |
| TANIMOTO | 0.092 ± 0.026 | 0.070 ± 0.017 | 0.035 ± 0.012 |
| Strategy 2 | |||
| PPMI | 0.061 ± 0.012 | 0.055 ± 0.014 | 0.023 ± 0.008 |
| COSINE | 0.066 ± 0.013 | 0.049 ± 0.010 | 0.029 ± 0.007 |
| TANIMOTO | 0.066 ± 0.013 | 0.052 ± 0.011 | 0.028 ± 0.007 |
| Strategy 3 | |||
| PPMI | 0.109 ± 0.020 | 0.077 ± 0.014 | 0.023 ± 0.007 |
| COSINE | 0.086 ± 0.013 | 0.051 ± 0.009 | 0.027 ± 0.006 |
| TANIMOTO | 0.083 ± 0.014 | 0.055 ± 0.010 | 0.027 ± 0.004 |
| DT-hybrid | |||
| – | 0.083 ± 0.023 | 0.063 ± 0.016 | 0.037 ± 0.013 |
Fig. 2Boxplots of MPR for the proposed NBCF algorithm for three benchmark datasets. a–c MPR based on Strategy 1; d–f MPR based on Strategy 2; and g–i MPR based on Strategy 3