| Literature DB >> 33431036 |
Maha A Thafar1,2, Rawan S Olayan1,3, Haitham Ashoor1,3, Somayah Albaradei1,4, Vladimir B Bajic1, Xin Gao1, Takashi Gojobori1,5, Magbubah Essack6.
Abstract
In silico prediction of drug-target interactions is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. However, developing such computational methods is not an easy task, but is much needed, as current methods that predict potential drug-target interactions suffer from high false-positive rates. Here we introduce DTiGEMS+, a computational method that predicts Drug-Target interactions using Graph Embedding, graph Mining, and Similarity-based techniques. DTiGEMS+ combines similarity-based as well as feature-based approaches, and models the identification of novel drug-target interactions as a link prediction problem in a heterogeneous network. DTiGEMS+ constructs the heterogeneous network by augmenting the known drug-target interactions graph with two other complementary graphs namely: drug-drug similarity, target-target similarity. DTiGEMS+ combines different computational techniques to provide the final drug target prediction, these techniques include graph embeddings, graph mining, and machine learning. DTiGEMS+ integrates multiple drug-drug similarities and target-target similarities into the final heterogeneous graph construction after applying a similarity selection procedure as well as a similarity fusion algorithm. Using four benchmark datasets, we show DTiGEMS+ substantially improves prediction performance compared to other state-of-the-art in silico methods developed to predict of drug-target interactions by achieving the highest average AUPR across all datasets (0.92), which reduces the error rate by 33.3% relative to the second-best performing model in the state-of-the-art methods comparison.Entities:
Keywords: Bioinformatics; Cheminformatics; Drug repositioning; Drug–target interaction; Graph embedding; Heterogenous network; Machine learning; Similarity integration; Similarity-based
Year: 2020 PMID: 33431036 PMCID: PMC7325230 DOI: 10.1186/s13321-020-00447-2
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Benchmark Yamanishi_08 datasets statistics
| Statistics | NR | GPCR | IC | Enzyme |
|---|---|---|---|---|
| No. of drugs | 54 | 223 | 210 | 445 |
| No. of targets | 26 | 95 | 204 | 664 |
| Known DTIs | 90 | 635 | 1476 | 2926 |
| Unknown DTIs | 1314 | 20,550 | 41,364 | 292,554 |
| Sparsity ratio | 0.068 | 0.031 | 0.036 | 0.010 |
Fig. 1DTIs prediction problem depiction
Fig. 2Integrating multiple similarities using different functions
Fig. 4DTiGEMS+ prediction Framework. DTIs: drug–target interactions; DD: drug–drug; TT: target–target; FV: feature vector; FSS alg.: forward similarity selection algorithm; SNF fuc: similarity network fusion function; COS similarity: cosine similarity; ML: machine learning
The equations used to determine path structure features
| Score description | Equation |
|---|---|
| The meta-path score is the product of all the edge weight scores from the start drug node to the ending target node in each path structure | |
| The sum of all meta-path scores for each path structure (Sum feature) | |
| The max path score is the highest meta-path score under each path structure (Max feature) |
Fig. 3An illustration of Sum and Max scores for a D–D–D–T path structure
Average scores for the AUPR, AUC, and ranking position for all comparison methods across all benchmark datasets
| Methods | BLM-NII | KronRLS | RLS-WNN | NRLMF | DNILMF | DDR | TriModel | DTiGEMS+ |
|---|---|---|---|---|---|---|---|---|
| Average AUPR | 0.68 | 0.73 | 0.76 | 0.80 | 0.78 | 0.87 | ||
| Average AUC | 0.92 | 0.90 | 0.96 | 0.95 | 0.95 | 0.96 | ||
| Average of the ranking position across all datasets | 8 | 7 | 6 | 4 | 5 | 3 |
We rounded-off all results to two decimal places. The italic font with underline indicates the best result in each category, while the italic font only indicates the second-best result
Fig. 5Comparison results for DTiGEMS+ and other methods in terms of AUPR using the Yamanishi_08 datasets. The best performing method is indicated in blue, the second-best method in purple, and all other methods in green
Relative error rates associated with DTiGEMS+ and the second-best performing model TriModel
| Datasets | ER1 of DTiGEMS+ (%) | ER2 of TriModel (%) | Relative ER reduction (%) |
|---|---|---|---|
| 12.00 | 16.00 | 25.00 | |
| 14.00 | 20.00 | 30.00 | |
| 4.00 | 7.00 | 42.86 | |
| 3.00 | 5.00 | 40.00 | |
| The average of ΔER across all datasets | 34.47 | ||
Validation of the 10-top ranked newly predicted DTIs for each dataset
| Data sets | # | KEGG: Drug ID | Drug name | KEGG: Target ID | Target name | Validation evidence |
|---|---|---|---|---|---|---|
| NR | 1 | D01132 | Tazarotene | hsa6097 | RORC (RAR Related Orphan Receptor C) | Unknown |
| 2 | D00182 | Norethindrone | hsa2099 | ESR1 (Estrogen Receptor Alpha) | PMID: 27245768 T3DB: T3D4745 | |
| 3 | D00075 | Testosterone | hsa5241 | PGR (Progesterone Receptor) | PMID: 23229004 PMID: 23933754 C: CHEMBL386630 | |
| 4 | D01132 | Tazarotene | hsa190 | NR0B1 (Nuclear Receptor Subfamily 0 Group B Member 1) | Unknown | |
| 5 | D00094 | Tretinoin | hsa3174 | HNF4G (Hepatocyte Nuclear Factor 4 Gamma) | Unknown | |
| 6 | D00554 | Ethinyl estradiol | hsa2100 | ESR2 (Estrogen Receptor 2) | CTD: D004997 | |
| 7 | D00327 | Fluoxymesterone | hsa5241 | PGR (Progesterone Receptor) | Unknown | |
| 8 | D01294 | Ethynodiol diacetate | hsa2100 | ESR2 (Estrogen Receptor 2) | Unknown | |
| 9 | D00299 | Dihydrotachysterol | hsa190 | NR0B1 (Nuclear Receptor Subfamily 0 Group B Member 11) | Unknown | |
| 10 | D00094 | Tretinoin | hsa6095 | RORA (RAR Related Orphan Receptor A) | C: CHEMBL38 | |
| GPCR | 1 | D00283 | Clozapine | hsa1814 | DRD3 (Dopamine Receptor D3) | C: CHEMBL42 M: Clozapine DB: DB00363 |
| 2 | D02358 | Metoprolol | hsa154 | ADRB2 (Adrenoceptor Alpha 1B) | DB: DB00264 | |
| 3 | D00437 | Nifedipine | hsa152 | ADRA2C (Adrenergic Receptor alpha-2C) | C: CHEMBL193 | |
| 4 | D00604 | Clonidine hydrochloride | hsa147 | ADRA1B (Adrenergic Receptor alpha-1B) | DB: DB00575 | |
| 5 | D00255 | Carvedilol | hsa152 | ADRA2C (Adrenergic Receptor alpha-2C) | DB: DB01136 | |
| 6 | D00451 | Sumatriptan | hsa3363 | HTR7 (5-Hydroxytryptamine Receptor 7) | Unknown | |
| 7 | D00397 | Tropicamide | hsa1133 | CHRM5 (Cholinergic Receptor Muscarinic 5) | KG: D00397 | |
| 8 | D00270 | Chlorpromazine | hsa152 | ADRA2C (Adrenoceptor Alpha 2C) | KG: D00270 | |
| 9 | D02250 | Octreotide acetate | hsa6751 | SSTR1 (Somatostatin Receptor 1) | CTD: D015282 | |
| 10 | D01103 | Trospium chloride | hsa1129 | CHRM2 (Cholinergic Receptor Muscarinic 2) | KG: D01103 | |
| IC | 1 | D00649 | Amiloride hydrochloride | hsa8911 | CACNA1I (Calcium Voltage-Gated Channel Subunit Alpha1 I) | M: Amiloride (direct) |
| 2 | D03365 | Nicotine | hsa1137 | CHRNA4 (Cholinergic Receptor Nicotinic Alpha 4 Subunit) | PMID: 17590520 KG: D03365 DB: DB00184 | |
| 3 | D00775 | Riluzole | hsa2898 | GRIK2 (Glutamate Ionotropic Receptor Kainate Type Subunit 2) | KG: D00775 | |
| 4 | D00438 | Nimodipine | hsa779 | CACNA1S (Calcium Voltage-Gated Channel Subunit Alpha1S) | KG: D00438 DB: DB00393 | |
| 5 | D00726 | Metoclopramide | hsa1138 | CHRNA5 (Cholinergic Receptor Nicotinic Alpha 5 Subunit) | Unknown | |
| 6 | D00552 | Benzocaine | hsa6331 | SCN5A (Sodium Voltage-Gated Channel Alpha Subunit 5) | KG: D00552 | |
| 7 | D00542 | Halothane | hsa3736 | KCNA1(Potassium Voltage-Gated Channel Subfamily A Member 1) | Unknown | |
| 8 | D02098 | Proparacaine hydrochloride | hsa8645 | KCNK5 (Potassium Two Pore Domain Channel Subfamily K Member 5) | Unknown | |
| 9 | D01599 | Gliclazide | hsa3758 | KCNJ1 (Potassium Inwardly Rectifying Channel Subfamily J Member 1) | Unknown | |
| 10 | D00538 | Zonisamide | hsa6331 | SCN5A (Sodium Voltage-Gated Channel Alpha Subunit 5) | DB: DB00909 KG: D00538 | |
| E | 1 | D00437 | Nifedipine | hsa1559 | CYP2C9 (Cytochrome P450 Family 2 Subfamily C Member 9) | CTD: D009543 PMID: 9929518 |
| 2 | D00574 | Aminoglutethimide | hsa1589 | CYP21A2 (Cytochrome P450 Family 21 Subfamily A Member 2) | M: Aminoglutethimide PMID: 8201961 | |
| 3 | D00410 | Metyrapone | hsa1583 | CYP11A1 (Cytochrome P450 Family 11 Subfamily A Member 1) | CTD: D008797 | |
| 4 | D00437 | Nifedipine | hsa1585 | CYP11B2 (Cytochrome P450 Family 11 Subfamily B Member 2) | M: Nifedipine- CTD: D009543 | |
| 5 | D00410 | Metyrapone | hsa1543 | CYP1A1 (Cytochrome P450 Family 1 Subfamily A Member) | PMID: 9512490 | |
| 6 | D03670 | Deferoxamine | hsa51302 | CYP39A1 (Cytochrome P450 Family 39 Subfamily A Member 1) | Unknown | |
| 7 | D00043 | Isoflurophate | hsa1991 | ELANE (Elastase, Neutrophil Expressed) | M: Diisopropylfluorophosphate (indirect) | |
| 8 | D00947 | Linezolid | hsa4129 | MAOB (Monoamine Oxidase B) | CTD: D000069349 | |
| 9 | D03670 | Deferoxamine | hsa4353 | MPO (Myeloperoxidase) | M: Desferrioxamine (indirect) | |
| 10 | D05458 | Phentermine | hsa4128 | MAOA (Monoamine Oxidase A) | KG: D05458 DB: DB00191 |
C: ChEMBL; CTD: comparative toxicogenomics database; DB: DrugBank; M: MATADOR; KG: KEGG; PMID: PubMed; T3DB: toxin and toxin–target database