| Literature DB >> 33941241 |
Shingo Tsuji1, Takeshi Hase2,3,4,5, Ayako Yachie-Kinoshita2,4, Taiko Nishino2, Samik Ghosh2, Masataka Kikuchi6, Kazuro Shimokawa7, Hiroyuki Aburatani8, Hiroaki Kitano2, Hiroshi Tanaka3.
Abstract
BACKGROUND: Identifying novel therapeutic targets is crucial for the successful development of drugs. However, the cost to experimentally identify therapeutic targets is huge and only approximately 400 genes are targets for FDA-approved drugs. As a result, it is inevitable to develop powerful computational tools that can identify potential novel therapeutic targets. Fortunately, the human protein-protein interaction network (PIN) could be a useful resource to achieve this objective.Entities:
Keywords: Deep learning; Drug discovery; Machine learning; Network embedding; Protein interaction network; Systems biology
Mesh:
Substances:
Year: 2021 PMID: 33941241 PMCID: PMC8091739 DOI: 10.1186/s13195-021-00826-3
Source DB: PubMed Journal: Alzheimers Res Ther Impact factor: 6.982
Fig. 1Computational analysis pipeline for drug target prioritization. (Step 1) Our computational framework employed genome-wide PINs and information of drug targets obtained from public domain databases. (Step 2) The framework is based on a deep autoencoder to extract low-dimensional latent features from high-dimensional PIN. (Step 3) By using features from step 2 and a target gene list for a specific disease, we generated 100 datasets to train the 100 classifier models. By using the 100 datasets and the state-of-the-art machine learning techniques (SMOTE and Xgboost), we build 100 classifier models to infer potential drug targets. (Step 4) We applied the classifier models to all unknown drug-target genes in the PIN to prioritize potential drug target genes
Fig. 2Relationship between features in low-dimensional latent space by deep autoencoder and representative network metrics in the PIN. The X-axis is the latent space dimension and the Y-axis is Spearman’s correlation coefficient between a given low-dimensional feature and a given network metric (see Supplementary Figure 1 for the original data). The gray background dimensions (58, 86, 88, and 89) indicate almost no correlation to the representative network metrics. Several dimensions without the box (e.g., dimension 6 and 7) are n.a. because the encoded numerical values for all genes are zero
Top 20 genes with the highest mean probability value for the “positive (drug target)” class
| DLG4 | 0.99859 |
| PLCG1 | 0.99775 |
| EGFR | 0.99758 |
| SYK | 0.99752 |
| PTK2B | 0.99617 |
| RAC1 | 0.99585 |
| CAV1 | 0.99579 |
| DLG1 | 0.99512 |
| PIK3R1 | 0.99500 |
| PRKCA | 0.99292 |
| KIT | 0.99224 |
| JAK1 | 0.99154 |
| PTPN6 | 0.98968 |
| CRKL | 0.98918 |
| SHC1 | 0.98840 |
| NCK1 | 0.98760 |
| ZAP70 | 0.98750 |
| PTPN11 | 0.98630 |
| DLG3 | 0.98551 |
| PTK2 | 0.98537 |
| DLG2 | 0.98471 |
| IL2RB | 0.98328 |
| JAK2 | 0.98299 |
| GRB2 | 0.98278 |
Fig. 3Pathway enrichment analysis using GO biological database for the 187 putative targets from our computational pipeline for Alzheimer’s disease. The names of the pathways are shown on the vertical axis, and the bars on the horizontal axis represent the − log10(p value) of the corresponding pathway. Dashed lines in orange, magenta, and red indicate p value <0.05, 0.01, and 0.001, respectively
Fig. 4Pathway enrichment analysis using the KEGG database for 187 putative targets. The legend for this figure is the same as that for Fig. 3
Fig. 5Pathway enrichment analysis using the Reactome pathway for 187 putative targets. The legend for this figure is the same as that for Fig. 3
Fig. 6A method to infer potential repositionable drugs based on the putative targets derived from our computational pipeline. Step 1: We obtained the drug-target-disease network from the DrugBank database. Step 2: We mapped the associations between the putative target genes and their target diseases to infer the potential repositionable drugs for a given disease
Top 20 candidate repositioning drugs for Alzheimer’s disease
| DRUG | Overlaps between known targets and predicted targets | # of overlaps |
|---|---|---|
| Regorafenib | RET; FLT1; KDR; KIT; PDGFRA; PDGFRB; FGFR1; TEK; NTRK1; EPHA2; ABL1 | 11 |
| Tamoxifen | ESR1; ESR2; PRKCA; PRKCB; PRKCD; PRKCE; PRKCG; PRKCQ; PRKCZ; ESRRG | 10 |
| Ponatinib | ABL1; KIT; RET; TEK; FGFR1; LCK; SRC; LYN; KDR; PDGFRA | 10 |
| Dasatinib | ABL1; SRC; FYN; LCK; KIT; PDGFRB; EPHA2; BTK; FGR; LYN | 10 |
| Imatinib | PDGFRB; ABL1; KIT; RET; NTRK1; CSF1R; PDGFRA | 7 |
| Brigatinib | EGFR; ABL1; IGF1R; INSR; MET; ERBB2 | 6 |
| Sorafenib | PDGFRB; KIT; KDR; FGFR1; RET; FLT1 | 6 |
| Sunitinib | PDGFRB; FLT1; KDR; KIT; CSF1R; PDGFRA | 6 |
| Nintedanib | FLT1; KDR; FGFR1; LCK; LYN; SRC | 6 |
| Pazopanib | FLT1; KDR; PDGFRA; PDGFRB; KIT | 5 |
| Midostaurin | PRKCA; KDR; KIT; PDGFRA; PDGFRB | 5 |
| Resveratrol | ITGA5; ITGB3; SNCA; ESR1; AKT1 | 5 |
| Diethylstilbestrol | ESR1; ESRRG; ESR2; ESRRA | 4 |
| Tofacitinib | TYK2; JAK2; JAK1; JAK3 | 4 |
| Lenvatinib | FLT1; KDR; FGFR1; KIT | 4 |
| Foreskin fibroblast (neonatal) | FLT1; CSF2RA; PDGFRB; TGFB1 | 4 |
| Baricitinib | JAK1; JAK2; PTK2B; JAK3 | 4 |
| Foreskin keratinocyte (neonatal) | EGFR; CSF2RA; PDGFRA; TGFB1 | 4 |
| Bosutinib | ABL1; LYN; SRC | 3 |
| Estradiol valerate | ESR1; ESR2; ESRRG | 3 |