| Literature DB >> 36210804 |
Yuanyuan Zhang1,2, Mengjie Wu1, Shudong Wang2, Wei Chen1.
Abstract
Accurate identification of Drug Target Interactions (DTIs) is of great significance for understanding the mechanism of drug treatment and discovering new drugs for disease treatment. Currently, computational methods of DTIs prediction that combine drug and target multi-source data can effectively reduce the cost and time of drug development. However, in multi-source data processing, the contribution of different source data to DTIs is often not considered. Therefore, how to make full use of the contribution of different source data to predict DTIs for efficient fusion is the key to improving the prediction accuracy of DTIs. In this paper, considering the contribution of different source data to DTIs prediction, a DTIs prediction approach based on an effective fusion of drug and target multi-source data is proposed, named EFMSDTI. EFMSDTI first builds 15 similarity networks based on multi-source information networks classified as topological and semantic graphs of drugs and targets according to their biological characteristics. Then, the multi-networks are fused by selective and entropy weighting based on similarity network fusion (SNF) according to their contribution to DTIs prediction. The deep neural networks model learns the embedding of low-dimensional vectors of drugs and targets. Finally, the LightGBM algorithm based on Gradient Boosting Decision Tree (GBDT) is used to complete DTIs prediction. Experimental results show that EFMSDTI has better performance (AUROC and AUPR are 0.982) than several state-of-the-art algorithms. Also, it has a good effect on analyzing the top 1000 prediction results, while 990 of the first 1000DTIs were confirmed. Code and data are available at https://github.com/meng-jie/EFMSDTI.Entities:
Keywords: drug-target prediction; multi-source data; selective and weighted fusion; similarity network fusion; topology and semantic graph
Year: 2022 PMID: 36210804 PMCID: PMC9538487 DOI: 10.3389/fphar.2022.1009996
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.988
FIGURE 1EFMSDTI framework of predicting DTIs. EFMSDTI constructs 15 drug-related networks and target-related networks from heterogeneous multi-source data. Based on the contribution of the class network, the drug and target networks are fused or spliced after the network embedding. Through selective and weighted fusion based on SNF and extract low-dimensional vector of drugs and targets, then features are input into the LightGBM to predict DTIs. Among them, drug-related networks are DDI (Drug-drug), DD (Drug-disease), DSE (Drug-sideEffect), SDC (Chemical similarities), SDATC (ATC similarities), SDP (Drug targets sequence similarities), SDMF (molecular function similarities), SDCC (cellular component similarities) and SDBP (biological process similarities), target-related networks are TTI (Target-target-interaction), TD (Target-disease), STP (Target sequence similarities), STMF (molecular function similarities), STCC (cellular component similarities) and STBP (biological process similarities).
Classification of target-related data.
| Target category | Networks | |
|---|---|---|
| Topological graphs |
| Target-target network TTI |
|
| Target-disease network TD | |
| Semantic graphs |
| Similarity network of target protein sequence STP |
|
| Three networks based on GO, STMF, STCC and STBP |
Classification of drug-related data.
| Drug category | Networks | |
|---|---|---|
| Topological graphs |
| Drug-drug network DDI |
|
| Drug-disease network DD and drug-side effect network DSE | |
| Semantic graphs |
| Drug chemical similarity network SDC |
|
| Drug ATC similarity network SDATC | |
|
| Drug-associated protein sequence similarity network SDP | |
|
| Three networks based on GO, SDMF, SDCC and SDBP |
The process of six classes of drug networks by using SNF algorithm is described in algorithm 1.
| Algorithm 1: SNF_drug |
|---|
| Input:DDI.txt, DD.txt, DSE.txt, SDC.txt, SDP.txt, SDATC.txt, SDGO.txt Output: FuDrug.mat Begin |
| 1. Compute the similarity matrix of heterogeneous association matrix based on |
| 2. Calculate edge wights matrix |
| 3. Each similarity network is updated t times iteratively based on |
| 4. After t iterations, calculating the population state matrix |
FIGURE 2Comparison of DTI prediction accuracy (AUROC) under different drug and target data combinations. It describes the DTIs prediction of six categories of drugs and four categories of targets combinations. (A) The DTIs prediction of drugs which contains one network combine all target class networks and target which contain one network combine all drug class networks; (B) the DTIs prediction of drugs and target with fusion and splicing respectively.
Prediction performance of selective fusion. For ease of description, abbreviations in the model are expressed as follows.
| Model | AUROC | AUPR |
|---|---|---|
| DF_TFS | 0.903 | 0.908 |
| DE_D125_DF_TFS | 0.918 | 0.923 |
| DE_D12_T12_DF_TS | 0.924 | 0.933 |
|
|
|
|
| WEC_DF_TFS | 0.903 | 0.904 |
|
|
|
|
|
|
|
|
*D and T are drug and target; F and S are fusion and splicing; DE represents delete; A D or T followed by numbers indicates what kinds of data is deleted. The abbreviations WE and WEC represent unclassified network-based entropy and classified network entropy respectively.
DE_D12_DF_TFS means that the first class of the drug is deleted, and the remaining drugs are fully fused, and the first, second and third classes of the target are fused to learn the features and then spliced with the fourth class.
WE_DF_TFS means represents the entropy weighting of all the networks of the drug and the target, and all the networks of the drug are fused, and the three types of networks before the target are fused to learn features, and then they are spliced with the fourth type of network.
DE_D12_WE_DF_TFP represents the synthesis of the first two bold methods, that is, the first type of drug network is deleted, all the networks are entropy-weighted, and the drug network is fully fused, and the first three types of target networks are fused to learn features, and then spliced with the fourth type of network.
Comparison of EFMSDTI with other state-of-the-art methods for DTIs prediction.
| Model | AUROC | AUPR |
|---|---|---|
| EFMSDTI |
|
|
| NEDTP | 0.971 | 0.967 |
| deepDTnet | 0.963 | 0.969 |
| NeoDTI | 0.97 | 0.91 |
| DTINet | 0.932 | 0.943 |
FIGURE 3The number of DTIs that were verified to exist in the top 1000 prediction results.
The 10 unverified DTIs out of the top 1000 prediction results.
| DTI’s ranking in the top 1000 predictions | Drug | Target |
|---|---|---|
| 922nd | DB08882:Linagliptin | 2934:GSN |
| 953rd | DB00612:Bisoprolol | 3283:HSD3B1 |
| 963rd | DB08896:Regorafenib | 9453:GGPS1 |
| 991st | DB00606:Cyclothiazide | 4698:NDUFA5 |
| 993rd | DB00602:Ivermectin | 834:CASP1 |
| 994th | DB01594:Cinolazepam | 378:ARF4 |
| 995th | DB06204:Tapentadol | 55825:PECR |
| 997th | DB00312:Pentobarbital | 4326:MMP17 |
| 998th | DB01364:Ephedrine | 4507:MTAP |
| 999th | DB00459:Acitretin | 10606:PAICS |