| Literature DB >> 35873597 |
Shaghayegh Sadeghi1, Jianguo Lu1, Alioune Ngom1.
Abstract
Drug repurposing is the process of discovering new indications (i.e., diseases or conditions) for already approved drugs. Many computational methods have been proposed for predicting new associations between drugs and diseases. In this article, we proposed a new method, called DR-HGNN, an integrative heterogeneous graph neural network-based method for multi-labeled drug repurposing, to discover new indications for existing drugs. For this purpose, we first used the DTINet dataset to construct a heterogeneous drug-protein-disease (DPD) network, which is a graph composed of four types of nodes (drugs, proteins, diseases, and drug side effects) and eight types of edges. Second, we labeled each drug-protein edge, dp i,j = (d i , p j ), of the DPD network with a set of diseases, {δ i,j,1, … , δ i,j,k } associated with both d i and p j and then devised multi-label ranking approaches which incorporate neural network architecture that operates on the heterogeneous graph-structured data and which leverages both the interaction patterns and the features of drug and protein nodes. We used a derivative of the GraphSAGE algorithm, HinSAGE, on the heterogeneous DPD network to learn low-dimensional vector representation of features of drugs and proteins. Finally, we used the drug-protein network to learn the embeddings of the drug-protein edges and then predict the disease labels that act as bridges between drugs and proteins. The proposed method shows better results than existing methods applied to the DTINet dataset, with an AUC of 0.964.Entities:
Keywords: computational drug repurposing; data integration; graph embedding; graph neural network; graphsage; link prediction
Year: 2022 PMID: 35873597 PMCID: PMC9298882 DOI: 10.3389/fphar.2022.908549
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.988
FIGURE 1Pipeline of DR-HGNN. (A) Creating Heterogeneous Drug-Disease-Protein Network: using the DTINet dataset, a meta-graph is created, which can be presented as a Heterogeneous Graph (on the right). (B) Multi-label Problem Transformation: A problem transformation technique is used since the Heterogeneous Graph from step A is multi-labeled. (C) Link Embedding Using Heterogeneous GraphSAGE: With matrix representation of each protein and drug and the heterogeneous graph from step B, Heterogeneous GraphSAGE embeds links between nodes of this heterogeneous graph.
Comparison of GNN methods.
| Method | Handle bipartite graph | Handle heterogeneous graph | Handle different node feature sizes |
|---|---|---|---|
| HinSAGE | Yes | Yes | Yes |
| GraphSAGE | Yes | No | Yes |
| GCN | No | No | No |
| GAT | Yes | No | Yes |
Number of nodes and edges of individual types in the constructed heterogeneous network on DTINet (Luo et al., 2017).
| Node | Number of edges | |||
|---|---|---|---|---|
| Drug | Protein | Disease | Side effect | |
| Drug | 10, 036 | 1, 923 | 199, 214 | 80, 164 |
| Protein | 1, 923 | 7, 363 | 1, 596, 745 | – |
| Number of nodes | 708 | 1, 512 | 5, 603 | 4, 192 |
FIGURE 2AUC ROC and AUC PR values of prediction results obtained by applying DR-HGNN and other reported methods in 5-fold cross-validation.
FIGURE 4AUC and loss history plot for DR-HGNN on each epoch for the training and validation datasets.
Results of DR-HGNN on the TL-HBGI dataset (Wang et al., 2014).
| Method | AUC | AUPR |
|---|---|---|
| TL-HGBI ( | 0.95 | 0.0492 |
| NMF-DR ( |
| 0.4200 |
| SCMFDD ( | 0.97 | 0.1500 |
| NTSIM ( | 0.96 | 0.2631 |
| DR-HGNN | 0.9895 |
|
AUC ROC results for DR-HGNN based on different parameters.
| Adam optimizer | Learning rate | |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
| ||
| Dropout | 0 | 0.8245 |
| 0.9487 | 0.9639 | 0.8734 |
| 0.1 | 0.8678 | 0.9165 | 0.954 |
|
| |
| 0.2 | 0.8605 | 0.9365 |
| 0.8903 | 0.8778 | |
| 0.3 | 0.9333 | 0.916 | 0.9594 | 0.9167 | 0.8317 | |
| 0.4 | 0.898 | 0.9307 | 0.8955 | 0.937 | 0.827 | |
| 0.5 | 0.9492 | 0.9519 | 0.8986 | 0.9606 | 0.8319 | |
| 0.6 | 0.888 | 0.9469 | 0.892 | 0.9068 | 0.8687 | |
| 0.7 | 0.8682 | 0.8923 | 0.8682 | 0.961 | 0.7054 | |
| 0.8 |
| 0.93 | 0.9106 | 0.9448 | 0.6968 | |
| 0.9 | 0.8922 | 0.9457 | 0.9357 | 0.9298 | 0.8472 | |
FIGURE 3Impact of the number of layers and embedding dimension on the model performance.