| Literature DB >> 31780934 |
Ping Xuan1, Hui Cui2, Tonghui Shen1, Nan Sheng1, Tiangang Zhang3.
Abstract
Identifying new treatments for existing drugs can help reduce drug development costs and explore novel indications of drugs. The prediction of associations between drugs and diseases is challenging because their similarities and relations are complicated and non-linear. We propose a HeteroDualNet model to address this issue. Firstly, three types of matrices are extracted to represent intra-drug similarities, intra-disease similarity and drug-disease associations. The intra-drug similarities consider three drug features and a newly introduced drug-related disease correlation. Secondly, an embedding mechanism is proposed to integrate these matrices in a heterogenous drug-disease association layer (hetero-layer). Further, a neighbouring heterogeneous layer (hetero-layer-N) is constructed to incorporate the biological premise that similar drugs can often treat related diseases. Finally, a dual convolutional neural network is built with hetero-layer and hetero-layer-N as two branches to learn from characteristics of drug-disease and the relations of their neighbours simultaneously. HeteroDualNet outperformed the other four methods in comparison over a public dataset of 763 drugs and 681 diseases in terms of Areas Under the Curves of Receiver Operating Characteristics and Precision-Recall, and recall rate at top k. Case study of five drugs further proved the capacity of HeteroDualNet in finding reliable disease candidates of drugs as validated by database records or literature. Our findings show that the embedded heterogenous layers of original and neighbouring drug-disease representations in a dual neural network improved the association prediction performance.Entities:
Keywords: deep learning; drug-disease association prediction; dual convolutional neural network; multiple kinds of similarities; neighbouring heterogeneous layer
Year: 2019 PMID: 31780934 PMCID: PMC6856670 DOI: 10.3389/fphar.2019.01301
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
Figure 1Overview of the proposed HeteroDualNet model for drug-disease association prediction. Given input data, (A) similarity and association representations are extracted including (B) intra-disease similarity, (C) intra-drug similarity, and (D) drug-disease association. Then (E) an embedding mechanism is proposed to embed these matrices. The final drug-disease association score is obtained by (H) HeteroDualNet with (F) heterogeneous and biological premise enhanced (G) neighboring heterogeneous drug-disease association layers.
Figure 2Illustration of the proposed embedding mechanism for heterogenous drug-disease association matrix. Given drug r2 and disease d1 as an example, (D) the heterogeneous matrix is obtained by integrating (A) four types of intra-drug similarities, (B) drug-disease associations and (C) intra-disease similarities. In (A) and (C), darker colours indicate higher similarities; in (B) darker colour represents the drug-disease association is available.
Figure 3Illustration of the embedding procedure for neighbouring heterogeneous matrix. Using drug r 2 and disease d 1 as an example, (D) the final matrix is obtained by finding the most similar neighbours (e.g. r 3,r 1,r 5,r 4) of r 2 calculated from (A) four intra-drug similarities respectively, the most similar neighbour (e.g. d 4) of drug d 1 by (B) intra-drug similarity matrix, and (C) drug-disease associations. In (A) and (B), darker colours indicate higher similarities; in (C) darker colour represents the drug-disease association is available.
Figure 4Schematic diagram of HeteroDualNet. (A) One branch over hetero-layer of drug-disease characteristics and (B) one branch over the neighbouring heterogeneous layer (hetero-layer-N) are connected by (C) an integration module for final association score prediction. Three 3×5 filters in 1st convolution, six 3×5 filters in 2nd convolution, a sliding window of 1 × 2 in 1st and 2nd pooling are used for illustration.
Figure 5Comparison between the proposed HeteroDudalNet model (H_D_Net) against four other methods by Receiver Operating Characteristic (ROC) (A) and Precision-Recall (PR) (B) curves.
Receiver Operating Characteristic area under curve (ROC AUC) and Precision-Recall area under curve (PR AUC) of all the methods in comparison.
| Average performance on 763 drugs | |||||
|---|---|---|---|---|---|
| HeteroDualNet | TL_HGBI | MBiRW | LRSSL | SCMFDD | |
| ROC AUC | 0.908 | 0.723 | 0.855 | 0.845 | 0.611 |
| PR AUC | 0.154 | 0.031 | 0.045 | 0.089 | 0.006 |
Figure 6The recalls across all the tested drugs at different top k cutoffs.
Top 10 related candidate diseases of ciprofloxacin, ceftriaxone, ofloxacin, ampicillin and cefotaxime.
| Drug name | Rank | Disease name | Description | Rank | Disease | Description |
|---|---|---|---|---|---|---|
| ciprofloxacn | 1 | Pneumonia, Bacterial | CTD | 6 | Gram-Positive Bacterial Infections | CTD |
| 2 | Salmonella Infections | CTD | 7 | Eye Infections, Bacterial | Literature ( | |
| 3 | Bacterial Infections | CTD | 8 | Soft Tissue Infections | CTD | |
| 4 | Streptococcal Infections | DrugBank | 9 | Enterobacteriaceae Infections | CTD | |
| 5 | Gram-Negative Bacterial Infections | CTD | 10 | Helicobacter Infections | CTD | |
| ceftriaxone | 1 | Gram-Negative Bacterial Infections | CTD | 6 | Haemophilus Infections | CTD |
| 2 | Bacterial Infections | CTD, | 7 | Gram-Positive Bacterial Infections | CTD | |
| 3 | Septicemia | DrugBank | 8 | Skin Diseases, Infectious | DrugBank | |
| 4 | Respiratory Tract Infections | CTD | 9 | Wound Infection | ClinicalTrials | |
| 5 | Pseudomonas Infections | DrugBank | 10 | Eye Infections, Bacterial | DrugBank | |
| ofloxacin | 1 | Eye Infections, Bacterial | ClinicalTrials, | 6 | Pseudomonas Infections | CTD |
| 2 | Gram-Negative Bacterial Infections | DrugBank | 7 | Bacterial Infections | CTD | |
| 3 | Sinusitis | CTD | 8 | Bacteroides Infections | DrugBank | |
| 4 | Streptococcal Infections | CTD | 9 | Gram-Positive Bacterial Infections | CTD | |
| 5 | Pneumonia, Bacterial | CTD | 10 | Enterobacteriaceae Infections | DrugBank | |
| ampicillin | 1 | Pseudomonas Infections | unconfirmed | 6 | Proteus Infections | CTD |
| 2 | Bacterial Infections | CTD | 7 | Septicemia | DrugBank | |
| 3 | Gram-Positive Bacterial Infections | CTD | 8 | Streptococcal Infections | CTD | |
| 4 | Gram-Negative Bacterial Infections | CTD | 9 | Wound Infection | CTD | |
| 5 | Pneumonia, Bacterial | CTD, ClinicalTrials | 10 | Enterobacteriaceae Infections | DrugBank | |
| cefotaxime | 1 | Respiratory Tract Infections | CTD, ClinicalTrials | 6 | Enterobacteriaceae Infections | DrugBank |
| 2 | Pseudomonas Infections | DrugBank | 7 | Gram-Positive Bacterial Infections | CTD, DrugBank | |
| 3 | Gram-Negative Bacterial Infections | CTD, DrugBank | 8 | Wound Infection | DrugBank | |
| 4 | Septicemia | DrugBank | 9 | Skin Diseases, Infectious | ClinicalTrials | |
| 5 | Bacterial Infections | CTD, ClinicalTrials | 10 | Osteomyelitis | CTD, ClinicalTrials |
(1) CTD refers to the Comparative Toxicogenomics Database (CTD), which contains a manually managed drug-disease association. (2) DrugBank refers to the drug-disease association held in the DrugBank database, which collects experimental information of the drug. (3) ClinicalTrials means that the association of drugs with the disease is recorded in the online database ClinicalTrials.gov. (4) literature refers to the literature supporting the association of drugs with the disease. (5) unconfirmed means that there is no evidence that the drug is associated with the disease.