| Literature DB >> 28249566 |
Pathima Nusrath Hameed1,2,3, Karin Verspoor4, Snezana Kusljic5,6, Saman Halgamuge7.
Abstract
BACKGROUND: Investigating and understanding drug-drug interactions (DDIs) is important in improving the effectiveness of clinical care. DDIs can occur when two or more drugs are administered together. Experimentally based DDI detection methods require a large cost and time. Hence, there is a great interest in developing efficient and useful computational methods for inferring potential DDIs. Standard binary classifiers require both positives and negatives for training. In a DDI context, drug pairs that are known to interact can serve as positives for predictive methods. But, the negatives or drug pairs that have been confirmed to have no interaction are scarce. To address this lack of negatives, we introduce a Positive-Unlabeled Learning method for inferring potential DDIs.Entities:
Keywords: CYP isoforms; Drug-drug interaction; Growing self organizing map (GSOM); PU learning; Pairwise drug similarity
Mesh:
Substances:
Year: 2017 PMID: 28249566 PMCID: PMC5333429 DOI: 10.1186/s12859-017-1546-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1This diagram illustrates the main idea behind Positive-Unlabeled Learning. a Available data. b Goal
Fig. 2This diagram illustrates the proposed methodology and our three main contributions for inferring DDIs, integrating Similarity Feature Representation1 (SFR1) and Similarity Feature Representation2 (SFR2)
Fig. 3Pseudo-code for profiling GSOM nodes as ‘positive/negative/ambiguous’ node
Fig. 4Example of deriving similarity metrics for drug association. Jaccard Index is the frequently used approach while Individual Similarity function is the proposed function
Fig. 5a The average within cluster distance (AWCD) using Similarity Feature Representation 1 and (b) Number of GSOM nodes variation for Similarity Feature Representation 1
Fig. 6GSOM maps for DDI data: (a) shows the GSOM map for Similarity Feature Representation 1 (SFR1) when Spread Factor=0.1 and it contains 919 nodes; (b) shows the GSOM map for Similarity Feature Representation 2 (SFR2) when Spread Factor= 10−15 and it contains 922 nodes. The nodes shown in blue are the proposed negative nodes having only unlabeled instances, the nodes shown in grey contains both initial positives and unlabeled instances, and the nodes shown in red contains only initial positives
Performance assessment of the proposed GSOM-based PUL approach, Baseline and OCSVM using Similarity Feature Representation1 (SFR1) and Similarity Feature Representation2 (SFR2)
| Baseline | OCSVM | GSOM-based PUL | |||
|---|---|---|---|---|---|
| Cross validation | SFR1 | Precision | 0.628 | 0.584 | 0.951 |
| Recall | 0.448 | 0.499 | 0.861 | ||
| F1-score | 0.523 | 0.537 | 0.904 | ||
| SFR2 | Precision | 0.823 | 0.622 | 0.974 | |
| Recall | 0.850 | 0.436 | 0.975 | ||
| F1-score | 0.836 | 0.511 | 0.974 |
Performance assessment of the ensemble model for GSOM-based PUL approach for DDI prediction using Similarity Feature Representation1 (SFR1) and Similarity Feature Representation2 (SFR2)
| SFR1 | SFR2 | Ensemble model | ||
|---|---|---|---|---|
| Cross validation (80%) | Precision | 0.960 | 0.968 | |
| Recall | 0.841 | 0.972 | ||
| F-measure | 0.896 | 0.970 | ||
| Testing (20%) | Precision | 0.749 | 0.973 | 0.970 |
| Recall | 0.648 | 0.979 | 0.975 | |
| F-measure | 0.692 | 0.976 | 0.973 |
Cross-Validation performance assessment of the proposed GSOM-based PUL approach, Baseline and OCSVM for DDI prediction using INDI dataset
| Baseline | OCSVM | GSOM-based PUL | |
|---|---|---|---|
| Precision | 0.610 | 0.583 | 0.917 |
| Recall | 0.529 | 0.583 | 0.916 |
| F1-score | 0.567 | 0.583 | 0.916 |