| Literature DB >> 29186828 |
Wen Zhang1, Yanlin Chen2, Dingfang Li3.
Abstract
Interactions between drugs and target proteins provide important information for the drug discovery. Currently, experiments identified only a small number of drug-target interactions. Therefore, the development of computational methods for drug-target interaction prediction is an urgent task of theoretical interest and practical significance. In this paper, we propose a label propagation method with linear neighborhood information (LPLNI) for predicting unobserved drug-target interactions. Firstly, we calculate drug-drug linear neighborhood similarity in the feature spaces, by considering how to reconstruct data points from neighbors. Then, we take similarities as the manifold of drugs, and assume the manifold unchanged in the interaction space. At last, we predict unobserved interactions between known drugs and targets by using drug-drug linear neighborhood similarity and known drug-target interactions. The experiments show that LPLNI can utilize only known drug-target interactions to make high-accuracy predictions on four benchmark datasets. Furthermore, we consider incorporating chemical structures into LPLNI models. Experimental results demonstrate that the model with integrated information (LPLNI-II) can produce improved performances, better than other state-of-the-art methods. The known drug-target interactions are an important information source for computational predictions. The usefulness of the proposed method is demonstrated by cross validation and the case study.Entities:
Keywords: drug-target interactions; integrated information; label propagation; linear neighborhood
Mesh:
Substances:
Year: 2017 PMID: 29186828 PMCID: PMC6149680 DOI: 10.3390/molecules22122056
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1The area under the precision-recall curve (AUPR) values of the similarity-based models with different parameters. LNS-10 means LN similarity-based models constructed with 10 neighbors. Other symbols have the similar meanings.
Performances of label propagation method with linear neighborhood information (LPLNI) models and LPLNI-II models on the four datasets.
| Features | Methods | NRs | ICs | GPCRs | Es |
|---|---|---|---|---|---|
| Daylight | LPLNI | ||||
| EState | LPLNI | 0.2958 | 0.2437 | 0.3096 | 0.2770 |
| 0.6903 | 0.7098 | 0.8480 | 0.8055 | ||
| Extended | LPLNI | ||||
| GraphOnly | LPLNI | 0.3177 | 0.3226 | 0.3525 | 0.3507 |
| 0.7478 | 0.7606 | 0.8483 | 0.7939 | ||
| Hybridization | LPLNI | ||||
| Klekota-Roth | LPLNI | 0.3030 | 0.3819 | 0.3360 | |
| 0.7355 | 0.8580 | 0.8179 | |||
| MACCS | LPLNI | 0.3764 | 0.3881 | 0.3804 | |
| 0.7712 | 0.8621 | 0.8360 | |||
| Pubchem | LPLNI | 0.4470 | 0.3234 | ||
| 0.7561 | 0.7522 | ||||
| Substructure | LPLNI | 0.3202 | 0.3092 | 0.2942 | 0.2875 |
| 0.7539 | 0.7662 | 0.8465 | 0.8068 | ||
| Interaction profile | LPLNI | 0.9464 | 0.9658 | 0.9461 | 0.9051 |
| 0.9532 | 0.9890 | 0.9683 | 0.9465 | ||
| Day&Ext&Hyb&Int | LPLNI-II | 0.9492 | 0.9684 | 0.9469 | 0.9069 |
| 0.9919 | 0.9947 | 0.9769 | 0.9700 |
The value of each fingerprint represents AUPR values (previous row) and area under the receiver operating characteristic (ROC) curve (AUC) values (next row). The bold type indicates the top 4 in terms of AUC and AUPR values. Day&Ext&Hyb&Int: using Daylight, Extended, Hybridization, and the interaction profile as features.
Performances of LPLNI and RLS-Kron based on the interaction profiles.
| Datasets | Features | Methods | AUC | AUPR |
|---|---|---|---|---|
| Es | Interaction profile | RLS-Kron | 0.8850 | |
| LPLNI | 0.9465 | |||
| GPCRs | Interaction profile | RLS-Kron | 0.9470 | 0.7130 |
| LPLNI | ||||
| ICs | Interaction profile | RLS-Kron | 0.9860 | 0.9270 |
| LPLNI | ||||
| NRs | Interaction profile | RLS-Kron | 0.9060 | 0.6100 |
| LPLNI |
The bold type indicates the highest AUC/AUPR values. The following tables maintain uniform standards.
Performances of LPLNI-II and other state-of-the-art methods.
| Datasets | Features | Methods | AUC | AUPR |
|---|---|---|---|---|
| Es | chem&gen&int | RLS-Kron | 0.9780 | |
| chem&gen&int | NetLapRLS | N.A. | ||
| chem&int | LPLNI-II | 0.9700 | 0.9069 | |
| GPCRs | chem&gen&int | RLS-Kron | 0.9540 | 0.7130 |
| chem&gen&int | NetLapRLS | 0.9710 | N.A. | |
| chem&int | LPLNI-II | |||
| ICs | chem&gen&int | RLS-Kron | 0.9840 | 0.9430 |
| chem&gen&int | NetLapRLS | 0.9860 | 0.N.A. | |
| chem&int | LPLNI-II | |||
| NRs | chem&gen&int | RLS-Kron | 0.9220 | 0.6840 |
| chem&gen&int | NetLapRLS | 0.8880 | 0.N.A. | |
| chem&int | LPLNI-II |
N.A.: not available. chem, gen, and int are abbreviations for chemical structure, genomic sequence, and the interaction profile, respectively.
The top 10 new predicted interactions on the Es dataset.
| Rank | Pair | Description | Confirmed? |
|---|---|---|---|
| 1 | D00574 | Aminoglutethimide (USP/INN) | |
| hsa1589 | cytochrome P450, family 21, subfamily A, polypeptide 2 | ||
| 2 | D00437 | Nifedipine (JP15/USP/INN) | Yes |
| hsa1559 | cytochrome P450, family 2, subfamily C, polypeptide 9 | ||
| 3 | D00542 | Halothane (JP15/USP/INN) | Yes |
| hsa1571 | cytochrome P450, family 2, subfamily E, polypeptide 1 | ||
| 4 | D00410 | Metyrapone (JP15/USP/INN) | |
| hsa1583 | cytochrome P450, family 11, subfamily A, polypeptide 1 | ||
| 5 | D00139 | Methoxsalen (JP15/USP) | Yes |
| hsa1543 | cytochrome P450, family 1, subfamily A, polypeptide 1 | ||
| 6 | D00437 | Nifedipine (JP15/USP/INN) | |
| hsa1585 | cytochrome P450, family 11, subfamily B, polypeptide 2 | ||
| 7 | D00691 | Diprophylline (JAN/INN) | |
| hsa8654 | phosphodiesterase 5A, cGMP-specific | ||
| 8 | D00691 | Diprophylline (JAN/INN) | |
| hsa5152 | phosphodiesterase 9A | ||
| 9 | D00691 | Diprophylline (JAN/INN) | Yes |
| hsa5150 | phosphodiesterase 7A | ||
| 10 | D00691 | Diprophylline (JAN/INN) | |
| hsa50940 | Peptidyl-prolyl cis-trans isomerase A |
Statistics of four drug-target interaction datasets.
| Datasets | Sparsity | |||||
|---|---|---|---|---|---|---|
| Es | 445 | 664 | 2926 | 6.5753 | 4.4066 | 0.0099 |
| GPCRs | 223 | 95 | 635 | 2.8475 | 6.6842 | 0.0299 |
| ICs | 210 | 204 | 1476 | 7.0286 | 7.2353 | 0.0345 |
| NRs | 54 | 26 | 90 | 1.6667 | 3.4615 | 0.0641 |
is the number of drugs, is the number of targets, is the number of known interactions, is the average number of targets for each drug, and is the average number of drugs for each target. Sparsity is known interactions divided by all possible interaction pairs.
Figure 2A drug-target interaction network and interaction profiles of drugs.
Descriptions of nine fingerprints.
| Fingerprints | Descriptions |
|---|---|
| Daylight | Daylight fingerprints based on hashing molecular subgraphs |
| EState | This fingerprinter generates 79 bit fingerprints using the E-State fragments |
| Extended | These fingerprints extends the CDK with additional bits describing ring features |
| Graph Only | Specialized version of the CDK Fingerprinter that does not take bond orders into account |
| Hybridization | This fingerprinter takes into account SP2 hybridization states |
| Klekota-Roth | This fingerprinter presence of 4860 substructures |
| MACCS | This fingerprinter generates 166 bit MACCS keys. |
| Pubchem | These fingerprints are of the structural key type, of length 881 |
| Substructure | The fingerprint currently supports 307 substructures |
Figure 3Procedure of calculating linear neighborhood similarity.