| Literature DB >> 28270862 |
Mahendra Awale1, Jean-Louis Reymond1.
Abstract
BACKGROUND: Several web-based tools have been reported recently which predict the possible targets of a small molecule by similarity to compounds of known bioactivity using molecular fingerprints (fps), however predictions in each case rely on similarities computed from only one or two fps. Considering that structural similarity and therefore the predicted targets strongly depend on the method used for comparison, it would be highly desirable to predict targets using a broader set of fps simultaneously.Entities:
Keywords: Drug–target interactions; Molecular fingerprints; Polypharmacology; Target prediction
Year: 2017 PMID: 28270862 PMCID: PMC5319934 DOI: 10.1186/s13321-017-0199-x
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Publicly accessible web-based target prediction tools
| Website | Similarity method | Database | Ref. |
|---|---|---|---|
|
| 10 different fingerprints | ChEMBL 21 | This work |
|
| Multilevel neighbourhoods of atoms (MNA) descriptors | WDI and ACD | [ |
|
| Docking | PDTD | [ |
|
| ECfp4 | CHEMBL 16, WOMBAT, MDDR and StarLite | [ |
|
| Receptor-based pharmacophore models | TargetBank, DrugBank, BindingDB, PDTD | [ |
|
| Docking | PDB, DrugBank | [ |
|
| ECfp6, ECfp4 and Openbabel FP2 | ChEMBL 11 and PubChem bioassay | [ |
|
| Openbabel FP2, MACSS, SHAFT and USR | ChEMBL 14, BindingDB, DrugBank, KEGG and PDB | [ |
|
| Circular fingerprint FCFP | STITCH | [ |
|
| CATS and MOE physiochemical descriptors | COBRA | [ |
|
| Openbabel FP2 and Electroshape descriptors | ChEMBL 16 | [ |
|
| ECfp4 | ChEMBL, SuperTarget and BindingDB | [ |
|
| ECfp4 | BindingDB | [ |
|
| Sfp | ChEMBL 14, BindingDB, DrugBank, PharmGKB, PubChem bioassay, WOMBAT, IUPHAR, CTD and STITCH | [ |
Molecular fingerprints used for target prediction with PPB
| Name | Description | Ref. |
|---|---|---|
| APfp | 21-D atom-pair fingerprint, perceives molecular shape | [ |
| Xfp | 55-D atom category extended atom-pair fingerprint, perceives pharmacophores | [ |
| MQN | 42-D Molecular Quantum Numbers, scalar fingerprint counting atoms, bonds, polarity and ring features, perceives constitution, topology and molecular shape | [ |
| SMIfp | 34-D scalar fingerprint counting occurrence of characters in SMILES, perceives rings, aromaticity, and polarity | [ |
| Sfp | 1024-D binary daylight type substructure fingerprint, perceives detailed substructures | [ |
| ECfp4 | 1024-D binary circular extended connectivity fingerprint, perceives detailed substructures and pharmacophores | [ |
| Ffp1 | Fusion fingerprint, Xfp + SMIfp + Sfp | This work |
| Ffp2 | Fusion fingerprint, Xfp + MQN + SMIfp | This work |
| Ffp3 | Fusion fingerprint, Xfp + SMIfp + Sfp + ECfp4 | This work |
| Ffp4 | Fusion fingerprint, Xfp + MQN + SMIfp + Sfp + ECfp4 | This work |
Fig. 1Overview of the data used for constructing PPB. Distribution of a target type as defined in ChEMBL and b source of targets. c Distribution of targets as per number of associated bioactive compounds. d Histogram of city block distances (log scale) calculated for 50 million random pairs of compounds from ChEMBL 21 using six molecular fingerprints. e APfp, MQN, SMIfp, Sfp and ECfp4 fingerprints were scaled with respect to Xfp to adjust to the value of the most frequent distance. Scaling factors are shown in parentheses. f, g Enrichment of 40 set of DUD actives from corresponding decoys set by six different fingerprints (APfp, Xfp, MQN, SMIfp, Sfp and ECfp4) and four similarity fusion methods (Ffp1–4). City block distance was used as sorting function. Data is represented as average of f Area under ROC curve and g Enrichment factor at 1% of screen database for 40 targets from DUD. h, i Example of p value calculation. h Observed (red) and fitted (black) random distance distributions for the muscarinic acetylcholine receptor M1 (CHRM1, CHEMBL216) in MQN fingerprint space. City block distances were calculated for 1788 ligands of CHRM1 with respect to random compounds from ZINC database. Negative binomial distribution was used for curve fitting. i Cumulative density plot indicating area under fitted curve in h
Fig. 2PPB web-browser. a Result panel displaying the PPB predicted targets for the drug metaraminol. b List of molecules for the target selected in the result panel (row 5 ADRA1A)
Fig. 3Distribution of 670 ChEMBL compounds labelled as drug in clinical trial or as approved drug used in validation study according to number of associated targets
Fig. 4Recovery statistics of targets of 670 drugs by various fingerprints and combined method used in PPB. The bar plots shows an average, a fraction of known targets found, b number of targets predicted and c hit rate calculated for 670 drugs (see Additional file 1: Fig. S1). For each drug analysis was performed at three different levels considering all targets (grey), targets with p value of ≤0.01 (green) and targets with p value of >0.01 (red). d Success rate for finding at least 1 known target of drugs among top 5 predicted targets by each method. e Average Tanimoto coefficient of binary substructure fingerprint between the query and the most similar bioactive ligand associated with correctly predicted known targets in the results list. f–h Percentages of targets of one fingerprint found by another fingerprint and percentages of targets unique to this fingerprint, considering three different targets lists as mentioned before
Fig. 5Prediction of targets of CIS22a using PPB and comparison with other web-based tools. In case of ChEMBLPred target prediction models (10 μM) were downloaded from ChEMBL website and implemented locally using RDkit and python. a Structure of CIS22a. b Confirmed side targets of CIS22a. c Targets detected with no significant binding affinity for CIS22a. Targets which were found and not found by the fingerprints used in PPB and external web-based tools are indicated with green and black dots, respectively. For external web-based tools, at the maximum top 30 predicted targets were considered. The prediction performance of ChemProt, HitPick, TarPred, SPiDER, PASS, TarFishDock and Drar web based tools listed in Table 1 are not shown due to technical failures or no applicability in the context. d Structure, ChEMBL id and tanimoto coefficient for bioactive compounds which linked the targets to CIS22a, indicated with the name of fingerprints in parentheses. Target full names: Adrenergic α1A (ADRA1A) and α2A (ADRA2A) receptor, Adrenergic β1 (ADRB1) and β2 (ADRB2) receptor, Cannabinoid 1 (CB1) and 2 (CB2) receptor, Voltage dependent L- (CACNA1S) and N-type (CACNA1B) Ca2+ channel, Cholinergic muscarinic receptor 1 (CHRM1) and 2 (CHRM2), Dopamine receptor subtypes D1-4 (DRD1-4), Gamma aminobutyric acid receptor (GABA), 5-hydroytryptamine receptor 3 (5-HT3), 5-hydroytryptamine receptor 1A (HTR1A), 1B (HTR1B), 2A (HTR2A) and 2B (HTR2B), Voltage gated potassium channel subfamily H member 2 (HERG), N-methyl-d-aspartate receptor (NMDA), µ opioid receptor (OPRM), voltage gated Na+ channel (SCN2A)