Literature DB >> 31725865

PTS: a pharmaceutical target seeker.

Peng Ding1, Xin Yan1, Zhihong Liu1, Jiewen Du1, Yunfei Du2, Yutong Lu2, Di Wu3, Yuehua Xu1, Huihao Zhou1, Qiong Gu1, Jun Xu1.   

Abstract

Identifying protein targets for a bioactive compound is critical in drug discovery. Molecular similarity is a main approach to fish drug targets, and is based upon an axiom that similar compounds may have the same targets. The molecular structural similarity of a compound and the ligand of a known target can be gauged in topological (2D), steric (3D) or static (pharmacophoric) metric. The topologic metric is fast, but unable to represent steric and static profile of a bioactive compound. Steric and static metrics reflect the shape properties of a compound if its structure were experimentally obtained, and could be unreliable if they were based upon the putative conformation data. In this paper, we report a pharmaceutical target seeker (PTS), which searches protein targets for a bioactive compound based upon the static and steric shape comparison by comparing a compound structure against the experimental ligand structure. Especially, the crystal structures of active compounds were taken into similarity calculation and the predicted targets can be filtered according to multi activity thresholds. PTS has a pharmaceutical target database that contains approximately 250 000 ligands annotated with about 2300 protein targets. A visualization tool is provided for a user to examine the result. Database URL: http://www.rcdd.org.cn/PTS.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Year:  2017        PMID: 31725865      PMCID: PMC5750839          DOI: 10.1093/database/bax095

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


Introduction

For decades, the paradigm of drug discovery and development has been one-drug-for-one-target (1). Recent advances in systems biology (2) and chemical biology demonstrate that existing drugs can interact with multiple targets (3, 4). However, multi-target interactions are either unknown or insufficiently understood in most cases. There are increasing needs to predict drug targets for an agent due to growing number of bioactive compounds identified from phenotypic assays (5–7). The prediction has to be validated by experiments, such as structure biological approaches or proteomics. The in silico approaches can significantly reduce the costs and improve the performance of the experimental approaches for drug target fishing. A drug target prediction method can be categorized into structure-based or ligand-based method. INDOCK (8) and TarFisDock (9) are typical structure-based target fishing tools using molecular docking algorithms, which rely on the target structure availability and the structure diversity of the binding pocket. However, a ligand-based target fishing approach uses the ligand-compound similarity based on topological structures (fingerprints) (10, 11), molecular shapes, pharmacophores (12) or compound activity profiles (13). The ligand-based target fishing approaches are being adopted due to the increasing availability of bioassay data (14–16). SEA (17) and SuperPred (18) are typical ligand-based approaches that use ligand databases and compound topological (2D) similarity measurements. Other methods, such as Chemmapper (19), Superimpose (20) and wwLigCSRre (21) use 3D structure similarity metric to predict protein targets. 2D and 3D similarity measurements are complimentary, and 3D similarity measurements seem capable of picking novel chemotypes (22) if the template structures were experimentally obtained. In this work, we have implemented a pharmaceutical target seeker (PTS), which uses the experimental 3D structures of ligands with known targets to calculate the similarity of the ligand and a compound. For those ligands for which experimental structure data are not available, their energy-minimized conformations are generated for the 3D similarity calculations. The 3D similarity search engine is Weighted Gaussian Algorithm (WEGA) (23), which can take steric and pharmacophoric profile into account. The user can rule out impossible targets by setting activity thresholds in order to expedite the target fishing process. PTS contains approximately 250 000 ligands annotated with 2300 protein targets.

Materials and methods

Data preparation

The data of bioactive compounds and their targets were collected from public databases. Target data were derived from therapeutic target database (TTD version 2015) (24) and reference (25). Through UniProt ID, ligand data and their relations with targets were extracted from UniProt (26), ChEMBL20 (27) and BindingDB (28, 29), PDBbind (version 2014) (30–32) and RCSB PDB databases. The data were pre-processed with the following steps: This resulted in 266 866 ligands associated with 2298 protein targets, 537 095 bioactivity data points, 4391 crystal structures and 16 590 related articles in the PTS built-in database (Table 1). Among the targets, 14% of them have drugs in the market, 41% of them have drug candidates under clinic trails, 40% of them have ligands under the investigations and 5% of them have compounds that were discontinued for pharmaceutical studies.
Table 1.

Statistics data of PTS

DataNumberSourceNumberAvailability
(extracted)(original)
Target2298TTD (2015)2875bidd.nus.edu.sg/group/cjttd/
Compound266 866ChEMBL201 463 270www.ebi.ac.uk/chembl/
PDB4391PDBbind (2014)10 656www.pdbbind-cn.org/
Activity record537 095ChEMBL2013 520 737www.ebi.ac.uk/chembl/
Reference16 590ChEMBL2059 610www.ebi.ac.uk/chembl/
removing obsolete UniProt IDs from TTD target data; removing counter ion moieties from bioactive ligands; removing compounds from ChEMBL20 data if their activity (IC50/Ki/Kd) values are greater than 50 μM; removing small compounds (heavy atoms < 6) and large compounds (MW > 1000 Da). Statistics data of PTS

Similarity algorithm

The target fishing process is based upon an axiom that similar compounds may have the same targets. An in-house algorithm, WEGA, is used to compute the steric and static similarity of a ligand-compound pair. WEGA is based on the first order Gaussian approximation, which simplifies the shape density functions of the molecules by avoiding expensive higher order terms calculation. WEGA offers three similarity calculation methods: 1) shape matching, which is only the molecular volumetric overlay, 2) feature matching, which is the pharmacophore mapping of molecule pair, 3) combination matching, which integrates the advantage of the above two aspects and is the most precise approach. The detailed method of WEAG is described in reference (23).

Webserver implementation

PTS uses a browser and server framework. Client interface was implemented in HTML and Javascript. The back-end is implemented in Golang language and MySQL database system. The molecule editor and chemical structure viewers are supported with Marvin JS and ChemDoodle web component. All tools have been summarized in Table 2.
Table 2.

The tools used for PTS implementation

ToolUseAvailability
Marvin JSChemical structure inputmarvinjs-demo.chemaxon.com/latest/
ChemDoodle WebStructure draw and 3D displayweb.chemdoodle.com/
Open BabelChemical file format conversionopenbabel.org/wiki/Main_Page
Discovery Studio3D conformation generationaccelrys.com/products/collaborative-science/biovia-discovery-studio/
MySQLStorage databasewww.mysql.com/
JQueryForeground and background interactionjquery.com/
GolangWeb server languagegolang.org/
The tools used for PTS implementation

Results

Workflow

PTS provides an intuitive interface to predict small molecule protein targets. A user can input a query molecule by uploading a file (mol/SDF format) or drawing a chemical structure with its built-in chemical structure editor (Figure 1B). PTS will generate the possible 3D conformations for the query and, employ WEGA to compute the 3D similarities of the molecular conformations and the ligand structures in PTS ligand database (Figure 1C). A typical task of PTS takes about 30–60 min, depending on the flexibility of the query compound and calculation method assigned by users. Each user’s query is automatically assigned with a Job-ID that allows the user to receive and inspect the target prediction result (Figure 1D). The result page lists the predicted targets with their common names linked to UniProt (26) database if they are available. Targets are ranked according to their score with respect to the query molecule. Thus, the most possible target is placed at the top of the list. Sometimes, multiple targets may be inferred for the query compound based on a single similar compound. If so, the order of the target presented in the table has no specific meaning regarding to prediction significance. Besides, predicted targets can be tailored by the activity threshold of a template ligand (such as 10, 20 or 50 µM) (Figure 1D).
Figure 1.

PTS working protocol. (A) Main menu, (B) chemical structure editor, (C) ligand structure from PTS builtin ligand database, (D) predicted target list, (E) target profile and (F) query molecules docked in the binding pocket of a predicted target.

PTS working protocol. (A) Main menu, (B) chemical structure editor, (C) ligand structure from PTS builtin ligand database, (D) predicted target list, (E) target profile and (F) query molecules docked in the binding pocket of a predicted target. By clicking on the check button, a user can inspect the predicted template ligands for a query molecule (Figure 1D). In the result page, target data fields include UniProt ID, CHEMBL ID, TTD ID, type, organism, gene and biological functions (Figure 1E). The template ligands are ranked based upon their similarity values to the query molecule. The resulting targets can be tailored by setting the 3D similarity values to the query molecule (default threshold is 0.6, and ligands show low similarity with the query if below this threshold). For the template ligands with experimental structure data, the query molecular structure is superimposed with the ligands in the binding pocket of the predicted target (Figure 1F), and downloadable.

Case study 1: seeking targets for Afatinib

Afatinib is an irreversible kinase inhibitor targeting epidermal growth factor receptor (EGFR) and inhibiting tyrosine kinase auto-phosphorylation (33) to stop tumor cells growth. Afatinib is available for the first-line treatment of patient with metastatic non-small cell lung cancer. The targets predicted by PTS are listed in Table 3.
Table 3.

The predicted targets for Afatinib

RankUniProt IDTarget nameOrganismScore
1P00533EGFRHomo sapiens (Human)0.74
2P25440Bromodomain-containing protein 2Homo sapiens (Human)0.72
3Q15059Bromodomain-containing protein 3Homo sapiens (Human)0.72
4O60885Bromodomain-containing protein 4Homo sapiens (Human)0.72
5P349695-hydroxytryptamine 7 receptorHomo sapiens (Human)0.72
6Q07820Induced myeloid leukemia cell differentiation protein Mcl-1Homo sapiens (Human)0.72
7P09917mRNA of human 5-lipoxygenaseHomo sapiens (Human)0.72
8P17948Vascular endothelial growth factor receptor 1Homo sapiens (Human)0.72
9P0825372 kDa type IV collagenaseHomo sapiens (Human)0.71
10P24557Thromboxane-A synthasenil0.71
The predicted targets for Afatinib Experimental data indicate that Afatinib is an EGFR inhibitor (IC50 = 1 nM) (34). EGFR (UniProt ID: P00533) is ranked at the top of the predicted target list by PTS (Table 3). The predicted Afatinib binding poses are aligned with the native EGFR ligands as shown in Figure 2. PTS also predicted other potential targets, however, there are no evidences showing that Afatinib is strongly binding with them. The data for the alignments of Afatinib and the native ligands of these targets can be found in Supplementary Figure S1.
Figure 2.

Afatinib (yellow) aligned with known EGFR inhibitor CHEMBL484108 (A, red) and CHEMBL482489 (B, red).

Afatinib (yellow) aligned with known EGFR inhibitor CHEMBL484108 (A, red) and CHEMBL482489 (B, red).

Case study 2: seeking targets for Tamoxifen

Estrogen receptors (ERs) are well-known targets for Tamoxifen (35–37). The targets predicted by PTS for tamoxifen are listed in Table 2. ERα (UniProt ID: P03372) and ERβ (UniProt ID: Q92731) are ranked as the top-1 and top-6. The predicted Tamoxifen binding poses aligned with the native ERα and ERβ ligands are depicted in Figure 3.
Figure 3.

Tamoxifen (yellow) aligned with ERα (A, red, PDB Code: 1XQC) and ERβ (B, red, PDB Code: 2QTU) selective ligands in binding pocket.

Tamoxifen (yellow) aligned with ERα (A, red, PDB Code: 1XQC) and ERβ (B, red, PDB Code: 2QTU) selective ligands in binding pocket. In Table 4, PTS places top-10 targets. After examing the other predicted targets besides ERs, we find that CYP2D6 (38), Prostaglandin synthase (40), 3-β-hydroxysteroid-δ(8),δ(7)-isomerase (39) and Collagenase 3 (41, 42) has been experimentally reported as off-targets. In fact, 38 proteins have been experimentally validated to interact with tamoxifen or 4 H-tamoxifen (Supplementary Table S1). In the top-100 PTS predicted targets for Tamoxifen, 15% proteins are experimentally confirmed off-targets, which can be found in Supplementary Table S2.
Table 4.

Experimentally proved Tamoxifen targets and off-targets predicted by PTS

RankUniProt IDTarget nameOrganismScore
1P03372ER alphaHomo sapiens (Human)0.82
2P040353-hydroxy-3-methylglutaryl-coenzyme A reductaseHomo sapiens (Human)0.82
3P08684mRNA of CYP3A4Homo sapiens (Human)0.82
4P23458JAK1Homo sapiens (Human)0.81
5P41145Kappa-type opioid receptorHomo sapiens (Human)0.81
6Q92731ER betaHomo sapiens (Human)0.81
7O14965Aurora kinase AHomo sapiens (Human)0.80
8Q96GD4Serine/threonine protein kinase 12Homo sapiens (Human)0.80
9P29597TYK2Homo sapiens (Human)0.80
10P52333Tyrosine-protein kinase JAK3Homo sapiens (Human)0.80
11P10635Cytochrome P450 2D6Homo sapiens (Human)0.80
Experimentally proved Tamoxifen targets and off-targets predicted by PTS

Case study 3: validating a target for Chlorprothixene

Chlorprothixene is an old antipsychotic drug. It antagonizes dopaminergic D1 (UniProt ID: P21728) and D2 (UniProt ID: P14416) receptors in the brain to exert its antipsychotic effect (43). Chlorprothixene also antagonizes histamine H1 receptor (44). But, there is no direct evidence to show Chlorprothixene interacts with H1. PTS predicts H1 receptor is the target of Chlorprothixene, the results are listed in Supplementary Table S3 (Figure 4). Sridhar R. Vasudevan proved experimentally that Chlorprothixene binds with H1 receptor and selectively inhibit histamine-induced calcium release with IC50 of 1 nM.
Figure 4.

Chlorprothixene (red) aligned with H1 receptor antagonist promazine (A, yellow) and CHEMBL363581 (B, yellow).

Chlorprothixene (red) aligned with H1 receptor antagonist promazine (A, yellow) and CHEMBL363581 (B, yellow).

Case study 4: predicting potential side-effects for Fluoxetine

hERG (the human Ether-à-go-go-Related Gene, UniProt ID: Q12809) is known as a potassium (K+) ion channel mediating the repolarizing IKr current in the cardiac action potential. A drug that potentially interacts with hERG can result in lethal side-effect (45). Fluoxetine is a selective serotonin reuptake inhibitor for treating depressive disorder. The potential targets PTS predicted for Fluoxetine are listed in Table 5.
Table 5.

The predicted target profile for fluoxetine

RankUniProt IDTarget nameOrganismScore
1P35354Prostaglandin G/H synthase 2Homo sapiens (Human)0.86
2P23219Prostaglandin G/H synthase 1Homo sapiens (Human)0.86
3Q01959Sodium-dependent dopamine transporterHomo sapiens (Human)0.85
4P31645Sodium-dependent serotonin transporterHomo sapiens (Human)0.85
5P23975Sodium-dependent noradrenaline transporterHomo sapiens (Human)0.85
6P14416D(2) dopamine receptorHomo sapiens (Human)0.84
7P35462D(3) dopamine receptorHomo sapiens (Human)0.84
8P11229Muscarinic receptorHomo sapiens (Human)0.83
9P35367Histamine H1 receptorHomo sapiens (Human)0.83
10Q12809Potassium channel H-ERGHomo sapiens (Human)0.83
The predicted target profile for fluoxetine Sodium-dependent serotonin transporter (UniProt ID: P31645) is the primary target of Fluoxetine, which is ranked at fourth in the list. By inspect the potential target list, we find the top-10 target is hERG. Further literature studies reveal that Fluoxetine is experimentally proved as hERG inhibitor (IC50 = 3.1 μM) (46).

Discussion

PTS predicts targets for a compound through superimposing the compound structure onto the 3D ligand structures of putative targets. This approach considers pharmacophore shape similarity that 2D approaches cannot do. Other 3D approaches use molecular docking techniques, while PTS does not employ ligand-receptor docking techniques and can still produce results where the receptor structure data are not available. Both PTS and SwissTargetPrediction are web-based target fishing tools. The four testing cases were tested on both tools, which produced similar results. Additionally, small scale of structurally diverse drugs that targeting four classes of biological systems (GPCR, Ion channel, Nuclear receptor and kinase) were extracted to test the success rate and applicability of PTS. Averagely, at least one known targets of each drug is found among the top 20 predicted targets for 70% of the ligands (Supplementary Material). The advantage of PTS is able to superimpose the query molecule into the binding pockets of the putative targets when the experimental structure data are available. However, there are limits for PTS, too. The activity cliff issue, i.e. a subtle change in the chemical structure cause the great loss of bioactivity, may be a common concern for all ligand-based methods, including the 3D approaches. To date, PTS has received >500 queries from 11 countries in the world since 10 August 2016. It can be used for target identification, drug repurposing, toxic risk estimation and molecular interaction simulation pre-processing tool.

Funding

National Science Foundation of China (81473138, 81573310); Guangdong Frontier & Key Technology Innovation Program (2015B010109004); Guangdong Natural Science Foundation (2016A030310228); the Fundamental Research Funds for the Central Universities (2013HGCH0015); Tianhe-2 based biomedical health data application support platform (U1611261); the Fundamental Research Funds for the Central Universities (17LGJC23). Funding for open access charge: National Science Foundation of China (81473138), funded by the National Natural Science Foundation of China.

Supplementary data

Supplementary data are available at Database Online. Conflict of interest. None declared. Click here for additional data file.
  46 in total

Review 1.  Medicinal chemistry of hERG optimizations: Highlights and hang-ups.

Authors:  Craig Jamieson; Elizabeth M Moir; Zoran Rankovic; Grant Wishart
Journal:  J Med Chem       Date:  2006-08-24       Impact factor: 7.446

2.  Enhancing molecular shape comparison by weighted Gaussian functions.

Authors:  Xin Yan; Jiabo Li; Zhihong Liu; Minghao Zheng; Hu Ge; Jun Xu
Journal:  J Chem Inf Model       Date:  2013-07-25       Impact factor: 4.956

Review 3.  Target identification for small bioactive molecules: finding the needle in the haystack.

Authors:  Slava Ziegler; Verena Pries; Christian Hedberg; Herbert Waldmann
Journal:  Angew Chem Int Ed Engl       Date:  2013-02-18       Impact factor: 15.336

4.  Neuroleptic blockade of the effect of various neurotransmitter substances.

Authors:  B Fjalland; V Boeck
Journal:  Acta Pharmacol Toxicol (Copenh)       Date:  1978-03

5.  Predicting new molecular targets for known drugs.

Authors:  Michael J Keiser; Vincent Setola; John J Irwin; Christian Laggner; Atheir I Abbas; Sandra J Hufeisen; Niels H Jensen; Michael B Kuijer; Roberto C Matos; Thuy B Tran; Ryan Whaley; Richard A Glennon; Jérôme Hert; Kelan L H Thomas; Douglas D Edwards; Brian K Shoichet; Bryan L Roth
Journal:  Nature       Date:  2009-11-01       Impact factor: 49.962

6.  Relating protein pharmacology by ligand chemistry.

Authors:  Michael J Keiser; Bryan L Roth; Blaine N Armbruster; Paul Ernsberger; John J Irwin; Brian K Shoichet
Journal:  Nat Biotechnol       Date:  2007-02       Impact factor: 54.908

7.  Activity, assay and target data curation and quality in the ChEMBL database.

Authors:  George Papadatos; Anna Gaulton; Anne Hersey; John P Overington
Journal:  J Comput Aided Mol Des       Date:  2015-07-23       Impact factor: 3.686

8.  Searching and Navigating UniProt Databases.

Authors:  Sangya Pundir; Michele Magrane; Maria J Martin; Claire O'Donovan
Journal:  Curr Protoc Bioinformatics       Date:  2015-06-19

9.  Connecting proteins with drug-like compounds: Open source drug discovery workflows with BindingDB and KNIME.

Authors:  George Nicola; Michael R Berthold; Michael P Hedrick; Michael K Gilson
Journal:  Database (Oxford)       Date:  2015-09-16       Impact factor: 3.451

10.  In Silico target fishing: addressing a "Big Data" problem by ligand-based similarity rankings with data fusion.

Authors:  Xian Liu; Yuan Xu; Shanshan Li; Yulan Wang; Jianlong Peng; Cheng Luo; Xiaomin Luo; Mingyue Zheng; Kaixian Chen; Hualiang Jiang
Journal:  J Cheminform       Date:  2014-06-18       Impact factor: 5.514

View more
  1 in total

1.  Ganghuo Kanggan Decoction in Influenza: Integrating Network Pharmacology and In Vivo Pharmacological Evaluation.

Authors:  Yanni Lai; Qiong Zhang; Haishan Long; Tiantian Han; Geng Li; Shaofeng Zhan; Yiwei Li; Zonghui Li; Yong Jiang; Xiaohong Liu
Journal:  Front Pharmacol       Date:  2020-12-10       Impact factor: 5.810

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.