Literature DB >> 16844997

TarFisDock: a web server for identifying drug targets with docking approach.

Honglin Li1, Zhenting Gao, Ling Kang, Hailei Zhang, Kun Yang, Kunqian Yu, Xiaomin Luo, Weiliang Zhu, Kaixian Chen, Jianhua Shen, Xicheng Wang, Hualiang Jiang.   

Abstract

TarFisDock is a web-based tool for automating the procedure of searching for small molecule-protein interactions over a large repertoire of protein structures. It offers PDTD (potential drug target database), a target database containing 698 protein structures covering 15 therapeutic areas and a reverse ligand-protein docking program. In contrast to conventional ligand-protein docking, reverse ligand-protein docking aims to seek potential protein targets by screening an appropriate protein database. The input file of this web server is the small molecule to be tested, in standard mol2 format; TarFisDock then searches for possible binding proteins for the given small molecule by use of a docking approach. The ligand-protein interaction energy terms of the program DOCK are adopted for ranking the proteins. To test the reliability of the TarFisDock server, we searched the PDTD for putative binding proteins for vitamin E and 4H-tamoxifen. The top 2 and 10% candidates of vitamin E binding proteins identified by TarFisDock respectively cover 30 and 50% of reported targets verified or implicated by experiments; and 30 and 50% of experimentally confirmed targets for 4H-tamoxifen appear amongst the top 2 and 5% of the TarFisDock predicted candidates, respectively. Therefore, TarFisDock may be a useful tool for target identification, mechanism study of old drugs and probes discovered from natural products. TarFisDock and PDTD are available at http://www.dddc.ac.cn/tarfisdock/.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16844997      PMCID: PMC1538869          DOI: 10.1093/nar/gkl114

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Recent advances in the development of tools for docking small molecules to proteins, i.e. virtual screening, has demonstrated the efficiency of this approach for the discovery of potential lead compounds for drug development in the postgenomic era (1–3). Numerous docking programs (4–10) have been used to seek ligands which recognize the 3D structure of a given target obtained by X-ray crystallography, NMR spectroscopy or even by homology modeling [for a review comparing and evaluating docking tools see ref. (11)]. However, identification and validation of druggable targets from amongst thousands of candidate macromolecules is still a challenging task (12,13). A proteomic approach for identification of binding proteins for a given small molecule involves comparison of the protein expression profiles for a given cell or tissue in the presence or absence of the given molecule. This method has not proved very successful in target discovery because it is laborious and time-consuming (14). Thus an efficient computational method for identifying the targets of a small molecule which had been demonstrated experimentally to have an important biological activity would provide a tool of great potential value. An alternative approach that has shown promise in recent years is to use computational methods to find putative binding proteins for a given compound from either genomic or protein databases, and subsequently use experimental procedures to validate the computational result (15–18). One such computational approach, which is the reverse of docking a set of ligands into a given target, is to dock a compound with a known biological activity into the binding sites of all the 3D structures in a given protein database. Protein ‘hits’ so identified can then serve as potential candidates for experimental validation. Accordingly, this approach is referred to as reverse docking. Herein, we present a web-based tool Target Fishing Dock (TarFisDock) for seeking potential binding proteins for a given ligand. It makes use of a ligand–protein reverse docking strategy to search out all possible binding proteins for a small molecule from the potential drug target database (PDTD). The small molecule might be a biologically active compound detected in a cell- or animal-based bioassay screen, a natural product or an existing drug whose molecular target(s) is (are) unknown. Thus, TarFisDock may serve as a valuable tool for identifying targets for a novel synthetic compound or for a newly isolated natural product, for a compound with known biological activity, or for an existing drug whose mechanism of action is unknown.

METHODS

Construction of the potential drug target database

TarFisDock requires a sufficient number of known protein structures covering a diverse range of drug targets. The target proteins collected in PDTD were selected from the literature (19–22), and from several online databases, such as DrugBank () (23), and TTD () (24). Only proteins with known 3D structures were deposited in PDTD, the Protein Data Bank (PDB) (25) being the major source of their coordinates. PDTD currently consists of 698 entries covering 371 drug targets. These drug targets may be categorized into 15 types, according to their therapeutic areas (20,22), as shown in Table 1. Because TarFisDock does not take into account protein flexibility, PDTD includes redundant entries for proteins known to be flexible. Thus, for example, there are seven entries for HIV-1 (Figure 1).
Table 1

Diseases categories of drug targets in PDTD

(1) Synaptic And Neuroeffector Junctional Sites And Central Nervous System
(2) Inflammation
(3) Renal And Cardiovascular Functions
(4) Gastrointestinal Functions
(5) Uterine Motility
(6) Bacterial Infections
(7) Fungal Infections
(8) Viral Infections
(9) Parasitic Infectious Diseases
(10) Immunomodulation
(11) Blood And Blood-Forming Organs
(12) Neoplastic Diseases
(13) Hormones And Hormone Antagonists
(14) The Vitamins
(15) Undefined
Figure 1

An example of PDTD querying and finding out 22 targets records of ‘[HIV] DISEASE’.

Water molecules and complexed ligands were removed from the protein structures, after which hydrogen atoms were added, and KOLLMAN charges (26), with the protonation state of the individual residues being taken into account during charge assignment. A mo12 file (Mol2 file (.mol2) developed by SYBYL, Tripos Inc., St Louis, USA () is a complete, portable representation of a SYBYL molecule. It is an ASCII file which contains all the information needed to reconstruct a SYBYL molecule.) was then constructed for each protein. The active site of each protein was defined as all residues within 6.5 Å of the ligand bound, and a sphere file for the active site was generated using the SPHGEN program (27). The PDB, mol2 and sphere files for each protein were stored in PDTD.

Reverse docking procedure using TarFisDock

TarFisDock consists of two parts, a front-end web interface written in both PHP and HTML, with MySQL as database system, and a back-end tool for reverse docking. TarFisDock was developed on the basis of the widely used docking program, DOCK (version 4.0) (5,27). The reverse docking procedure is as follows: (i) TarFisDock either generates a protein target list according to the user's preference (see INPUT) or selects all the protein entries in the PDTD if the user intends to find a new target or targets for an active compound; (ii) TarFisDock docks a given small molecule into the possible binding sites of proteins in the target list, and the interaction energies between the small molecule and the proteins are calculated and recorded; (iii) TarFisDock analyzes the reverse docking result. In general, TarFisDock may output the top 2, 5 or 10% of the ranking list, from which the user may select protein candidates for further biological study. So far, TarFisDock has taken into account the flexibility of the small molecules, but has not yet taken into account protein flexibility. Putative binding proteins are selected by ranking the values of the interaction energy (Einter), which is composed of van der Waals and electrostatic interaction terms (Equation 1), where each term is a double sum over ligand atoms i and receptor atoms j; r is the distance between atom i in the ligand and atom j in the putative receptor protein; A and B are van der Waals repulsion and attraction parameters, respectively; a and b are the van der Waals repulsion and attraction exponents, respectively; q and q are point charges on atoms i and j; D is dielectric function; and 332.0 is the factor that converts the electrostatic energy into kcal/mol. The Amber force field (26) was used for the energy calculation.

INPUT, OUTPUT AND OPTIONS

The input file consists of only the test small molecule in standard mol2 format. The 2D structure of a small molecule can be either sketched using ISIS/Draw (ISIS/Draw, MDL Informations Systems, Inc., San Leandro, CA 945577) or ChemDraw (ChemDraw, CambridgeSoft Corporation, 875 Massachusetts Avenue, Cambridge, MA 02139, USA) or taken from such chemical databases as CCD (), ACD () and SPECS (). The user can convert the small molecule from its 2D structures to the 3D structures by using CORINA (28) () or other modeling software. The structures can be minimized by means of molecular mechanics, and Gasteiger charges (29) should be assigned to them. Finally, the 3D structure of the small molecule is saved in a mol2 file. Users can register free of charge for using the TarFisDock server, including access to PDTD. The user must provide his/her email address and username so as to receive the result. After registration, the user can login to the server to upload the mol2 file of the test molecule, customize a target list from PDTD, and submit a job (Figure 2). A job identity number, the ‘job_id’, is assigned to each job by the web server, and the number is appended to a job queue in the back-end server. The user may use the job_id to check the status of his/her job.
Figure 2

An example of the input and output of TarFisDock.

The output is delivered in ascending order of energy score (interaction energy). The archive file contains a list of the scores, together with binding models (in mol2 format) of the small molecule tested within the binding sites of the candidate targets. The user can also browse the ‘Categories’ dropdown menu of PDTD to obtain detailed information for the potential target proteins identified by TarFisDock: the ‘PDB_ID’ field contains a hyperlink to the PDB website; the ‘TARGET NAME’ field also contains a hyperlink to the DrugBank website (Figures 1 and 2), and any information linking targets to diseases is contained in the ‘RELATED DISEASE’ field taken from TTD.

TEST CASES

To test the reliability of the TarFisDock server, we searched for the candidate binding proteins for vitamin E and for 4H-tamoxifen. The results and their comparison with the published experimental data are described below.

Potential binding proteins for vitamin E

Vitamin E is an antioxidant which is widely used as a dietary supplement (30). It has also been shown to be of therapeutic value in the treatment of a number of diseases, such as cardiovascular disease and some forms of cancer, and to enhance the immune response (31). It is thus likely that vitamin E may interact with multiple target proteins. Indeed, 12 targets for vitamin E have already been reported (16) (Supplementary Table S1). Candidate vitamin E-binding proteins identified using TarFisDock are listed in Supplementary Table S2. The top 2% candidates identified by TarFisDock, ranked by interaction energies, included 4 out of the 12 targets identified experimentally. Three more of these experimentally identified targets were in the top 10% of the proteins ranked by interaction energy. The top 2 and 10% candidates of vitamin E-binding proteins identified by TarFisDock cover 30 and 50%, respectively, of reported targets verified or implicated by experiments. Other targets, such as glutathione S-transferase, glutathione synthetase, D-amino acid oxidase, and guanylyl cyclase (it is not available in PDTD), were not identified by TarFisDock (Table 2). The main reason may be that TarFisDock does not take into account protein flexibility. It is of interest that many of the top 10% candidate vitamin E-binding proteins are associated with cancer, cardiovascular diseases, immune function and dementia (Supplementary Table S2).
Table 2

The protein target candidates of vitamin E identified by TarFisDock

RankPDB_IDEnergy scoreTarget name
11VXR−32.61Acetylcholinesterase
21DHT−32.49Estrogenic 17β-hydroxysteroid dehydrogenase
32NSE−32.07Nitric oxide synthase
62ACS−30.49Aldose reductase
201M9M−29.28Nitric oxide synthase
281GPN−28.56Acetylcholinesterase
491DBJ−27.4Fab' fragment of monoclonal antibody Db3
501ADF−27.37Alcohol dehydrogenase
534FAB−27.374-4-20 (IgG2A) Fab fragment
625P2P−27.14Phospholipase A2

Potential binding proteins for 4H-tamoxifen

4H-tamoxifen is used as an adjuvant therapy in the treatment of breast cancer (32). Like vitamin E, it is a multiple target drug. So far, 10 proteins have been identified as interaction targets for 4H-tamoxifen or for its metabolite, tamoxifen (16) (Supplementary Table S1). To test the reliability of our TarFisDock server, we used it to search for candidate binding proteins for 4H-tamoxifen in the PDTD. The target candidates so thus identified are listed in Supplementary Table S3, and those which correspond to proteins identified experimentally are shown in Table 3. Three amongst the top 2% of the candidates are known targets of 4H-tamoxifen, namely dihydrofolate reductase, immunoglobulin and glutathione transferase. The top 5% of the candidates include two additional targets identified experimentally, i.e. human fibroblast collagenase and 17β-hydroxysteroid dehydrogenase. Of experimentally confirmed targets for 4H-tamoxifen 30 and 50% appear amongst the top 2 and 5% of the TarFisDock predicted candidates, respectively, indicating the reliability of this server tool again.
Table 3

The protein target candidates of 4H-tamoxifen identified by TarFisDock

RankPDB_IDEnergy scoreTarget name
41DHF−36.8Dihydrofolate reductase
101MCR−35.4Immunoglobulin λ light chain dimer
121K3Y−34.66Glutathione transferase
131DBM−34.51Fab' fragment of monoclonal antibody Db3
174DFR−34.07Dihydrofolate reductase
192CGR−33.68Igg2B (κ) Fab fragment
214AYK−33.53Human fibroblast collagenase
224FAB−33.494-4-20 (IgG2A) Fab fragment
251DBJ−33.26Fab' fragment of monoclonal antibody Db3
271DHT−33.1217β-Hydroxysteroid dehydrogenase
311AYK−32.9Human fibroblast collagenase
461RA2−31.75Dihydrofolate reductase
481DIH−31.71Microbial dihydrofolate reductase
501MCB−31.66Immunoglobulin λ light chain dimer
701BHS−30.8117β-Hydroxysteroid dehydrogenase
TarFisDock has been in use for about 9 months, and over 1000 small molecules, including synthetic compounds, existing drugs and natural products, have been screened. Five groups outside the authors' labs have become involved in screening. Experimental evidence has been obtained to confirm that binding proteins identified by TarFisDock for several compounds indeed display binding activity. In one case, that of a binding protein for a natural product, not only was binding verified experimentally, but a complex was obtained whose 3D crystal structure was solved by X-ray crystallography (data not shown). The computing time required depends on the flexibility of the given compound. Thus, TarFisDock may finish the PDTD search within 5–20 h using one CPU of the SGI Origin3800 superserver.

SUMMARY

In bringing together the target database PDTD and the reverse docking program, TarFisDock server is a convenient tool for identification of potential binding proteins for small molecules such as drugs, lead compounds and natural products. Totally, this web server has already been tested for over 1000 small molecules, the binding proteins for several molecules have been verified by bioassay including crystal structure determination (data not shown). This web server can also be used in mapping the regulation genomic network for an existing drug or a drug candidate. In general, one drug molecule may interact with several targets including targets associated with side effect (toxicity). As illustrated by the examples for identifying potential binding proteins of vitamin E and 4H-tamxifen, TarFisDock provides multiple options for selecting protein targets. These are useful clues for further experimental test in evaluating the efficacy and toxicity of the drug. On the other hand, the targets information produced by TarFisDock is also significant for functional genomic study with the chemical biology paradigm (33). In general, TarFisDock web sever is a convenient tool for ‘fishing’ the target proteins of small molecules, the user just inputs the structure of querying compound and customizes a target list from PDTD (a list of all the targets is recommended). However, TarFisDock still has certain limitations. The major one is that the protein entries are not enough for covering all the protein information of disease related genomes. The second one is that TarFisDock has not considered the flexibility of proteins during docking simulation. These two aspects will produce negative false. Another limitation is that the scoring function for reverse docking is not accurate enough, which will produce positive false. To overcome these shortages, we are (i) collecting proteins structures (experimental and modeling structures) as more as possible for enlarging PDTD, (ii) developing new docking program including protein flexibility, and (iii) establishing accurate scoring function. TarFisDock and PDTD are available at .

SUPPLEMENTARY DATA

Supplementary Data are available at NAR online.
  27 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Ligand-protein inverse docking and its potential use in the computer search of protein targets of a small molecule.

Authors:  Y Z Chen; D G Zhi
Journal:  Proteins       Date:  2001-05-01

3.  TTD: Therapeutic Target Database.

Authors:  X Chen; Z L Ji; Y Z Chen
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

4.  Prediction of potential toxicity and side effect protein targets of a small molecule by a ligand-protein inverse docking approach.

Authors:  Y Z Chen; C Y Ung
Journal:  J Mol Graph Model       Date:  2001       Impact factor: 2.518

5.  DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases.

Authors:  T J Ewing; S Makino; A G Skillman; I D Kuntz
Journal:  J Comput Aided Mol Des       Date:  2001-05       Impact factor: 3.686

6.  Import of host delta-aminolevulinate dehydratase into the malarial parasite: identification of a new drug target.

Authors:  Z Q Bonday; S Dhanasekaran; P N Rangarajan; G Padmanaban
Journal:  Nat Med       Date:  2000-08       Impact factor: 53.440

Review 7.  Virtual screening on natural products for discovering active compounds and target information.

Authors:  Jianhua Shen; Xiaoying Xu; Feng Cheng; Hong Liu; Xiaomin Luo; Jingkang Shen; Kaixian Chen; Weimin Zhao; Xu Shen; Hualiang Jiang
Journal:  Curr Med Chem       Date:  2003-11       Impact factor: 4.530

8.  Can an in silico drug-target search method be used to probe potential mechanisms of medicinal plant ingredients?

Authors:  Xin Chen; Choong Yong Ung; Yuzong Chen
Journal:  Nat Prod Rep       Date:  2003-08       Impact factor: 13.423

Review 9.  The druggable genome.

Authors:  Andrew L Hopkins; Colin R Groom
Journal:  Nat Rev Drug Discov       Date:  2002-09       Impact factor: 84.694

Review 10.  Mechanism-based target identification and drug discovery in cancer research.

Authors:  J B Gibbs
Journal:  Science       Date:  2000-03-17       Impact factor: 47.728

View more
  95 in total

1.  Virtual target screening: validation using kinase inhibitors.

Authors:  Daniel N Santiago; Yuri Pevzner; Ashley A Durand; MinhPhuong Tran; Rachel R Scheerer; Kenyon Daniel; Shen-Shu Sung; H Lee Woodcock; Wayne C Guida; Wesley H Brooks
Journal:  J Chem Inf Model       Date:  2012-07-23       Impact factor: 4.956

2.  Validation strategies for target prediction methods.

Authors:  Neann Mathai; Ya Chen; Johannes Kirchmair
Journal:  Brief Bioinform       Date:  2020-05-21       Impact factor: 11.622

Review 3.  In silico pharmacology for drug discovery: methods for virtual ligand screening and profiling.

Authors:  S Ekins; J Mestres; B Testa
Journal:  Br J Pharmacol       Date:  2007-06-04       Impact factor: 8.739

4.  Comparison of ultra-fast 2D and 3D ligand and target descriptors for side effect prediction and network analysis in polypharmacology.

Authors:  Alvaro Cortés-Cabrera; Garrett M Morris; Paul W Finn; Antonio Morreale; Federico Gago
Journal:  Br J Pharmacol       Date:  2013-10       Impact factor: 8.739

5.  Improving inverse docking target identification with Z-score selection.

Authors:  Stephanie S Kim; Melanie L Aprahamian; Steffen Lindert
Journal:  Chem Biol Drug Des       Date:  2019-01-02       Impact factor: 2.817

6.  Phenotypic Screening of Chemical Libraries Enriched by Molecular Docking to Multiple Targets Selected from Glioblastoma Genomic Data.

Authors:  David Xu; Donghui Zhou; Khuchtumur Bum-Erdene; Barbara J Bailey; Kamakshi Sishtla; Sheng Liu; Jun Wan; Uma K Aryal; Jonathan A Lee; Clark D Wells; Melissa L Fishel; Timothy W Corson; Karen E Pollok; Samy O Meroueh
Journal:  ACS Chem Biol       Date:  2020-05-21       Impact factor: 5.100

7.  Identifying unexpected therapeutic targets via chemical-protein interactome.

Authors:  Lun Yang; Jian Chen; Leming Shi; Michael P Hudock; Kejian Wang; Lin He
Journal:  PLoS One       Date:  2010-03-08       Impact factor: 3.240

8.  PharmMapper server: a web server for potential drug target identification using pharmacophore mapping approach.

Authors:  Xiaofeng Liu; Sisheng Ouyang; Biao Yu; Yabo Liu; Kai Huang; Jiayu Gong; Siyuan Zheng; Zhihua Li; Honglin Li; Hualiang Jiang
Journal:  Nucleic Acids Res       Date:  2010-04-29       Impact factor: 16.971

9.  e-LEA3D: a computational-aided drug design web server.

Authors:  Dominique Douguet
Journal:  Nucleic Acids Res       Date:  2010-05-05       Impact factor: 16.971

10.  SePreSA: a server for the prediction of populations susceptible to serious adverse drug reactions implementing the methodology of a chemical-protein interactome.

Authors:  Lun Yang; Heng Luo; Jian Chen; Qinghe Xing; Lin He
Journal:  Nucleic Acids Res       Date:  2009-05-05       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.