Literature DB >> 16845110

FAF-Drugs: free ADME/tox filtering of compound collections.

Maria A Miteva1, Stephanie Violas, Matthieu Montes, David Gomez, Pierre Tuffery, Bruno O Villoutreix.   

Abstract

In silico screening based on the structures of the ligands or of the receptors has become an essential tool to facilitate the drug discovery process but compound collections are needed to carry out such in silico experiments. It has been recognized that absorption, distribution, metabolism, excretion and toxicity (ADME/tox) are key properties that need to be considered early on, even during the database preparation stage. FAF-Drugs is an online service based on Frowns (a chemoinformatics toolkit) that allows users to process their own compound collections via simple ADME/Tox filtering rules such as molecular weight, polar surface area, logP or number of rotatable bonds. SMILES (Simplified Molecular Input Line Entry System), CANSMILES (canonical smiles) or SDF (structure data file) files are required as input and molecules that pass or do not pass the filters are sent back in CANSMILES format. This service should thus help scientists engaging in drug discovery campaigns. Other utilities and several compound collections suitable for in silico screening are available at our site. FAF-Drugs can be accessed at http://bioserv.rpbs.jussieu.fr/FAFDrugs.html.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16845110      PMCID: PMC1538885          DOI: 10.1093/nar/gkl065

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Drug discovery is a complex and expensive endeavor that usually requires seven major steps: disease selection, target hypothesis, lead compound identification (screening), lead optimization, pre-clinical trial, clinical trial and pharmacogenomic optimization. Among the various techniques used to facilitate the drug discovery process, virtual or in silico ligand screening (VLS) based on the structure of known ligands or on the structure of the receptor is becoming a method of choice (1–11), as seen in several recent studies [reviewed in (12–16)]. All these investigations require suitable compound collections. It has been suggested that these libraries of purchasable small organic compounds should be filtered [ADME/tox (absorption, distribution, metabolism, excretion and toxicity) filtering] in an attempt to work with databases of molecules with acceptable physical properties and chemical functionalities, at least consistent with known drug profiles (17–27). Common filtering protocols can be variations of Lipinski's rule-of-five (or RO5, potential for oral bioavailability) (25): molecular weight (MW) (poor absorption is observed if MW is more than 500), computed log P (P = octanol/water partition coefficient) (should not be more than 5), H-bond donors (should not be more than 5) and H-bond acceptors (should not be more than 10). Filters can also include a limit on the number of rotatable bonds, on the polar surface area (a value correlated to the number of H-bond donors and acceptors) among others, or can remove compounds containing specific chemical substructures associated with poor chemical stability or toxicity and sometimes attempt to predict drug metabolism (e.g. cytochrome-mediated metabolism, Pgp efflux) (28–32). The selected molecules after applying Lipinski's RO5 or related filters based on physicochemical properties or investigation of chemical functionalities are erroneously called ‘drug-like’ while in fact, many organic compounds conform to the above listed rules but they are by no means drug-like (33). In fact, these rules define only some necessary conditions for a drug candidate (such as likely solubility, bio-availability) but not sufficient ones. Different levels of filtering could be applied in agreement with the aims of the project. For instance, soft filtering protocols are usually appropriate for cancer projects while, for some other studies, only small and rigid compounds/fragments (low MW, few rotatable bonds) are needed (e.g. fragment-based lead discovery projects or fraganomics) (34). Only few online ADME/tox tools are available, they can usually evaluate one compound at a time (Table 1) while commercial packages are in general expensive [see review (35)]. Compound libraries can be found online, but they are usually not free nor filtered (Table 2) (36). Only recently, a free 3D database of compounds ready for VLS projects has been reported, ZINC: (37). It is also possible to perform ADME/tox computations via ZINC and in this case they are carried out by the program Filter (OpenEye Scientific Software, a program to remove undesirable molecules based on physicochemical properties and about 100 rules to eliminate unstable/reactive/dye chemical groups as well as to desalt the molecules). For the time being, the users of ZINC can only apply default thresholds for the various computed properties.
Table 1

Example of free online ADME/tox tools

One molecule at a time
One molecule at a time
One molecule at a time
(37)All molecules—ADME/tox filtering with Filter (OpenEye)
Computation of logP (one molecule at a time), Syracuse Research Corporation
Computation of logP (one molecule at a time)
(42)Computation of logP (one molecule at a time)
Table 2

Some online compound collections

(44)Free collections
Free collections
(45)Free collections
(46)Free collections
(37)Free and ADME/tox filtered collections
Free and ADME/tox filtered collections
(47)Free collections
Available chemicals directory commercial collections—ADME/tox filtering possible
Commercial collections
Dictionary of small molecules
(48)Measured binding affinities
(49)Proteins with co-crystallized ligands and experimental binding affinities
(50)Proteins with co-crystallized ligands and experimental binding affinities
Because ADME/tox calculations are usually not available online, we have created FAF-Drugs, a tool to perform physicochemical filtering. Also, in order to make VLS experiments easier to perform to a broad community of users, we have interfaced several additional utilities (such as binding site prediction, OpenBabel…) and processed five major compound collections.

METHODS AND IMPLEMENTATION

ADME/tox filters

We use Frowns (developed by Brian Kelley), a chemoinformatics toolkit () written in Python and C++ to parse/read SMILES (see explanations about the format at ) or SDF files (see format at Molecular Design Limited). We have implemented an algorithm in Python that make use of Frowns features to compute properties known to be important for filtering databases and that utilizes Xtool (38) to compute log P-values. Because salts and counterions are often present in compound collections we recommend users to first apply the desalt utility that removes most salts and counterions prior to FAF-Drugs calculations. Then, our program computes the following molecular properties: (i) Molecular weight (part of Lipinski's RO5) (ii) Hydrogen bond donors and acceptors (part of Lipinski's RO5) Defined as the number of hydrogen bond acceptors (sum of N + O) and hydrogen bond donors (sum of OH + NH). (iii) Number of rigid bonds (iv) Number of rings (v) Size of the rings (vi) Number of rotatable bond Defined as any single non-ring bond, bounded to non-terminal heavy atom (29). The amide C-N bonds are not considered because of their high rotational energy barrier. (vii) Number of carbon atoms, number of heteroatoms and ratio. (viii) Number of atom with a net charge (ix) Sum of formal charges (x) The Topological Polar Surface Area (TPSA) The method described in (30) has been implemented. Briefly, the molecular polar surface area (PSA) (i.e. surface belonging to polar atoms) is a descriptor that was shown to correlate well with passive molecular transport through membranes. The calculation of PSA, however, is rather time-consuming because of the necessity to generate a reasonable 3D molecular geometry and the calculation of the surface itself. A new approach for the calculation of the PSA was developed by Erlt et al. (30) based on the summation of tabulated surface contributions of polar fragments. This approach was called topological polar surface area, it provides results that are practically identical with the 3D PSA while the computation speed is 2–3 orders of magnitude faster. (xi) Computation of XlogP (P = calculated octanol/water partition coefficient) (part of Lipinski's RO5) We use the XScore package () to compute XlogP as described in (38). This method gives log P-values by summing the contributions of component atoms while making use of correction factors. About 90 atom types are used to classify carbon, nitrogen, oxygen, sulfur, phosphorus and halogen atoms, and 10 correction factors are used for some special substructures. The contributions of each atom type and correction factor were derived by multivariate regression analysis of about 1850 organic compounds with known experimental log P-values. In FAF-Drugs, the format for the input files has, for the time being, to be SDF, SMILES or CANSMILES while the compounds have to be in Mol2 format for XlogP computations. We use OpenBabel for file format conversion prior to XlogP calculations. Few compounds are found to have ambiguous atom types and in this case the log P is not computed. (Please see definitions about log P at: ) (xii) Atom check Molecules with some specific atoms can be filtered-out (for instance molecules containing H, C, N, O, F, S, P, Cl, Br, I atoms are kept when using default parameters).

RESULTS AND DISCUSSION

Online ADME/tox tools are usually not freely available, for this reason, we have developed FAF-Drugs. This latter stands for Free ADME/tox Filtering and ‘Drug-like’ compound collections. Our service can be used to filter collections available online as well as virtual libraries. Different levels of filtering have been reported in the literature, depending on the stage of the project, on the target and the disease types. For example, simple physicochemical property filtering could be used when searching for new hits on a new target while more complex ADME/tox models (39) [see for example a list of chemical groups incompatible with final drug development (36,40)] could be applied at a later stage. We chose to implement only simple physicochemical rules because they address the filtering process using widely understood molecular properties.

ADME/tox FILTERS

To start FAF-Drugs filtering, users can either write a molecule in SMILES or 2D/3D SDF format directly in the Web interface window or browse and upload a compound library. Salts and counterions are often present in compound collections and should be removed prior to ADME/tox calculations. If salts and counterions are present, we suggest users to run first our DeSalt utility. At present, the input formats for FAF-Drugs calculations are CANSMILES, SMILES or SDF (please check our Web site for explanations about the required formats) but OpenBabel ( or online at RPBS) can be used for file format conversion prior to the filtering step (Figure 1a). Then users can decide about the upper and lower limits of each investigated properties (adjustable thresholds) such as, to tailor the compound selection to a specific project. We also propose default parameters that are commonly used in the field (25,26,29,32,41).
Figure 1

(a) Schema of the FAF-Drugs service. Compound collections in SMILES, CANSMILES or SDF format are needed as input. Users can select a threshold for each investigated physicochemical properties. XlogP calculations are performed with Xtool (see text). Users obtain two output files, one with molecules that pass the filters and the other with compounds that do not pass the filters. A third file with all the computed properties can also be downloaded. Several other utilities are available at FAF-Drugs, these involve online XlogP calculations (38) computed with Xtool, online OpenBabel for file format conversion and implementation of the Java Molecular Editor from Dr P. Ertl (Novartis Pharma AG, Basel, Switzerland) to draw molecules and obtain the corresponding SMILES string. In addition, at FAF-Drugs, users can find five ADME/tox filtered compound collections ready for VLS computations. Three levels of filtering were applied (see our web site for further details) in order to better suit the needs of potential users. The OpenEye's Omega program was used to generate 3D models, either single conformation or up to 50 conformations, for each molecule that passed the ADME/tox filters. The compound collections can be downloaded in Mol2 format or in SMILES format. Other utilities consist of a Test Set that contains six protein targets (PDB format) and about 10 corresponding ligands (Mol2 format, see information about the format at ) to facilitate evaluation of docking/scoring methods and an interface to PASS (43), a program that predicts binding pocket at the surface of a receptor. Many additional tools pertaining to the field of structural bioinformatics are also available at RPBS such as protein electrostatic computations, loop search, solvent accessibility prediction…(see RPBS services). (b) FAF-Drugs results. Four molecules with different physicochemical properties were selected in order to compare FAF-Drugs calculations with other online tools.

Users obtain two files with molecules that pass and do not pass the filters in CANSMILES format together with the original (if available) compound ID provided by the chemical vendors. All computed properties (e.g. MW, TPSA, XlogP…) are also returned in a third file. In order to test our program, we performed computations on 50 080 molecules extracted from the ChemBridge compound collection (Diversity set) with FAF-Drugs and Filter (version 1.0.2, OpenEye Scientific Software) with the same parameters with the same threshold values (MW, TPSA…). Both, Filter and FAF-Drugs compute TPSA using the approach of Erlt et al. (30) and log P using the method of Wang et al. (38). A total of 49 334 passed the filters with FAF-Drugs and 49 032 with Filter. Small differences could be due to the fact that some rules are implemented slightly differently, for instance TPSA or log P calculations or definition of flexible bond. Our tests on a Linux machine (Dell Precision 650, 3GHz, 2GB SDRAM) show that the standalone version of FAF-Drugs is able to process the above 50 080 molecules in about 20 min while equivalent computations on the same computer with Filter (OpenEye) took about 10 min. FAF-Drugs implementation is Python-based and is not presently optimized for speed. With regard to server implementation, similar computations took about 30 min, but it can be longer (about 3 h) depending on the server load. We also compared FAF-Drugs with other online tools: Molinspiration () that allows evaluation of few physicochemical properties (one molecule at a time can be processed, they have implemented their own tools to calculate log P while they follow the Erlt et al. approach to compute polar surface), and the log P calculators provided by Syracuse Research Corporation (see Table 1) and by Tetko and Tanchuck, ALOGPS 2.1 (42). The method for log P prediction developed at Molinspiration (miLogP) is based on group contributions. These have been obtained by fitting calculated log P with experimental log P. ALOGPS uses a neural network approach to predict logP while Syracuse Research Corporation tool (LogKow) estimates log P using an atom/fragment contribution method. Over 100 diverse molecules were tested and in all cases we computed very similar values. To illustrate our calculations, results on four different molecules are reported in Table 3 and Figure 1b. Overall, we note a very good agreement among the different methods.
Table 3

Comparison of FAF-Drugs with several online tools

Some computed dataFAF-DrugsMolinspirationSyracuse_logPALOGPSlogP experimentalCompound
MW154.1154.25154.25154.251,8-Cineole CAS: 470-82-6
HD (OH+NH)00
HA (O+N)11
Rot_bond00
TPSA9.239.23
log P2.592.713.133.372.50
MW182182.15182.16182.16Triethyl Phosphate CAS: 78-40-0
HD00
HA44
Rot_bond66
TPSA54.5744.77
log P0.580.690.870.710.80
MW232.1232.23232.24232.24Phenobarbital CAS: 50-06-6
HD22
HA55
Rot_bond22
TPSA75.2775.26
log P1.320.791.331.411.47
MW181.4181.45181.45181.451,2,4-Trichlorobenzene CAS: 120-82-1
HD00
HA00
Rot_bond00
TPSA00
log P3.893.893.934.084.02

FAF-Drugs computes several descriptors, such as molecular weight (MW), hydrogen bond donors (HD), hydrogen bond acceptors (HA), number of rotatable bonds (Rot_bond), TPSA and log P (see text). Similar/identical results were obtained via the Molinspiration website and by FAF-Drugs. Experimental log P-values and CAS registry numbers were found in the EDETOX database () and at the (Syracuse Research Corporation) server. The corresponding molecules are shown in Figure 1B.

To further assess FAF-Drugs calculations, we compared over 100 computed log P-values [via our implementation of XlogP) with experimental log P (obtained via Syracuse Research Corporation and via the EDETOX database ()]. The computed values are in good agreement with the experimental data, indicating that our implementation of XlogP is appropriate and that this approach gives very good results (Figure 2).
Figure 2

Experimental versus computed logP. Correlation between experimental and calculated log P-values for over 100 compounds.

Taken together, the above data suggest that our ADME/tox program is robust. Once users obtain the CANSMILES output, they can decide about adjusting the filters and run additional computations or use 1D/2D to 3D conversion programs such as Corina (), Omega (OpenEye Scientific Software), Converter (Accelrys) and start a VLS project. For the time being, to protect our server from intensive use, we suggest scientists to upload files with less than 30 000 molecules. In the present version of the service, computations for several tens of thousands of compounds remain time consuming (e.g. several hours depending on the number of jobs in the queue) but work is in progress to improve this point. For this reason and in order to save CPU time and disk space, we also provide five filtered compound collections (Figure 1a).

CONCLUSION AND FUTURE DIRECTIONS

A rational approach to increase the efficiency of finding new drugs and reduce the R&D cost is to reduce the attrition rate in the costly downstream stages (e.g. clinical trials). Several important methods toward this goal have been developed, involving early computations of ADME/tox properties. We have developed FAF-Drugs to help modelers and biologists to embark into drug discovery projects. Users can filter their own compound libraries and adapt the thresholds to a specific project. Other tools pertaining to the field of drug design/compound collections are also available at our Web site. We are presently working on improving the speed of the calculations on our server.
  44 in total

1.  Improving the odds in discriminating "drug-like" from "non drug-like" compounds.

Authors:  T M Frimurer; R Bywater; L Naerum; L N Lauritsen; S Brunak
Journal:  J Chem Inf Comput Sci       Date:  2000 Nov-Dec

Review 2.  ADMET in silico modelling: towards prediction paradise?

Authors:  Han van de Waterbeemd; Eric Gifford
Journal:  Nat Rev Drug Discov       Date:  2003-03       Impact factor: 84.694

3.  Application of associative neural networks for prediction of lipophilicity in ALOGPS 2.1 program.

Authors:  Igor V Tetko; Vsevolod Yu Tanchuk
Journal:  J Chem Inf Comput Sci       Date:  2002 Sep-Oct

Review 4.  Physicochemical effects in the representation of molecular structures for drug designing.

Authors:  Johann Gasteiger
Journal:  Mini Rev Med Chem       Date:  2003-12       Impact factor: 3.862

Review 5.  Applications of high-throughput ADME in drug discovery.

Authors:  Daniel B Kassel
Journal:  Curr Opin Chem Biol       Date:  2004-06       Impact factor: 8.822

Review 6.  High-throughput docking as a source of novel drug leads.

Authors:  Juan C Alvarez
Journal:  Curr Opin Chem Biol       Date:  2004-08       Impact factor: 8.822

Review 7.  Guided docking approaches to structure-based design and screening.

Authors:  Xavier Fradera; Jordi Mestres
Journal:  Curr Top Med Chem       Date:  2004       Impact factor: 3.295

8.  ZINC--a free database of commercially available compounds for virtual screening.

Authors:  John J Irwin; Brian K Shoichet
Journal:  J Chem Inf Model       Date:  2005 Jan-Feb       Impact factor: 4.956

Review 9.  Linking databases and organisms: GenomeNet resources in Japan.

Authors:  M Kanehisa
Journal:  Trends Biochem Sci       Date:  1997-11       Impact factor: 13.807

Review 10.  Improving the decision-making process in structural modification of drug candidates: reducing toxicity.

Authors:  Alaa-Eldin F Nassar; Amin M Kamel; Caroline Clarimont
Journal:  Drug Discov Today       Date:  2004-12-15       Impact factor: 7.851

View more
  34 in total

1.  Blocking CD40-TRAF6 interactions by small-molecule inhibitor 6860766 ameliorates the complications of diet-induced obesity in mice.

Authors:  S M van den Berg; T T P Seijkens; P J H Kusters; B Zarzycka; L Beckers; M den Toom; M J J Gijbels; A Chatzigeorgiou; C Weber; M P J de Winther; T Chavakis; G A F Nicolaes; E Lutgens
Journal:  Int J Obes (Lond)       Date:  2014-11-13       Impact factor: 5.095

2.  Design of protein membrane interaction inhibitors by virtual ligand screening, proof of concept with the C2 domain of factor V.

Authors:  Kenneth Segers; Olivier Sperandio; Markus Sack; Rainer Fischer; Maria A Miteva; Jan Rosing; Gerry A F Nicolaes; Bruno O Villoutreix
Journal:  Proc Natl Acad Sci U S A       Date:  2007-07-23       Impact factor: 11.205

3.  Identification of new potential Mycobacterium tuberculosis shikimate kinase inhibitors through molecular docking simulations.

Authors:  Carolina Pasa Vianna; Walter F de Azevedo
Journal:  J Mol Model       Date:  2011-05-19       Impact factor: 1.810

4.  Targeting imidazoline site on monoamine oxidase B through molecular docking simulations.

Authors:  Fernanda Pretto Moraes; Walter Filgueira de Azevedo
Journal:  J Mol Model       Date:  2012-03-17       Impact factor: 1.810

Review 5.  Advances in computationally modeling human oral bioavailability.

Authors:  Junmei Wang; Tingjun Hou
Journal:  Adv Drug Deliv Rev       Date:  2015-01-09       Impact factor: 15.470

6.  In silico assessment of new progesterone receptor inhibitors using molecular dynamics: a new insight into breast cancer treatment.

Authors:  Vahid Zarezade; Marzie Abolghasemi; Fakher Rahim; Ali Veisi; Mohammad Behbahani
Journal:  J Mol Model       Date:  2018-11-10       Impact factor: 1.810

7.  e-LEA3D: a computational-aided drug design web server.

Authors:  Dominique Douguet
Journal:  Nucleic Acids Res       Date:  2010-05-05       Impact factor: 16.971

8.  Frog2: Efficient 3D conformation ensemble generator for small compounds.

Authors:  Maria A Miteva; Frederic Guyon; Pierre Tufféry
Journal:  Nucleic Acids Res       Date:  2010-05-05       Impact factor: 16.971

9.  DG-AMMOS: a new tool to generate 3d conformation of small molecules using distance geometry and automated molecular mechanics optimization for in silico screening.

Authors:  David Lagorce; Tania Pencheva; Bruno O Villoutreix; Maria A Miteva
Journal:  BMC Chem Biol       Date:  2009-11-13

10.  Ligand scaffold hopping combining 3D maximal substructure search and molecular similarity.

Authors:  Flavien Quintus; Olivier Sperandio; Julien Grynberg; Michel Petitjean; Pierre Tuffery
Journal:  BMC Bioinformatics       Date:  2009-08-11       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.