Literature DB >> 33145486

Generating Multibillion Chemical Space of Readily Accessible Screening Compounds.

Oleksandr O Grygorenko1,2, Dmytro S Radchenko1,2, Igor Dziuba3, Alexander Chuprina4, Kateryna E Gubina2, Yurii S Moroz2,3.   

Abstract

An approach to the generation of ultra-large chemical libraries of readily accessible ("REAL") compounds is described. The strategy is based on the use of two- or three-step three-component reaction sequences and available starting materials with pre-validated chemical reactivity. After the preliminary parallel experiments, the methods with at least ∼80% synthesis success rate (such as acylation - deprotection - acylation of monoprotected diamines or amide formation - click reaction with functionalized azides) can be selected and used to generate the target chemical space. It is shown that by using only on the two aforementioned reaction sequences, a nearly 29-billion compound library is easily obtained. According to the predicted physico-chemical descriptor values, the generated chemical space contains large fractions of both drug-like and "beyond rule-of-five" members, whereas the strictest lead-likeness criteria (the so-called Churcher's rules) are met by the lesser part, which still exceeds 22 million.
© 2020 The Author(s).

Entities:  

Keywords:  Chemical Compound; Cheminformatics; Computational Chemistry by Subject

Year:  2020        PMID: 33145486      PMCID: PMC7593547          DOI: 10.1016/j.isci.2020.101681

Source DB:  PubMed          Journal:  iScience        ISSN: 2589-0042


Introduction

Modern drug discovery relies heavily on efficient mining of the chemical space, which is a descriptor space of all possible compounds (Dobson, 2004). This task is difficult owing to the enormous size of the accessible chemical space, which is estimated to include at least 1060 “observable” molecules; such a huge number makes its comprehensive enumeration and synthetic exploration impossible (at least currently). Nevertheless, significant advances in computational techniques allowed virtual exploration of reasonably large portions of chemical space efficiently (Hoffmann and Gastreich, 2019; Walters, 2019). Many recent works addressed enumeration of compounds relevant to drug discovery; a prominent example is given by works of Reymond and co-workers who described the generation of all stable molecules with up to a certain number of heavy atoms (GDB) (Reymond, 2015). In combination with virtual screening as a powerful tool for prioritizing compounds before in vitro biological tests, such databases provide a promising tool to discover chemotypes for further optimization into drug candidates. The major drawback of many virtually enumerated compound libraries is the unpredictable synthetic feasibility of their particular members, which hampers their experimental validation against the biological targets of interest. A possible approach to address this issue is based on the so-called forward synthetic analysis (Schreiber, 2000) and includes an enumeration of virtual libraries by the combination of synthons representing readily available building blocks. This strategy typically requires a reasonably large pool of reagents with established chemical behavior; it is not surprising therefore that it has been mostly used by the big pharma companies internally (e.g., Merck's MASSIV [Walters, 2019], Boehringer Ingelheim's BIClaim [Lessel et al., 2009], Eli Lilly's Proximal Collection [Nicolaou et al., 2016] or Pfizer Global Virtual Library [PGVL] [Hu et al., 2012]). More than a decade ago, we have launched a similar project on the generation of a virtual compound database based on the experimentally validated synthetic accessibility (the so-called REAL database, where REAL stands for REadily AccessibLe) (Shivanyuk et al., 2007). The main idea of this project follows the “forward synthetic analysis” concept described above: the available building blocks with validated reactivity are transformed into synthons with denoted reactivity features, which are then subjected to virtual coupling according to the well-established reactions and exclusion rules based on the reactivity features (Figure 1). The database was growing through the years and has reached 1.2 billion compounds with a 3- to 4-week synthesis time and ca. 85% synthesis success rate (i.e., a fraction of experiments that could produce the target compound among all the experiments performed) (Enamine REAL compounds, 2020). Recently, its utility in combination with virtual screening was confirmed by discovery of highly potent AmpC β-lactamase (AmpC) inhibitors, D4 dopamine receptor ligands (Lyu et al., 2019), and Kelch-like ECH-associated protein 1 (KEAP1) inhibitors (Gorgulla et al., 2020).
Figure 1

A General Principle of the REAL Database Generation Using One-Step Two-Component Reactions

A General Principle of the REAL Database Generation Using One-Step Two-Component Reactions Further extension of this concept led to the development of the REAL Space, a searchable chemical space that is not typically stored as an enumerated database but generated upon query through a chemoinformatics software (Klingler et al., 2019). This feature tree-based (Rarey and Stahl, 2001; Boehm et al., 2008) engine allowed processing very large datasets currently reaching 13 billion molecules. In addition to that, it allowed considering more complex reaction sequences as compared with those shown in Figure 1. In this work, we describe our approach to the generation of ultra-large, multibillion chemical space of the readily accessible compounds, which is based on the one-pot parallel reactions involving at least three building blocks (Figure 2).
Figure 2

An Approach to the Generation of Ultra-large Chemical Space Described in this Work

An Approach to the Generation of Ultra-large Chemical Space Described in this Work

Results and Discussion

Validation of Parallel Reactions

To demonstrate the principles of the chemical space generation, we have selected five two- or three-step three-component reactions shown in Scheme 1. In most cases, modification of N-Boc-monoprotected diamines (building blocks 1) was envisaged, i.e., acylation – deprotection – acylation (reaction ), acylation – deprotection – arylation (), acylation – deprotection – alkylation (), and arylation – deprotection – acylation () sequences. In addition to that, the acylation – copper-catalyzed azide-alkyne click reaction sequence involving either amino azides (2) or azido acids (3) was studied. Starting from the available set of bifunctional building blocks 1–3 and capping reagents 4–10 (typically with validated reactivity in the corresponding one-step transformations), 5 × 400 members of the libraries 11–15 were generated by random selection and virtual coupling of the corresponding synthons and then subjected to parallel synthesis.
Scheme 1

Parallel Reaction Sequences Studied in This Work.

See also Tables S1 and S2, Figures S1 and S2.

Parallel Reaction Sequences Studied in This Work. See also Tables S1 and S2, Figures S1 and S2. The results of these validation experiments are shown in Table 1. Thus, methods and worked well and gave the target products with 77% and 81% synthesis success rate, as well as 44% and 38% average yield. Two-step reaction sequence was even more efficient (80% synthesis success rate, 51% average yield). On the contrary, methods and (acylation – deprotection – arylation/alkylation) showed lower success rate (60% and 53%, respectively); the corresponding library members 12 and 13 were obtained with 43% and 31% average yield, respectively. Analysis of the crude reaction mixture showed that competitive arylation/alkylation of N-hydroxybenzotriazole (formed from HATU, a coupling reagent from the acylation step) might be a major problem lowering the efficiency of the latter reaction sequences.
Table 1

Validation Experiments for the Parallel Synthesis of Libraries 11–15

#MethodConditionsLibrarySuccess Rate, %aAverage Yield, %
All ExperimentsSuccessful Experiments
1A1. HATU, i-Pr2NEt, DMSO, rt, 16 h2. CF3COOH, i-Pr3SiH, H2O, rt, 6 h3. HATU, i-Pr2NEt, DMSO, rt, 16 h11773444
2B1-2. Same as for A3. i-Pr2NEt, NMP,b 100°C, 16 h12602643
3C1-2. Same as for A3. i-Pr2NEt, DMF, 80°C, 16 h13531631
4D1. i-Pr2NEt, NMP,b 100°C, 16 h2-3. Same as for A14813038
5E1. HATU, i-Pr2NEt, DMF, rt, 16 h2. i-Pr2NEt, Cu(OAc)2, 80°C, 16 h15804151

See Also Tables S1 and S2, Figures S1 and S2.

Fraction of 400 experiments that allowed for the preparation of the target product.

NMP, N-methyl-2-pyrrolidone.

Validation Experiments for the Parallel Synthesis of Libraries 11–15 See Also Tables S1 and S2, Figures S1 and S2. Fraction of 400 experiments that allowed for the preparation of the target product. NMP, N-methyl-2-pyrrolidone. Taking into account the results described above, as well as the acceptable synthesis success rate for the REAL Database and the REAL Space (around 80%), only methods , , and can be used to generate the ultra-large chemical space in further steps of this work. Methods and require further optimization before their incorporation into the toolbox of the studied strategy is possible; they might still be applicable for the library synthesis but with lower confidence. In addition to that, success rates were analyzed for each of the reagents 1–3 to identify those demonstrating poor efficiency. Owing to the limited size of the dataset, only the building blocks for which at least 10 experiments were performed were taken into account. Figure 3 shows examples of the reagents showing both excellent and low reactivity in the reaction sequence studied. An obvious reason for the poor efficiency observed for compounds 1{256} and 3{5} is related to steric hindrance. Therefore, building blocks 1{256} and 3{5} were excluded from the further generation of the REAL chemical space.
Figure 3

Examples of Reagents 1–3 Showing Excellent and Poor Efficiency in the Methods Studied (Relative Configurations are Shown)

Examples of Reagents 1–3 Showing Excellent and Poor Efficiency in the Methods Studied (Relative Configurations are Shown)

Generation of the Chemical Space

First of all, building blocks 1–5 and 8–10 necessary for reaction sequences and were transformed into the synthons ready for the virtual coupling (see Figure 4 and Scheme 2). Of building blocks 4, 5, and 8–10, only those having validated reactivity in the corresponding one-step parallel syntheses were taken into consideration. In addition to that, cut-offs by molecular weight were applied for 4 and 5. For the bifunctional building blocks 1–3, visual inspection was also applied in addition to the results obtained from the preliminary tests described above. Apart from the SMILES representation (Weininger, 1988), reaction ID, and the role in the reaction sequence, reactivity features for the exclusion rules were recorded for each synthon, denoting steric hindrance around the corresponding functional groups. Again, for building blocks 4 and 5, these reactivity features were taken from the available statistical data for the one-step parallel reactions, whereas for monoprotected diamines 1, they were assigned manually for each of the functional groups after the visual inspection. It should be pointed out that the methodology does not involve quantitative reactivity measures; instead, binary (“yes/no”) qualitative reactivity features are introduced for each synthon. For reaction sequence , the reactivity features related to the steric factor were not taken into account. Although method is fully suitable for the REAL Space generation, it was not included in the study at this point; the corresponding synthons are currently under development.
Figure 4

Examples of Synthons Generated from Reagents 1, 2, 4, 5, 8, and 10 (in the Corresponding SMILES Representations, Uncommon [“Dummy”] Atoms are Used Instead of the Colored Asterisks [∗] to Denote Different Types of the Variation Points)

Scheme 2

Virtual Coupling of the Synthons Shown in Figure 4 (the Variation Points [∗] Are Connected according to Their Types)

Examples of Synthons Generated from Reagents 1, 2, 4, 5, 8, and 10 (in the Corresponding SMILES Representations, Uncommon [“Dummy”] Atoms are Used Instead of the Colored Asterisks [∗] to Denote Different Types of the Variation Points) Virtual Coupling of the Synthons Shown in Figure 4 (the Variation Points [∗] Are Connected according to Their Types) As a result, a total of 15,153 and 46,474 synthons were generated for the reaction sequences and , respectively (Table 2). Further processing of these synthons followed the workflow shown in Figure 5. The workflow included virtual coupling, application of the exclusion filters (addressing the reactivity features), and duplicate removal (performed with the InChI key representations [Heller et al., 2015] to increase the performance). The synthons with negative overall reactivity feature were excluded from the process prior the coupling. As for the steric factor, combinations of synthons with both negative features were excluded at the corresponding step. Table 3 summarizes the generation of the multibillion parts of the chemical space according to the methods and , as well as numbers of the readily accessible compounds that could be achieved.
Table 2

Number of Various Synthons Types Generated for Reaction Sequences and

#MethodReagentsNumber of Synthons
No Reactivity FeaturesWith Steric FeaturesTotal
1A1467196a663
246,7061,2717,977
355,4511,0636,514
4E241041
5352052
6817,94455018,494
7926,43464627,080
8108070807

With steric hindrance at the free amino group (103), the protected amino group (82), or both (11).

Figure 5

The Workflow of the Multibillion Chemical Space Generation

Table 3

Results of the Multibillion Chemical Space Generation

#MethodNo. of SynthonsNo. of Library Members after
Virtual CouplingExclusion FiltersDuplicate Removal
1A15,15434,450,924,01432,733,348,05827,297,397,644
2E46,4741,748,296,0981,748,296,0981,563,752,616
3Total60,43136,199,220,11234,481,644,15628,861,150,260
Number of Various Synthons Types Generated for Reaction Sequences and With steric hindrance at the free amino group (103), the protected amino group (82), or both (11). The Workflow of the Multibillion Chemical Space Generation Results of the Multibillion Chemical Space Generation As it is obvious from Table 3, the number of the “core” (bifunctional) building blocks, as well as sufficient and comparable accessibility of both “capping” reagents types are the key parameters affecting the size of the generated chemical space. Even with reasonably decreased sets of the “capping reagents,” multibillion numbers are easily achieved (as in the case of method ). For method , both limited availabilities of the azide-containing bifunctional building blocks 2 and 3 and very different accessibility of the reagents 8/9 and 10 are responsible for the fact that the size of the resulting chemical space could only exceed a billion of structures. Nevertheless, even with all these limitations, we could generate a chemical space containing nearly 29 billion of readily accessible compounds using only two reaction sequences. As it was mentioned in the Introduction, this chemical space can be accessed either directly as a pre-enumerated database or through a feature-tree based search engine that performs generation of the corresponding structures upon a query. The current versions of the REAL Database and REAL Space include 0.27 and 9.9 billion members obtained according to methods or (since additional cut-offs on the physico-chemical and structural properties, as well as reagent availability, were applied).

Predicted Physico-Chemical Descriptors

Over the last decades, it was stressed out that physico-chemical properties of the compounds are important to drug discovery since they have a critical impact on the attrition rate of drug candidates (Grygorenko et al., 2020). It is therefore important to understand the capabilities of the generated chemical space in terms of providing the so-called drug-like or lead-like compounds (Nadin et al., 2012). To address this point, we have calculated physico-chemical descriptors of common interest to medicinal chemistry, i.e., molecular weight (MW), the logarithm of octanol-water partition coefficient (sLogP) (Wildman and Crippen, 1999), hydrogen bond acceptor/donor counts (HAcc/HDon), topologic polar surface area (TPSA), rotatable bond count (RotB), and sp-hybrid carbon atom fraction (Fsp3). As it follows from Figure 6 and Tables 4 and 5, the part of the chemical space generated by method complies well with the classical drug-likeness criteria (i.e., Lipinski and Veber rules), whereas method tends to provide heavier, more lipophilic compounds with higher hydrogen bond acceptor count, polar surface area, and rotatable bond count, an obvious consequence of the less stringent pre-selection of starting building blocks 8–10. Of course, the percentage of the fitting chemical space members goes down rapidly when more stringent lead-likeness criteria are applied. Nevertheless, a considerable number of the compounds remains even after application of the most rigorous Churcher's rules (21.2, 0.95, and 22.1 Mln members by method , , and in total, respectively). Moreover, the significant fraction of the readily accessible “beyond-of-Ro5” members can be sometimes considered even advantageous taking into account the recently increased interest to such compounds in medicinal chemistry (DeGoey et al., 2018).
Figure 6

Distribution of Physico-Chemical Descriptors Predicted for the Generated Chemical Space and Approved Drugs

See also Table S3.

Table 4

Average Values of Physico-Chemical Descriptors Predicted for the Generated Chemical Space and Approved Drugs

#MethodMWsLogPHAccHDonTPSA, Å2RotBFsp3
1A4402.615.31.593.76.40.56
2E5023.157.81.3112.18.30.51
3Total4442.645.41.594.66.50.55
4DrugBanka3952.055.12.496.96.40.47

Data for 2,470 drugs deposited in DrugBank (as of September 2020).

Table 5

Fractions of the Generated Chemical Space (%) Compliant with the Drug- and Lead-likeness Rules

#MethodRule of 5a+ Veber's RulesbRule of 4.5cRule of 4dChurcher's Rulese,f
1A89.182.656.916.90.08 (21,167,934)
2E48.440.424.37.40.06 (952,402)
3Total86.980.355.116.50.08 (22,120,336)

MW < 500, LogP<5, HAcc≤10, HDon≤5 (Lipinski et al., 1997).

RotB ≤10, TPSA <140 (Veber et al., 2002).

MW < 450, LogP<4.5 (Oprea et al., 2001).

MW < 400, LogP<4 (Hann and Oprea, 2004).

MW 200 … 350, LogP −1 … 3 (Nadin et al., 2012).

Absolute numbers of the library members are given in brackets.

Distribution of Physico-Chemical Descriptors Predicted for the Generated Chemical Space and Approved Drugs See also Table S3. Average Values of Physico-Chemical Descriptors Predicted for the Generated Chemical Space and Approved Drugs Data for 2,470 drugs deposited in DrugBank (as of September 2020). Fractions of the Generated Chemical Space (%) Compliant with the Drug- and Lead-likeness Rules MW < 500, LogP<5, HAcc≤10, HDon≤5 (Lipinski et al., 1997). RotB ≤10, TPSA <140 (Veber et al., 2002). MW < 450, LogP<4.5 (Oprea et al., 2001). MW < 400, LogP<4 (Hann and Oprea, 2004). MW 200 … 350, LogP −1 … 3 (Nadin et al., 2012). Absolute numbers of the library members are given in brackets. Comparison of the obtained results with the physico-chemical properties of 2,470 approved drugs deposited in DrugBank database (Wishart et al., 2018) showed that compounds produced by our approach tend to be heavier and slightly more lipophilic and have somewhat lower hydrogen bond donor count, which is an obvious consequence of the chemical methodology used (Table 4 and Figure 6). They are also more sp3 enriched. All these features are in line with recent trends in drug discovery (good or bad) related to increased molecular complexity of new drug molecules (Grygorenko et al., 2020). On the contrary, the average values of total polar surface area, rotatable bond and hydrogen bond acceptor counts for the library members are more or less in line with those of the known drugs. A short study was also performed to assess the relationship between the distribution of the synthons and the space covered by the generated databases. In particular, random selections were made from the sets of the synthons used for method containing from 5% to 95% (with 5% step) of the initial structures, and the corresponding library members were selected from the final database. As it might be expected, the database size followed a cube function of the synthon subset size (Figure 7).
Figure 7

Relationship between the Size of the Generated Databases and the Size of the Synthon Subsets

Obtained by random selections from the initial synthon set for Method ; average from three independent selections; see also Table S4.

Relationship between the Size of the Generated Databases and the Size of the Synthon Subsets Obtained by random selections from the initial synthon set for Method ; average from three independent selections; see also Table S4. In addition to that, other synthon subsets were prepared by applying molecular weight cut-offs of 100–275 (with 25 MW step) to the initial synthon set used for method , and the corresponding databases were generated. Owing to the properties of the initial synthon set, the size of the resulting databases increased dramatically for the cut-off range 100–200 (the so-called rule-of-two for building blocks [Goldberg et al., 2015]) and reached a maximum value after MW = 275 (a general cut-off used in the design of the initial set) (Figure 8A). Expectedly, distribution of physico-chemical properties (i.e., MW and sLogP) within the resulting virtual libraries correlated with increase in the corresponding values for the synthons (Figure 8B).
Figure 8

Properties of the Generated Chemical Space as a Function of the Molecular Weight Cut-offs Applied to the Initial Synthon Sets for Method 

(A and B) (A) The size of the generated databases. (B) Distribution of physico-chemical descriptors (MW and sLogP) for the generated chemical space.

See also Table S5.

Properties of the Generated Chemical Space as a Function of the Molecular Weight Cut-offs Applied to the Initial Synthon Sets for Method (A and B) (A) The size of the generated databases. (B) Distribution of physico-chemical descriptors (MW and sLogP) for the generated chemical space. See also Table S5. One might argue that the technology described in the current work is mostly based on very simple chemical transformations; therefore, its capability of producing novel, complex, and diverse molecules might be questionable. Nevertheless, a recent analysis by AstraZeneca scientists shows that this is not the case: even using only the amide formation reaction, very good results can be obtained in the early drug discovery provided that sufficient access to the corresponding building blocks is possible (Tomberg and Boström, 2020). The physico-chemical features of the chemical space generated in this work are similar to those of DNA-encoded libraries (Kunig et al., 2018). In both cases, this is related to the fact that final library members are constructed from at least three building blocks, which increases the lower MW limit. In our opinion, the huge size of both DNA-encoded libraries and multibillion chemical spaces like the one described herein can be considered as compensation for the increased molecular complexity (provided that efficient in vitro or in silico screening technologies are available to mine these ultra-large libraries). The success stories available in the literature for both technologies (Goodnow et al., 2017; Kunig et al., 2018; Lyu et al., 2019; Gorgulla et al., 2020) can serve as a justification for the above hypothesis.

Conclusions

Combined with the modern virtual screening tools, ultra-large libraries of readily accessible (“REAL”) compounds have proven their utility for the identification of highly potent hits against various biological targets. Herein, it is shown that a nearly 29-billion chemical space covering such synthetically feasible representatives can be easily generated using two- or three-step three-component reaction sequences and available starting materials with the chemical reactivity validated in one-step parallel reactions. Only the methods with at least ∼80% synthesis success rate (e.g., acylation – deprotection – acylation of monoprotected diamines, as well as amide formation – click reaction with amino azides or azido acids) are acceptable to generate the target chemical space with sufficient synthetic confidence. It is shown that diversity of the “core” (bifunctional) building blocks, as well as nearly equal (but sufficient) accessibility of the “capping” reagents are essential to obtain the largest numbers of the library members. Analysis of physico-chemical descriptors reveals that the generated chemical space contains large fractions of both drug-like and “beyond rule-of-five” members, whereas the strictest lead-likeness criteria (i.e., Churcher's rules) are met for the lesser part (which still exceeds 22 million compounds). In our opinion, a combination of ultra-large REAL libraries and modern virtual screening tools is similar to DNA-encoded libraries (that have gained momentum in recent years) in terms of physico-chemical properties and chemical space coverage. The approach proposed in this work is a substantial extension of the previous methodology that was based mainly on the two-component parallel reactions. It is also distinct from recent approaches relying heavily on artificial intelligence (Hoffmann and Gastreich, 2019) since it relies on the very robust and straightforward algorithm (Table 6).
Table 6

Selected Approaches to Generate (Ultra-)large Virtual Chemical Space

FeatureApproach Described in This WorkPrevious Feasibility-Based approachesaRecent AI-Based approachesb
Virtual chemical spaceMultibillion (over 3 × 1010)Large (~109)Varied but typically less than 109
Synthetic methodsExperimentally validated three-component two- or three-step reaction sequencesExperimentally validated two-component one-step reactions (mostly)Various; typically based on the literature data (not always validated experimentally)
AlgorithmVery straightforwardSophisticated
Synthetic feasibilityAverage value for each method or synthon, described as average synthesis success rateVaried; from unknown to predicted for each particular member
Building block reactivity assessmentSemi-qualitative; by a chemical expert aided by a computerTypically quantitative; by AI

Previous version of our REAL methodology is referred here; much larger datasets were also generated internally within big pharma companies (Hoffmann and Gastreich, 2019).

The subject was reviewed and critically accessed in a number of recent publications (Schneider, 2018; Schwaller and Laino, 2019; Brown et al., 2020; Lemonick, 2020).

Selected Approaches to Generate (Ultra-)large Virtual Chemical Space Previous version of our REAL methodology is referred here; much larger datasets were also generated internally within big pharma companies (Hoffmann and Gastreich, 2019). The subject was reviewed and critically accessed in a number of recent publications (Schneider, 2018; Schwaller and Laino, 2019; Brown et al., 2020; Lemonick, 2020).

Limitations of the Study

Possible limitations of the study include: (1) difficulties with handling of the full generated chemical space owing to the current hardware capabilities; this can be overcome by pre-selection of its part according to some criteria (like molecular weight) or by using special search engines like those mentioned in the Introduction; (2) a ca. 20% probability for the particular library member to be not produced according to the proposed synthetic methodology; a possible solution is to make a larger selection of the library members of interest (e.g., at least 100–200 representatives) to be synthesized with ca. 80% confidence; (3) impossibility to provide more or less precise synthetic feasibility for a particular compound—only an average value can be predicted for the method as a whole; (4) dynamic nature of the generated space due to the changes in the availability of the starting materials or information on their reactivity; this can be addressed by its regular periodic updates, as well as by applying cut-offs for the amounts of the stock reagents.

Resource Availability

Lead Contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Dr. Yurii S. Moroz, ysmoroz@gmail.com.

Materials Availability

Compound library members generated in this study will be made available on request, but we shall require payment and/or a completed Materials Transfer Agreement if there is potential for commercial application.

Data and Code Availability

The complete lists of reagents used to construct the chemical space supporting the current study have not been deposited in a public repository owing to the company's policy but are available from the corresponding author on request. There are restrictions on the availability of the in-house code and the synthon lists with the reactivity features that have been used to generate the chemical space owing to commercial confidentiality reasons.

Methods

All methods can be found in the accompanying Transparent Methods supplemental file.
  28 in total

Review 1.  Pursuing the leadlikeness concept in pharmaceutical research.

Authors:  Mike M Hann; Tudor I Oprea
Journal:  Curr Opin Chem Biol       Date:  2004-06       Impact factor: 8.822

Review 2.  Lead-oriented synthesis: a new opportunity for synthetic chemistry.

Authors:  Alan Nadin; Channa Hattotuwagama; Ian Churcher
Journal:  Angew Chem Int Ed Engl       Date:  2012-01-03       Impact factor: 15.336

3.  Chemical space and biology.

Authors:  Christopher M Dobson
Journal:  Nature       Date:  2004-12-16       Impact factor: 49.962

4.  Searching Fragment Spaces with feature trees.

Authors:  Uta Lessel; Bernd Wellenzohn; Markus Lilienthal; Holger Claussen
Journal:  J Chem Inf Model       Date:  2009-02       Impact factor: 4.956

Review 5.  The Symbiotic Relationship Between Drug Discovery and Organic Chemistry.

Authors:  Oleksandr O Grygorenko; Dmitriy M Volochnyuk; Sergey V Ryabukhin; Duncan B Judd
Journal:  Chemistry       Date:  2019-10-30       Impact factor: 5.236

6.  Pfizer Global Virtual Library (PGVL): a chemistry design tool powered by experimentally validated parallel synthesis information.

Authors:  Qiyue Hu; Zhengwei Peng; Scott C Sutton; Jim Na; Jaroslav Kostrowicki; Bo Yang; Thomas Thacher; Xianjun Kong; Sarathy Mattaparti; Joe Zhongxiang Zhou; Javier Gonzalez; Michele Ramirez-Weinhouse; Atsuo Kuki
Journal:  ACS Comb Sci       Date:  2012-10-30       Impact factor: 3.784

Review 7.  Designing novel building blocks is an overlooked strategy to improve compound quality.

Authors:  Frederick W Goldberg; Jason G Kettle; Thierry Kogej; Matthew W D Perry; Nick P Tomkinson
Journal:  Drug Discov Today       Date:  2014-10-02       Impact factor: 7.851

8.  Ultra-large library docking for discovering new chemotypes.

Authors:  Jiankun Lyu; Sheng Wang; Trent E Balius; Isha Singh; Anat Levit; Yurii S Moroz; Matthew J O'Meara; Tao Che; Enkhjargal Algaa; Kateryna Tolmachova; Andrey A Tolmachev; Brian K Shoichet; Bryan L Roth; John J Irwin
Journal:  Nature       Date:  2019-02-06       Impact factor: 49.962

9.  SAR by Space: Enriching Hit Sets from the Chemical Space.

Authors:  Franca-Maria Klingler; Marcus Gastreich; Oleksandr O Grygorenko; Olena Savych; Petro Borysko; Anastasia Griniukova; Kateryna E Gubina; Christian Lemmen; Yurii S Moroz
Journal:  Molecules       Date:  2019-08-26       Impact factor: 4.411

10.  An open-source drug discovery platform enables ultra-large virtual screens.

Authors:  Andras Boeszoermenyi; Zi-Fu Wang; Christoph Gorgulla; Patrick D Fischer; Paul W Coote; Krishna M Padmanabha Das; Yehor S Malets; Dmytro S Radchenko; Yurii S Moroz; David A Scott; Konstantin Fackeldey; Moritz Hoffmann; Iryna Iavniuk; Gerhard Wagner; Haribabu Arthanari
Journal:  Nature       Date:  2020-03-09       Impact factor: 49.962

View more
  13 in total

1.  Creation of targeted compound libraries based on 3D shape recognition.

Authors:  Andrii Kyrylchuk; Iryna Kravets; Anton Cherednichenko; Valentyna Tararina; Anna Kapeliukha; Dmytro Dudenko; Mykola Protopopov
Journal:  Mol Divers       Date:  2022-05-24       Impact factor: 2.943

2.  Efficient Hit-to-Lead Searching of Kinase Inhibitor Chemical Space via Computational Fragment Merging.

Authors:  Grigorii V Andrianov; Wern Juin Gabriel Ong; Ilya Serebriiskii; John Karanicolas
Journal:  J Chem Inf Model       Date:  2021-11-11       Impact factor: 4.956

Review 3.  Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system.

Authors:  Vertika Gautam; Anand Gaurav; Neeraj Masand; Vannajan Sanghiran Lee; Vaishali M Patil
Journal:  Mol Divers       Date:  2022-07-11       Impact factor: 3.364

4.  Synthon-based ligand discovery in virtual libraries of over 11 billion compounds.

Authors:  Arman A Sadybekov; Anastasiia V Sadybekov; Yongfeng Liu; Christos Iliopoulos-Tsoutsouvas; Xi-Ping Huang; Julie Pickett; Blake Houser; Nilkanth Patel; Ngan K Tran; Fei Tong; Nikolai Zvonok; Manish K Jain; Olena Savych; Dmytro S Radchenko; Spyros P Nikas; Nicos A Petasis; Yurii S Moroz; Bryan L Roth; Alexandros Makriyannis; Vsevolod Katritch
Journal:  Nature       Date:  2021-12-15       Impact factor: 69.504

5.  It all clicks together: In silico drug discovery becoming mainstream.

Authors:  Antonina L Nazarova; Vsevolod Katritch
Journal:  Clin Transl Med       Date:  2022-04

6.  Importance of Binding Site Hydration and Flexibility Revealed When Optimizing a Macrocyclic Inhibitor of the Keap1-Nrf2 Protein-Protein Interaction.

Authors:  Fabio Begnini; Stefan Geschwindner; Patrik Johansson; Lisa Wissler; Richard J Lewis; Emma Danelius; Andreas Luttens; Pierre Matricon; Jens Carlsson; Stijn Lenders; Beate König; Anna Friedel; Peter Sjö; Stefan Schiesser; Jan Kihlberg
Journal:  J Med Chem       Date:  2022-02-02       Impact factor: 7.446

Review 7.  A practical guide to large-scale docking.

Authors:  Brian J Bender; Stefan Gahbauer; Andreas Luttens; Jiankun Lyu; Chase M Webb; Reed M Stein; Elissa A Fink; Trent E Balius; Jens Carlsson; John J Irwin; Brian K Shoichet
Journal:  Nat Protoc       Date:  2021-09-24       Impact factor: 17.021

8.  One-pot parallel synthesis of 1,3,5-trisubstituted 1,2,4-triazoles.

Authors:  Dmytro S Radchenko; Vasyl S Naumchyk; Igor Dziuba; Andrii A Kyrylchuk; Kateryna E Gubina; Yurii S Moroz; Oleksandr O Grygorenko
Journal:  Mol Divers       Date:  2021-04-02       Impact factor: 3.364

9.  Automated discovery of noncovalent inhibitors of SARS-CoV-2 main protease by consensus Deep Docking of 40 billion small molecules.

Authors:  Francesco Gentile; Michael Fernandez; Fuqiang Ban; Anh-Tien Ton; Hazem Mslati; Carl F Perez; Eric Leblanc; Jean Charle Yaacoub; James Gleave; Abraham Stern; Bill Wong; François Jean; Natalie Strynadka; Artem Cherkasov
Journal:  Chem Sci       Date:  2021-11-17       Impact factor: 9.825

10.  Virtual Screening in Search for a Chemical Probe for Angiotensin-Converting Enzyme 2 (ACE2).

Authors:  Iryna O Kravets; Dmytro V Dudenko; Alexander E Pashenko; Tatiana A Borisova; Ganna M Tolstanova; Sergey V Ryabukhin; Dmitriy M Volochnyuk
Journal:  Molecules       Date:  2021-12-14       Impact factor: 4.411

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.