| Literature DB >> 35008818 |
Sebastjan Kralj1, Marko Jukič1,2, Urban Bren1,2.
Abstract
Since December 2019, the new SARS-CoV-2-related COVID-19 disease has caused a global pandemic and shut down the public life worldwide. Several proteins have emerged as potential therapeutic targets for drug development, and we sought out to review the commercially available and marketed SARS-CoV-2-targeted libraries ready for high-throughput virtual screening (HTVS). We evaluated the SARS-CoV-2-targeted, protease-inhibitor-focused and protein-protein-interaction-inhibitor-focused libraries to gain a better understanding of how these libraries were designed. The most common were ligand- and structure-based approaches, along with various filtering steps, using molecular descriptors. Often, these methods were combined to obtain the final library. We recognized the abundance of targeted libraries offered and complimented by the inclusion of analytical data; however, serious concerns had to be raised. Namely, vendors lack the information on the library design and the references to the primary literature. Few references to active compounds were also provided when using the ligand-based design and usually only protein classes or a general panel of targets were listed, along with a general reference to the methods, such as molecular docking for the structure-based design. No receptor data, docking protocols or even references to the applied molecular docking software (or other HTVS software), and no pharmacophore or filter design details were given. No detailed functional group or chemical space analyses were reported, and no specific orientation of the libraries toward the design of covalent or noncovalent inhibitors could be observed. All libraries contained pan-assay interference compounds (PAINS), rapid elimination of swill compounds (REOS) and aggregators, as well as focused on the drug-like model, with the majority of compounds possessing their molecular mass around 500 g/mol. These facts do not bode well for the use of the reviewed libraries in drug design and lend themselves to commercial drug companies to focus on and improve.Entities:
Keywords: computer-aided drug design; focused libraries; high-throughput virtual screening; in silico drug design; targeted libraries; virtual screening
Mesh:
Substances:
Year: 2021 PMID: 35008818 PMCID: PMC8745317 DOI: 10.3390/ijms23010393
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Computer-aided drug design process chart used to obtain lead compounds (based on Sliwoski et al. [5]).
Figure 2Workflow of an efficient library preparation for medicinal chemistry.
Essential filters for efficient library design in medicinal chemistry.
| Name | Features/Cutoffs | Developer/ |
|---|---|---|
| PAINS | Removal of frequent hitters (promiscuous compounds) in HTS assays | Cancer Therapeutics-CRC P/L [ |
| REOS | Set of rules and of functional group filters for the removal of problematic structures dubbed REOS (Rapid elimination of swill) | Vertex [ |
| Aggregators | Tanimoto coefficient similarity search to a database of known aggregators | Irwin et al. [ |
| Ely Lilly rules | Hybrid method of physiochemical and functional group filters for identification of promiscuous compounds | Bruns at Ely Lilly [ |
| Lipinski (Rule of 5) | A set of rules for drug-likeness: | Lipinski at Pfizer [ |
| Rule of 3 | A set of cutoff rules for lead-like discovery: | Congreve et al. [ |
| Rule of 4 | A set of rules derived from protein–protein interaction inhibitors: | Morelli [ |
| Oprea Lead-like and drug-like | A set of rules based on the lead-like vs. drug-like comparison: | Oprea group [ |
| Ghose | A set of rules for drug-likeness with cutoffs: | Ghose et al. [ |
| Veber | Two rules to meet the criteria for drug-likeness: | Veber et al [ |
| Lee | Physiochemical properties of highly-drug like molecules: | Lee at Hoffman-La Roche [ |
| van de Waterbeemd | Physiochemical properties for permeability and blood brain barrier permeability: | van de Waterbeemd [ |
| Mozzicconacci | Set of rules to filter for drug-like molecules: | Mozziconacci [ |
| Fichert | Rules for permeability based on a small drug set: | Fichert et al. [ |
| Muegge method | Bioavailability prediction rules dubbed Muegge method: | Muegge [ |
| Egan | Set of rules designed to predict bioavailability: | Egan et al. [ |
| Murcko filter | Set of rules derived from the rule of 5 coupled with 1D and 2D descriptor analysis to determine central nervous system activity. | Ajay et al. [ |
Figure 3Process algorithm for generating the Enamine SARS-CoV-2-targeted molecular library.
Figure 4Process algorithm for generating the Otava ltd. SARS-CoV-2-targeted molecular library.
Figure 5Process algorithm for generating the ChemBridge SARS-CoV-2-targeted molecular library.
Figure 6Process algorithm for generating the LifeChemicals SARS-CoV-2-targeted molecular library.
Figure 7Process algorithm for generating the TargetMol SARS-CoV-2-targeted molecular library.
Analysis of the most important descriptors for the SARS-CoV-2-targeted library.
| Database Name | No. of | MW | TPSA | SlogP | HBA | HBD | No. of Rings | No. of Rotatable Bonds |
|---|---|---|---|---|---|---|---|---|
| Chembridge | 16,777 | 391.5 ± 62 | 80 ± 26 | 3.2 ± 1.5 | 4.9 ± 1.7 | 1.4 ± 1.0 | 4.3 ± 0.8 | 3.9 ± 1.4 |
| Enamine | 16,800 | 362 ± 61 | 79.3 ± 21 | 2.7 ± 1.2 | 4.5 ± 1.4 | 1.58 ± 0.8 | 3.1 ± 0.8 | 5.1 ± 1.8 |
| LifeChemical | 7311 | 404 ± 75 | 84.8 ± 23 | 3.1 ± 1.4 | 5.8 ± 1.8 | 1.4 ± 0.9 | 3.6 ± 0.9 | 5.6 ± 1.9 |
| Otava | 9018 | 383 ± 56 | 77.3 ± 20 | 3.7 ± 1.0 | 5.2 ± 1.5 | 1 ± 0.8 | 3.9 ± 0.8 | 4.0 ± 1.5 |
| TargetMol | 2448 | 460 ± 211 | 110 ± 151 | 2.6 ± 4.6 | 6.6 ± 4.0 | 2.2 ± 3.2 | 3.6 ± 1.5 | 7.5 ± 4.8 |
| Joined SARS-CoV-2-Targeted Library | 52,354 | 385 ± 79 | 81 ± 40 | 3.1 ± 1.7 | 5.0 ± 1.9 | 1.4 ± 1.1 | 3.7 ± 1.0 | 4.7 ± 2.1 |
Analysis of the attrition rate of various filters for the SARS-CoV-2-targeted library.
| Chembridge | Enamine | LifeChemicals | Otava | TargetMol | |
|---|---|---|---|---|---|
| Unfiltered | 16,777 | 16,800 | 7311 | 9018 | 2448 |
| Isolated filter: | Number of filtered out compounds (%) | ||||
| REOS | 1160 (7%) | 4565 (27%) | 1414 (19%) | 1486 (17%) | 858 (35%) |
| PAINS | 454 (3%) | 267 (2%) | 11 (~0%) | 430 (5%) | 248 (10%) |
| Aggregators | 9053 (54%) | 6702 (40%) | 4002 (55%) | 6784 (75%) | 1445 (60%) |
| Lipinski Ro5 | 258 (2%) | 193 (1%) | 380 (5%) | 5 (~0%) | 522 (21%) |
| All filters | 10,000 (60%) | 9412 (56%) | 4937 (68%) | 7202 (80%) | 1887 (77%) |
Figure 8Compound (relative) retention after post-filtering for REOS, PAINS, Aggregators and the Lipinski rule of 5 (one rule break allowed) on protein–protein-interaction-inhibitor-targeted libraries.
Analysis of the most important descriptors for the protease-inhibitor-targeted library.
| Database Name | No. of | MW | TPSA | SlogP | HBA | HBD | No. of Rings | No. of Rotatable Bonds |
|---|---|---|---|---|---|---|---|---|
| ApexBio | 824 | 348 ± 181 | 96 ± 59 | 1.9 ± 2.8 | 5.0 ± 3.2 | 2.5 ± 2.0 | 2.5 ± 1.7 | 5.2 ± 4.5 |
| Asinex | 6640 | 383 ± 34 | 79 ± 18 | 2.9 ± 1.0 | 5.3 ± 1.3 | 0.9 ± 0.7 | 3.7 ± 0.6 | 4.7 ± 1.5 |
| Chemdiv | 41,801 | 406 ± 63 | 74 ± 20 | 3.6 ± 1.2 | 5.0 ± 1.6 | 1.1 ± 0.7 | 3.7 ± 0.8 | 5.4 ± 1.8 |
| Enamine | 117 | 336 ± 167 | 90 ± 58 | 2.2 ± 2 | 4.4 ± 2.6 | 2.0 ± 1.8 | 2.5 ± 1.5 | 4.5 ± 4.4 |
| LifeChemicals | 25,535 | 390 ± 70 | 81 ± 22 | 3.0 ± 1.4 | 5.3 ± 1.7 | 1.0 ± 0.7 | 3.3 ± 0.9 | 4.9 ± 1.9 |
| Otava | 8034 | 352 ± 71 | 79 ± 23 | 3.0 ± 1.2 | 4.6 ± 1.6 | 1.4 ± 0.9 | 3.2 ± 1.1 | 4.5 ± 1.8 |
| SelleckChem | 227 | 409 ± 168 | 106 ± 52 | 2.4 ± 2 | 5.45 ± 2.6 | 2.4 ± 1.7 | 2.9 ± 1.7 | 6.2 ± 4.4 |
| TargetMol | 295 | 410 ± 183 | 107 ± 60 | 2.4 ± 2.2 | 5.6 ± 3.1 | 2.5 ± 2.0 | 3.0 ± 1.8 | 5.9 ± 4.5 |
| Joined Protease Inhibitor Databases | 83,473 | 394 ± 70 | 77 ± 23 | 3.3 ± 1.3 | 5.0 ± 1.7 | 1.1 ± 0.8 | 3.5 ± 0.9 | 5.1 ± 1.9 |
Analysis of the attrition rate of various filters for the protease-inhibitor-targeted library.
| ApexBio | Asinex | Chemdiv | Enamine | LifeChemicals | Otava | SelleckChem | TargetMol | |
|---|---|---|---|---|---|---|---|---|
| Unfiltered | 824 | 6640 | 41,801 | 117 | 25,535 | 8034 | 227 | 295 |
| Isolated filter: | Number of filtered out compounds (%) | |||||||
| REOS | 397(48%) | 273(4%) | 2151(5%) | 54(46%) | 3583(14%) | 1025(13%) | 110(48%) | 146(49%) |
| PAINS | 58(7%) | 129(2%) | 1060(3%) | 7(6%) | 430(2%) | 307(4%) | 7(3%) | 16(5%) |
| Aggregators | 294(36%) | 3254(49%) | 29,313(70%) | 36(31%) | 14,059(55%) | 4216(52%) | 88(39%) | 118(40%) |
| Lipinski Ro5 | 113(14%) | 0(0%) | 1045(2%) | 11(9%) | 409(2%) | 2(~0%) | 37(16%) | 55(19%) |
| All filters | 743(90%) | 5019(76%) | 37,137(89%) | 106(91%) | 23,215(91%) | 6987(87%) | 208(92%) | 272(92%) |
Figure 9Compound (relative) retention after post-filtering for REOS, PAINS, Aggregators and the Lipinski rule of 5 (one rule break allowed) on protease-inhibitor-targeted libraries.
Analysis of the most important descriptors for the protein–protein-interaction-inhibitor-targeted library.
| Database Name | No. of | MW | TPSA | SlogP | HBA | HBD | No. of Rings | No. of Rotatable Bonds |
|---|---|---|---|---|---|---|---|---|
| Asinex | 11,439 | 386 ± 53 | 70 ± 19 | 3.1 ± 1.2 | 4.9 ± 1.3 | 0.8 ± 0.6 | 3.7 ± 0.7 | 4.9 ± 1.5 |
| Chemdiv | 212,906 | 408 ± 61 | 72 ± 20 | 3.5 ± 1.3 | 5.0 ± 1.6 | 0.8 ± 0.7 | 3.9 ± 0.8 | 5.1 ± 1.9 |
| Enamine | 40,640 | 357 ± 49 | 70 ± 17 | 2.8 ± 0.9 | 4.7 ± 1.3 | 1 ± 0.7 | 3.5 ± 0.7 | 4.3 ± 1.6 |
| LifeChemicals | 36,426 | 393 ± 81 | 77 ± 22 | 3.3 ± 1.1 | 5.4 ± 1.8 | 1.0 ± 0.7 | 3.6 ± 1.0 | 5.3 ± 2.2 |
| Otava | 3849 | 437 ± 71 | 86 ± 25 | 3.8 ± 1.6 | 5.4 ± 1.6 | 1.4 ± 1.0 | 3.9 ± 1.1 | 6.4 ± 2.4 |
| SelleckChem | 188 | 472 ± 183 | 101 ± 66 | 3.6 ± 2.3 | 6.3 ± 3.7 | 2.1 ± 1.7 | 3.8 ± 1.5 | 6.2 ± 3.9 |
| Joined PPI databases | 305,448 | 400 ± 65 | 73 ± 20 | 3.3 ± 1.2 | 5.0 ± 1.6 | 0.9 ± 0.7 | 3.8 ± 0.8 | 5.0 ± 1.9 |
Analysis of the attrition rate of various filters for the protein–protein interaction inhibitor library.
| Asinex | Chemdiv | Enamine | LifeChemicals | Otava | SelleckChem | |
|---|---|---|---|---|---|---|
| Unfiltered | 11,439 | 212,906 | 40,640 | 36,426 | 3849 | 188 |
| Isolated filter: | Number of filtered out compounds (%) | |||||
| REOS | 148(1%) | 12,546(6%) | 970(2%) | 3191(9%) | 686(18%) | 87(47%) |
| PAINS | 18(~0%) | 3891(2%) | 239(1%) | 547(2%) | 164(4%) | 22(12%) |
| Aggregators | 6579(58%) | 132,015(62%) | 17,499(43%) | 22,721(38%) | 2682(70%) | 126(67%) |
| Lipinski Ro5 | 26(~0%) | 5936(3%) | 16(~0%) | 814(89%) | 424(11%) | 47(25%) |
| All filters | 8137(71%) | 174,713(82%) | 28,386(70%) | 31,886(88%) | 3581(93%) | 177(94%) |
Figure 10Compound (relative) retention after post-filtering for REOS, PAINS, Aggregators and the Lipinski rule of 5 (one rule break allowed) on protein–protein-interaction-inhibitor-targeted libraries.
Figure 11(a) Distribution of Molecular Weight (ExactMW), number of rotatable bonds (NumRotatableBonds) and number of rings (NumRings) for all joined libraries represented in a spatial diagram. (b) Distribution of Molecular Weight (ExactMW), number of hydrogen bond acceptors (NumHBA) and SlogP for all joined libraries represented in a spatial diagram.