| Literature DB >> 27631533 |
Andrew S Bell1,2, Joseph Bradley1,3, Jeremy R Everett4,5, Jens Loesel1,6, David McLoughlin1,7, James Mills1,8, Marie-Claire Peakman9, Robert E Sharp9,10, Christine Williams1,11, Hongyao Zhu9.
Abstract
High-throughput screening (HTS) is an effective method for lead and probe discovery that is widely used in industry and academia to identify novel chemical matter and to initiate the drug discovery process. However, HTS can be time consuming and costly and the use of subsets as an efficient alternative to screening entire compound collections has been investigated. Subsets may be selected on the basis of chemical diversity, molecular properties, biological activity diversity or biological target focus. Previously, we described a novel form of subset screening: plate-based diversity subset (PBDS) screening, in which the screening subset is constructed by plate selection (rather than individual compound cherry-picking), using algorithms that select for compound quality and chemical diversity on a plate basis. In this paper, we describe a second-generation approach to the construction of an updated subset: PBDS2, using both plate and individual compound selection, that has an improved coverage of the chemical space of the screening file, whilst only selecting the same number of plates for screening. We describe the validation of PBDS2 and its successful use in hit and lead discovery. PBDS2 screening became the default mode of singleton (one compound per well) HTS for lead discovery in Pfizer.Entities:
Keywords: 2nd Generation; Diversity; High-throughput screening (HTS); Lead discovery; Plate-based; Ro40; Rule of 40; Screening file; Subset
Mesh:
Substances:
Year: 2016 PMID: 27631533 PMCID: PMC5055576 DOI: 10.1007/s11030-016-9692-9
Source DB: PubMed Journal: Mol Divers ISSN: 1381-1991 Impact factor: 2.943
Fig. 1a Histogram of the number of screening plates in the Pfizer Screening File containing 0 to 40, 41 to 80, 81 to 120, etc. unique compounds per 384-well plate. The maximal number of test wells per plate is 360, as 24 wells are reserved for controls. b Histogram of the number of plates containing a total of 0 to 100, 101 to 200, etc. Ro5 violations per plate: note that any given compound on a plate could have more than one violation. c Histogram of plates containing 0 to 40, 41 to 80, etc. compounds per plate failing the stricter structural filters and thus being undesirable to a medicinal chemist. All data for Pfizer screening file as of 4th Quarter 2008.
Fig. 2Percentage of screening plates against the binned number (0 to 40, 41 to 80, etc.) of library compounds on each plate in the filtered Pfizer screening file as of November 2008. The polarization of plates into those derived from library chemistry and those not is clear
Fig. 3A plot of General Activity vs Rule of 5 scores for multiple, diverse screening sets [44]. The closed black diamonds are for external company or organization libraries. The open yellow square is for known drugs. See Experimental section for definitions and calculations of General Activity and Rule of 5 scores
Fig. 4A plot of the number of BCUT cells in the Pfizer screening file covered for the first time (single coverage, upper dark blue line) and second time (double coverage, lower, pink line), plotted against the number of 384-well screening plates selected. Note the convergence of the iterative plate selection algorithm after the selection of 1200 plates. (Color figure online)
The composition of the Pfizer screening file in fourth quarter 2008 prior to the construction of PBDS2
| File | Number of plates | Number of compound IDs | Number of SMILES | Number of unique SMILES | Number passing GDRS filters |
|---|---|---|---|---|---|
| Singleton 384 well | 8998 | 3,239,280 | 3,041,481 | 3,040,455 | 2,848,710 |
| PBDS [ | 1200 | 432,000 | 422,704 | 422,685 | 414,460 |
| Total | 10,198 | 3,671,280 | 3,664,185 | 3,463,140 | 3,263,170 |
Fig. 5Calculation of the number of HTS hit series that would be expected to be found by random screening of between 0 and 20 % of the screening file for a series of cluster sizes of between 1 and 50 compounds (curves). Superimposed on these graphs are the calculated performances (shown by vertical lines) of a PBDS2 subset of 400,000 compounds when tested in silico against 68,000 HTS hits with from 77 recent HTSs and covering a range of cluster sizes, based on different molecular similarity criteria. The test PBDS2 subset in this case was constructed 50 % from plate-based selection followed by 50 % from random cherry-picking from those compounds not yet selected by the plate-pick. The vertical lines show the percentage of series that this PBDS2 construct was calculated to retrieve for a series of at least 1 active (light blue) and for a series of at least 5 actives (blue). See Supplementary Fig. 2 and Supplementary Tables 1 and 2 for more information. (Color figure online)
An overview of the calculated series retrieval and fold efficiency, and the actual library coverage of three PBDS2 designs of 400,000 compounds with 100, 75 and 50 % of the subsets chosen by iterative plate selection (see above) and the balance chosen by random cherry-picking from the remaining compounds not selected in the plate selection process
| Percentage of plate-based/cherry-picked set in test PBDS2s | 100/0 | 75/25 | 50/50 |
|---|---|---|---|
| % Of total series found | 33 | 31 | 27 |
| % Of larger series found ( | 67 | 74 | 75 |
| % Of libraries covered at | 17 | 100 | 100 |
| Fold efficiency over full file singleton for finding larger series | 6.1 | 6.7 | 6.8 |
The series retrieval is calculated as described previously against 68,000 active compounds with , found in 77 recent Pfizer HTSs
Fig. 6A flow chart depicting the five main stages of the PBDS2 plate selection process
An analysis of the cell occupancies in PBDS2
| PBDS2 BCUT cell analysis | |
|---|---|
| BCUT cell compound population | Number of BCUT cells |
| 1–20 | 31,727 |
| 21–40 | 2584 |
| 41–60 | 948 |
| 61–80 | 517 |
| 81–100 | 307 |
| 101–120 | 195 |
| 121–140 | 124 |
| 141–160 | 80 |
| 161–180 | 59 |
| 181–200 | 33 |
| 201–220 | 32 |
| 221–240 | 22 |
| 241–260 | 19 |
| 261–280 | 8 |
| 281–300 | 6 |
| 301–320 | 6 |
| 321–340 | 2 |
| 341–360 | 1 |
| 361–380 | 0 |
| 381–400 | 1 |
| Total number of occupied cells | 36,671 |
| Average occupied cell population | 11.2 |
| Standard deviation of average occupied cell population | 7.2 |
Fig. 7Compound 5 above from Zhang et al. [51] was an early lead developed from a PBDS2 subset hit and has mGluR5 , Geometric mean, measurements in HEK-293FT cells expressing human mGLUR5 using []MPEPy ([]3-methoxy-5-pyridin-2-ylethynylpyridine); mGLUR5 Geometric mean, measurements in HEK-293 cells expressing rat mGluR5 using fluorimetric imaging plate reader (FLIPR)