| Literature DB >> 30672692 |
John C Faver, Kevin Riehle, David R Lancia1, Jared B J Milbank1, Christopher S Kollmann1, Nicholas Simmons, Zhifeng Yu, Martin M Matzuk.
Abstract
DNA-encoded chemical libraries (DELs) provide a high-throughput and cost-effective route for screening billions of unique molecules for binding affinity for diverse protein targets. Identifying candidate compounds from these libraries involves affinity selection, DNA sequencing, and measuring enrichment in a sample pool of DNA barcodes. Successful detection of potent binders is affected by many factors, including selection parameters, chemical yields, library amplification, sequencing depth, sequencing errors, library sizes, and the chosen enrichment metric. To date, there has not been a clear consensus about how enrichment from DEL selections should be measured or reported. We propose a normalized z-score enrichment metric using a binomial distribution model that satisfies important criteria that are relevant for analysis of DEL selection data. The introduced metric is robust with respect to library diversity and sampling and allows for quantitative comparisons of enrichment of n-synthons from parallel DEL selections. These features enable a comparative enrichment analysis strategy that can provide valuable information about hit compounds in early stage drug discovery.Entities:
Keywords: DNA-encoded libraries; affinity selection; data analysis; drug discovery
Mesh:
Substances:
Year: 2019 PMID: 30672692 PMCID: PMC6372980 DOI: 10.1021/acscombsci.8b00116
Source DB: PubMed Journal: ACS Comb Sci ISSN: 2156-8944 Impact factor: 3.784
Figure 1Comparison of enrichment from two independently prepared naïve samples of the triazine DEL. Observed n-synthons are colored by their value of n (their “dimension”), and enrichment is measured as normalized z-scores with their 95% confidence intervals shown as error bars (some error bars are smaller than the data point radii). The y = x line corresponding to equal enrichment between the two samples is plotted for reference.
Selected sEH Inhibitors Reported by Thalji et al.[30] and Their Analogues in the Triazine DELa
IC50 values are from the earlier report, and the evaluated enrichments for the DEL analogues are provided as normalized z-scores with their 95% confidence intervals.
Figure 2Comparative enrichment plot for a selection against sEH. Enrichment in the target data set is plotted along the horizontal axis against the measured enrichment for a control sample, an NTC, on the vertical axis. This DEL included compounds previously assayed for sEH inhibition by Thalji et al.[30] as disynthons from cycles 1 and 3. Each point in the plot is a different n-synthon from the DEL, and the points highlighted in green correspond to analogues of the two most potent of the previously reported compounds. Both of these known inhibitor structures were observed in the target data but not the NTC, and the most potent inhibitors from the earlier publication were significantly more enriched in the target data set than the weaker inhibitors. The remaining points correspond to different combinations of amino acids in cycle 1 and amines in cycles 2 and 3 of the triazine DEL.
Figure 3Enrichment of n-synthons evaluated with the normalized z-score metric from the fully sampled data set compared to randomly subsampled data sets. Panel A plots the full data set against the same data with 90% of samples randomly removed, while panel B plots the full data set against the same data set with 99% of samples randomly removed. This in silico experiment simulates the effects of large differences in sampling between two decoded DEL selection samples.