| Literature DB >> 26290177 |
Andrew B MacConnell1, Patrick J McEnaney1, Valerie J Cavett1, Brian M Paegel1.
Abstract
The promise of exploiting combinatorial synthesis for small molecule discovery remains unfulfilled due primarily to the "structure elucidation problem": the back-end mass spectrometric analysis that significantly restricts one-bead-one-compound (OBOC) library complexity. The very molecular features that confer binding potency and specificity, such as stereochemistry, regiochemistry, and scaffold rigidity, are conspicuously absent from most libraries because isomerism introduces mass redundancy and diverse scaffolds yield uninterpretable MS fragmentation. Here we present DNA-encoded solid-phase synthesis (Entities:
Keywords: DNA-encoded libraries; combinatorial synthesis; one-bead-one-compound; split-and-pool
Mesh:
Substances:
Year: 2015 PMID: 26290177 PMCID: PMC4571006 DOI: 10.1021/acscombsci.5b00106
Source DB: PubMed Journal: ACS Comb Sci ISSN: 2156-8944 Impact factor: 3.784
Figure 1DNA-encoded solid-phase synthesis. (A) TentaGel Rink-amide resin (160-μm diameter) is first elaborated with a common linker (gray) containing a coumarin chromophore and arginine. Linker resin is further functionalized with an alkyne and Fmoc-protected glycine. Azide-functionalized DNA headpiece (HDNA), consisting of two complementary strands of DNA (cyan) covalently joined via two PEG tethers (magenta), is coupled substoichiometrically (0.004 equiv) to alkyne sites via CuAAC, yielding bifunctional-HDNA resin (Fmoc-protected amine for chemical coupling and 5′-phosphoryl-CC-3′ overhang for enzymatic cohesive end ligation). (B) A forward primer module (green) is first enzymatically ligated to resin. Encoded synthesis proceeds as alternating steps of monomer coupling (scaffold elements shown in purple hues, side chain elements shown in orange hues) and coding module ligation (correspondingly in purple or orange hues). After the last encoding step, a reverse primer module (green) is ligated. The finished resin displays oligomer and a structure-encoding DNA message flanked by primer binding sequences for PCR amplification. (C) The DNA sequence encodes the series of reaction conditions that the bead experienced. Here, the DNA sequence encodes acylation with chloroacetic acid, treatment with methylamine, acylation with (2S,3E)-5-chloro-2,4-dimethyl-3-pentenoic acid, treatment with 3-methoxypropylamine, and acylation with N-Fmoc-l-proline followed by Fmoc removal.
Scheme 1DNA-Encoded Solid-Phase Synthesis Reaction Sequence
Figure 2Encoding language design and optimization. (A) Each target heteroduplex coding module (schematic at top) is composed of two hybridized oligonucleotide strands. Each strand is 5′-phosphorylated (yellow “P”), displays a strand-specific overhang sequence (orange or purple), and coding region that is complementary (gray background). (B) Sufficiently self-complementary sequences may form undesired homoduplexes. Enforcing a coding region sequence structure of either 5′-NNRRRRNN-3′ or 5′-NNYYYYNN-3′ decreases the stability of potential homoduplexes relative to the target heteroduplex. (C) Some sequences (e.g., homopolymers) can form stable off-target heteroduplexes with occluded, unreactive overhangs. (D) Self-complementary sequences can form intramolecular secondary structures (hairpins) that prevent target heteroduplex formation.
Encoding Sequence Thermodynamic Parameters and Ligation Yield
| set 1 | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| coding [+] | Δ | Δ | Δ | ΔΔ | ΔΔ | Δ | Δ | ΔΔ | OH1 yield | OH3 yield | OH5 yield | |||
| 1X01 | TGGAAAGT | 37.1 | –13.4 | –3.9 | –5.0 | –9.5 | –8.4 | – | – | –3.6 | –9.8 | 70 | 71 | 68 |
| 1X02 | ACGGAGCA | 49.9 | –16.3 | –3.6 | –6.9 | –12.7 | –9.4 | – | – | –3.6 | –12.7 | 70 | 70 | 63 |
| 1X03 | TTGGAGTT | 37.1 | –13.4 | –1.6 | –5.0 | –11.8 | –8.4 | 1.9 | 35.2 | –3.6 | –9.8 | 72 | 73 | 69 |
| 1X04 | AAGGAGGT | 40.7 | –14.2 | –4.9 | –4.7 | –9.3 | –9.5 | – | – | –3.6 | –10.6 | 75 | 74 | 66 |
| 1X05 | AGAAAGCA | 38.5 | –13.8 | –3.5 | –3.1 | –10.2 | –10.6 | 20.6 | 17.9 | –3.6 | –10.1 | 74 | 74 | 67 |
| 1X06 | ACAGAACT | 36.5 | –11.4 | –3.5 | –2.0 | –7.8 | –9.4 | – | – | –3.6 | –7.7 | 72 | 72 | 61 |
| 1X07 | TAAGGAGT | 33.5 | –12.1 | –4.9 | –3.1 | –7.2 | –9.0 | – | – | –3.6 | –8.5 | 72 | 74 | 68 |
| 1X08 | ATGGGAGT | 40.9 | –14.1 | –5.4 | –6.5 | –8.7 | –7.6 | – | – | –3.6 | –10.5 | 74 | 75 | 65 |
| 1X09 | TGAAGGAA | 36.3 | –13.7 | –3.5 | –3.5 | –10.1 | –10.1 | – | – | –3.6 | –10.0 | 71 | 73 | 66 |
| 1X10 | TTGAGGAT | 35.4 | –13.2 | –1.6 | –3.1 | –11.6 | –10.1 | 20.0 | 15.4 | –4.6 | –8.6 | 75 | 75 | 70 |
Strand designations are [+] for top strand and [−] for bottom strand.
TM,het = melting temperature of the target heteroduplex (50 mM Na+, 10 mM Mg2+, 1 mM nucleotide triphosphates, 10 μM each oligonucleotide).
ΔGhet = target heteroduplex free energy of formation.
ΔGhomo = most stable homoduplex (of all overhang-appended parents) free energy of formation.
ΔGhet,2° = most stable off-target heteroduplex (of all overhang-appended parents) free energy of formation.
TM,hp = highest hairpin melting temperature (of all overhang-appended parents); entries marked “–” yielded no predicted hairpin formation; no hairpins predicted for any overhang-appended [+] parent.
ΔΔGhet/homo = ΔGhet – ΔGhomo.
ΔΔGhet/het,2° = ΔGhet – ΔGhet,2°.
ΔTM,hp = TM,het – TM,hp.
OHX yield = experimentally measured ligation yield of overhang-appended parent for each set’s overhangs (OH1, OH3, and OH5 for set 1; OH2, OH4, and OH6 for set 2). OH1 = 5′-ATGG-3′; OH2 = 5′-TCA-3′; OH3 = 5′-GTT-3′; OH4 = 5′-CTA-3′; OH5 = 5′-TTC-3′; OH6 = 5′-CGC-3′.
Chart 1DNA-Encoded Oligomer Synthesis and Single-Bead Quantitation
Chart 2DNA-Encoded Compound Purity and Side Product Identification
Figure 3DNA-encoded combinatorial library plan and quality control. (A) The library scaffold features a linear arrangement of three positions for diversification (Pos1, Pos2, Pos3). Each position displays either an amino acid or N-substituted glycine. Amino acids featured Cα diversity in side chain, side chain stereochemistry, and N-methylation. The central position, Pos2, uniquely featured 1 of 6 different “linker” amino acids in addition to the L- and D- complement. N-substitution of glycine was executed with 1 of 21 different amines (gray). (B) The mixed-scale combinatorial DESPS was conducted in wells of a filtration microplate that housed a mixture of 160- and 10-μm bifunctional-HDNA library resin. The 160-μm QC beads were harvested by filtration and single beads were placed into separate wells for qPCR analysis. The resultant amplicons were purified and sequenced. The single QC beads were retrieved from qPCR supernatant, transferred to individual trifluoroacetic acid cleavage reactions, and the cleavage products subjected to mass spectrometric analysis. (C) DNA sequence data (shown as numeric identifiers) were used to predict the compound structure on each QC bead. The predicted exact mass of [M + H]+ (green) agreed with the observed predominant ion (black) in the MALDI-TOF mass spectra.