| Literature DB >> 29066821 |
Eli Lyons1, Paul Sheridan2, Georg Tremmel3, Satoru Miyano3, Sumio Sugano4.
Abstract
High-throughput screens allow for the identification of specific biomolecules with characteristics of interest. In barcoded screens, DNA barcodes are linked to target biomolecules in a manner allowing for the target molecules making up a library to be identified by sequencing the DNA barcodes using Next Generation Sequencing. To be useful in experimental settings, the DNA barcodes in a library must satisfy certain constraints related to GC content, homopolymer length, Hamming distance, and blacklisted subsequences. Here we report a novel framework to quickly generate large-scale libraries of DNA barcodes for use in high-throughput screens. We show that our framework dramatically reduces the computation time required to generate large-scale DNA barcode libraries, compared with a naїve approach to DNA barcode library generation. As a proof of concept, we demonstrate that our framework is able to generate a library consisting of one million DNA barcodes for use in a fragment antibody phage display screening experiment. We also report generating a general purpose one billion DNA barcode library, the largest such library yet reported in literature. Our results demonstrate the value of our novel large-scale DNA barcode library generation framework for use in high-throughput screening applications.Entities:
Mesh:
Year: 2017 PMID: 29066821 PMCID: PMC5654825 DOI: 10.1038/s41598-017-12825-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Barcode structure and library generation overview. (A) An individual barcode consists of a batch code, a linker, and a target code. Batch codes serve as labels for different experiments and are appended with linkers for technical reasons that will be made apparent in the main text. Target codes serve as labels for particular targets in a given batch. (B) A barcode library is constructed by appending linked batch codes with target codes in all possible combinations. (C) Flowchart of the main steps involved in the generation of a barcode library within our framework. See main text for details.
Runtimes for the generation of barcodes of length 25, 50, and 100 bp for different values of m using the Naïve framework, and two versions of our framework, Framework A and Framework B, as described in the Methods and Results sections. The library size in each case is N = 1,000,000. Framework A and Framework B were run with n = 100 batch codes and n = 10,000 target codes. Frameworks A and B outperform the Naïve framework in all cases. The runtimes are reported in [h]:mm:ss format. The runtimes for the Naïve framework are estimated as described in the main text.
| Generation Framework |
| Length (bp) | Acceptance rate | Time |
|---|---|---|---|---|
| Naïve | 2 | 25 | 0.00 | 649:00:00 |
| Framework A | 2 | 25 | 0.24 | 4:05 |
| Framework B | 2 | 25 | 0.36 | 2:24 |
| Naïve | 2 | 50 | 0.00 | 2146:00:00 |
| Framework A | 2 | 50 | 0.10 | 11:14 |
| Framework B | 2 | 50 | 0.49 | 1:54 |
| Naïve | 2 | 100 | 0.00 | 22295:00:00 |
| Framework A | 2 | 100 | 0.01 | 3:33:52 |
| Framework B | 2 | 100 | 0.39 | 2:41 |
| Naïve | 3 | 25 | 0.00 | 238:00:00 |
| Framework A | 3 | 25 | 0.49 | 2:10 |
| Framework B | 3 | 25 | 0.33 | 2:23 |
| Naïve | 3 | 50 | 0.00 | 322:00:00 |
| Framework A | 3 | 50 | 0.24 | 5:09 |
| Framework B | 3 | 50 | 0.46 | 2:01 |
| Naïve | 3 | 100 | 0.00 | 649:00:00 |
| Framework A | 3 | 100 | 0.18 | 7:12 |
| Framework B | 3 | 100 | 0.25 | 4:16 |
| Naïve | 4 | 25 | 0.00 | 191:00:00 |
| Framework A | 4 | 25 | 0.38 | 2:49 |
| Framework B | 4 | 25 | 0.45 | 1:56 |
| Naïve | 4 | 50 | 0.00 | 215:00:00 |
| Framework A | 4 | 50 | 0.34 | 3:54 |
| Framework B | 4 | 50 | 0.42 | 2:30 |
| Naïve | 4 | 100 | 0.00 | 259:00:00 |
| Framework A | 4 | 100 | 0.43 | 3:21 |
| Framework B | 4 | 100 | 0.50 | 1:58 |