| Literature DB >> 30899031 |
Christopher N Takahashi1, Bichlien H Nguyen2,3, Karin Strauss2,3, Luis Ceze2.
Abstract
Synthetic DNA has emerged as a novel substrate to encode computer data with the potential to be orders of magnitude denser than contemporary cutting edge techniques. However, even with the help of automated synthesis and sequencing devices, many intermediate steps still require expert laboratory technicians to execute. We have developed an automated end-to-end DNA data storage device to explore the challenges of automation within the constraints of this unique application. Our device encodes data into a DNA sequence, which is then written to a DNA oligonucleotide using a custom DNA synthesizer, pooled for liquid storage, and read using a nanopore sequencer and a novel, minimal preparation protocol. We demonstrate an automated 5-byte write, store, and read cycle with a modular design enabling expansion as new technology becomes available.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30899031 PMCID: PMC6428863 DOI: 10.1038/s41598-019-41228-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1An overview of the write-store-read process. Data is encoded, with error correction, into DNA bases, which are synthesized into physical DNA molecules and stored. When a user wishes to read the data, the stored DNA is read by a DNA sequencer into bases and the decoding software corrects any errors retrieving the original data. (a) The logical flow from bits to bases to DNA and back. (b) A block diagram representation of the system hardware’s three modules: synthesis, storage, and sequencing. (c) A photograph showing the completed system. Highlighted are the storage vessel and the nanopore loading fixture. The majority of the remaining hardware is responsible for synthesis. (d) Overview of enzymatic preparation for DNA sequencing. An arbitrary 1 kilobase “extension segment” of DNA is PCR-amplified with TAQ polymerase, and a Bsa-I restriction site is added by the primer, leaving an A-tail and a TCGC sticky end after digestion. The extension segment is then T/A ligated to the standard Oxford Nanopore Technology (ONT) LSK-108 kit sequencing adapter, creating the “extended adapter,” which ensures that sufficient bases are read for successful base calling. For sequencing, the payload hairpin and extended adapter are ligated, forming a sequence-ready construct that does not require purification.
Figure 2Synthesis and sequencing process quality. (a) Insertion, deletion, and substitution frequency by locus for a synthesized and PCR-amplified 100-mer. Below: An overview of errors. Above: An expanded view of the central 60 bases. The terminal 20 bases come from primers used in amplification and therefore are not representative of synthesis quality. (b) Combined write-to-read quality of synthesis, ligation, and sequencing. Bases −60 to −4 (below, grey) are adapter bases. Bases −3 to 0 (below, red) are the ligation scar. Bases 0 to 39 (below, blue) are the synthesized payload region with 8 bases of padding on the 3′ end. (c) Distribution of nanopore read lengths with unligated, 1D and 2D read lengths identified.
DNA synthesis reagent parameters.
| Step | Volume (μL) | Time (s) |
|---|---|---|
| deblock | 600 | 50 |
| Act + {A, C, T, G} (1:1) | 350 | 120 |
| Act + Phos. reagent (1:1)* | 350 | 900 |
| Oxidizer | 750 | 10 |
*Only performed as final coupling step to add 5′ phosphate.
Sequencing prep master mix.
| Reagent | Volume (μL) |
|---|---|
| Extended adapter | 15 |
| T4 DNA ligase (NEB: M0202) | 5 |
| DTT-free 10× T4 buffer* | 20 |
| ONT RBF | 93 |
| Nuclease-free water | 64 |
| Total | 197 |
*DTT-free 1× T4 buffer: 50 mM Tris-HCl, 10 mM MgCl2, 1 mM ATP.