| Literature DB >> 34818035 |
Bichlien H Nguyen1,2, Christopher N Takahashi2, Gagan Gupta1, Jake A Smith1,2, Richard Rouse1, Paul Berndt1, Sergey Yekhanin1, David P Ward2, Siena D Ang1, Patrick Garvan1, Hsing-Yeh Parker1, Rob Carlson1, Douglas Carmean1, Luis Ceze1,2, Karin Strauss1,2.
Abstract
Synthetic DNA is an attractive medium for long-term data storage because of its density, ease of copying, sustainability, and longevity. Recent advances have focused on the development of new encoding algorithms, automation, preservation, and sequencing technologies. Despite progress in these areas, the most challenging hurdle in deployment of DNA data storage remains the write throughput, which limits data storage capacity. We have developed the first nanoscale DNA storage writer, which we expect to scale DNA write density to 25 × 106 sequences per square centimeter, three orders of magnitude improvement over existing DNA synthesis arrays. We show confinement of DNA synthesis to an area under 1 square micrometer, parallelized over millions of nanoelectrode wells and then successfully write and decode a message in DNA. DNA synthesis on this scale will enable write throughputs to reach megabytes per second and is a key enabler to a practical DNA data storage system.Entities:
Year: 2021 PMID: 34818035 PMCID: PMC8612674 DOI: 10.1126/sciadv.abi6714
Source DB: PubMed Journal: Sci Adv ISSN: 2375-2548 Impact factor: 14.136
Fig. 1.DNA data storage requires higher synthesis throughput than is possible with current techniques.
(A to D) Overview of the DNA data storage pipeline. (A) Digital data are encoded from their binary representation into sequences of DNA bases, with an identifier that correlates them with a data object, addressing information that is used to reorder the data when reading, and redundant information that is used for error correction. (B) These sequences are synthesized into DNA oligonucleotides and stored. (C) At retrieval time, the DNA molecules are selected and copied via PCR or other methods and sequenced back into electronic representations of the bases in these sequences. (D) The decoding process takes this noisy and sometimes incomplete set of sequencing reads, corrects for errors and missing sequences, and decodes the information to recover the data. (E) Summary of the commercial synthesis processes and corresponding estimated oligonucleotide densities, as reported in the literature or by the companies themselves (see text S2). Our electrochemical method density is highlighted in dark red.
Fig. 2.Overview of 650-nm array pitched 2 μm.
(A) Finite element analysis of anodic acid generation and diffusion at a 650-nm-diameter electrode with a 200-nm well is depicted with a cross-sectional view along the y = x plane and (B) top-down view on the z = 0 plane. The colors blue and yellow represent regions with relatively low and high acid concentrations, respectively. (C) An overview of the nanoscale DNA synthesis array with scanning electron microscopy images of the 650-nm electrode array and enlarged view of one electrode. (D) A fluorescent image in which the well surrounding each activated anode is patterned with AAA-fluorescein. The cartoon diagram depicts which electrodes in the layout were activated. (E) Illustration of the wells patterned with AAA-fluorescein and AAA-AquaPhluor and (F) corresponding image overlay of the two fluorophores at the end of DNA synthesized on the same 650-nm electrode array.
Fig. 3.Errors stemming from synthesis followed by sequencing.
(A) Insertions (Ins), deletions (Del), and substitutions (Sub) per position for a synthesized and PCR-amplified 180-base sequence. (B) Electrophoresis image of synthesis products after PCR amplification. (C) Message encoded into 64 bytes split into four unique sequences of 104 bases (top). Insertions, deletions, and substitutions per locus of each of the four sequences in the multiplex synthesis run. In every error analysis graph, the terminal 20 bases at both 3′ and 5′ ends come from the primers used in PCR and are not representative of the synthesized errors.