| Literature DB >> 31497460 |
Shufang Zhang1, Beibei Huang1, Xiangming Song1, Tao Zhang1, Hanjie Wang2, Yuhong Liu3.
Abstract
DNA has been recognized as a promising natural medium for information storage. The expensive DNA synthesis process makes it an important challenge to utilize DNA nucleotides optimally and increase the storage density. Thus, a novel scheme is proposed for the storage of digital information in synthetic DNA with high storage density and perfect error correction capability. The proposed strategy introduces quaternary Huffman coding to compress the binary stream of an original file before it is converted into a DNA sequence. The proposed quaternary Huffman coding is based on the statistical properties of the source and can gain a very high compression ratio for files with a non-uniform probability distribution of the source. Consequently, the amount of information that each base can store increases, and the storage density is also improved. In addition, quaternary Hamming code with low redundancy is proposed to correct errors occurring in the synthesis and sequencing. We have successfully converted a total of 5.2 KB of files into 3934 bits in DNA bases. The results of biological experiment indicate that the storage density of the proposed scheme is higher than that of state-of-the-art schemes.Entities:
Keywords: DNA information storage; Quaternary Huffman code; Storage density; Synthetic DNA
Year: 2019 PMID: 31497460 PMCID: PMC6708515 DOI: 10.1007/s13205-019-1868-4
Source DB: PubMed Journal: 3 Biotech ISSN: 2190-5738 Impact factor: 2.406