Literature DB >> 31890936

A new DNA-based model for finite field arithmetic.

Iván Jirón¹, Susana Soto¹, Sabrina Marín², Mauricio Acosta², Ismael Soto³.

Abstract

A Galois field G F ( p n ) with p ≥ 2 a prime number and n ≥ 1 is a mathematical structure widely used in Cryptography and Error Correcting Codes Theory. In this paper, we propose a novel DNA-based model for arithmetic over G F ( p n ) . Our model has three main advantages over other previously described models. First, it has a flexible implementation in the laboratory that allows the realization arithmetic calculations in parallel for p ≥ 2 , while the tile assembly and the sticker models are limited to p = 2 . Second, the proposed model is less prone to error, because it is grounded on conventional Polymerase Chain Reaction (PCR) amplification and gel electrophoresis techniques. Hence, the problems associated to models such as tile-assembly and stickers, that arise when using more complex molecular techniques, such as hybridization and denaturation, are avoided. Third, it is simple to implement and requires 50 ng/μL per DNA double fragment used to develop the calculations, since the only feature of interest is the size of the DNA double strand fragments. The efficiency of our model has execution times of order O ( 1 ) and O ( n ) , for the addition and multiplication over G F ( p n ) , respectively. Furthermore, this paper provides one of the few experimental evidences of arithmetic calculations for molecular computing and validates the technical applicability of the proposed model to perform arithmetic operations over G F ( p n ) .

Entities: Chemical Disease Species

Keywords: Applied mathematics; Bioinformatics; DNA computing; Finite fields; Galois fields; Gel electrophoresis; Molecular computing technologies; Polymerase chain reaction (PCR)

Year: 2019 PMID： 31890936 PMCID： PMC6926258 DOI： 10.1016/j.heliyon.2019.e02901

Source DB: PubMed Journal: Heliyon ISSN： 2405-8440

Introduction

The fast-paced technological development keeps pushing computer science to new boundaries. The field of DNA computing was born to address hard computational problems. The strategy of most algorithms developed within this novel area of study, is brute force, relying on the huge capacity for parallel processing of DNA computing. The interest in designing a molecular computer is not limited to difficult search problems. If a computer should be able to carry out addition and multiplication, a wider range of problems could be addressed. However, most of the work done in the field of DNA computing is theoretical. Researchers laxly count on the supposed feasibility of the biomolecular techniques proposed in the works of Adleman (Adleman, 1994, 1996; Roweis at al., 1998), Lipton (Lipton, 1995, Winfree (Winfree et al., 1998); LaBean et al., 1999; Rothemund et al., 2004) and Rothemund (Rothemund, 2006). Most algorithms that have appeared in the literature are based on a reduced number of DNA computing models and introduced with no experimental work to back up their actual implementation. We propose a new DNA based model specifically designed to do arithmetic over Galois fields, which was successfully implemented in the laboratory. Galois fields, , are mathematical structures widely used in Cryptography and in Error Correcting Codes Theory. In Cryptography, the key exchange scheme of Diffie-Hellman is implemented on elliptic and hyperelliptic curves defined on Galois fields (Menezes et al., 1996; Koblitz, 1998; Cohen et al., 2006). On the other hand, in Error Correcting Codes Theory, algebraic geometric codes use algebraic curves, such as Hermitian curves, defined on Galois fields (Sklar, 2001; Guajardo, 2004; Carrasco and Johnston, 2008). Our model has two main properties. First, the molecular techniques employed, Polymerase Chain Reaction (PCR) and electrophoresis, are standard techniques, widely used, easily implemented and not expensive, with only a few designed components needed to carry out the experiments. Secondly, our model allows calculations over , for prime number, , and an integer . In contrast, all works on DNA molecular computation over finite fields found in the literature, are restricted to and . This paper is organized as follows. Section 2 introduces the DNA computing model. Section 3 presents mathematical basic concepts about Galois fields. Section 4 presents the proposed DNA-based model for arithmetic over Galois fields. Section 5 describes the physical molecular implementation for the proposed DNA-based model. Section 6 summarizes the obtained experimental results for as case study. Section 7 presents a simulation of the proposed DNA-based model using Field Programmable Gate Array (FPGA) technology. Section 8 contains the discussion of the experimental results, the analysis of the advantages and the description of a possible DNA-based computer that implements arithmetic over using the proposed model. Section 9 presents conclusions and future works.

DNA computing models

The tendency in computer technology is to produce devices with greater memory and speed than the previous generation but much smaller. The idea of building a tiny computer is not new. In the late 1950s, Richard Feynman suggested the possibility of having sub-microscopic computers in his famous talk “There's Plenty of Room at the Bottom”. However, only about two decades ago Leonard Adleman made a breakthrough when he used the tools of molecular biology to address an NP-complete problem (Adleman, 1994). He succeeded in solving a case of the Hamiltonian path, by manipulating DNA. This event marked the birth of the field known as DNA computing (Kari, 1997). The speed of any computing device, bio-molecular or not, depends on how many parallel processes it has and how many steps, per each process, it can realize per unit of time. Electronic computers can calculate millions of instructions per second, a task a biological system cannot emulate. However, a DNA computer has a huge advantage in parallel processing and memory (Goldman et al., 2013) and this compensates for the much slower execution time for one instruction (Lipton, 1995; Guarnieri et al., 1996). The immense capacity for parallelization of DNA computing appeared to be the key to outperform electronic computers. The advent of the new discipline at first augured the end of silicon-based computers, however, scientists in the field soon acknowledged there were some obstacles in the way of realizing a competitive DNA-based computer (Gibbons et al., 1996; Regalado, 2000). The models developed for DNA computing can be classified in two types: those which require human intervention during the process of calculation and those that can be programmed to function autonomously. Early research, following the works of Adleman (Adleman, 1994) and Lipton (Lipton, 1995), provided a variety of non-autonomous models, known as filtering models, for solving complex computational problems. Filtering models use large DNA combinatorial libraries as search spaces for algorithms of parallel filtering (Ignatova et al., 2008). Most of these works were theoretical (Adleman, 1996; Gibbons et al., 1996; Reif, 1995; Rozenberg and Spaink, 2003), however, a few specific problems were actually solved in the laboratory: a 3-SAT problem with 3 (Liu et al., 2000), 6 (Braich et al., 2000) and 20 (Braich et al., 2002) variables, and a variation of the SAT problem, known as the knight problem (Faulhammer et al., 1999). To solve a wider range of problems a computer should be able to carry out addition and multiplication. However, carrying out binary operations poses other challenges. Guarnieri and colleagues (Guarnieri et al., 1996) presented a general algorithm to perform addition of two nonnegative binary numbers. In the same year, Roweis and colleagues introduced the sticker model, a complete and universal system (Roweis et al., 1998), which has been considered to do arithmetic over finite fields (Chang et al., 2005; Guo and Zhang, 2009; Li et al., 2013a). The sticker system is also a filtering model, which uses two types of single stranded DNA molecules, named memory strand and sticker strand. A memory strand and a number of sticker strands, hybridize to form a partial duplex (memory complex), which represents a bit string of zeros and ones. The main issues with this model are the limited length of a memory complex - it might fragment if it is longer than 15,000 bases – and time consuming operations, which are prone to error - stickers may bind to the wrong sites, or unbind when they are not supposed to. In 1998, Erik Winfree (Winfree, 1998) provided a remarkable new approach in the emerging field, when he proposed that DNA self-assembly could be used to do computation in an autonomous manner. Winfree explored algorithmic self-assembly, which is the result of the combination of Wang's tiling theory (Wang, 1961) and DNA nanotechnology, introduced by Seeman (Seeman, 1982). Winfree showed that DNA computation is Turing-universal and proposed that DNA self-assembly can be used to compute functions or assemble shapes (Winfree, 1996; Winfree et al., 1998; Rothemund et al., 2004). The introduced model by Winfree and colleagues, known as tile assembly model (TAM), has been considered to implement arithmetic over a finite field (Barua and Das, 2003; Li et al., 2013b, 2016; Li, and Xiao, 2014). TAM is based on the self-assembly of double-crossover DNA molecules (known as tiles) into a rectangular lattice, a pseudo-crystalline growth that occurs in the presence of an infinite supply of a finite number of tile types (Rothemund and Winfree, 2000; Jonoska et al., 2011). Tiles glue together or not depending on the binding domains on their sides. To carry out a computation one must start with an arrangement of tiles, called seed configuration, and a set of unattached tiles of different types. The calculation proceeds by annealing, ligation and melting, which occur in a controlled manner. A final configuration containing the result is obtained (Brun, 2007). The disadvantages of this model are the high error rate, the big number of components that a single calculation requires, and the fact that the seed configuration cannot be recycled (Brun, 2008; Brun and Medvidovic, 2007). Despite of the progress achieved in the field of DNA computing, big drawbacks such as time consuming operations with a high error rate, the output following statistical laws, and the amount of DNA molecules growing exponentially with problem size, are still unresolved in all the mentioned models (Kari et al., 2012). Recently, Woods and colleagues have presented a reprogrammable model of self-assembly (Woods et al., 2019). On the other hand, Currin and colleagues presented a non-deterministic Turing universal model which offered to overcome the problems that previous models posed (Currin et al., 2017). However, the drawbacks associated to the complexities of the experiments are still an issue.

Basic concepts about Galois fields

In this section, basic concepts about Galois fields are presented (Guajardo, 2004; Hungerford, 2012; Koblitz, 1998; Menezes et al., 1996; Sklar, 2001). A Galois field is a finite set with addition, , and multiplication, , module , defined in Tables 1(a) and (b), respectively. Here, is a prime number.

Table 1

Definitions for addition and multiplication in .

(a)

+	0	1	2	⋯	p−1
0	0	1	2	⋯	p−1
1	1	2	3	⋯	0
2	2	3	4	⋯	1
⋮	⋮	⋮	⋮	⋯	⋮
p−1	p−1	0	1	⋯	p−2
(b)

∗	0	1	2	⋯	p−1
0	0	0	0	⋯	0
1	0	1	2	⋯	p−1
2	0	2	4	⋯	p−2
⋮	⋮	⋮	⋮	⋯	⋮
p−1	0	p−1	p−2	⋯	1

Definitions for addition and multiplication in . Next, we briefly explain the method for constructing an extension field , with and , using as the underlying field. First, an irreducible polynomial of degree over is selected,where for . The polynomial is called a primitive polynomial. Let a root of , that is , thenwhere is the additive inverse of according to Table 1a. Next, is constructed recursively as,and, the element is replaced using Eq. (2), Then,where , , , and . Thus, the nonzero elements of are generated as linear combinations of in the following manner,with , , . We should note that , and the null element does not have a representation as power of . Hence, the field has elements, which are stored in a look-up table according to the powers of each element. Next, we explain how addition and multiplication of the elements of the field are carried out. Let , where Their addition is calculated as followswhere is calculated in using the Table 1a, for . There is not carry or borrow, because are independent of each other. On the other hand, the multiplication, , is calculated using Algorithm 1 (Guajardo, 2004). In the seventh step of the algorithm, we must set , when . In the following sections, we will refer to steps 2 to 9 as the external cycle and to the two internal for cycles, that is, the first cycle from steps 3 to 5, and the second cycle from 6 to 8, as cycles IF-A and IF-B, respectively. These can be executed in parallel, since these are independent of each other. Multiplication for . For the field , the addition and multiplication are defined in Tables 2(a) and (b), respectively.

Table 2

Definitions for addition and multiplication in .

(a)

+	0	1	2
0	0	1	2
1	1	2	0
2	2	0	1
(b)

∗	0	1	2
0	0	0	0
1	0	1	2
2	0	2	1

Definitions for addition and multiplication in . The extension field is constructed using the primitive polynomial . If a root of , then Now, we construct the non-null elements of recursively as follows, Equivalently, we can represent the elements of as arrays of elements in . Next we build a look-up table that contains all the elements of . In particular, this field has elements and Table 3 shows some of its elements.

Table 3

Look-up table with some non-null elements of .

αi	α4	α3	α2	α	1
i=0	0	0	0	0	1
i=1	0	0	0	1	0
i=2	0	0	1	0	0
i=3	0	1	0	0	0
i=4	1	0	0	0	0
i=5	0	0	0	1	2
i=6	0	0	1	2	0
i=7	0	1	2	0	0
i=8	1	2	0	0	0
i=9	2	0	0	1	2
⋮	⋮	⋮	⋮	⋮	⋮
i=20	1	2	0	2	1
⋮	⋮	⋮	⋮	⋮	⋮
i=26	1	2	0	2	2
⋮	⋮	⋮	⋮	⋮	⋮
i=35	1	0	1	2	2
⋮	⋮	⋮	⋮	⋮	⋮

Look-up table with some non-null elements of . We use Algorithm 1 to calculate the multiplication in , where and . Initially, the array is initialized with Then, the input values for and are The coefficients of the primitive polynomial are In Tables 4, 5, 6, 7, and 8, we detail the iterations of Algorithm 1 to calculate .

Table 4