Literature DB >> 31890936

A new DNA-based model for finite field arithmetic.

Iván Jirón1, Susana Soto1, Sabrina Marín2, Mauricio Acosta2, Ismael Soto3.   

Abstract

A Galois field G F ( p n ) with p ≥ 2 a prime number and n ≥ 1 is a mathematical structure widely used in Cryptography and Error Correcting Codes Theory. In this paper, we propose a novel DNA-based model for arithmetic over G F ( p n ) . Our model has three main advantages over other previously described models. First, it has a flexible implementation in the laboratory that allows the realization arithmetic calculations in parallel for p ≥ 2 , while the tile assembly and the sticker models are limited to p = 2 . Second, the proposed model is less prone to error, because it is grounded on conventional Polymerase Chain Reaction (PCR) amplification and gel electrophoresis techniques. Hence, the problems associated to models such as tile-assembly and stickers, that arise when using more complex molecular techniques, such as hybridization and denaturation, are avoided. Third, it is simple to implement and requires 50 ng/μL per DNA double fragment used to develop the calculations, since the only feature of interest is the size of the DNA double strand fragments. The efficiency of our model has execution times of order O ( 1 ) and O ( n ) , for the addition and multiplication over G F ( p n ) , respectively. Furthermore, this paper provides one of the few experimental evidences of arithmetic calculations for molecular computing and validates the technical applicability of the proposed model to perform arithmetic operations over G F ( p n ) .
© 2019 Published by Elsevier Ltd.

Entities:  

Keywords:  Applied mathematics; Bioinformatics; DNA computing; Finite fields; Galois fields; Gel electrophoresis; Molecular computing technologies; Polymerase chain reaction (PCR)

Year:  2019        PMID: 31890936      PMCID: PMC6926258          DOI: 10.1016/j.heliyon.2019.e02901

Source DB:  PubMed          Journal:  Heliyon        ISSN: 2405-8440


Introduction

The fast-paced technological development keeps pushing computer science to new boundaries. The field of DNA computing was born to address hard computational problems. The strategy of most algorithms developed within this novel area of study, is brute force, relying on the huge capacity for parallel processing of DNA computing. The interest in designing a molecular computer is not limited to difficult search problems. If a computer should be able to carry out addition and multiplication, a wider range of problems could be addressed. However, most of the work done in the field of DNA computing is theoretical. Researchers laxly count on the supposed feasibility of the biomolecular techniques proposed in the works of Adleman (Adleman, 1994, 1996; Roweis at al., 1998), Lipton (Lipton, 1995, Winfree (Winfree et al., 1998); LaBean et al., 1999; Rothemund et al., 2004) and Rothemund (Rothemund, 2006). Most algorithms that have appeared in the literature are based on a reduced number of DNA computing models and introduced with no experimental work to back up their actual implementation. We propose a new DNA based model specifically designed to do arithmetic over Galois fields, which was successfully implemented in the laboratory. Galois fields, , are mathematical structures widely used in Cryptography and in Error Correcting Codes Theory. In Cryptography, the key exchange scheme of Diffie-Hellman is implemented on elliptic and hyperelliptic curves defined on Galois fields (Menezes et al., 1996; Koblitz, 1998; Cohen et al., 2006). On the other hand, in Error Correcting Codes Theory, algebraic geometric codes use algebraic curves, such as Hermitian curves, defined on Galois fields (Sklar, 2001; Guajardo, 2004; Carrasco and Johnston, 2008). Our model has two main properties. First, the molecular techniques employed, Polymerase Chain Reaction (PCR) and electrophoresis, are standard techniques, widely used, easily implemented and not expensive, with only a few designed components needed to carry out the experiments. Secondly, our model allows calculations over , for prime number, , and an integer . In contrast, all works on DNA molecular computation over finite fields found in the literature, are restricted to and . This paper is organized as follows. Section 2 introduces the DNA computing model. Section 3 presents mathematical basic concepts about Galois fields. Section 4 presents the proposed DNA-based model for arithmetic over Galois fields. Section 5 describes the physical molecular implementation for the proposed DNA-based model. Section 6 summarizes the obtained experimental results for as case study. Section 7 presents a simulation of the proposed DNA-based model using Field Programmable Gate Array (FPGA) technology. Section 8 contains the discussion of the experimental results, the analysis of the advantages and the description of a possible DNA-based computer that implements arithmetic over using the proposed model. Section 9 presents conclusions and future works.

DNA computing models

The tendency in computer technology is to produce devices with greater memory and speed than the previous generation but much smaller. The idea of building a tiny computer is not new. In the late 1950s, Richard Feynman suggested the possibility of having sub-microscopic computers in his famous talk “There's Plenty of Room at the Bottom”. However, only about two decades ago Leonard Adleman made a breakthrough when he used the tools of molecular biology to address an NP-complete problem (Adleman, 1994). He succeeded in solving a case of the Hamiltonian path, by manipulating DNA. This event marked the birth of the field known as DNA computing (Kari, 1997). The speed of any computing device, bio-molecular or not, depends on how many parallel processes it has and how many steps, per each process, it can realize per unit of time. Electronic computers can calculate millions of instructions per second, a task a biological system cannot emulate. However, a DNA computer has a huge advantage in parallel processing and memory (Goldman et al., 2013) and this compensates for the much slower execution time for one instruction (Lipton, 1995; Guarnieri et al., 1996). The immense capacity for parallelization of DNA computing appeared to be the key to outperform electronic computers. The advent of the new discipline at first augured the end of silicon-based computers, however, scientists in the field soon acknowledged there were some obstacles in the way of realizing a competitive DNA-based computer (Gibbons et al., 1996; Regalado, 2000). The models developed for DNA computing can be classified in two types: those which require human intervention during the process of calculation and those that can be programmed to function autonomously. Early research, following the works of Adleman (Adleman, 1994) and Lipton (Lipton, 1995), provided a variety of non-autonomous models, known as filtering models, for solving complex computational problems. Filtering models use large DNA combinatorial libraries as search spaces for algorithms of parallel filtering (Ignatova et al., 2008). Most of these works were theoretical (Adleman, 1996; Gibbons et al., 1996; Reif, 1995; Rozenberg and Spaink, 2003), however, a few specific problems were actually solved in the laboratory: a 3-SAT problem with 3 (Liu et al., 2000), 6 (Braich et al., 2000) and 20 (Braich et al., 2002) variables, and a variation of the SAT problem, known as the knight problem (Faulhammer et al., 1999). To solve a wider range of problems a computer should be able to carry out addition and multiplication. However, carrying out binary operations poses other challenges. Guarnieri and colleagues (Guarnieri et al., 1996) presented a general algorithm to perform addition of two nonnegative binary numbers. In the same year, Roweis and colleagues introduced the sticker model, a complete and universal system (Roweis et al., 1998), which has been considered to do arithmetic over finite fields (Chang et al., 2005; Guo and Zhang, 2009; Li et al., 2013a). The sticker system is also a filtering model, which uses two types of single stranded DNA molecules, named memory strand and sticker strand. A memory strand and a number of sticker strands, hybridize to form a partial duplex (memory complex), which represents a bit string of zeros and ones. The main issues with this model are the limited length of a memory complex - it might fragment if it is longer than 15,000 bases – and time consuming operations, which are prone to error - stickers may bind to the wrong sites, or unbind when they are not supposed to. In 1998, Erik Winfree (Winfree, 1998) provided a remarkable new approach in the emerging field, when he proposed that DNA self-assembly could be used to do computation in an autonomous manner. Winfree explored algorithmic self-assembly, which is the result of the combination of Wang's tiling theory (Wang, 1961) and DNA nanotechnology, introduced by Seeman (Seeman, 1982). Winfree showed that DNA computation is Turing-universal and proposed that DNA self-assembly can be used to compute functions or assemble shapes (Winfree, 1996; Winfree et al., 1998; Rothemund et al., 2004). The introduced model by Winfree and colleagues, known as tile assembly model (TAM), has been considered to implement arithmetic over a finite field (Barua and Das, 2003; Li et al., 2013b, 2016; Li, and Xiao, 2014). TAM is based on the self-assembly of double-crossover DNA molecules (known as tiles) into a rectangular lattice, a pseudo-crystalline growth that occurs in the presence of an infinite supply of a finite number of tile types (Rothemund and Winfree, 2000; Jonoska et al., 2011). Tiles glue together or not depending on the binding domains on their sides. To carry out a computation one must start with an arrangement of tiles, called seed configuration, and a set of unattached tiles of different types. The calculation proceeds by annealing, ligation and melting, which occur in a controlled manner. A final configuration containing the result is obtained (Brun, 2007). The disadvantages of this model are the high error rate, the big number of components that a single calculation requires, and the fact that the seed configuration cannot be recycled (Brun, 2008; Brun and Medvidovic, 2007). Despite of the progress achieved in the field of DNA computing, big drawbacks such as time consuming operations with a high error rate, the output following statistical laws, and the amount of DNA molecules growing exponentially with problem size, are still unresolved in all the mentioned models (Kari et al., 2012). Recently, Woods and colleagues have presented a reprogrammable model of self-assembly (Woods et al., 2019). On the other hand, Currin and colleagues presented a non-deterministic Turing universal model which offered to overcome the problems that previous models posed (Currin et al., 2017). However, the drawbacks associated to the complexities of the experiments are still an issue.

Basic concepts about Galois fields

In this section, basic concepts about Galois fields are presented (Guajardo, 2004; Hungerford, 2012; Koblitz, 1998; Menezes et al., 1996; Sklar, 2001). A Galois field is a finite set with addition, , and multiplication, , module , defined in Tables 1(a) and (b), respectively. Here, is a prime number.
Table 1

Definitions for addition and multiplication in .

(a)

+012p1
0012p1
11230
22341
p1
p1
0
1

p2
(b)

012p1
00000
1012p1
2024p2
p10p1p21
Definitions for addition and multiplication in . Next, we briefly explain the method for constructing an extension field , with and , using as the underlying field. First, an irreducible polynomial of degree over is selected,where for . The polynomial is called a primitive polynomial. Let a root of , that is , thenwhere is the additive inverse of according to Table 1a. Next, is constructed recursively as,and, the element is replaced using Eq. (2), Then,where , , , and . Thus, the nonzero elements of are generated as linear combinations of in the following manner,with , , . We should note that , and the null element does not have a representation as power of . Hence, the field has elements, which are stored in a look-up table according to the powers of each element. Next, we explain how addition and multiplication of the elements of the field are carried out. Let , where Their addition is calculated as followswhere is calculated in using the Table 1a, for . There is not carry or borrow, because are independent of each other. On the other hand, the multiplication, , is calculated using Algorithm 1 (Guajardo, 2004). In the seventh step of the algorithm, we must set , when . In the following sections, we will refer to steps 2 to 9 as the external cycle and to the two internal for cycles, that is, the first cycle from steps 3 to 5, and the second cycle from 6 to 8, as cycles IF-A and IF-B, respectively. These can be executed in parallel, since these are independent of each other. Multiplication for . For the field , the addition and multiplication are defined in Tables 2(a) and (b), respectively.
Table 2

Definitions for addition and multiplication in .

(a)

+012
0012
1120
2
2
0
1
(b)

012
0000
1012
2021
Definitions for addition and multiplication in . The extension field is constructed using the primitive polynomial . If a root of , then Now, we construct the non-null elements of recursively as follows, Equivalently, we can represent the elements of as arrays of elements in . Next we build a look-up table that contains all the elements of . In particular, this field has elements and Table 3 shows some of its elements.
Table 3

Look-up table with some non-null elements of .

αiα4α3α2α1
i=000001
i=100010
i=200100
i=301000
i=410000
i=500012
i=600120
i=701200
i=812000
i=920012
i=2012021
i=2612022
i=3510122
Look-up table with some non-null elements of . We use Algorithm 1 to calculate the multiplication in , where and . Initially, the array is initialized with Then, the input values for and are The coefficients of the primitive polynomial are In Tables 4, 5, 6, 7, and 8, we detail the iterations of Algorithm 1 to calculate .
Table 4

Iteration for the external cycle.

a4=0,a3=0,a2=1,a1=2,a0=0
b4=1,b3=2,b2=0,b1=2,b0=1
q4=0,q3=0,q2=0,q1=2,q0=1
c4=0,c3=0,c2=0,c1=0,c0=0
i=0
Cycle IF-Aj=4c4c4+b0a4=0+(10)=0
j=3c3c3+b0a3=0+(10)=0
j=2c2c2+b0a2=0+(11)=1
j=1c1c1+b0a1=0+(12)=2
j=0c0c0+b0a0=0+(10)=0
Cycle IF-Bj=4a4a3q4a4=0(00)=0
j=3a3a2q3a4=1(00)=1
j=2a2a1q2a4=2(00)=2
j=1a1a0q1a4=0(20)=0
j=0a0a1q0a4=0(10)=0
Table 5

Iteration for the external cycle.

a4=0,a3=1,a2=2,a1=0,a0=0
b4=1,b3=2,b2=0,b1=2,b0=1
q4=0,q3=0,q2=0,q1=2,q0=1
c4=0,c3=0,c2=1,c1=2,c0=0
i=1
Cycle IF-Aj=4c4c4+b1a4=0+(20)=0
j=3c3c3+b1a3=0+(21)=2
j=2c2c2+b1a2=1+(22)=2
j=1c1c1+b1a1=2+(20)=2
j=0c0c0+b1a0=0+(20)=0
Cycle IF-Bj=4a4a3q4a4=1(00)=1
j=3a3a2q3a4=2(00)=2
j=2a2a1q2a4=0(00)=0
j=1a1a0q1a4=0(20)=0
j=0a0a1q0a4=0(10)=0
Table 6

Iteration for the external cycle.

a4=1,a3=2,a2=0,a1=0,a0=0
b4=1,b3=2,b2=0,b1=2,b0=1
q4=0,q3=0,q2=0,q1=2,q0=1
c4=0,c3=2,c2=2,c1=2,c0=0
i=2
Cycle IF-Aj=4c4c4+b2a4=0+(01)=0
j=3c3c3+b2a3=2+(02)=2
j=2c2c2+b2a2=2+(00)=2
j=1c1c1+b2a1=2+(00)=2
j=0c0c0+b2a0=0+(00)=0
Cycle IF-Bj=4a4a3q4a4=2(01)=2
j=3a3a2q3a4=0(01)=0
j=2a2a1q2a4=0(01)=0
j=1a1a0q1a4=0(21)=1
j=0a0a1q0a4=0(11)=2
Table 7

Iteration for the external cycle.

a4=2,a3=0,a2=0,a1=1,a0=2
b4=1,b3=2,b2=0,b1=2,b0=1
q4=0,q3=0,q2=0,q1=2,q0=1
c4=0,c3=2,c2=2,c1=2,c0=0
i=3
Cycle IF-Aj=4c4c4+b3a4=0+(22)=1
j=3c3c3+b3a3=2+(20)=2
j=2c2c2+b3a2=2+(20)=2
j=1c1c1+b3a1=2+(21)=1
j=0c0c0+b3a0=0+(22)=1
Cycle IF-Bj=4a4a3q4a4=0(02)=0
j=3a3a2q3a4=0(02)=0
j=2a2a1q2a4=1(02)=1
j=1a1a0q1a4=2(22)=1
j=0a0a1q0a4=0(12)=1
Table 8

Iteration for the external cycle.

a4=0,a3=0,a2=1,a1=1,a0=1
b4=1,b3=2,b2=0,b1=2,b0=1
q4=0,q3=0,q2=0,q1=2,q0=1
c4=1,c3=2,c2=2,c1=1,c0=1
i=4
Cycle IF-Aj=4c4c4+b4a4=1+(10)=1
j=3c3c3+b4a3=2+(10)=2
j=2c2c2+b4a2=2+(11)=0
j=1c1c1+b4a1=1+(11)=2
j=0c0c0+b4a0=1+(11)=2
Cycle IF-Bj=4a4a3q4a4=0(00)=0
j=3a3a2q3a4=1(00)=1
j=2a2a1q2a4=1(00)=1
j=1a1a0q1a4=1(20)=1
j=0a0a1q0a4=0(10)=0
Iteration for the external cycle. Iteration for the external cycle. Iteration for the external cycle. Iteration for the external cycle. Iteration for the external cycle. Finally,

Proposed DNA-based model for arithmetic over and

We have developed a simple DNA-based model to perform addition and multiplication over the fields and , . It is based on the differential migration of dsDNA fragments of different sizes in a gel electrophoresis, which is a standard technique for the separation of double-stranded DNA (dsDNA) fragments of different sizes that are previously obtained by PCR. Here the size of a dsDNA fragment corresponds to the number of base pairs that are contained in the fragment. Each element is represented by a dsDNA fragment whose size is unique to the element . Therefore, only dsDNA fragments are necessary to represent all the elements of . Table 9 shows this representation using dsDNA of different sizes, where the smallest size is and the largest size is .
Table 9

DNA representation for elements .

rGF(p)012p1
Size of DNA fragment [bp]S0S1S2Sp1
DNA representation for elements . The gel electrophoresis is used to visualize the DNA molecular representation of a nonzero element representing the coefficients of the polynomial expression, which is given for Eq. (3). The dsDNA fragments for each coefficient are loaded into different slots of the agarose gel matrix. The slots and their respective columns are numbered as according to the order of powers from left to right. Then, an electric field is applied to make the molecules migrate through the gel and be separated by sizes. Figure 1 shows the dsDNA fragments representation of . For this purpose, chains with size were loaded in the slot , chains with size were loaded in the slot , and from slot to slot , chains with size were loaded. Finally, chains with size were loaded in the slot , and chains with size were loaded in the slots and . Thus, our model defines a unique DNA-based representation for each element of .
Figure 1

dsDNA fragments representation of performed by agarose gel electrophoresis.

For and the extension field , the dsDNA fragments and are required to represent the elements of . Figure 2 shows the DNA-based representation of and . Such representations are obtained at the end of the gel electrophoresis process.
Figure 2

DNA-based representation for .

DNA-based representation for . To calculate addition and multiplication in , first it is necessary to establish a key configuration to interpret addition and multiplication in , according to Tables 1(a) and (b), respectively. In the configuration key, the band pattern in any column (depicting dsDNA fragments on a gel matrix) represents the addition or multiplication of two elements of the field . This is illustrated in the following example. The DNA-based implementation of addition in , requires dsDNA fragments of 3 different sizes. For example, to carry the addition , dsDNA fragments and are loaded into a slot in the agarose gel matrix. Then, the electrophoresis is executed, and the resulting band pattern is interpreted according to the key configuration for addition over , shown in Figure 3. In this figure, a pattern as the one shown in column , will be interpreted as , the result of or . In a similar way, for Figure 4, to calculate , the dsDNA fragments and are loaded into the gel matrix and the electrophoresis is run. The resulting configuration, identical to the one shown in column of the key configuration for multiplication, will be read as , the result of or .
Figure 3

Key configuration for addition over , used to interpret band patterns in gel electrophoresis.

Figure 4

Key configuration for multiplication over , used to interpret band patterns in gel electrophoresis.

Key configuration for addition over , used to interpret band patterns in gel electrophoresis. Key configuration for multiplication over , used to interpret band patterns in gel electrophoresis. Using these key configurations, we can carry out addition and multiplication of any two elements of by gel electrophoresis. Addition is calculated by adding the coefficients of corresponding powers of each operand, as explained in formula (4) of Section 3. Each one of these additions is independent of the others, since there is not carry or borrow. This is best explained by the following example of addition over . The addition of and is calculated as Next, we use the Table 3, which is the look-up table of , previously constructed, to find the representation of as a power of a root of primitive polynomial . Then, the linear combination or equivalently the array corresponds to . For the implementation of addition by gel electrophoresis, dsDNA fragments representing corresponding coefficients of and , are loaded into five slots in the agarose matrix. The slots (columns) are numbered from left to right. The dsDNA fragments and representing and are loaded in slot 4, and representing and are loaded into slot 3. This procedure is repeated for the rest of the coefficients. Next, the electrophoresis is run. Figure 5 shows the resulting band pattern from calculating . The configurations in columns 4 and 3, match the configurations in columns and of Figure 3, respectively. Thus the band patterns in these columns are read as 1 and 2. The columns are independent of each other, because there is not carry or borrow, therefore they are interpreted separately. The complete band pattern is interpreted as the array , which corresponds to element , according to the Table 3.
Figure 5

Gel electrophoresis implementation of in .

Gel electrophoresis implementation of in . Below, we explain the DNA-based implementation for the multiplication of two elements of . We use Algorithm 1 to calculate the multiplication in , where and . Initially, the agarose gel matrix is empty. Then, the array is initialized according to the step 1 of Algorithm 1, with Next, dsDNA fragments representing corresponding coefficients of both elements are loaded into five slots of the agarose gel matrix. Then, the electrophoresis is run according to Algorithm 1, where the input values and their representation as dsDNA fragments for and areandrespectively. The coefficients of the primitive polynomial and their representation as dsDNA fragments are dsDNA fragments representation of performed by agarose gel electrophoresis. Tables 4 and 8 show the iterations of the algorithm for cycles IF-A and IF-B for and , respectively. Then, the array is searched in the rows of Table 3 for , and it is determined that . For Table 4, the theoretical scheme of gel electrophoresis for cycles IF-A (steps 3 to 5) and IF-B (steps 6 to 8) in Algorithm 1 is shown in Figure 6. We use three different sizes for the dsDNA fragments, in order to implement the addition and the multiplication in steps 4 and 7. Furthermore, as explained in Section 3, the cycles IF-A and IF-B are independent of each other. This allows executing them in parallel, using gel electrophoresis. Figure 8 in Section 6 shows this condition of parallelism empirically in the lab.
Figure 6

Interpretation of gel electrophoresis: cycles IF-A and IF-B for and .

Figure 8

Practical implementation for Figure 7 by DNA gel electrophoresis.

Interpretation of gel electrophoresis: cycles IF-A and IF-B for and . For internal cycles IF-A and IF-B, the per column configuration in the lower half of the gel matrix ( or ), is interpreted according to the key configuration for multiplication shown in Figure 4. The obtained result of multiplication ( or ) is added to the operand ( or ), on the upper half of the gel matrix, which is in the same column. The addition is calculated according to Table 2a. The first iteration is shown in Figure 6. At the last iteration , the final configuration is interpreted as the array using the look-up table described in Table 3 for , concluding that is the result of . Therefore, our DNA-based model has a flexible implementation in the laboratory, because we only need to change: The number , which defines the amount of dsDNA fragments with different sizes that are previously obtained by PCR. The number , which defines the amount of slots in the gel matrix to execute an electrophoresis. It allows us to calculate additions and multiplications in different fields and with . For example, if we want to change from the field to the field , we would only have to change from to , and from to . Obviously for each field we must build the respective tables and DNA-based representations for the addition and multiplication. Finally, we analyze the efficiency of our model to calculate the addition and the multiplication in a field . For this, we assume that all electrophoresis are executed in a constant time for the addition and the multiplication. Thus, The addition has an execution time of the order , since the addition is calculated in only one electrophoresis, as shown in Examples 3 and 4, Figures 3 and 5. The multiplication has an execution time of the order , since the internal cycles (Cycle IF-A and Cycle IF-B) of the Algorithm 1 are calculated in parallel, and in only one electrophoresis for each iteration of the external cycle (steps 2 to 9), as shown in Example 5 and Figure 6.

Physical implementation of the proposed DNA-based model

Before performing the agarose gel electrophoresis experiments, a dsDNA template is required from which to generate the different dsDNA fragments of known size by the PCR technique. For that purpose, the bacterial strain Sulfobacillus sp. CBAR-13, whose genomic DNA sequence was already known, was grown by microbial culture in the laboratory. Detailed information about the culture methodology is described below. Then, the genomic DNA was purified.

Cells growth and DNA template preparation

Bacterial strain CBAR-13 of Sulfobacillus sp. was grown in a shaking incubator at 59 °C in Single Strength medium [0.2 g/liter (NH4)2SO4, 0.4 g/liter MgSO4·7H2O, and 0.1 g/liter K2HPO4 (initial pH 1.7)] with 50 mM ferrous sulfate (membrane filtered) and 0.02% yeast extract. At the mid-exponential-growth phase, the bacterial cells were harvested, and the total genomic DNA was extracted with High Pure PCR Template Preparation Kit (Roche Product No. 11796828001) following the protocol prescribed by the manufacturer and then used for Polymerase chain reaction (PCR). The DNA fragments were obtained by PCR using a set of DNA primers designed specifically and the genomic DNA of CBAR-13 as a PCR template. The details are explained below.

Primers design

Primers were designed using Primer-BLAST software (Ye et al., 2012). The genomic sequence of S. sp. CBAR-13 (access numbers; NZ_LGRO01000001 and NZ_LGRO01000002) was used as template for primers design. Table 10 shows all primers designed, synthesized and used in this study.
Table 10

The three pair of PCR primers used in this study and the expected size for each PCR product. Fw and Rv are forward and reverse primers, respectively.

PrimerSequence 5'3'Product length (bp)
1-FwGACAGACCTGCTCGCTTCTT639
1-RvTGGTAAACGCGGGCAACTTA
4-FwTACTCCATCCGCCAGTCAGA110
4-RvGTTGACGTGCTGTGACAACC
5-FwGTTGTCACAGCACGTCAACC77
5-RvAAGTACAAGAGCGCCAACGA
The three pair of PCR primers used in this study and the expected size for each PCR product. Fw and Rv are forward and reverse primers, respectively.

PCR protocol for generation of dsDNA fragments

PCR amplification was carried out in a 50 μl reaction volume containing 50–100 ng of template DNA, primers (1 μM each Fw and Rv), dNTPs (10 μM each), MgCl2 (2 mM), 5X Green GoTaq® Flexi Buffer (1X final concentration) and 1.25 U GoTaq® DNA Polymerase (Promega catalog M7801). The conditions for the PCR reactions were: 98 °C for 3 min, followed by 30 cycles of denaturation at 95 °C for 30 s, annealing at 60 °C for 45 s, extension at 72 °C for 30 s, and a final extension at 72 °C for 5 min.

Agarose gel electrophoresis

The products of the PCR reactions were separated as follows, 4 μl of each reaction were revealed by agarose gel electrophoresis at 90 V for 1.5 h on 1 or 3% agarose in Tris-acetate-EDTA buffer (40mM Tris, 20mM acetic acid, and 1mM EDTA) and 3 μl GelRed® 10000X (Biotium catalog 41002). The agarose gels were visualized by a transilluminator and then documented and confirmed. The same electrophoretic procedure was applied to perform the arithmetic calculations with the obtained DNA fragments.

Experimental results

The dsDNA fragments of specific size were generated for the experimental development of the proposed model. Figure 7 shows the size and quality of the generated fragments , that represent the elements of the field , verified by electrophoresis of the PCR products.
Figure 7

Practical implementation for Table 8 by DNA gel electrophoresis. Mk = Molecular weight marker.

Practical implementation for Table 8 by DNA gel electrophoresis. Mk = Molecular weight marker. To test the validity of the proposed model, we performed the calculation described in Section 4 using the fragments generated previously. Figure 8 shows the true implementation by gel electrophoresis of the multiplication described in Figure 6 of Section 4, which considers the iterations , where , for internal cycles IF-A and IF-B of Algorithm 1. Practical implementation for Figure 7 by DNA gel electrophoresis. Once the results of and for (cycle IF-A and IF-B) are interpreted and obtained, then and are replaced for calculation of , and so on. The calculation of the multiplication of and in ends when all iterations have been completed.

FPGA simulation of the proposed model

In this section, we present the simulation of our proposed model using Field Programmable Gate Array (FPGA) technology. The simulation consists in the design and testing of arithmetic circuits optimized for , achieving shorter times than it would take sequential computers. For this we use the FPGA ZYNQ7000 of Xilinx, which is incorporated into a SoM TE0729-02 of Trenz electronic GmbH. For the case study of the field , the operations (base 3) were designed as a virtual layer on the components of the architecture of FPGA ZYNQ7000 (base 2). Thus, virtual minimum logical units of 3 states are considered to establish a homologation between the virtual layer and the physical layer. Figure 9 shows the logical mapping using a FPGA for the addition and multiplication of , according to Tables 2a and b with . These operations are used to develop addition and multiplication of elements , as explained in Sections 3 and 4.
Figure 9

The logical mapping for the addition and multiplication of using a FPGA.

The logical mapping for the addition and multiplication of using a FPGA. On the other hand, Figure 10 shows the simulation using a FPGA, for , which was described in Example 1. The multiplication is done in 5 iterations. These iterations appear in red color in the row corresponding to the result (R). It should be noted that in each operation performed with the coefficients of and , the logical mapping described in Figure 9 is used. Although is performed in one clock, there is an additional computational cost of converting non-binary coefficients in to their respective binary representation in order to operate with FPGA.
Figure 10

Simulation using a FPGA for .

Simulation using a FPGA for .

Discussion

This paper is the first that introduces a new DNA model in the area of molecular computing, designed to perform arithmetic over Galois fields, based on the differential migration of dsDNA fragments of different sizes. The proposed model presents several advantages over other models that have been widely studied and previously published, such as the Tile Assembling model (TAM) (Winfree et al., 1998) and the Sticker model (Roweis et al., 1998). All the arithmetic calculations covered by the TAM have been performed only in a theoretical way (e. g. Brun, 2007; Li et al., 2013a, 2013b; Li and Xiao, 2014; Li and Xiao, 2016; Li et al., 2016; Li, 2018; Li and Zhang, 2018). Li et al. (Li et al., 2013b) designed a tile assembly system that, in theory, could compute a square over , based on the condition that all DNA operations are perfect. But it is widely known that this is not the case but quite the opposite. In the same way, Jonoska et al. (Jonoska et al., 2011) assumed, in their flexible-TAM study, that the assembly process happens in ideal conditions. In the more recent studies (Li et al., 2016; Li, 2018) the authors only reference the article (Rothemund, 2012) to justify the technical feasibility of their methodology for DNA computation of modular-multiplication and modular-square over . However (Rothemund, 2012), only uses DNA self-assembly for fabrication of nanostructures (DNA origami) at laboratory scale. There is still not empirical evidence of the application of this molecular technique in the calculation of the mentioned problem. This may be due to the recognized complexity associated to the implementation of the DNA self-assembly technique in the laboratory (Rothemund, 2006; Jonoska et al., 2011). In (Woods et al., 2019) the authors present a reprogrammable DNA self-assembly system based on tiles, which can copy, sort, recognize palindromes, find multiples of 3, and other functions that are detailed in that article. However, this reprogrammable DNA self-assembly system is limited to the binary case, since the system uses iterated Boolean circuits. This hinders its application to develop calculations over a with . With respect to the molecular process associated to the TAM, Rothemund (Rothemund, 2006) and Jonoska (Jonoska et al., 2011) described and showed some typical experimental deviations that can occur during the practical work in the laboratory. First, the known difficulty in determining the stoichiometry for complex test tubes with different types of molecules, could result in annealing or thermodynamical problems and, in consequence, in hybridization mismatches and low performance of the reaction. Second, the low proportion of well-formed structures resulting from self-assembling (only 53%), evidenced by Rothemund (Rothemund, 2006), could predict an important percentage of error in the tile assembly process and, consequently, in the molecular calculations. Third, the presence of large dislocations at unbridged seams, where two halves of one assembled structure get completely separated, are also common. It is highly probable that all these technical issues hinder an accurate and successful computation by the TAM. In contrast, our model is based on conventional PCR reactions and agarose gel electrophoresis, both highly stable, reproducible and relatively low-cost molecular techniques. Another consideration for the technical application of the TAM on Galois fields calculations is that all technical work performed for one calculation (for example a multiplication of two elements in ) cannot be recycled for a different calculation, which would have to be completely performed from the start (Rothemund et al., 2004; Rothemund, 2012). Furthermore, it has been reported that time required for a calculation by TAM increases proportionally with the increase of in a , just because the complexity of the DNA assembly increase with (Brun, 2008). If it is considered that the time required for the design of the sequences necessary for structure formation is one week in addition to one week needed for sequences synthesis and 2 h for mixing and annealing reactions (Rothemund, 2006), then TAM is a time and money expensive technique for algorithmic calculation. Conversely, most of the technical work of our model could be reused for different arithmetic calculations transforming it into a very attractive model in terms of costs and time. In an eventual new arithmetic calculation, only the electrophoresis must be repeated, while the different DNA fragments could be reused or, at most, re-amplified by PCR (90 min) with the previously designed primers. The application of the TAM model to arithmetical problems has been studied for more than ten years. However, there are still no records of its successful implementation in the laboratory to do arithmetic over with . This supports our hypothesis that the TAM works well at a theoretical level but there is high uncertainty about the feasibility of its practical implementation and application in the short or medium term. On the other hand, the only methodological complexity of our DNA-based model is that the number of different dsDNA fragments for the input depends on parameter , and the number of slots in the gel matrix depends on parameter for any . For example, for the addition and multiplication over we only need dsDNA fragments of three different sizes and a gel matrix with capacity for five slots. For a much large field, such as , we only require dsDNA fragments of two different sizes and a gel matrix with capacity for 163 slots, to perform addition and multiplication over that field. This technical component of our model, also makes it simpler than the sticker model. The sticker model uses the hybridization of complementary DNA fragments to represent bit strings and do binary arithmetic, in theory (Zimmermann, 2002). However, the implementation of such calculations requires a careful adjustment of the hybridization conditions to ensure reproducibility and the correct assembly of all fragments. Consequently, any modification in the sequence of stickers or an increase in the number of stickers required, will need a resetting of the hybridization conditions. In (Li et al., 2013a), the authors present a stickers-based algorithm for parallel reduction in a field , and they use the field as a theoretical example. Then, to represent all the elements of it is required to design and handle 163 different stickers to represent a bit 1 in the different positions of the memory strand. Moreover, it would not be possible to effectively manage the melting and annealing temperatures to bind and unbind selectively up to 163 different stickers. This fact makes the biological operations of merge, separate, set and clear difficult to implement in the laboratory for addition and multiplication over . Parallel computing is being widely studied and has been implemented using different approaches in recent years and it is expected that the threshold of ExaFLOP (1018 floating-point operations per second) will be reached by the year 2020. Thus, parallel computers could replace the current computers, which mostly have a sequential architecture (Li, 2018; Wright, 2019). However, silicon and molecular computers continue to use binary logic, and this force translating non-binary operations into binary atomic operations using Boolean algebra (Zhang et al., 2019; Eshra et al., 2019). In particular, to calculate addition and multiplication in a non-binary field , , there is an additional computational cost associated with using atomic operations in Boolean algebra for the implementation of these operations. In this context, our model can also operate in parallel and it has a flexible implementation in the laboratory that allows avoiding the translation to Boolean operations to implement addition and multiplication over a non-binary field, which gives it a big advantage over current silicon and molecular systems. Even in the simulation with FPGA ZYNQ7000 for the case study of , in which we programmed the logical mapping using the circuits to simulate the addition and the multiplication over any Galois field, it was necessary to translate the no-binary operations performed by our model into binary ones. Therefore, our model has a flexible implementation in the laboratory which allows arithmetic operations over binary and non-binary fields without conversion cost, it is easy to implement, economic, less prone to error, more tolerant to changes without altering the result. On the other hand, we analyze the efficiency of our model to calculate the addition and the multiplication in a field with . For this, we assume that each electrophoresis is executed in a constant time for the addition and the multiplication. Thus, we obtain that the addition has an execution time of the order , while the multiplication has an execution time of the order . We use the multiplication as the worst case to compare the efficiency of our model with the efficiency of TAM-based models. Since the multiplication is more expensive than the addition in computational terms (memory, processor and time). The authors of (Li, 2018; Li and Xiao, 2016; Li et al., 2016) state that the execution time for the multiplication over a field is using the TAM system, which is equal to the execution time of our model. But in the TAM model, 7746 different tiles are needed to do the calculations, and as explained above this leads to great complexity of the implementation in the laboratory of the TAM model. Further, we say again TAM model can only be used in the binary case, that is, for . Finally, with a plausible engineering intervention, our model can be fully automated. This is discussed with more detail below. Figure 11 shows a system diagram of a possible DNA-based computer for our model that performs addition and multiplication over for and in an autonomous way.
Figure 11

System diagram for a DNA-base molecular computer for proposed model.

System diagram for a DNA-base molecular computer for proposed model. In the system there are Robots, Interpreters, Electrophoresis boxes and a Display. A Robot makes the loading of the previously generated dsDNA fragments in the respective slots in the gel matrix for the electrophoresis. Thus, our model avoids any human error in pipetting and loading process. An Interpreter contains the look-up table of all the elements of the field , and the key configurations for addition and multiplication. This device makes the interpretation of the images obtained by electrophoresis, and gives instructions to the next Robot to load the dsDNA fragments of the next iteration or Electrophoresis box. The Electrophoresis boxes represent the DNA electrophoresis processes for each addition or multiplication calculations, according to previously explained and showed in Sections 4, 5 and 6. At the end of the system, there is a Display, which is a device that receives information from the last Interpreter and shows the final result of the completed arithmetic calculation. Many of the mentioned possible improvements for the proposed model are currently in use. The simplicity of the technique makes feasible its adaptation to devices currently available in the market, like automated and miniaturized systems. Microfabricated capillary array electrophoresis is a microfluidic device system that allows the separation of molecules, in this case DNA fragments of different size, and could be used for DNA sequencing (Paegel et al., 2002). The DNA sequencing technologies are relevant for our model, because they resume most of the advances towards increasing the analysis throughput and fragment resolution, and decreasing consumption of reactants and samples. For comparison, a slab gel requires 0.5–1.0 μL of DNA sample and 6–8 h of analysis time, the DNA separation in a microfluidic device could require 0.0001 – 0.0005 μL of DNA sample and 0.1 – 0.5 h of analysis time (Sinville and Soper, 2007). The handling of samples and the distribution of liquid solutions is a field where multiple solutions have been developed through robotics, e.g. QIAsymphony SP/AS instruments of Qiagen. Those devices can handle the mixture and the loading of samples into the analytic instrument, so the entire wet procedure is free of human intervention. This offers the advantages of a uniform loading of samples, time saving and avoidance of the error prone process of handling and loading larger number of samples by hand. Another area where improvements can be incorporated is in reading and interpreting of the results through image analysis by simple software, for example (Intarapanich et al., 2015; Abeykoon et al., 2015). This software can be of great help, as it can quickly and accurately read and translate the image results in a user-friendly format.

Conclusions and future work

This work is the first that introduces a novel DNA model to implement arithmetic operations over a field , with and , which is based on the differential migration of dsDNA fragments of different sizes. It has three major advantages over TAM and sticker models. First, because of its flexible implementation in the laboratory, it allows performing arithmetic operations over binary and non-binary Galois fields without the translation to Boolean operations, while finite field arithmetic, using the TAM model or the sticker model, is limited to . The second asset of our model is that it is less prone to error than other systems. It is based on conventional PCR amplification and electrophoresis, highly stable, reproducible and low-cost molecular techniques. Hence, the problems associated to other models that arise when using more complex molecular techniques, such as hybridization and denaturation, are avoided. The third advantage is that it is simple to implement and, when fully developed, it will use 50 ng/μL per DNA fragment used to develop the calculations. There is no need for designing complex DNA structures, since the only feature of interest is the size of the used dsDNA fragments. Also, to do arithmetic over only fragments of different sizes and a gel matrix with capacity for slots are necessary. This contrasts with TAM and sticker models, where the design of the DNA strands is of major importance and the concentration of reactants increases greatly with the size of the problem. Furthermore, the flexible implementation in the laboratory of our model allows us to perform arithmetic calculations in parallel over , and also without the cost of translating non-binary operations into binary atomic operations using Boolean algebra. Then, it is easy to implement, economic, less prone to error, more tolerant to changes without altering the result. On the other hand, the efficiency of our model has execution times of order and , for the addition and multiplication over a field with , respectively. For this, we assume that each electrophoresis is executed in a constant time for the addition and the multiplication. This paper provides one of the few experimental evidences of arithmetic calculations for molecular computing and validates the technical applicability of the proposed model to perform arithmetic operations over with . Finally, our future work will be focused on making faster the interpretation of the DNA patterns produced in each electrophoresis, and with this achieve a cheaper and faster implementation of the addition and multiplication on a field with in the laboratory.

Declarations

Author contribution statement

Ivan Jiron, Susana Soto, Sabrina Marin, Mauricio Acosta, Ismael Soto: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.

Funding statement

This work was supported by VRIDT/UCN N° 171/17, VRIDT/UCN N° 084/2018 and Project FONDEF IT17M10012.

Competing interest statement

The authors declare no conflict of interest.

Additional information

Supplementary content related to this article has been published online at https://doi.org/10.1016/j.heliyon.2019.e02901.
Input:
αk=i=1n1aiαi,αl=i=1n1biαi,Q(α)=αn+i=1n1qiαi
where ai,bi,qiGF(p).
Output:
C=αkαl=i=1n1ciαi
where ciGF(p).
1.C0
2.fori=0 to n1do
3. forj=n1 to 0do
4. cjcj+biaj
5. end_for
6. forj=n1 to 0do
7. ajaj1qjan1
8.end_for
9.end_for
10.ReturnC
  20 in total

1.  Molecular computation: RNA solutions to chess problems.

Authors:  D Faulhammer; A R Cukras; R J Lipton; L F Landweber
Journal:  Proc Natl Acad Sci U S A       Date:  2000-02-15       Impact factor: 11.205

2.  A sticker-based model for DNA computation.

Authors:  S Roweis; E Winfree; R Burgoyne; N V Chelyapov; M F Goodman; P W Rothemund; L M Adleman
Journal:  J Comput Biol       Date:  1998       Impact factor: 1.479

3.  High throughput DNA sequencing with a microfabricated 96-lane capillary array electrophoresis bioprocessor.

Authors:  Brian M Paegel; Charles A Emrich; Gary J Wedemayer; James R Scherer; Richard A Mathies
Journal:  Proc Natl Acad Sci U S A       Date:  2002-01-15       Impact factor: 11.205

4.  Solution of a 20-variable 3-SAT problem on a DNA computer.

Authors:  Ravinderjit S Braich; Nickolas Chelyapov; Cliff Johnson; Paul W K Rothemund; Leonard Adleman
Journal:  Science       Date:  2002-03-14       Impact factor: 47.728

5.  Fast parallel molecular algorithms for DNA-based computation: factoring integers.

Authors:  Weng-Long Chang; Minyi Guo; Michael Shan-Hui Ho
Journal:  IEEE Trans Nanobioscience       Date:  2005-06       Impact factor: 2.935

6.  Folding DNA to create nanoscale shapes and patterns.

Authors:  Paul W K Rothemund
Journal:  Nature       Date:  2006-03-16       Impact factor: 49.962

Review 7.  High resolution DNA separations using microchip electrophoresis.

Authors:  Rondedrick Sinville; Steven A Soper
Journal:  J Sep Sci       Date:  2007-07       Impact factor: 3.645

8.  Making DNA add.

Authors:  F Guarnieri; M Fliss; C Bancroft
Journal:  Science       Date:  1996-07-12       Impact factor: 47.728

9.  Molecular computation of solutions to combinatorial problems.

Authors:  L M Adleman
Journal:  Science       Date:  1994-11-11       Impact factor: 47.728

10.  Automatic DNA Diagnosis for 1D Gel Electrophoresis Images using Bio-image Processing Technique.

Authors:  Apichart Intarapanich; Saowaluck Kaewkamnerd; Philip J Shaw; Kittipat Ukosakit; Somvong Tragoonrung; Sissades Tongsima
Journal:  BMC Genomics       Date:  2015-12-09       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.