Literature DB >> 32884959

Parallelizing Assignment Problem with DNA Strands.

Babak Khorsand¹, Abdorreza Savadi¹, Mahmoud Naghibzadeh¹.

Abstract

BACKGROUND: Many problems of combinatorial optimization, which are solvable only in exponential time, are known to be Non-Deterministic Polynomial hard (NP-hard). With the advent of parallel machines, new opportunities have been emerged to develop the effective solutions for NP-hard problems. However, solving these problems in polynomial time needs massive parallel machines and is not applicable up to now.
OBJECTIVES: DNA (Deoxyribonucleic acid) computing provides a fantastic method to solve NP-hard problems in polynomial time. Accordingly, one of the famous NP-hard problems is assignment problem, which is designed to find the best assignment of n jobs to n persons in a way that it could maximize the profit or minimize the cost.
MATERIAL AND METHODS: Applying bio molecular operations of Adelman Lipton model, a novel parallel DNA algorithm have been proposed for solving the assignment problem.
RESULTS: The proposed algorithm can solve the problem in time complexity, and just O(n2) initial DNA strand in comparison with nn initial sequence, which is used by the other methods.
CONCLUSIONS: In this article, using DNA computing, we proposed a parallel DNA algorithm to solve the assignment problem in linear time. Copyright:

Entities: Chemical

Keywords: Adelman Lipton model; Assignment; DNA algorithm; DNA computing; Molecular computation

Year: 2020 PMID： 32884959 PMCID： PMC7461708 DOI： 10.30498/IJB.2020.195413.2547

Source DB: PubMed Journal: Iran J Biotechnol ISSN： 1728-3043 Impact factor: 1.671

1. Background

Deoxyribonucleic acid (DNA) is a molecule that carries most of the genetic instructions used in the development, functioning, and reproduction of all known living organisms and many viruses. Just one gram of DNA can hold about 680 peta bytes (6.8* 1017) equal to one billion ( 1 ) Compact Disc (CD), which take 163,000 centuries to be listened. Considering parallelism aspect, it also is capable of parallelizing 3*1014 molecules at a time ( 2 ), and this enormous parallelism persuades the researchers to build a model by the use of DNA strands to solve NP-hard problems ( 3 ). Considering the similarities in biological and mathematical operations, and the facts that DNA is stable and also predictable in its reactions, this macromolecule can be used to encode information for mathematical systems. In 1994, Adelman ( 4 ) designed the first model of molecular computation by solving a seven-node instances of Hamiltonian path problem, and then showed the enormous parallel power of DNA computation. In 1995, Lipton ( 5 ) solved satisfiability problem by Adelman’s method and designed a new method called Adelman-Lipton model, which was consisted of three levels as follows: a) Producing specific sequences and putting them in a test tube. b) Performing a number of operations on the DNA sequences inside the tube and removing unwanted sequences, which certainly cannot be considered as the final answer of the problem. c) Reading and reporting the DNA sequences, if any exists, in the final tube as final answer. Ouyang ( 6 ) solved Maximal clique problem in 1997 by designing Restriction enzyme model. Roweis designed sticker model ( 7 ) in 1998. The Self-assembly model was designed by Winfree in 1998 ( 8 ). The Surface based model ( 9 ) was designed by Smith in 1998. And finally, Sakomoto proposed the Hairpin model in 2000 ( 10 ). As the other applications of DNA computing we can point to Zhang's image encryption algorithm ( 11 ) or Patel multiple image encryption ( 12 ) which is based on DNA computing or novel data encryption scheme which is proposed by Patro ( 13 ) by using DNA computing. Many efforts have been done by researchers to solve NP-hard problems via these models such as knapsack ( 14 ), n-queen ( 15 ), binpacking ( 16 ), maximal clique problem ( 17 ), graph vertex coloring problem ( 18 ), maximal matching problem ( 19 ), maximal connected subgraph problem ( 20 ), maximum complete bipartite subgraph ( 21 ), maximum independent set problem ( 22 ), maximum matching problem ( 23 ), maximum k-colourable subgraph ( 24 ), hamiltonian path ( 25 , 26 ), minimum k-suppliers ( 27 ).

2. Objectives

Assignment is another famous NP-hard problem ( 28 ), which has been solved in different versions by researchers as Wang ( 29 ) who solved the unbalanced version of assignment, or Yang ( 30 ) who solved the quadratic version of assignment. Based on Adleman-Lipton model, we proposed a parallel DNA algorithm with O(n2) time complexity for assignment problem, and O(n) initial DNA strand in comparison with Onn initial DNA strands of similar solutions such as that was proposed by Wang et al ( 31 ).

3. Materials and Methods

In Assignment problem, we have n jobs, which can be performed by n people. Each of the people will do each one of the jobs with a specific wage. The objective of this study was to assign each job to one person in such a way that the total wages become minimized. As an example, consider four teachers (T) from four different cities (C) and also four universities (U1-U4) in four different other cities (C). Table 1 shows the ticket price for each flight from cities C to cities C. The goal is to send each teacher to one of the universities in a way that the total price of purchased ticket becomes minimized.

Table 1

Ticket prices for the example described in the text.

City	C_T1	C_T1	C_T1	C_T1
C_U1	40	140	90	180
C_U2	120	20	70	110
C_U3	80	25	25	115
C_U4	80	180	130	140

Ticket prices for the example described in the text. We can represent Table 1 as a cost matrix: One possible assignment is as follow: Where selected flights highlighted as yellow. The total cost of this assignment is: 90$+110$+25$+80$=305$. But the best possible assignment is: Which will cost 40$+20$+25$+140$=225$ and means to send T to U to U to U and T to U.

3.1. Adelman-Lipton Model:

Different operations of Adelman-Lipton model: Merge (T1, T2): The content of tube T1 will be merged with T2 and will also be moved into T1. Copy (T1, T2): The content of T1 will be copied into the empty tube T2. Detect (T): Determining whether any sequence exists in tube T or not. Extract (T1, X, T2): The subset of T1, which contains the X sequence will be transferred into T2. Select (T1, X, T2): The subset of T1 containing sequences longer than X will be transferred into T2. Selecting (T1, L, T2): It will select sequences of size L from tube T1 and put them in tube T2. SelectMax (T1, T2): The longest sequence in tube T1 will be chosen and moved into tube T2. SelectMin (T1, T2): The shortest sequence in tube T1 will be chosen and moved into tube T2. Annealing (T): ssDNA will be converted to dsDNA by freezing the tube. Denaturing (T): dsDNA will be converted to ssDNA by freezing the tube. Discard (T): The content of tube T will be eliminated. PCR (T, T1): A lot of copies of DNA strands will be made and placed in tube T1. Append Tail (T, Z): Sequence Z will be appended to the end of all sequences in tube T. Read (T): A sequence of tube T will be read.

3.2. DNA Algorithm for Assignment Problem

First of all, all n combinations of people and jobs will be generated as different DNA strands. Then, the strands having all people will be separated and the others will be discarded. After that, among remained stands, those that have all jobs will be separated and the others will be discarded. The shortest sequences will be regarded as the final solutions. The proposed algorithm consists of six steps: Building 2n unique10-mer sequences (sequence with length 10) for each of n people and n jobs. Building all n2 possible combinations of jobs with people and then adding Ci-mer sequences of G nucleotides (Ci is the cost or profit of performing Ji by Pi). Construct all nn possible combinations of choosing n jobs by n people. Eliminate all sequences which do not consist of all people. Among remaining sequences of previous step, keep only that were consisted of all jobs. Consider the shortest sequence as the optimal solution.

3.3. Strand Design:

First of all, for each person (Pi) and each job (Ji) of assignment problem produce a distinct 10mer sequence. We can produce 59049 different 10mer sequences using 3 nucleotides a, t, c. Among them, we should select the sequences, which have the most difference with each other and then put these 2n sequences in 2n tube and perform the Polymerase Chain Reaction (PCR) on them. PCR, developed in 1983 by Kary Mullis ( 32 ), is a molecular biology technology, which generates a lot of copies from a single piece of DNA sequence. We have to generate n2 sequences by dot product of n jobs to n person’s sequences. Finally, equal to the cost of assigning a person to a job, we should add Ci-mer sequences of G nucleotides to the related sequences. For making the length of sequences shorter according to the costs, it is easily possible to make a rule. Such as subtracting the minimum cost can form the cost belonged to all people. As an example, in our problem, the minimum ticket price was 20$ and the costs were differed at least 5$ with each other. So, we subtracted 20 from all ticket prices and then added a G for each5$. As you can see in Table 2 for the four teachers, we used four different 10mer, and for the four universities, we used four others different 10mer.

Table 2

Sequences chosen to represent the elements of the assignment problem, which has been described in the Table 1.

	DNA Sequence		DNA Sequence
T₁	5' TCTATAACTA 3'	U₁	5' AACTACATTT 3'
T₂	5' TAACACTATT 3'	U₂	5' CATCATTTAC 3'
T₃	5' ACTAATCTCT 3'	U₃	U35' ATTACTCCTA 3'
T₄	5' CACACATCTA 3'	U₄	5' TACCTCAACT 3'

Sequences chosen to represent the elements of the assignment problem, which has been described in the Table 1. Then, as it is shown in Table 3, we should combine all possible ways and then add one G for each 5$ to all of the sequences and also remove 4G from all of them because of subtracting minimum.

Table 3

Sequences chosen to represent the elements of assignment problem.

	Length	DNA Sequence
T₁U₁	40	5' TCTATAACTAAACTACATTTGGGGGGGGGGGGGGGGGGGG 3'
T₁U₂	32	5' TCTATAACTACATCATTTACGGGGGGGGGGGG 3'
T₁U₃	24	5' TCTATAACTAATTACTCCTAGGGG 3'
T₁U₄	32	5' TCTATAACTATACCTCAACTGGGGGGGGGGGG 3'
T₂U₁	20	5' TAACACTATTAACTACATTT 3'
T₂U₂	52	5' TAACACTATTCATCATTTACGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG 3'
T₂U₃	44	5' TAACACTATTATTACTCCTAGGGGGGGGGGGGGGGGGGGGGGGG 3'
T₂U₄	21	5' TAACACTATTTACCTCAACTG 3'
T₃U₁	30	5' ACTAATCTCTAACTACATTTGGGGGGGGGG 3'
T₃U₂	42	5' ACTAATCTCTCATCATTTACGGGGGGGGGGGGGGGGGGGGGG 3'
T₃U₃	34	5' ACTAATCTCTATTACTCCTAGGGGGGGGGGGGGG 3'
T₃U₃	21	5' ACTAATCTCTTACCTCAACTG 3'
T₄U₁	38	5' CACACATCTAAACTACATTTGGGGGGGGGGGGGGGGGG 3'
T₄U₂	44	5' CACACATCTACATCATTTACGGGGGGGGGGGGGGGGGGGGGGGG 3'
T₄U₃	52	5' CACACATCTAATTACTCCTAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG 3'
T₄U₄	39	5' CACACATCTATACCTCAACTGGGGGGGGGGGGGGGGGGG 3'

Sequences chosen to represent the elements of assignment problem.

3.4. Sample Space:

Put each of the related strands to the first person in a separated tube. For (i in 1 to n) ° Merge (T, Ti) ° Discard (Ti) For (i in 2 to n) ° For (j in 1 to n) ▪ PCR (T, Tj) ▪ AppendTail (Tj, Pi Jj) ▪ Merge (Temp, Tj) ▪ Discard (Tj) ° Discard (T) ° Copy (Temp, T) ° Discard (Temp) The trace of the top algorithm for three people and three job is shown as an example in Table 4.

Table 4

Trace for three people and three jobs

Round	Tube T₁	Tube T₂	Tube T₃	Tube T
i=1	P₁J₁	P₁J₂	P₁J₃	P₁J₁,P₁J₂,P₁J₃
i=2, j=1	P₁J₁P₂J₁,P₁J₂ P₂J₁,P₁J₃ P₂J₁	P₁J₁,P₁J₂,P₁J₃
i=2, j=2		P₁J₁P₂J₂,P₁J₂ P₂J₂, P₁J₃ P₂J₂		P₁J₁,P₁J₂,P₁J₃
i=2, j=3			P₁J₁P₂J₃,P₁J₂P₂J₃,P₁J₃P₂J₃	P₁J₁,P₁J₂,P₁J₃
i=3, j=1	P₁J₁P₂J₁P₃J₁,P₁ J₂P₂J₁P₃J₁, P₁J₃P₂J₁P₃J₁,P₁J₁P₂J₂P₃J₁, P₁J₂P₂J₂P₃J₁,P₁J₃P₂J₂P₃J₁, P₁J₁P₂J₃P₃J₁,P₁J₂P₂J₃P₃J₁, P₁J₃P₂J₃P₃J₁			P₁J₁P₂J₁,P₁ J₂P₂J₁, P₁J₃P₂J₁,P₁J₁P₂J₂, P₁J₂P₂J₂,P₁J₃P₂J₂, P₁J₁P₂J₃,P₁J₂P₂J₃, P₁J₃P₂J₃
i=3, j=3			P₁J₁P₂J₁P₃J₃,P₁ J₂P₂J₁P₃J₃, P₁J₃P₂J₁P₃J₃,P₁J₁P₂J₂P₃J₃, P₁J₂P₂J₂P₃J₃,P₁J₃P₂J₂P₃J₃, P₁J₁P₂J₃P₃J₃,P₁J₂P₂J₃P₃J₃, P₁J₃P₂J₃P₃J₃	P₁J₁P₂J₁,P₁ J₂P₂J₁, P₁J₃P₂J₁,P₁J₁P₂J₂, P₁J₂P₂J₂,P₁J₃P₂J₂, P₁J₁P₂J₃,P₁J₂P₂J₃, P₁J₃P₂J₃
Final	P₁J₁P₂J₁P₃J₁,P₁J₂P₂J₁P₃J₁,P₁J₃P₂J₁P₃J₁,P₁J₁P₂J₂P₃J₁,P₁J₂P₂J₂P₃J₁,P₁J₃P₂J₂P₃J₁,P₁J₁P₂J₃P₃J₁, P₁J₂P₂J₃P₃J₁,P₁J₃P₂J₃P₃J₁P₁J₁P₂J₁P₃J₂,P₁J₂P₂J₁P₃J₂,P₁J₃P₂J₁P₃J₂,P₁J₁P₂J₂P₃J₂,P₁J₂P₂J₂P₃J₂, P₁J₃P₂J₂P₃J₂,P₁J₁P₂J₃P₃J₂,P₁J₂P₂J₃P₃J₂,P₁J₃P₂J₃P₃J₂,P₁J₁P₂J₁P₃J₃,P₁J₂P₂J₁P₃J₃,P₁J₃P₂J₁P₃J₃, P₁J₁P₂J₂P₃J₃,P₁J₂P₂J₂P₃J₃,P₁J₃P₂J₂P₃J₃,P₁J₁P₂J₃P₃J₃,P₁J₂P₂J₃P₃J₃,P₁J₃P₂J₃P₃J₃

Eliminate all sequences, which do not consist of all people. Trace for three people and three jobs For (i in 1 to n) °Extract (T, Pi, Temp) °Discard(T) °Copy (Temp, T) °Discard (Temp) Eliminate all sequences, which do not consist of all jobs. For (i in 1 to n) ° Extract (T, Jobi, Temp) ° Discard(T) ° Copy (Temp, T) ° Discard (Temp) Final Answer SelectMin (T, Temp) Read (Temp)

4. Results

4.1. The solutions of assignment problem with n jobs and individuals can be obtained by above mentioned DNA molecules operations.

Proof. At first, all combinations of the n job assignments were generated in the sample space. Then, the sequences containing all people were kept and this guaranteed that none of the people are jobless. And finally, among the remaining sequences, the sequences containing all the jobs were kept and this guaranteed that all of the jobs will be done at the end. Therefore, for sure, all possible assignments are in final tube, and as the cost of each assignment was attached to each of them in the strain design phase, so the strand with minimum length among the strands of final tube would be the minimum cost assigning solution.

4.2. The time complexity of the proposed algorithm for solving assignment problem with n jobs and individuals is n2.

Proof. As the complexity of every biological operation is O(1) ( 33 ), the time complexity of algorithm can be calculated by adding the time complexity of all steps as follows: T (Sample space): O(n) for the first loop and On for the nested loop. T (Persons): Four biological operations in each step, which become O(1), and in n step it would be O(n). T (Jobs): Four biological operations in each step, which become O(1), and in n step it would be O(n). T (Last step): Two biological operations, which become O(1). So, the time complexity of the algorithm will be arrived by equation 1 as follows: T(n) = O(n)+O(n

5. Discussion

In the proposed method, by using adenine, thymine, and cytosine; we produced distinct 10mer sequences equal to the number of people and jobs of assignment problem. Then, by using polymerase chain reaction on them, we produced n sequences equal to the dot product of n persons to n jobs. Finally, by adding guanine equal to the cost of assigning a job to a person to the related sequence, we got the desired sequence. Wang et al.( 31 ), Kang et al.( 34 ), Shu et al.( 35 ), and Ebrahim et al.( 36 ), Tsaftaris et al.( 37 ), Rashid et al. ( 38 ) all used the same model to solve the assignment problem using n distinct sequence with the time complexity of O(n. The proposed algorithm reduced the initial cost to a great extent by using just 2n initial sequences in comparison with n initial sequences, which is used by the other methods and solve the problem with the same time complexity.

6. Conclusion

Ultra-efficient parallelism of the methods of DNA computing in contrast with the obvious limitations in storage, speed, intelligence, and miniaturization of electronic computers is considerable. Although there are still some problems that need a further study in biologic technology, it is still possible to solve a lot of NP-hard problems in linear time via DNA computing. In this article, we highlighted a DNA computing model with biological operations based on Adelman-Lipton model to solve Assignment problem with the time complexity of O(n with just buying 2n 10mer sequences and making the other needed sequences in our lab in comparison with the previous methods which need a lot of different sequences. We hope that, in near future, molecular computer become usable instead of electronic computers, which can cause DNA computing solutions become applicable.

12 in total

10. A Parallel Biological Optimization Algorithm to Solve the Unbalanced Assignment Problem Based on DNA Molecular Computing.

Authors: Zhaocai Wang; Jun Pu; Liling Cao; Jian Tan
Journal: Int J Mol Sci Date: 2015-10-23 Impact factor: 5.923

Parallelizing Assignment Problem with DNA Strands.

1. Background

2. Objectives

3. Materials and Methods

3.1. Adelman-Lipton Model:

3.2. DNA Algorithm for Assignment Problem

3.3. Strand Design:

3.4. Sample Space:

4. Results

4.1. The solutions of assignment problem with n jobs and individuals can be obtained by above mentioned DNA molecules operations.

4.2. The time complexity of the proposed algorithm for solving assignment problem with n jobs and individuals is n2.

5. Discussion

6. Conclusion

1. Molecular computation by DNA hairpin formation.

2. A sticker-based model for DNA computation.

3. A surface-based approach to DNA computation.

4. DNA-based computing of strategic assignment problems.

5. DNA solution of the maximal clique problem.

6. A parallel algorithm for solving the n-queens problem based on inspired computational model.

7. DNA Fountain enables a robust and efficient storage architecture.

8. DNA solution of hard computational problems.

9. Molecular computation of solutions to combinatorial problems.

10. A Parallel Biological Optimization Algorithm to Solve the Unbalanced Assignment Problem Based on DNA Molecular Computing.