Literature DB >> 32884959

Parallelizing Assignment Problem with DNA Strands.

Babak Khorsand1, Abdorreza Savadi1, Mahmoud Naghibzadeh1.   

Abstract

BACKGROUND: Many problems of combinatorial optimization, which are solvable only in exponential time, are known to be Non-Deterministic Polynomial hard (NP-hard). With the advent of parallel machines, new opportunities have been emerged to develop the effective solutions for NP-hard problems. However, solving these problems in polynomial time needs massive parallel machines and is not applicable up to now.
OBJECTIVES: DNA (Deoxyribonucleic acid) computing provides a fantastic method to solve NP-hard problems in polynomial time. Accordingly, one of the famous NP-hard problems is assignment problem, which is designed to find the best assignment of n jobs to n persons in a way that it could maximize the profit or minimize the cost.
MATERIAL AND METHODS: Applying bio molecular operations of Adelman Lipton model, a novel parallel DNA algorithm have been proposed for solving the assignment problem.
RESULTS: The proposed algorithm can solve the problem in time complexity, and just O(n2) initial DNA strand in comparison with nn initial sequence, which is used by the other methods.
CONCLUSIONS: In this article, using DNA computing, we proposed a parallel DNA algorithm to solve the assignment problem in linear time. Copyright:
© 2019 The Author(s); Published by Iranian Journal of Biotechnology.

Entities:  

Keywords:  Adelman Lipton model; Assignment; DNA algorithm; DNA computing; Molecular computation

Year:  2020        PMID: 32884959      PMCID: PMC7461708          DOI: 10.30498/IJB.2020.195413.2547

Source DB:  PubMed          Journal:  Iran J Biotechnol        ISSN: 1728-3043            Impact factor:   1.671


1. Background

Deoxyribonucleic acid (DNA) is a molecule that carries most of the genetic instructions used in the development, functioning, and reproduction of all known living organisms and many viruses. Just one gram of DNA can hold about 680 peta bytes (6.8* 1017) equal to one billion ( 1 ) Compact Disc (CD), which take 163,000 centuries to be listened. Considering parallelism aspect, it also is capable of parallelizing 3*1014 molecules at a time ( 2 ), and this enormous parallelism persuades the researchers to build a model by the use of DNA strands to solve NP-hard problems ( 3 ). Considering the similarities in biological and mathematical operations, and the facts that DNA is stable and also predictable in its reactions, this macromolecule can be used to encode information for mathematical systems. In 1994, Adelman ( 4 ) designed the first model of molecular computation by solving a seven-node instances of Hamiltonian path problem, and then showed the enormous parallel power of DNA computation. In 1995, Lipton ( 5 ) solved satisfiability problem by Adelman’s method and designed a new method called Adelman-Lipton model, which was consisted of three levels as follows: a) Producing specific sequences and putting them in a test tube. b) Performing a number of operations on the DNA sequences inside the tube and removing unwanted sequences, which certainly cannot be considered as the final answer of the problem. c) Reading and reporting the DNA sequences, if any exists, in the final tube as final answer. Ouyang ( 6 ) solved Maximal clique problem in 1997 by designing Restriction enzyme model. Roweis designed sticker model ( 7 ) in 1998. The Self-assembly model was designed by Winfree in 1998 ( 8 ). The Surface based model ( 9 ) was designed by Smith in 1998. And finally, Sakomoto proposed the Hairpin model in 2000 ( 10 ). As the other applications of DNA computing we can point to Zhang's image encryption algorithm ( 11 ) or Patel multiple image encryption ( 12 ) which is based on DNA computing or novel data encryption scheme which is proposed by Patro ( 13 ) by using DNA computing. Many efforts have been done by researchers to solve NP-hard problems via these models such as knapsack ( 14 ), n-queen ( 15 ), binpacking ( 16 ), maximal clique problem ( 17 ), graph vertex coloring problem ( 18 ), maximal matching problem ( 19 ), maximal connected subgraph problem ( 20 ), maximum complete bipartite subgraph ( 21 ), maximum independent set problem ( 22 ), maximum matching problem ( 23 ), maximum k-colourable subgraph ( 24 ), hamiltonian path ( 25 , 26 ), minimum k-suppliers ( 27 ).

2. Objectives

Assignment is another famous NP-hard problem ( 28 ), which has been solved in different versions by researchers as Wang ( 29 ) who solved the unbalanced version of assignment, or Yang ( 30 ) who solved the quadratic version of assignment. Based on Adleman-Lipton model, we proposed a parallel DNA algorithm with O(n2) time complexity for assignment problem, and O(n) initial DNA strand in comparison with Onn initial DNA strands of similar solutions such as that was proposed by Wang et al ( 31 ).

3. Materials and Methods

In Assignment problem, we have n jobs, which can be performed by n people. Each of the people will do each one of the jobs with a specific wage. The objective of this study was to assign each job to one person in such a way that the total wages become minimized. As an example, consider four teachers (T) from four different cities (C) and also four universities (U1-U4) in four different other cities (C). Table 1 shows the ticket price for each flight from cities C to cities C. The goal is to send each teacher to one of the universities in a way that the total price of purchased ticket becomes minimized.
Table 1

Ticket prices for the example described in the text.

CityCT1CT1CT1CT1
CU14014090180
CU21202070110
CU3802525115
CU480180130140
Ticket prices for the example described in the text. We can represent Table 1 as a cost matrix: One possible assignment is as follow: Where selected flights highlighted as yellow. The total cost of this assignment is: 90$+110$+25$+80$=305$. But the best possible assignment is: Which will cost 40$+20$+25$+140$=225$ and means to send T to U to U to U and T to U.

3.1. Adelman-Lipton Model:

Different operations of Adelman-Lipton model: Merge (T1, T2): The content of tube T1 will be merged with T2 and will also be moved into T1. Copy (T1, T2): The content of T1 will be copied into the empty tube T2. Detect (T): Determining whether any sequence exists in tube T or not. Extract (T1, X, T2): The subset of T1, which contains the X sequence will be transferred into T2. Select (T1, X, T2): The subset of T1 containing sequences longer than X will be transferred into T2. Selecting (T1, L, T2): It will select sequences of size L from tube T1 and put them in tube T2. SelectMax (T1, T2): The longest sequence in tube T1 will be chosen and moved into tube T2. SelectMin (T1, T2): The shortest sequence in tube T1 will be chosen and moved into tube T2. Annealing (T): ssDNA will be converted to dsDNA by freezing the tube. Denaturing (T): dsDNA will be converted to ssDNA by freezing the tube. Discard (T): The content of tube T will be eliminated. PCR (T, T1): A lot of copies of DNA strands will be made and placed in tube T1. Append Tail (T, Z): Sequence Z will be appended to the end of all sequences in tube T. Read (T): A sequence of tube T will be read.

3.2. DNA Algorithm for Assignment Problem

First of all, all n combinations of people and jobs will be generated as different DNA strands. Then, the strands having all people will be separated and the others will be discarded. After that, among remained stands, those that have all jobs will be separated and the others will be discarded. The shortest sequences will be regarded as the final solutions. The proposed algorithm consists of six steps: Building 2n unique10-mer sequences (sequence with length 10) for each of n people and n jobs. Building all n2 possible combinations of jobs with people and then adding Ci-mer sequences of G nucleotides (Ci is the cost or profit of performing Ji by Pi). Construct all nn possible combinations of choosing n jobs by n people. Eliminate all sequences which do not consist of all people. Among remaining sequences of previous step, keep only that were consisted of all jobs. Consider the shortest sequence as the optimal solution.

3.3. Strand Design:

First of all, for each person (Pi) and each job (Ji) of assignment problem produce a distinct 10mer sequence. We can produce 59049 different 10mer sequences using 3 nucleotides a, t, c. Among them, we should select the sequences, which have the most difference with each other and then put these 2n sequences in 2n tube and perform the Polymerase Chain Reaction (PCR) on them. PCR, developed in 1983 by Kary Mullis ( 32 ), is a molecular biology technology, which generates a lot of copies from a single piece of DNA sequence. We have to generate n2 sequences by dot product of n jobs to n person’s sequences. Finally, equal to the cost of assigning a person to a job, we should add Ci-mer sequences of G nucleotides to the related sequences. For making the length of sequences shorter according to the costs, it is easily possible to make a rule. Such as subtracting the minimum cost can form the cost belonged to all people. As an example, in our problem, the minimum ticket price was 20$ and the costs were differed at least 5$ with each other. So, we subtracted 20 from all ticket prices and then added a G for each5$. As you can see in Table 2 for the four teachers, we used four different 10mer, and for the four universities, we used four others different 10mer.
Table 2

Sequences chosen to represent the elements of the assignment problem, which has been described in the Table 1.

DNA Sequence DNA Sequence
T15' TCTATAACTA 3'U15' AACTACATTT 3'
T25' TAACACTATT 3'U25' CATCATTTAC 3'
T35' ACTAATCTCT 3'U3U35' ATTACTCCTA 3'
T45' CACACATCTA 3'U45' TACCTCAACT 3'
Sequences chosen to represent the elements of the assignment problem, which has been described in the Table 1. Then, as it is shown in Table 3, we should combine all possible ways and then add one G for each 5$ to all of the sequences and also remove 4G from all of them because of subtracting minimum.
Table 3

Sequences chosen to represent the elements of assignment problem.

LengthDNA Sequence
T1U1405' TCTATAACTAAACTACATTTGGGGGGGGGGGGGGGGGGGG 3'
T1U2325' TCTATAACTACATCATTTACGGGGGGGGGGGG 3'
T1U3245' TCTATAACTAATTACTCCTAGGGG 3'
T1U4325' TCTATAACTATACCTCAACTGGGGGGGGGGGG 3'
T2U1205' TAACACTATTAACTACATTT 3'
T2U2525' TAACACTATTCATCATTTACGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG 3'
T2U3445' TAACACTATTATTACTCCTAGGGGGGGGGGGGGGGGGGGGGGGG 3'
T2U4215' TAACACTATTTACCTCAACTG 3'
T3U1305' ACTAATCTCTAACTACATTTGGGGGGGGGG 3'
T3U2425' ACTAATCTCTCATCATTTACGGGGGGGGGGGGGGGGGGGGGG 3'
T3U3345' ACTAATCTCTATTACTCCTAGGGGGGGGGGGGGG 3'
T3U3215' ACTAATCTCTTACCTCAACTG 3'
T4U1385' CACACATCTAAACTACATTTGGGGGGGGGGGGGGGGGG 3'
T4U2445' CACACATCTACATCATTTACGGGGGGGGGGGGGGGGGGGGGGGG 3'
T4U3525' CACACATCTAATTACTCCTAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG 3'
T4U4395' CACACATCTATACCTCAACTGGGGGGGGGGGGGGGGGGG 3'
Sequences chosen to represent the elements of assignment problem.

3.4. Sample Space:

Put each of the related strands to the first person in a separated tube. For (i in 1 to n) ° Merge (T, Ti) ° Discard (Ti) For (i in 2 to n) ° For (j in 1 to n) ▪ PCR (T, Tj) ▪ AppendTail (Tj, Pi Jj) ▪ Merge (Temp, Tj) ▪ Discard (Tj) ° Discard (T) ° Copy (Temp, T) ° Discard (Temp) The trace of the top algorithm for three people and three job is shown as an example in Table 4.
Table 4

Trace for three people and three jobs

RoundTube T1Tube T2Tube T3Tube T
i=1P1J1P1J2P1J3P1J1,P1J2,P1J3
i=2, j=1P1J1P2J1,P1J2 P2J1,P1J3 P2J1P1J1,P1J2,P1J3
i=2, j=2P1J1P2J2,P1J2 P2J2, P1J3 P2J2P1J1,P1J2,P1J3
i=2, j=3P1J1P2J3,P1J2P2J3,P1J3P2J3P1J1,P1J2,P1J3
i=3, j=1P1J1P2J1P3J1,P1 J2P2J1P3J1, P1J3P2J1P3J1,P1J1P2J2P3J1, P1J2P2J2P3J1,P1J3P2J2P3J1, P1J1P2J3P3J1,P1J2P2J3P3J1, P1J3P2J3P3J1P1J1P2J1,P1 J2P2J1, P1J3P2J1,P1J1P2J2, P1J2P2J2,P1J3P2J2, P1J1P2J3,P1J2P2J3, P1J3P2J3
i=3, j=3P1J1P2J1P3J3,P1 J2P2J1P3J3, P1J3P2J1P3J3,P1J1P2J2P3J3, P1J2P2J2P3J3,P1J3P2J2P3J3, P1J1P2J3P3J3,P1J2P2J3P3J3, P1J3P2J3P3J3P1J1P2J1,P1 J2P2J1, P1J3P2J1,P1J1P2J2, P1J2P2J2,P1J3P2J2, P1J1P2J3,P1J2P2J3, P1J3P2J3
FinalP1J1P2J1P3J1,P1J2P2J1P3J1,P1J3P2J1P3J1,P1J1P2J2P3J1,P1J2P2J2P3J1,P1J3P2J2P3J1,P1J1P2J3P3J1, P1J2P2J3P3J1,P1J3P2J3P3J1P1J1P2J1P3J2,P1J2P2J1P3J2,P1J3P2J1P3J2,P1J1P2J2P3J2,P1J2P2J2P3J2, P1J3P2J2P3J2,P1J1P2J3P3J2,P1J2P2J3P3J2,P1J3P2J3P3J2,P1J1P2J1P3J3,P1J2P2J1P3J3,P1J3P2J1P3J3, P1J1P2J2P3J3,P1J2P2J2P3J3,P1J3P2J2P3J3,P1J1P2J3P3J3,P1J2P2J3P3J3,P1J3P2J3P3J3
Eliminate all sequences, which do not consist of all people. Trace for three people and three jobs For (i in 1 to n) °Extract (T, Pi, Temp) °Discard(T) °Copy (Temp, T) °Discard (Temp) Eliminate all sequences, which do not consist of all jobs. For (i in 1 to n) ° Extract (T, Jobi, Temp) ° Discard(T) ° Copy (Temp, T) ° Discard (Temp) Final Answer SelectMin (T, Temp) Read (Temp)

4. Results

4.1. The solutions of assignment problem with n jobs and individuals can be obtained by above mentioned DNA molecules operations.

Proof. At first, all combinations of the n job assignments were generated in the sample space. Then, the sequences containing all people were kept and this guaranteed that none of the people are jobless. And finally, among the remaining sequences, the sequences containing all the jobs were kept and this guaranteed that all of the jobs will be done at the end. Therefore, for sure, all possible assignments are in final tube, and as the cost of each assignment was attached to each of them in the strain design phase, so the strand with minimum length among the strands of final tube would be the minimum cost assigning solution.

4.2. The time complexity of the proposed algorithm for solving assignment problem with n jobs and individuals is n2.

Proof. As the complexity of every biological operation is O(1) ( 33 ), the time complexity of algorithm can be calculated by adding the time complexity of all steps as follows: T (Sample space): O(n) for the first loop and On for the nested loop. T (Persons): Four biological operations in each step, which become O(1), and in n step it would be O(n). T (Jobs): Four biological operations in each step, which become O(1), and in n step it would be O(n). T (Last step): Two biological operations, which become O(1). So, the time complexity of the algorithm will be arrived by equation 1 as follows: T(n) = O(n)+O(n

5. Discussion

In the proposed method, by using adenine, thymine, and cytosine; we produced distinct 10mer sequences equal to the number of people and jobs of assignment problem. Then, by using polymerase chain reaction on them, we produced n sequences equal to the dot product of n persons to n jobs. Finally, by adding guanine equal to the cost of assigning a job to a person to the related sequence, we got the desired sequence. Wang et al.( 31 ), Kang et al.( 34 ), Shu et al.( 35 ), and Ebrahim et al.( 36 ), Tsaftaris et al.( 37 ), Rashid et al. ( 38 ) all used the same model to solve the assignment problem using n distinct sequence with the time complexity of O(n. The proposed algorithm reduced the initial cost to a great extent by using just 2n initial sequences in comparison with n initial sequences, which is used by the other methods and solve the problem with the same time complexity.

6. Conclusion

Ultra-efficient parallelism of the methods of DNA computing in contrast with the obvious limitations in storage, speed, intelligence, and miniaturization of electronic computers is considerable. Although there are still some problems that need a further study in biologic technology, it is still possible to solve a lot of NP-hard problems in linear time via DNA computing. In this article, we highlighted a DNA computing model with biological operations based on Adelman-Lipton model to solve Assignment problem with the time complexity of O(n with just buying 2n 10mer sequences and making the other needed sequences in our lab in comparison with the previous methods which need a lot of different sequences. We hope that, in near future, molecular computer become usable instead of electronic computers, which can cause DNA computing solutions become applicable.
  12 in total

1.  Molecular computation by DNA hairpin formation.

Authors:  K Sakamoto; H Gouzu; K Komiya; D Kiga; S Yokoyama; T Yokomori; M Hagiya
Journal:  Science       Date:  2000-05-19       Impact factor: 47.728

2.  A sticker-based model for DNA computation.

Authors:  S Roweis; E Winfree; R Burgoyne; N V Chelyapov; M F Goodman; P W Rothemund; L M Adleman
Journal:  J Comput Biol       Date:  1998       Impact factor: 1.479

3.  A surface-based approach to DNA computation.

Authors:  L M Smith; R M Corn; A E Condon; M G Lagally; A G Frutos; Q Liu; A J Thiel
Journal:  J Comput Biol       Date:  1998       Impact factor: 1.479

4.  DNA-based computing of strategic assignment problems.

Authors:  Jian-Jun Shu; Qi-Wen Wang; Kian-Yan Yong
Journal:  Phys Rev Lett       Date:  2011-05-03       Impact factor: 9.161

5.  DNA solution of the maximal clique problem.

Authors:  Q Ouyang; P D Kaplan; S Liu; A Libchaber
Journal:  Science       Date:  1997-10-17       Impact factor: 47.728

6.  A parallel algorithm for solving the n-queens problem based on inspired computational model.

Authors:  Zhaocai Wang; Dongmei Huang; Jian Tan; Taigang Liu; Kai Zhao; Lei Li
Journal:  Biosystems       Date:  2015-03-27       Impact factor: 1.973

7.  DNA Fountain enables a robust and efficient storage architecture.

Authors:  Yaniv Erlich; Dina Zielinski
Journal:  Science       Date:  2017-03-03       Impact factor: 47.728

8.  DNA solution of hard computational problems.

Authors:  R J Lipton
Journal:  Science       Date:  1995-04-28       Impact factor: 47.728

9.  Molecular computation of solutions to combinatorial problems.

Authors:  L M Adleman
Journal:  Science       Date:  1994-11-11       Impact factor: 47.728

10.  A Parallel Biological Optimization Algorithm to Solve the Unbalanced Assignment Problem Based on DNA Molecular Computing.

Authors:  Zhaocai Wang; Jun Pu; Liling Cao; Jian Tan
Journal:  Int J Mol Sci       Date:  2015-10-23       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.