Literature DB >> 35402982

Local Alignment of DNA Sequence Based on Deep Reinforcement Learning.

Yong-Joon Song1, Dong-Ho Cho1.   

Abstract

Goal: Over the decades, there have been improvements in the sequence alignment algorithm, with significant advances in various aspects such as complexity and accuracy. However, human-defined algorithms have an explicit limitation in view of developmental completeness. This paper introduces a novel local alignment method to obtain optimal sequence alignment based on reinforcement learning.
Methods: There is a DQNalign algorithm that learns and performs sequence alignment through deep reinforcement learning. This paper proposes a DQN x-drop algorithm that performs local alignment without human intervention by combining the x-drop algorithm with this DQNalign algorithm. The proposed algorithm performs local alignment by repeatedly observing the subsequences and selecting the next alignment direction until the x-drop algorithm terminates the DQNalign algorithm. This proposed algorithm has an advantage in view of linear computational complexity compared to conventional local alignment algorithms.
Results: This paper compares alignment performance (coverage and identity) and complexity for a fair comparison between the proposed DQN x-drop algorithm and the conventional greedy x-drop algorithm. Firstly, we prove the proposed algorithm's superiority by comparing the two algorithms' computational complexity through numerical analysis. After that, we tested the alignment performance actual HEV and E.coli sequence datasets. The proposed method shows the comparable identity and coverage performance to the conventional alignment method while having linear complexity for the [Formula: see text] parameter. Conclusions: Through this study, it was possible to confirm the possibility of a new local alignment algorithm that minimizes computational complexity without human intervention.

Entities:  

Keywords:  Deep reinforcement learning; local alignment; meta learning; sequence alignment; sequence comparison

Year:  2021        PMID: 35402982      PMCID: PMC8975175          DOI: 10.1109/OJEMB.2021.3076156

Source DB:  PubMed          Journal:  IEEE Open J Eng Med Biol        ISSN: 2644-1276


  26 in total

1.  Aligning two sequences within a specified diagonal band.

Authors:  K M Chao; W R Pearson; W Miller
Journal:  Comput Appl Biosci       Date:  1992-10

2.  A fast and symmetric DUST implementation to mask low-complexity DNA sequences.

Authors:  Aleksandr Morgulis; E Michael Gertz; Alejandro A Schäffer; Richa Agarwala
Journal:  J Comput Biol       Date:  2006-06       Impact factor: 1.479

3.  A novel k-word relative measure for sequence comparison.

Authors:  Jie Tang; Keru Hua; Mengye Chen; Ruiming Zhang; Xiaoli Xie
Journal:  Comput Biol Chem       Date:  2014-11-07       Impact factor: 2.877

4.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

5.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

6.  A novel sequence alignment algorithm based on deep learning of the protein folding code.

Authors:  Mu Gao; Jeffrey Skolnick
Journal:  Bioinformatics       Date:  2021-05-01       Impact factor: 6.937

7.  Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12.

Authors:  T Hayashi; K Makino; M Ohnishi; K Kurokawa; K Ishii; K Yokoyama; C G Han; E Ohtsubo; K Nakayama; T Murata; M Tanaka; T Tobe; T Iida; H Takami; T Honda; C Sasakawa; N Ogasawara; T Yasunaga; S Kuhara; T Shiba; M Hattori; H Shinagawa
Journal:  DNA Res       Date:  2001-02-28       Impact factor: 4.458

8.  REMiner-II: a tool for rapid identification and configuration of repetitive element arrays from large mammalian chromosomes as a single query.

Authors:  Woo-Chan Kim; Kang-Hoon Lee; Kyung-Seop Shin; Ri-Na You; Young-Kwan Lee; Kiho Cho; Dong-Ho Cho
Journal:  Genomics       Date:  2012-06-28       Impact factor: 5.736

9.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Authors:  Fabian Sievers; Andreas Wilm; David Dineen; Toby J Gibson; Kevin Karplus; Weizhong Li; Rodrigo Lopez; Hamish McWilliam; Michael Remmert; Johannes Söding; Julie D Thompson; Desmond G Higgins
Journal:  Mol Syst Biol       Date:  2011-10-11       Impact factor: 11.429

10.  DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences.

Authors:  Daniel Quang; Xiaohui Xie
Journal:  Nucleic Acids Res       Date:  2016-04-15       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.