Literature DB >> 30346493

SENSE: Siamese neural network for sequence embedding and alignment-free comparison.

Wei Zheng1, Le Yang1, Robert J Genco2,3, Jean Wactawski-Wende4, Michael Buck5, Yijun Sun1,3.   

Abstract

MOTIVATION: Sequence analysis is arguably a foundation of modern biology. Classic approaches to sequence analysis are based on sequence alignment, which is limited when dealing with large-scale sequence data. A dozen of alignment-free approaches have been developed to provide computationally efficient alternatives to alignment-based approaches. However, existing methods define sequence similarity based on various heuristics and can only provide rough approximations to alignment distances.
RESULTS: In this article, we developed a new approach, referred to as SENSE (SiamEse Neural network for Sequence Embedding), for efficient and accurate alignment-free sequence comparison. The basic idea is to use a deep neural network to learn an explicit embedding function based on a small training dataset to project sequences into an embedding space so that the mean square error between alignment distances and pairwise distances defined in the embedding space is minimized. To the best of our knowledge, this is the first attempt to use deep learning for alignment-free sequence analysis. A large-scale experiment was performed that demonstrated that our method significantly outperformed the state-of-the-art alignment-free methods in terms of both efficiency and accuracy.
AVAILABILITY AND IMPLEMENTATION: Open-source software for the proposed method is developed and freely available at https://www.acsu.buffalo.edu/∼yijunsun/lab/SENSE.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2019        PMID: 30346493     DOI: 10.1093/bioinformatics/bty887

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

1.  Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data.

Authors:  Runpu Chen; Le Yang; Steve Goodison; Yijun Sun
Journal:  Bioinformatics       Date:  2020-03-01       Impact factor: 6.937

2.  The Buffalo OsteoPerio Studies: Summary of our findings and the unique contributions of Robert J. Genco, DDS, PhD.

Authors:  Jean Wactawski-Wende; Michael J LaMonte; Kathy Hovey; Hailey Banack
Journal:  Curr Oral Health Rep       Date:  2020-01-27

3.  DeepNOG: Fast and accurate protein orthologous group assignment.

Authors:  Roman Feldbauer; Lukas Gosch; Lukas Lüftinger; Patrick Hyden; Arthur Flexer; Thomas Rattei
Journal:  Bioinformatics       Date:  2020-12-26       Impact factor: 6.937

4.  AutoCoV: tracking the early spread of COVID-19 in terms of the spatial and temporal patterns from embedding space by K-mer based deep learning.

Authors:  Inyoung Sung; Sangseon Lee; Minwoo Pak; Yunyol Shin; Sun Kim
Journal:  BMC Bioinformatics       Date:  2022-04-25       Impact factor: 3.307

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.