| Literature DB >> 30785347 |
Romain Menegaux1,2, Jean-Philippe Vert1,2,3,4.
Abstract
We propose a new model for fast classification of DNA sequences output by next-generation sequencing machines. The model, which we call fastDNA, embeds DNA sequences in a vector space by learning continuous low-dimensional representations of the k-mers it contains. We show on metagenomics benchmarks that it outperforms the state-of-the-art methods in terms of accuracy and scalability.Entities:
Keywords: classification; embedding.; metagenomics; sequencing
Mesh:
Substances:
Year: 2019 PMID: 30785347 DOI: 10.1089/cmb.2018.0174
Source DB: PubMed Journal: J Comput Biol ISSN: 1066-5277 Impact factor: 1.479