Literature DB >> 33315064

SneakySnake: A Fast and Accurate Universal Genome Pre-Alignment Filter for CPUs, GPUs, and FPGAs.

Mohammed Alser1,2, Taha Shahroodi1, Juan Gómez-Luna1,2, Can Alkan3, Onur Mutlu1,2,4,3.   

Abstract

MOTIVATION: We introduce SneakySnake, a highly parallel and highly accurate pre-alignment filter that remarkably reduces the need for computationally costly sequence alignment. The key idea of SneakySnake is to reduce the approximate string matching (ASM) problem to the single net routing (SNR) problem in VLSI chip layout. In the SNR problem, we are interested in finding the optimal path that connects two terminals with the least routing cost on a special grid layout that contains obstacles. The SneakySnake algorithm quickly solves the SNR problem and uses the found optimal path to decide whether or not performing sequence alignment is necessary. Reducing the ASM problem into SNR also makes SneakySnake efficient to implement on CPUs, GPUs, and FPGAs.
RESULTS: SneakySnake significantly improves the accuracy of pre-alignment filtering by up to four orders of magnitude compared to the state-of-the-art pre-alignment filters, Shouji, GateKeeper, and SHD. For short sequences, SneakySnake accelerates Edlib (state-of-the-art implementation of Myers's bit-vector algorithm) and Parasail (state-of-the-art sequence aligner with a configurable scoring function), by up to 37.7× and 43.9 × (>12× on average), respectively, with its CPU implementation, and by up to 413× and 689 × (>400× on average), respectively, with FPGA and GPU acceleration. For long sequences, the CPU implementation of SneakySnake accelerates Parasail and KSW2 (sequence aligner of minimap2) by up to 979 × (276.9× on average) and 91.7 × (31.7× on average), respectively. As SneakySnake does not replace sequence alignment, users can still obtain all capabilities (e.g., configurable scoring functions) of the aligner of their choice, unlike existing acceleration efforts that sacrifice some aligner capabilities. AVAILABILITY: https://github.com/CMU-SAFARI/SneakySnake. SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online.
© The Author(s) (2020). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Year:  2020        PMID: 33315064     DOI: 10.1093/bioinformatics/btaa1015

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

1.  Fast and memory-efficient mapping of short bisulfite sequencing reads using a two-letter alphabet.

Authors:  Guilherme de Sena Brandine; Andrew D Smith
Journal:  NAR Genom Bioinform       Date:  2021-12-22

Review 2.  Technology dictates algorithms: recent developments in read alignment.

Authors:  Mohammed Alser; Jeremy Rotman; Onur Mutlu; Serghei Mangul; Dhrithi Deshpande; Kodi Taraszka; Huwenbo Shi; Pelin Icer Baykal; Harry Taegyun Yang; Victor Xue; Sergey Knyazev; Benjamin D Singer; Brunilda Balliu; David Koslicki; Pavel Skums; Alex Zelikovsky; Can Alkan
Journal:  Genome Biol       Date:  2021-08-26       Impact factor: 13.583

Review 3.  From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures.

Authors:  Mohammed Alser; Joel Lindegger; Can Firtina; Nour Almadhoun; Haiyu Mao; Gagandeep Singh; Juan Gomez-Luna; Onur Mutlu
Journal:  Comput Struct Biotechnol J       Date:  2022-08-18       Impact factor: 6.155

4.  Nanopore Base Calling on the Edge.

Authors:  Peter Perešíni; Vladimír Boža; Broňa Brejová; Tomáš Vinař
Journal:  Bioinformatics       Date:  2021-07-26       Impact factor: 6.937

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.