Mohammed Alser1, Hasan Hassan2,3, Hongyi Xin4, Oguz Ergin2, Onur Mutlu3, Can Alkan1. 1. Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey. 2. TOBB University of Economics & Technology, Sogutozu, Ankara, Turkey. 3. Department of Computer Science, ETH Zürich, 8092 Zürich, Switzerland. 4. Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
Abstract
MOTIVATION: High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. RESULTS: We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. AVAILABILITY AND IMPLEMENTATION: https://github.com/BilkentCompGen/GateKeeper. CONTACT: mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. RESULTS: We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. AVAILABILITY AND IMPLEMENTATION: https://github.com/BilkentCompGen/GateKeeper. CONTACT: mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean Journal: Nature Date: 2012-11-01 Impact factor: 49.962
Authors: Faraz Hach; Iman Sarrafi; Farhad Hormozdiari; Can Alkan; Evan E Eichler; S Cenk Sahinalp Journal: Nucleic Acids Res Date: 2014-05-08 Impact factor: 16.971
Authors: Mohammed Alser; Jeremy Rotman; Onur Mutlu; Serghei Mangul; Dhrithi Deshpande; Kodi Taraszka; Huwenbo Shi; Pelin Icer Baykal; Harry Taegyun Yang; Victor Xue; Sergey Knyazev; Benjamin D Singer; Brunilda Balliu; David Koslicki; Pavel Skums; Alex Zelikovsky; Can Alkan Journal: Genome Biol Date: 2021-08-26 Impact factor: 13.583
Authors: Evangelina Inacio Namburete; Anzaan Dippenaar; Emilyn Costa Conceição; Cinara Feliciano; Margarida Maria Passeri do Nascimento; Kamila Chagas Peronni; Wilson Araújo Silva; Josefo João Ferro; Lee H Harrison; Robin Mark Warren; Valdes Roberto Bollela Journal: Tuberculosis (Edinb) Date: 2020-01-29 Impact factor: 2.973