Literature DB >> 30923804

Shouji: a fast and efficient pre-alignment filter for sequence alignment.

Mohammed Alser1,2,3, Hasan Hassan1, Akash Kumar2, Onur Mutlu1,3, Can Alkan3.   

Abstract

MOTIVATION: The ability to generate massive amounts of sequencing data continues to overwhelm the processing capability of existing algorithms and compute infrastructures. In this work, we explore the use of hardware/software co-design and hardware acceleration to significantly reduce the execution time of short sequence alignment, a crucial step in analyzing sequenced genomes. We introduce Shouji, a highly parallel and accurate pre-alignment filter that remarkably reduces the need for computationally-costly dynamic programming algorithms. The first key idea of our proposed pre-alignment filter is to provide high filtering accuracy by correctly detecting all common subsequences shared between two given sequences. The second key idea is to design a hardware accelerator that adopts modern field-programmable gate array (FPGA) architectures to further boost the performance of our algorithm.
RESULTS: Shouji significantly improves the accuracy of pre-alignment filtering by up to two orders of magnitude compared to the state-of-the-art pre-alignment filters, GateKeeper and SHD. Our FPGA-based accelerator is up to three orders of magnitude faster than the equivalent CPU implementation of Shouji. Using a single FPGA chip, we benchmark the benefits of integrating Shouji with five state-of-the-art sequence aligners, designed for different computing platforms. The addition of Shouji as a pre-alignment step reduces the execution time of the five state-of-the-art sequence aligners by up to 18.8×. Shouji can be adapted for any bioinformatics pipeline that performs sequence alignment for verification. Unlike most existing methods that aim to accelerate sequence alignment, Shouji does not sacrifice any of the aligner capabilities, as it does not modify or replace the alignment step.
AVAILABILITY AND IMPLEMENTATION: https://github.com/CMU-SAFARI/Shouji. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2019        PMID: 30923804      PMCID: PMC6821304          DOI: 10.1093/bioinformatics/btz234

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  24 in total

1.  Amino acid substitution matrices from protein blocks.

Authors:  S Henikoff; J G Henikoff
Journal:  Proc Natl Acad Sci U S A       Date:  1992-11-15       Impact factor: 11.205

2.  Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding.

Authors:  Kevin Judd McKernan; Heather E Peckham; Gina L Costa; Stephen F McLaughlin; Yutao Fu; Eric F Tsung; Christopher R Clouser; Cisyla Duncan; Jeffrey K Ichikawa; Clarence C Lee; Zheng Zhang; Swati S Ranade; Eileen T Dimalanta; Fiona C Hyland; Tanya D Sokolsky; Lei Zhang; Andrew Sheridan; Haoning Fu; Cynthia L Hendrickson; Bin Li; Lev Kotler; Jeremy R Stuart; Joel A Malek; Jonathan M Manning; Alena A Antipova; Damon S Perez; Michael P Moore; Kathleen C Hayashibara; Michael R Lyons; Robert E Beaudoin; Brittany E Coleman; Michael W Laptewicz; Adam E Sannicandro; Michael D Rhodes; Rajesh K Gottimukkala; Shan Yang; Vineet Bafna; Ali Bashir; Andrew MacBride; Can Alkan; Jeffrey M Kidd; Evan E Eichler; Martin G Reese; Francisco M De La Vega; Alan P Blanchard
Journal:  Genome Res       Date:  2009-06-22       Impact factor: 9.043

3.  Rapid and sensitive protein similarity searches.

Authors:  D J Lipman; W R Pearson
Journal:  Science       Date:  1985-03-22       Impact factor: 47.728

4.  A Survey of Software and Hardware Approaches to Performing Read Alignment in Next Generation Sequencing.

Authors:  Ahmad Al Kawam; Sunil Khatri; Aniruddha Datta
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2016-06-29       Impact factor: 3.710

5.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

Review 6.  Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

Authors:  Damla Senol Cali; Jeremie S Kim; Saugata Ghose; Can Alkan; Onur Mutlu
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

7.  Accuracy of Next Generation Sequencing Platforms.

Authors:  Edward J Fox; Kate S Reid-Bayliss; Mary J Emond; Lawrence A Loeb
Journal:  Next Gener Seq Appl       Date:  2014

8.  GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

Authors:  Mohammed Alser; Hasan Hassan; Hongyi Xin; Oguz Ergin; Onur Mutlu; Can Alkan
Journal:  Bioinformatics       Date:  2017-11-01       Impact factor: 6.937

9.  Edlib: a C/C ++ library for fast, exact sequence alignment using edit distance.

Authors:  Martin Šošic; Mile Šikic
Journal:  Bioinformatics       Date:  2017-05-01       Impact factor: 6.937

10.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

View more
  4 in total

1.  Proposal of Smith-Waterman algorithm on FPGA to accelerate the forward and backtracking steps.

Authors:  Fabio F de Oliveira; Leonardo A Dias; Marcelo A C Fernandes
Journal:  PLoS One       Date:  2022-06-30       Impact factor: 3.752

2.  Hardware Acceleration of Genomics Data Analysis: Challenges and Opportunities.

Authors:  Tony Robinson; Jim Harkin; Priyank Shukla
Journal:  Bioinformatics       Date:  2021-05-25       Impact factor: 6.937

Review 3.  Technology dictates algorithms: recent developments in read alignment.

Authors:  Mohammed Alser; Jeremy Rotman; Onur Mutlu; Serghei Mangul; Dhrithi Deshpande; Kodi Taraszka; Huwenbo Shi; Pelin Icer Baykal; Harry Taegyun Yang; Victor Xue; Sergey Knyazev; Benjamin D Singer; Brunilda Balliu; David Koslicki; Pavel Skums; Alex Zelikovsky; Can Alkan
Journal:  Genome Biol       Date:  2021-08-26       Impact factor: 13.583

Review 4.  From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures.

Authors:  Mohammed Alser; Joel Lindegger; Can Firtina; Nour Almadhoun; Haiyu Mao; Gagandeep Singh; Juan Gomez-Luna; Onur Mutlu
Journal:  Comput Struct Biotechnol J       Date:  2022-08-18       Impact factor: 6.155

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.