Literature DB >> 25577434

Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping.

Hongyi Xin1, John Greth1, John Emmons1, Gennady Pekhimenko1, Carl Kingsford1, Can Alkan1, Onur Mutlu1.   

Abstract

MOTIVATION: Calculating the edit-distance (i.e. minimum number of insertions, deletions and substitutions) between short DNA sequences is the primary task performed by seed-and-extend based mappers, which compare billions of sequences. In practice, only sequence pairs with a small edit-distance provide useful scientific data. However, the majority of sequence pairs analyzed by seed-and-extend based mappers differ by significantly more errors than what is typically allowed. Such error-abundant sequence pairs needlessly waste resources and severely hinder the performance of read mappers. Therefore, it is crucial to develop a fast and accurate filter that can rapidly and efficiently detect error-abundant string pairs and remove them from consideration before more computationally expensive methods are used.
RESULTS: We present a simple and efficient algorithm, Shifted Hamming Distance (SHD), which accelerates the alignment verification procedure in read mapping, by quickly filtering out error-abundant sequence pairs using bit-parallel and SIMD-parallel operations. SHD only filters string pairs that contain more errors than a user-defined threshold, making it fully comprehensive. It also maintains high accuracy with moderate error threshold (up to 5% of the string length) while achieving a 3-fold speedup over the best previous algorithm (Gene Myers's bit-vector algorithm). SHD is compatible with all mappers that perform sequence alignment for verification.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2015        PMID: 25577434      PMCID: PMC4426831          DOI: 10.1093/bioinformatics/btu856

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  25 in total

1.  SOAP2: an improved ultrafast tool for short read alignment.

Authors:  Ruiqiang Li; Chang Yu; Yingrui Li; Tak-Wah Lam; Siu-Ming Yiu; Karsten Kristiansen; Jun Wang
Journal:  Bioinformatics       Date:  2009-06-03       Impact factor: 6.937

2.  RazerS 3: faster, fully sensitive read mapping.

Authors:  David Weese; Manuel Holtgrewe; Knut Reinert
Journal:  Bioinformatics       Date:  2012-08-24       Impact factor: 6.937

3.  A high-coverage genome sequence from an archaic Denisovan individual.

Authors:  Matthias Meyer; Martin Kircher; Marie-Theres Gansauge; Heng Li; Fernando Racimo; Swapan Mallick; Joshua G Schraiber; Flora Jay; Kay Prüfer; Cesare de Filippo; Peter H Sudmant; Can Alkan; Qiaomei Fu; Ron Do; Nadin Rohland; Arti Tandon; Michael Siebauer; Richard E Green; Katarzyna Bryc; Adrian W Briggs; Udo Stenzel; Jesse Dabney; Jay Shendure; Jacob Kitzman; Michael F Hammer; Michael V Shunkov; Anatoli P Derevianko; Nick Patterson; Aida M Andrés; Evan E Eichler; Montgomery Slatkin; David Reich; Janet Kelso; Svante Pääbo
Journal:  Science       Date:  2012-08-30       Impact factor: 47.728

4.  A draft sequence of the Neandertal genome.

Authors:  Johannes Krause; Adrian W Briggs; Tomislav Maricic; Udo Stenzel; Martin Kircher; Nick Patterson; Richard E Green; Heng Li; Weiwei Zhai; Markus Hsi-Yang Fritz; Nancy F Hansen; Eric Y Durand; Anna-Sapfo Malaspinas; Jeffrey D Jensen; Tomas Marques-Bonet; Can Alkan; Kay Prüfer; Matthias Meyer; Hernán A Burbano; Jeffrey M Good; Rigo Schultz; Ayinuer Aximu-Petri; Anne Butthof; Barbara Höber; Barbara Höffner; Madlen Siegemund; Antje Weihmann; Chad Nusbaum; Eric S Lander; Carsten Russ; Nathaniel Novod; Jason Affourtit; Michael Egholm; Christine Verna; Pavao Rudan; Dejana Brajkovic; Željko Kucan; Ivan Gušic; Vladimir B Doronichev; Liubov V Golovanova; Carles Lalueza-Fox; Marco de la Rasilla; Javier Fortea; Antonio Rosas; Ralf W Schmitz; Philip L F Johnson; Evan E Eichler; Daniel Falush; Ewan Birney; James C Mullikin; Montgomery Slatkin; Rasmus Nielsen; Janet Kelso; Michael Lachmann; David Reich; Svante Pääbo
Journal:  Science       Date:  2010-05-07       Impact factor: 47.728

5.  Alignment of whole genomes.

Authors:  A L Delcher; S Kasif; R D Fleischmann; J Peterson; O White; S L Salzberg
Journal:  Nucleic Acids Res       Date:  1999-06-01       Impact factor: 16.971

6.  Gorilla genome structural variation reveals evolutionary parallelisms with chimpanzee.

Authors:  Mario Ventura; Claudia R Catacchio; Can Alkan; Tomas Marques-Bonet; Saba Sajjadian; Tina A Graves; Fereydoun Hormozdiari; Arcadi Navarro; Maika Malig; Carl Baker; Choli Lee; Emily H Turner; Lin Chen; Jeffrey M Kidd; Nicoletta Archidiacono; Jay Shendure; Richard K Wilson; Evan E Eichler
Journal:  Genome Res       Date:  2011-06-17       Impact factor: 9.043

7.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment.

Authors:  Svetlin A Manavski; Giorgio Valle
Journal:  BMC Bioinformatics       Date:  2008-03-26       Impact factor: 3.169

8.  Great ape genetic diversity and population history.

Authors:  Javier Prado-Martinez; Peter H Sudmant; Jeffrey M Kidd; Heng Li; Joanna L Kelley; Belen Lorente-Galdos; Krishna R Veeramah; August E Woerner; Timothy D O'Connor; Gabriel Santpere; Alexander Cagan; Christoph Theunert; Ferran Casals; Hafid Laayouni; Kasper Munch; Asger Hobolth; Anders E Halager; Maika Malig; Jessica Hernandez-Rodriguez; Irene Hernando-Herraez; Kay Prüfer; Marc Pybus; Laurel Johnstone; Michael Lachmann; Can Alkan; Dorina Twigg; Natalia Petit; Carl Baker; Fereydoun Hormozdiari; Marcos Fernandez-Callejo; Marc Dabad; Michael L Wilson; Laurie Stevison; Cristina Camprubí; Tiago Carvalho; Aurora Ruiz-Herrera; Laura Vives; Marta Mele; Teresa Abello; Ivanela Kondova; Ronald E Bontrop; Anne Pusey; Felix Lankester; John A Kiyang; Richard A Bergl; Elizabeth Lonsdorf; Simon Myers; Mario Ventura; Pascal Gagneux; David Comas; Hans Siegismund; Julie Blanc; Lidia Agueda-Calpena; Marta Gut; Lucinda Fulton; Sarah A Tishkoff; James C Mullikin; Richard K Wilson; Ivo G Gut; Mary Katherine Gonder; Oliver A Ryder; Beatrice H Hahn; Arcadi Navarro; Joshua M Akey; Jaume Bertranpetit; David Reich; Thomas Mailund; Mikkel H Schierup; Christina Hvilsom; Aida M Andrés; Jeffrey D Wall; Carlos D Bustamante; Michael F Hammer; Evan E Eichler; Tomas Marques-Bonet
Journal:  Nature       Date:  2013-07-03       Impact factor: 49.962

9.  Personalized copy number and segmental duplication maps using next-generation sequencing.

Authors:  Can Alkan; Jeffrey M Kidd; Tomas Marques-Bonet; Gozde Aksay; Francesca Antonacci; Fereydoun Hormozdiari; Jacob O Kitzman; Carl Baker; Maika Malig; Onur Mutlu; S Cenk Sahinalp; Richard A Gibbs; Evan E Eichler
Journal:  Nat Genet       Date:  2009-08-30       Impact factor: 38.330

10.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

View more
  7 in total

Review 1.  Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

Authors:  Damla Senol Cali; Jeremie S Kim; Saugata Ghose; Can Alkan; Onur Mutlu
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

2.  Shouji: a fast and efficient pre-alignment filter for sequence alignment.

Authors:  Mohammed Alser; Hasan Hassan; Akash Kumar; Onur Mutlu; Can Alkan
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

3.  CARE 2.0: reducing false-positive sequencing error corrections using machine learning.

Authors:  Felix Kallenborn; Julian Cascitti; Bertil Schmidt
Journal:  BMC Bioinformatics       Date:  2022-06-13       Impact factor: 3.307

4.  GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

Authors:  Mohammed Alser; Hasan Hassan; Hongyi Xin; Oguz Ergin; Onur Mutlu; Can Alkan
Journal:  Bioinformatics       Date:  2017-11-01       Impact factor: 6.937

Review 5.  From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures.

Authors:  Mohammed Alser; Joel Lindegger; Can Firtina; Nour Almadhoun; Haiyu Mao; Gagandeep Singh; Juan Gomez-Luna; Onur Mutlu
Journal:  Comput Struct Biotechnol J       Date:  2022-08-18       Impact factor: 6.155

6.  Gene Position Index Mutation Detection Algorithm Based on Feedback Fast Learning Neural Network.

Authors:  Zhike Zuo; Chao Tang; Yu Xu; Ying Wang; Yongzhong Wu; Jun Qi; Xiaolong Shi
Journal:  Comput Intell Neurosci       Date:  2021-07-06

7.  GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies.

Authors:  Jeremie S Kim; Damla Senol Cali; Hongyi Xin; Donghyuk Lee; Saugata Ghose; Mohammed Alser; Hasan Hassan; Oguz Ergin; Can Alkan; Onur Mutlu
Journal:  BMC Genomics       Date:  2018-05-09       Impact factor: 3.969

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.