Literature DB >> 35041518

Finding Maximal Exact Matches Using the r-Index.

Massimiliano Rossi1, Marco Oliva1, Paola Bonizzoni2, Ben Langmead3, Travis Gagie4, Christina Boucher1.   

Abstract

Efficiently finding maximal exact matches (MEMs) between a sequence read and a database of genomes is a key first step in read alignment. But until recently, it was unknown how to build a data structure in [Formula: see text] space that supports efficient MEM finding, where r is the number of runs in the Burrows-Wheeler Transform. In 2021, Rossi et al. showed how to build a small auxiliary data structure called thresholds in addition to the r-index in [Formula: see text] space. This addition enables efficient MEM finding using the r-index. In this article, we present the tool that implements this solution, which we call MONI. Namely, we give a high-level view of the main components of the data structure and show how the source code can be downloaded, compiled, and used to find MEMs between a set of sequence reads and a set of genomes.

Entities:  

Keywords:  MEM finding; r-index; run-length-encoded Burrows–Wheeler transform; thresholds

Mesh:

Year:  2022        PMID: 35041518      PMCID: PMC8902461          DOI: 10.1089/cmb.2021.0445

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  12 in total

1.  Minimap2: pairwise alignment for nucleotide sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2018-09-15       Impact factor: 6.937

2.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

3.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.

Authors:  Ben Langmead; Cole Trapnell; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-03-04       Impact factor: 13.583

4.  MONI: A Pangenomic Index for Finding Maximal Exact Matches.

Authors:  Massimiliano Rossi; Marco Oliva; Ben Langmead; Travis Gagie; Christina Boucher
Journal:  J Comput Biol       Date:  2022-01-17       Impact factor: 1.479

5.  Improving PacBio long read accuracy by short read alignment.

Authors:  Kin Fai Au; Jason G Underwood; Lawrence Lee; Wing Hung Wong
Journal:  PLoS One       Date:  2012-10-04       Impact factor: 3.240

6.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

7.  Introducing difference recurrence relations for faster semi-global alignment of long sequences.

Authors:  Hajime Suzuki; Masahiro Kasahara
Journal:  BMC Bioinformatics       Date:  2018-02-19       Impact factor: 3.169

8.  Prefix-free parsing for building big BWTs.

Authors:  Christina Boucher; Travis Gagie; Alan Kuhnle; Ben Langmead; Giovanni Manzini; Taher Mun
Journal:  Algorithms Mol Biol       Date:  2019-05-24       Impact factor: 1.405

9.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

10.  Matching Reads to Many Genomes with the r-Index.

Authors:  Taher Mun; Alan Kuhnle; Christina Boucher; Travis Gagie; Ben Langmead; Giovanni Manzini
Journal:  J Comput Biol       Date:  2020-03-16       Impact factor: 1.479

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.