Literature DB >> 30785192

Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs.

Antoine Limasset1, Jean-François Flot1,2, Pierre Peterlongo3.   

Abstract

MOTIVATION: Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large datasets or consider reads as mere suites of k-mers, without taking into account their full-length sequence information.
RESULTS: We propose a new method to correct short reads using de Bruijn graphs and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond.
AVAILABILITY AND IMPLEMENTATION: The implementation is open source, available at http://github.com/Malfoy/BCOOL under the Affero GPL license and as a Bioconda package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Year:  2020        PMID: 30785192     DOI: 10.1093/bioinformatics/btz102

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

1.  Space-efficient representation of genomic k-mer count tables.

Authors:  Yoshihiro Shibuya; Djamal Belazzougui; Gregory Kucherov
Journal:  Algorithms Mol Biol       Date:  2022-03-21       Impact factor: 1.405

Review 2.  Genome sequence assembly algorithms and misassembly identification methods.

Authors:  Yue Meng; Yu Lei; Jianlong Gao; Yuxuan Liu; Enze Ma; Yunhong Ding; Yixin Bian; Hongquan Zu; Yucui Dong; Xiao Zhu
Journal:  Mol Biol Rep       Date:  2022-09-23       Impact factor: 2.742

3.  Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.

Authors:  Jamshed Khan; Marek Kokot; Sebastian Deorowicz; Rob Patro
Journal:  Genome Biol       Date:  2022-09-08       Impact factor: 17.906

4.  Aberration-corrected ultrafine analysis of miRNA reads at single-base resolution: a k-mer lattice approach.

Authors:  Xuan Zhang; Pengyao Ping; Gyorgy Hutvagner; Michael Blumenstein; Jinyan Li
Journal:  Nucleic Acids Res       Date:  2021-10-11       Impact factor: 16.971

5.  A Sequence Distance Graph framework for genome assembly and analysis.

Authors:  Luis Yanes; Gonzalo Garcia Accinelli; Jonathan Wright; Ben J Ward; Bernardo J Clavijo
Journal:  F1000Res       Date:  2019-08-23

6.  Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs.

Authors:  Guillaume Holley; Páll Melsted
Journal:  Genome Biol       Date:  2020-09-17       Impact factor: 13.583

7.  Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly.

Authors:  Guillaume Holley; Doruk Beyter; Helga Ingimundardottir; Peter L Møller; Snædis Kristmundsdottir; Hannes P Eggertsson; Bjarni V Halldorsson
Journal:  Genome Biol       Date:  2021-01-08       Impact factor: 13.583

8.  Chromosome-level genome assembly reveals homologous chromosomes and recombination in asexual rotifer Adineta vaga.

Authors:  Paul Simion; Jitendra Narayan; Antoine Houtain; Alessandro Derzelle; Lyam Baudry; Emilien Nicolas; Rohan Arora; Marie Cariou; Corinne Cruaud; Florence Rodriguez Gaudray; Clément Gilbert; Nadège Guiglielmoni; Boris Hespeels; Djampa K L Kozlowski; Karine Labadie; Antoine Limasset; Marc Llirós; Martial Marbouty; Matthieu Terwagne; Julie Virgo; Richard Cordaux; Etienne G J Danchin; Bernard Hallet; Romain Koszul; Thomas Lenormand; Jean-Francois Flot; Karine Van Doninck
Journal:  Sci Adv       Date:  2021-10-06       Impact factor: 14.136

9.  Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields.

Authors:  Aranka Steyaert; Pieter Audenaert; Jan Fostier
Journal:  BMC Bioinformatics       Date:  2020-09-14       Impact factor: 3.169

10.  Variable-order reference-free variant discovery with the Burrows-Wheeler Transform.

Authors:  Nicola Prezza; Nadia Pisanti; Marinella Sciortino; Giovanna Rosone
Journal:  BMC Bioinformatics       Date:  2020-09-16       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.