Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs.

Literature DB >> 30785192

Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs.

Antoine Limasset¹, Jean-François Flot^1,2, Pierre Peterlongo³.

Abstract

MOTIVATION: Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large datasets or consider reads as mere suites of k-mers, without taking into account their full-length sequence information.
RESULTS: We propose a new method to correct short reads using de Bruijn graphs and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond.
AVAILABILITY AND IMPLEMENTATION: The implementation is open source, available at http://github.com/Malfoy/BCOOL under the Affero GPL license and as a Bioconda package. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Species

Year: 2020 PMID： 30785192 DOI： 10.1093/bioinformatics/btz102

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

11 in total

1. Space-efficient representation of genomic k-mer count tables.

Authors: Yoshihiro Shibuya; Djamal Belazzougui; Gregory Kucherov
Journal: Algorithms Mol Biol Date: 2022-03-21 Impact factor: 1.405

Review 2. Genome sequence assembly algorithms and misassembly identification methods.

Authors: Yue Meng; Yu Lei; Jianlong Gao; Yuxuan Liu; Enze Ma; Yunhong Ding; Yixin Bian; Hongquan Zu; Yucui Dong; Xiao Zhu
Journal: Mol Biol Rep Date: 2022-09-23 Impact factor: 2.742

3. Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2.

Authors: Jamshed Khan; Marek Kokot; Sebastian Deorowicz; Rob Patro
Journal: Genome Biol Date: 2022-09-08 Impact factor: 17.906

4. Aberration-corrected ultrafine analysis of miRNA reads at single-base resolution: a k-mer lattice approach.

Authors: Xuan Zhang; Pengyao Ping; Gyorgy Hutvagner; Michael Blumenstein; Jinyan Li
Journal: Nucleic Acids Res Date: 2021-10-11 Impact factor: 16.971

5. A Sequence Distance Graph framework for genome assembly and analysis.

Authors: Luis Yanes; Gonzalo Garcia Accinelli; Jonathan Wright; Ben J Ward; Bernardo J Clavijo
Journal: F1000Res Date: 2019-08-23

6. Bifrost: highly parallel construction and indexing of colored and compacted de Bruijn graphs.

Authors: Guillaume Holley; Páll Melsted
Journal: Genome Biol Date: 2020-09-17 Impact factor: 13.583

7. Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly.

Authors: Guillaume Holley; Doruk Beyter; Helga Ingimundardottir; Peter L Møller; Snædis Kristmundsdottir; Hannes P Eggertsson; Bjarni V Halldorsson
Journal: Genome Biol Date: 2021-01-08 Impact factor: 13.583

8. Chromosome-level genome assembly reveals homologous chromosomes and recombination in asexual rotifer Adineta vaga.

Authors: Paul Simion; Jitendra Narayan; Antoine Houtain; Alessandro Derzelle; Lyam Baudry; Emilien Nicolas; Rohan Arora; Marie Cariou; Corinne Cruaud; Florence Rodriguez Gaudray; Clément Gilbert; Nadège Guiglielmoni; Boris Hespeels; Djampa K L Kozlowski; Karine Labadie; Antoine Limasset; Marc Llirós; Martial Marbouty; Matthieu Terwagne; Julie Virgo; Richard Cordaux; Etienne G J Danchin; Bernard Hallet; Romain Koszul; Thomas Lenormand; Jean-Francois Flot; Karine Van Doninck
Journal: Sci Adv Date: 2021-10-06 Impact factor: 14.136

9. Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields.

Authors: Aranka Steyaert; Pieter Audenaert; Jan Fostier
Journal: BMC Bioinformatics Date: 2020-09-14 Impact factor: 3.169

10. Variable-order reference-free variant discovery with the Burrows-Wheeler Transform.

Authors: Nicola Prezza; Nadia Pisanti; Marinella Sciortino; Giovanna Rosone
Journal: BMC Bioinformatics Date: 2020-09-16 Impact factor: 3.169