Literature DB >> 29955770

Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph.

Pierre Morisse1, Thierry Lecroq1, Arnaud Lefebvre1.   

Abstract

Motivation: The recent rise of long read sequencing technologies such as Pacific Biosciences and Oxford Nanopore allows to solve assembly problems for larger and more complex genomes than what allowed short reads technologies. However, these long reads are very noisy, reaching an error rate of around 10-15% for Pacific Biosciences, and up to 30% for Oxford Nanopore. The error correction problem has been tackled by either self-correcting the long reads, or using complementary short reads in a hybrid approach. However, even though sequencing technologies promise to lower the error rate of the long reads below 10%, it is still higher in practice, and correcting such noisy long reads remains an issue.
Results: We present HG-CoLoR, a hybrid error correction method that focuses on a seed-and-extend approach based on the alignment of the short reads to the long reads, followed by the traversal of a variable-order de Bruijn graph, built from the short reads. Our experiments show that HG-CoLoR manages to efficiently correct highly noisy long reads that display an error rate as high as 44%. When compared to other state-of-the-art long read error correction methods, our experiments also show that HG-CoLoR provides the best trade-off between runtime and quality of the results, and is the only method able to efficiently scale to eukaryotic genomes. Availability and implementation: HG-CoLoR is implemented is C++, supported on Linux platforms and freely available at https://github.com/morispi/HG-CoLoR. Supplementary information: Supplementary data are available at Bioinformatics online.

Mesh:

Year:  2018        PMID: 29955770     DOI: 10.1093/bioinformatics/bty521

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  Resolving MiSeq-Generated Ambiguities in HLA-DPB1 Typing by Using the Oxford Nanopore Technology.

Authors:  Jamie L Duke; Timothy L Mosbruger; Deborah Ferriola; Nilesh Chitnis; Taishan Hu; Nikolaos Tairis; David J Margolis; Dimitri S Monos
Journal:  J Mol Diagn       Date:  2019-06-04       Impact factor: 5.568

Review 2.  Nanopore sequencing technology, bioinformatics and applications.

Authors:  Yunhao Wang; Yue Zhao; Audrey Bollas; Yuru Wang; Kin Fai Au
Journal:  Nat Biotechnol       Date:  2021-11-08       Impact factor: 54.908

Review 3.  Genome sequence assembly algorithms and misassembly identification methods.

Authors:  Yue Meng; Yu Lei; Jianlong Gao; Yuxuan Liu; Enze Ma; Yunhong Ding; Yixin Bian; Hongquan Zu; Yucui Dong; Xiao Zhu
Journal:  Mol Biol Rep       Date:  2022-09-23       Impact factor: 2.742

4.  Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis.

Authors:  Adrian Viehweger; Sebastian Krautwurst; Kevin Lamkiewicz; Ramakanth Madhugiri; John Ziebuhr; Martin Hölzer; Manja Marz
Journal:  Genome Res       Date:  2019-08-22       Impact factor: 9.043

5.  Scalable long read self-correction and assembly polishing with multiple sequence alignment.

Authors:  Pierre Morisse; Camille Marchet; Antoine Limasset; Thierry Lecroq; Arnaud Lefebvre
Journal:  Sci Rep       Date:  2021-01-12       Impact factor: 4.379

Review 6.  A comprehensive evaluation of long read error correction methods.

Authors:  Haowen Zhang; Chirag Jain; Srinivas Aluru
Journal:  BMC Genomics       Date:  2020-12-21       Impact factor: 3.969

7.  Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly.

Authors:  Guillaume Holley; Doruk Beyter; Helga Ingimundardottir; Peter L Møller; Snædis Kristmundsdottir; Hannes P Eggertsson; Bjarni V Halldorsson
Journal:  Genome Biol       Date:  2021-01-08       Impact factor: 13.583

8.  Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields.

Authors:  Aranka Steyaert; Pieter Audenaert; Jan Fostier
Journal:  BMC Bioinformatics       Date:  2020-09-14       Impact factor: 3.169

9.  Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries.

Authors:  Andrew Currin; Neil Swainston; Mark S Dunstan; Adrian J Jervis; Paul Mulherin; Christopher J Robinson; Sandra Taylor; Pablo Carbonell; Katherine A Hollywood; Cunyu Yan; Eriko Takano; Nigel S Scrutton; Rainer Breitling
Journal:  Synth Biol (Oxf)       Date:  2019-10-29
  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.