Literature DB >> 26755626

Integration of string and de Bruijn graphs for genome assembly.

Yao-Ting Huang1, Chen-Fu Liao1.   

Abstract

MOTIVATION: String and de Bruijn graphs are two graph models used by most genome assemblers. At present, none of the existing assemblers clearly outperforms the others across all datasets. We found that although a string graph can make use of entire reads for resolving repeats, de Bruijn graphs can naturally assemble through regions that are error-prone due to sequencing bias.
RESULTS: We developed a novel assembler called StriDe that has advantages of both string and de Bruijn graphs. First, the reads are decomposed adaptively only in error-prone regions. Second, each paired-end read is extended into a long read directly using an FM-index. The decomposed and extended reads are used to build an assembly graph. In addition, several essential components of an assembler were designed or improved. The resulting assembler was fully parallelized, tested and compared with state-of-the-art assemblers using benchmark datasets. The results indicate that contiguity of StriDe is comparable with top assemblers on both short-read and long-read datasets, and the assembly accuracy is high in comparison with the others.
AVAILABILITY AND IMPLEMENTATION: https://github.com/ythuang0522/StriDe CONTACT: : ythuang@cs.ccu.edu.tw SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2016        PMID: 26755626     DOI: 10.1093/bioinformatics/btw011

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  2 in total

Review 1.  Towards precision medicine.

Authors:  Euan A Ashley
Journal:  Nat Rev Genet       Date:  2016-08-16       Impact factor: 53.242

2.  An efficient error correction algorithm using FM-index.

Authors:  Yao-Ting Huang; Yu-Wen Huang
Journal:  BMC Bioinformatics       Date:  2017-11-28       Impact factor: 3.169

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.