| Literature DB >> 31433799 |
Jay Ghurye1,2, Arang Rhie2, Brian P Walenz2, Anthony Schmitt3, Siddarth Selvaraj3, Mihai Pop1, Adam M Phillippy2, Sergey Koren2.
Abstract
Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C sequencing is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions and fusions across chromosomes, remain higher than alternate scaffolding technologies. We present a novel open-source Hi-C scaffolder that does not require an a priori estimate of chromosome number and minimizes errors by scaffolding with the assistance of an assembly graph. We demonstrate higher accuracy than the state-of-the-art methods across a variety of Hi-C library preparations and input assembly sizes. The Python and C++ code for our method is openly available at https://github.com/machinegun/SALSA.Entities:
Mesh:
Year: 2019 PMID: 31433799 PMCID: PMC6719893 DOI: 10.1371/journal.pcbi.1007273
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475