| Literature DB >> 28407147 |
Shengfeng Huang1, Mingjing Kang1, Anlong Xu1.
Abstract
SUMMARY: De novo assembly is a difficult issue for heterozygous diploid genomes. The advent of high-throughput short-read and long-read sequencing technologies provides both new challenges and potential solutions to the issue. Here, we present HaploMerger2 (HM2), an automated pipeline for rebuilding both haploid sub-assemblies from the polymorphic diploid genome assembly. It is designed to work on pre-existing diploid assemblies, which are typically created by using de novo assemblers. HM2 can process any diploid assemblies, but it is especially suitable for diploid assemblies with high heterozygosity (≥3%), which can be difficult for other tools. This pipeline also implements flexible and sensitive assembly error detection, a hierarchical scaffolding procedure and a reliable gap-closing method for haploid sub-assemblies. Using HM2, we demonstrate that two haploid sub-assemblies reconstructed from a real, highly-polymorphic diploid assembly show greatly improved continuity.Entities:
Mesh:
Year: 2017 PMID: 28407147 PMCID: PMC5870766 DOI: 10.1093/bioinformatics/btx220
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1A flowchart of the HaploMerger2 (HM2) pipeline. HM2 comprises five functionally independent modules. Each module can work on its own. The users can run any module separately or choose some of them to form a specific pipeline, as suggested in the flowchart. In this specific pipeline, all but the second module are optional and can be iterated to achieve better results