Literature DB >> 30994902

Overlap graph-based generation of haplotigs for diploids and polyploids.

Jasmijn A Baaijens1, Alexander Schönhuth1,2.   

Abstract

MOTIVATION: Haplotype-aware genome assembly plays an important role in genetics, medicine and various other disciplines, yet generation of haplotype-resolved de novo assemblies remains a major challenge. Beyond distinguishing between errors and true sequential variants, one needs to assign the true variants to the different genome copies. Recent work has pointed out that the enormous quantities of traditional NGS read data have been greatly underexploited in terms of haplotig computation so far, which reflects that methodology for reference independent haplotig computation has not yet reached maturity.
RESULTS: We present POLYploid genome fitTEr (POLYTE) as a new approach to de novo generation of haplotigs for diploid and polyploid genomes of known ploidy. Our method follows an iterative scheme where in each iteration reads or contigs are joined, based on their interplay in terms of an underlying haplotype-aware overlap graph. Along the iterations, contigs grow while preserving their haplotype identity. Benchmarking experiments on both real and simulated data demonstrate that POLYTE establishes new standards in terms of error-free reconstruction of haplotype-specific sequence. As a consequence, POLYTE outperforms state-of-the-art approaches in various relevant aspects, where advantages become particularly distinct in polyploid settings.
AVAILABILITY AND IMPLEMENTATION: POLYTE is freely available as part of the HaploConduct package at https://github.com/HaploConduct/HaploConduct, implemented in Python and C++. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2019        PMID: 30994902     DOI: 10.1093/bioinformatics/btz255

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  6 in total

1.  Estimating the time since admixture from phased and unphased molecular data.

Authors:  Thijs Janzen; Verónica Miró Pina
Journal:  Mol Ecol Resour       Date:  2021-10-10       Impact factor: 8.678

2.  Haploflow: Strain-resolved de novo assembly of viral genomes.

Authors:  A Fritz; A Bremges; Z-L Deng; T-R Lesker; J Götting; T Ganzenmüller; A Sczyrba; A Dilthey; F Klawonn; A C McHardy
Journal:  bioRxiv       Date:  2021-01-26

Review 3.  Computational methods for chromosome-scale haplotype reconstruction.

Authors:  Shilpa Garg
Journal:  Genome Biol       Date:  2021-04-12       Impact factor: 13.583

4.  phasebook: haplotype-aware de novo assembly of diploid genomes from long reads.

Authors:  Xiao Luo; Xiongbin Kang; Alexander Schönhuth
Journal:  Genome Biol       Date:  2021-10-27       Impact factor: 13.583

5.  Haploflow: strain-resolved de novo assembly of viral genomes.

Authors:  Adrian Fritz; Andreas Bremges; Zhi-Luo Deng; Till Robin Lesker; Jasper Götting; Tina Ganzenmueller; Alexander Sczyrba; Alexander Dilthey; Frank Klawonn; Alice Carolyn McHardy
Journal:  Genome Biol       Date:  2021-07-19       Impact factor: 13.583

6.  OGRE: Overlap Graph-based metagenomic Read clustEring.

Authors:  Marleen Balvert; Xiao Luo; Ernestina Hauptfeld; Alexander Schönhuth; Bas E Dutilh
Journal:  Bioinformatics       Date:  2021-05-17       Impact factor: 6.937

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.