Literature DB >> 26315913

HapCol: accurate and memory-efficient haplotype assembly from long reads.

Yuri Pirola1, Simone Zaccaria1, Riccardo Dondi2, Gunnar W Klau3, Nadia Pisanti4, Paola Bonizzoni1.   

Abstract

MOTIVATION: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from the advent of 'future-generation' sequencing technologies and their capability to produce long reads at increasing coverage. Existing methods are not able to deal with such data in a fully satisfactory way, either because accuracy or performances degrade as read length and sequencing coverage increase or because they are based on restrictive assumptions.
RESULTS: By exploiting a feature of future-generation technologies-the uniform distribution of sequencing errors-we designed an exact algorithm, called HapCol, that is exponential in the maximum number of corrections for each single-nucleotide polymorphism position and that minimizes the overall error-correction score. We performed an experimental analysis, comparing HapCol with the current state-of-the-art combinatorial methods both on real and simulated data. On a standard benchmark of real data, we show that HapCol is competitive with state-of-the-art methods, improving the accuracy and the number of phased positions. Furthermore, experiments on realistically simulated datasets revealed that HapCol requires significantly less computing resources, especially memory. Thanks to its computational efficiency, HapCol can overcome the limits of previous approaches, allowing to phase datasets with higher coverage and without the traditional all-heterozygous assumption.
AVAILABILITY AND IMPLEMENTATION: Our source code is available under the terms of the GNU General Public License at http://hapcol.algolab.eu/ CONTACT: bonizzoni@disco.unimib.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2015        PMID: 26315913     DOI: 10.1093/bioinformatics/btv495

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

Review 1.  DNA sequencing technologies: 2006-2016.

Authors:  Elaine R Mardis
Journal:  Nat Protoc       Date:  2017-01-05       Impact factor: 13.491

Review 2.  Multidisciplinary approaches for elucidating genetics and molecular pathogenesis of urinary tract malformations.

Authors:  Kamal Khan; Dina F Ahram; Yangfan P Liu; Rik Westland; Rosemary V Sampogna; Nicholas Katsanis; Erica E Davis; Simone Sanna-Cherchi
Journal:  Kidney Int       Date:  2021-11-12       Impact factor: 10.612

3.  Read-based phasing of related individuals.

Authors:  Shilpa Garg; Marcel Martin; Tobias Marschall
Journal:  Bioinformatics       Date:  2016-06-15       Impact factor: 6.937

4.  PWHATSHAP: efficient haplotyping for future generation sequencing.

Authors:  Andrea Bracciali; Marco Aldinucci; Murray Patterson; Tobias Marschall; Nadia Pisanti; Ivan Merelli; Massimo Torquati
Journal:  BMC Bioinformatics       Date:  2016-09-22       Impact factor: 3.169

Review 5.  Computational pan-genomics: status, promises and challenges.

Authors: 
Journal:  Brief Bioinform       Date:  2018-01-01       Impact factor: 11.622

6.  HapCHAT: adaptive haplotype assembly for efficiently leveraging high coverage in long reads.

Authors:  Stefano Beretta; Murray D Patterson; Simone Zaccaria; Gianluca Della Vedova; Paola Bonizzoni
Journal:  BMC Bioinformatics       Date:  2018-07-03       Impact factor: 3.169

7.  Haplotype phasing in single-cell DNA-sequencing data.

Authors:  Gryte Satas; Benjamin J Raphael
Journal:  Bioinformatics       Date:  2018-07-01       Impact factor: 6.937

8.  ComHapDet: a spatial community detection algorithm for haplotype assembly.

Authors:  Abishek Sankararaman; Haris Vikalo; François Baccelli
Journal:  BMC Genomics       Date:  2020-09-09       Impact factor: 3.969

9.  Sparse Tensor Decomposition for Haplotype Assembly of Diploids and Polyploids.

Authors:  Abolfazl Hashemi; Banghua Zhu; Haris Vikalo
Journal:  BMC Genomics       Date:  2018-03-21       Impact factor: 3.969

10.  Variable-order reference-free variant discovery with the Burrows-Wheeler Transform.

Authors:  Nicola Prezza; Nadia Pisanti; Marinella Sciortino; Giovanna Rosone
Journal:  BMC Bioinformatics       Date:  2020-09-16       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.