Volodymyr Kuleshov1. 1. Department of Computer Science, Stanford University, Stanford, CA 94305, USA.
Abstract
MOTIVATION: Accurate haplotyping-determining from which parent particular portions of the genome are inherited-is still mostly an unresolved problem in genomics. This problem has only recently started to become tractable, thanks to the development of new long read sequencing technologies. Here, we introduce ProbHap, a haplotyping algorithm targeted at such technologies. The main algorithmic idea of ProbHap is a new dynamic programming algorithm that exactly optimizes a likelihood function specified by a probabilistic graphical model and which generalizes a popular objective called the minimum error correction. In addition to being accurate, ProbHap also provides confidence scores at phased positions. RESULTS: On a standard benchmark dataset, ProbHap makes 11% fewer errors than current state-of-the-art methods. This accuracy can be further increased by excluding low-confidence positions, at the cost of a small drop in haplotype completeness. AVAILABILITY: Our source code is freely available at: https://github.com/kuleshov/ProbHap.
MOTIVATION: Accurate haplotyping-determining from which parent particular portions of the genome are inherited-is still mostly an unresolved problem in genomics. This problem has only recently started to become tractable, thanks to the development of new long read sequencing technologies. Here, we introduce ProbHap, a haplotyping algorithm targeted at such technologies. The main algorithmic idea of ProbHap is a new dynamic programming algorithm that exactly optimizes a likelihood function specified by a probabilistic graphical model and which generalizes a popular objective called the minimum error correction. In addition to being accurate, ProbHap also provides confidence scores at phased positions. RESULTS: On a standard benchmark dataset, ProbHap makes 11% fewer errors than current state-of-the-art methods. This accuracy can be further increased by excluding low-confidence positions, at the cost of a small drop in haplotype completeness. AVAILABILITY: Our source code is freely available at: https://github.com/kuleshov/ProbHap.
Authors: Jacob O Kitzman; Alexandra P Mackenzie; Andrew Adey; Joseph B Hiatt; Rupali P Patwardhan; Peter H Sudmant; Sarah B Ng; Can Alkan; Ruolan Qiu; Evan E Eichler; Jay Shendure Journal: Nat Biotechnol Date: 2010-12-19 Impact factor: 54.908
Authors: Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly Journal: Nat Genet Date: 2011-04-10 Impact factor: 38.330
Authors: Ayelet Voskoboynik; Norma F Neff; Debashis Sahoo; Aaron M Newman; Dmitry Pushkarev; Winston Koh; Benedetto Passarelli; H Christina Fan; Gary L Mantalas; Karla J Palmeri; Katherine J Ishizuka; Carmela Gissi; Francesca Griggio; Rachel Ben-Shlomo; Daniel M Corey; Lolita Penland; Richard A White; Irving L Weissman; Stephen R Quake Journal: Elife Date: 2013-07-02 Impact factor: 8.140