Bonnie B Kirkpatrick1. 1. Electrical Engineering and Computer Sciences, University of California Berkeley, Berkeley, CA 94720-1776, USA. bbkirk@eecs.berkeley.edu.
Abstract
BACKGROUND: Genome sequencing will soon produce haplotype data for individuals. For pedigrees of related individuals, sequencing appears to be an attractive alternative to genotyping. However, methods for pedigree analysis with haplotype data have not yet been developed, and the computational complexity of such problems has been an open question. Furthermore, it is not clear in which scenarios haplotype data would provide better estimates than genotype data for quantities such as recombination rates. RESULTS: To answer these questions, a reduction is given from genotype problem instances to haplotype problem instances, and it is shown that solving the haplotype problem yields the solution to the genotype problem, up to constant factors or coefficients. The pedigree analysis problems we will consider are the likelihood, maximum probability haplotype, and minimum recombination haplotype problems. CONCLUSIONS: Two algorithms are introduced: an exponential-time hidden Markov model (HMM) for haplotype data where some individuals are untyped, and a linear-time algorithm for pedigrees having haplotype data for all individuals. Recombination estimates from the general haplotype HMM algorithm are compared to recombination estimates produced by a genotype HMM. Having haplotype data on all individuals produces better estimates. However, having several untyped individuals can drastically reduce the utility of haplotype data.
BACKGROUND: Genome sequencing will soon produce haplotype data for individuals. For pedigrees of related individuals, sequencing appears to be an attractive alternative to genotyping. However, methods for pedigree analysis with haplotype data have not yet been developed, and the computational complexity of such problems has been an open question. Furthermore, it is not clear in which scenarios haplotype data would provide better estimates than genotype data for quantities such as recombination rates. RESULTS: To answer these questions, a reduction is given from genotype problem instances to haplotype problem instances, and it is shown that solving the haplotype problem yields the solution to the genotype problem, up to constant factors or coefficients. The pedigree analysis problems we will consider are the likelihood, maximum probability haplotype, and minimum recombination haplotype problems. CONCLUSIONS: Two algorithms are introduced: an exponential-time hidden Markov model (HMM) for haplotype data where some individuals are untyped, and a linear-time algorithm for pedigrees having haplotype data for all individuals. Recombination estimates from the general haplotype HMM algorithm are compared to recombination estimates produced by a genotype HMM. Having haplotype data on all individuals produces better estimates. However, having several untyped individuals can drastically reduce the utility of haplotype data.
Authors: Jeffrey C Barrett; Sarah Hansoul; Dan L Nicolae; Judy H Cho; Richard H Duerr; John D Rioux; Steven R Brant; Mark S Silverberg; Kent D Taylor; M Michael Barmada; Alain Bitton; Themistocles Dassopoulos; Lisa Wu Datta; Todd Green; Anne M Griffiths; Emily O Kistner; Michael T Murtha; Miguel D Regueiro; Jerome I Rotter; L Philip Schumm; A Hillary Steinhart; Stephan R Targan; Ramnik J Xavier; Cécile Libioulle; Cynthia Sandor; Mark Lathrop; Jacques Belaiche; Olivier Dewit; Ivo Gut; Simon Heath; Debby Laukens; Myriam Mni; Paul Rutgeerts; André Van Gossum; Diana Zelenika; Denis Franchimont; Jean-Pierre Hugot; Martine de Vos; Severine Vermeire; Edouard Louis; Lon R Cardon; Carl A Anderson; Hazel Drummond; Elaine Nimmo; Tariq Ahmad; Natalie J Prescott; Clive M Onnie; Sheila A Fisher; Jonathan Marchini; Jilur Ghori; Suzannah Bumpstead; Rhian Gwilliam; Mark Tremelling; Panos Deloukas; John Mansfield; Derek Jewell; Jack Satsangi; Christopher G Mathew; Miles Parkes; Michel Georges; Mark J Daly Journal: Nat Genet Date: 2008-06-29 Impact factor: 38.330
Authors: John Eid; Adrian Fehr; Jeremy Gray; Khai Luong; John Lyle; Geoff Otto; Paul Peluso; David Rank; Primo Baybayan; Brad Bettman; Arkadiusz Bibillo; Keith Bjornson; Bidhan Chaudhuri; Frederick Christians; Ronald Cicero; Sonya Clark; Ravindra Dalal; Alex Dewinter; John Dixon; Mathieu Foquet; Alfred Gaertner; Paul Hardenbol; Cheryl Heiner; Kevin Hester; David Holden; Gregory Kearns; Xiangxu Kong; Ronald Kuse; Yves Lacroix; Steven Lin; Paul Lundquist; Congcong Ma; Patrick Marks; Mark Maxham; Devon Murphy; Insil Park; Thang Pham; Michael Phillips; Joy Roy; Robert Sebra; Gene Shen; Jon Sorenson; Austin Tomaney; Kevin Travers; Mark Trulson; John Vieceli; Jeffrey Wegener; Dawn Wu; Alicia Yang; Denis Zaccarin; Peter Zhao; Frank Zhong; Jonas Korlach; Stephen Turner Journal: Science Date: 2008-11-20 Impact factor: 47.728
Authors: M Y M Ng; D F Levinson; S V Faraone; B K Suarez; L E DeLisi; T Arinami; B Riley; T Paunio; A E Pulver; P A Holmans; M Escamilla; D B Wildenauer; N M Williams; C Laurent; B J Mowry; L M Brzustowicz; M Maziade; P Sklar; D L Garver; G R Abecasis; B Lerer; M D Fallin; H M D Gurling; P V Gejman; E Lindholm; H W Moises; W Byerley; E M Wijsman; P Forabosco; M T Tsuang; H-G Hwu; Y Okazaki; K S Kendler; B Wormley; A Fanous; D Walsh; F A O'Neill; L Peltonen; G Nestadt; V K Lasseter; K Y Liang; G M Papadimitriou; D G Dikeos; S G Schwab; M J Owen; M C O'Donovan; N Norton; E Hare; H Raventos; H Nicolini; M Albus; W Maier; V L Nimgaonkar; L Terenius; J Mallet; M Jay; S Godard; D Nertney; M Alexander; R R Crowe; J M Silverman; A S Bassett; M-A Roy; C Mérette; C N Pato; M T Pato; J Louw Roos; Y Kohn; D Amann-Zalcenstein; G Kalsi; A McQuillin; D Curtis; J Brynjolfson; T Sigmundsson; H Petursson; A R Sanders; J Duan; E Jazin; M Myles-Worsley; M Karayiorgou; C M Lewis Journal: Mol Psychiatry Date: 2008-12-30 Impact factor: 15.992