Literature DB >> 22285565

Genotype calling from next-generation sequencing data using haplotype information of reads.

Degui Zhi1, Jihua Wu, Nianjun Liu, Kui Zhang.   

Abstract

MOTIVATION: Low coverage sequencing provides an economic strategy for whole genome sequencing. When sequencing a set of individuals, genotype calling can be challenging due to low sequencing coverage. Linkage disequilibrium (LD) based refinement of genotyping calling is essential to improve the accuracy. Current LD-based methods use read counts or genotype likelihoods at individual potential polymorphic sites (PPSs). Reads that span multiple PPSs (jumping reads) can provide additional haplotype information overlooked by current methods.
RESULTS: In this article, we introduce a new Hidden Markov Model (HMM)-based method that can take into account jumping reads information across adjacent PPSs and implement it in the HapSeq program. Our method extends the HMM in Thunder and explicitly models jumping reads information as emission probabilities conditional on the states of adjacent PPSs. Our simulation results show that, compared to Thunder, HapSeq reduces the genotyping error rate by 30%, from 0.86% to 0.60%. The results from the 1000 Genomes Project show that HapSeq reduces the genotyping error rate by 12 and 9%, from 2.24% and 2.76% to 1.97% and 2.50% for individuals with European and African ancestry, respectively. We expect our program can improve genotyping qualities of the large number of ongoing and planned whole genome sequencing projects. CONTACT: dzhi@ms.soph.uab.edu; kzhang@ms.soph.uab.edu AVAILABILITY: The software package HapSeq and its manual can be found and downloaded at www.ssg.uab.edu/hapseq/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh:

Year:  2012        PMID: 22285565      PMCID: PMC3493122          DOI: 10.1093/bioinformatics/bts047

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  23 in total

1.  Calibrating a coalescent simulation of human genome sequence variation.

Authors:  Stephen F Schaffner; Catherine Foo; Stacey Gabriel; David Reich; Mark J Daly; David Altshuler
Journal:  Genome Res       Date:  2005-11       Impact factor: 9.043

2.  SNP detection for massively parallel whole-genome resequencing.

Authors:  Ruiqiang Li; Yingrui Li; Xiaodong Fang; Huanming Yang; Jian Wang; Karsten Kristiansen; Jun Wang
Journal:  Genome Res       Date:  2009-05-06       Impact factor: 9.043

3.  Genomewide association studies--illuminating biologic pathways.

Authors:  Joel N Hirschhorn
Journal:  N Engl J Med       Date:  2009-04-15       Impact factor: 91.245

4.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

5.  Personal genomes: The case of the missing heritability.

Authors:  Brendan Maher
Journal:  Nature       Date:  2008-11-06       Impact factor: 49.962

Review 6.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges.

Authors:  Mark I McCarthy; Gonçalo R Abecasis; Lon R Cardon; David B Goldstein; Julian Little; John P A Ioannidis; Joel N Hirschhorn
Journal:  Nat Rev Genet       Date:  2008-05       Impact factor: 53.242

7.  Low-coverage sequencing: implications for design of complex trait association studies.

Authors:  Yun Li; Carlo Sidore; Hyun Min Kang; Michael Boehnke; Gonçalo R Abecasis
Journal:  Genome Res       Date:  2011-04-01       Impact factor: 9.043

Review 8.  Genetic mapping in human disease.

Authors:  David Altshuler; Mark J Daly; Eric S Lander
Journal:  Science       Date:  2008-11-07       Impact factor: 47.728

9.  Aspects of coverage in medical DNA sequencing.

Authors:  Michael C Wendl; Richard K Wilson
Journal:  BMC Bioinformatics       Date:  2008-05-16       Impact factor: 3.169

10.  Accurate whole human genome sequencing using reversible terminator chemistry.

Authors:  David R Bentley; Shankar Balasubramanian; Harold P Swerdlow; Geoffrey P Smith; John Milton; Clive G Brown; Kevin P Hall; Dirk J Evers; Colin L Barnes; Helen R Bignell; Jonathan M Boutell; Jason Bryant; Richard J Carter; R Keira Cheetham; Anthony J Cox; Darren J Ellis; Michael R Flatbush; Niall A Gormley; Sean J Humphray; Leslie J Irving; Mirian S Karbelashvili; Scott M Kirk; Heng Li; Xiaohai Liu; Klaus S Maisinger; Lisa J Murray; Bojan Obradovic; Tobias Ost; Michael L Parkinson; Mark R Pratt; Isabelle M J Rasolonjatovo; Mark T Reed; Roberto Rigatti; Chiara Rodighiero; Mark T Ross; Andrea Sabot; Subramanian V Sankar; Aylwyn Scally; Gary P Schroth; Mark E Smith; Vincent P Smith; Anastassia Spiridou; Peta E Torrance; Svilen S Tzonev; Eric H Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D Alam; Carole Anastasi; Ify C Aniebo; David M D Bailey; Iain R Bancarz; Saibal Banerjee; Selena G Barbour; Primo A Baybayan; Vincent A Benoit; Kevin F Benson; Claire Bevis; Phillip J Black; Asha Boodhun; Joe S Brennan; John A Bridgham; Rob C Brown; Andrew A Brown; Dale H Buermann; Abass A Bundu; James C Burrows; Nigel P Carter; Nestor Castillo; Maria Chiara E Catenazzi; Simon Chang; R Neil Cooley; Natasha R Crake; Olubunmi O Dada; Konstantinos D Diakoumakos; Belen Dominguez-Fernandez; David J Earnshaw; Ugonna C Egbujor; David W Elmore; Sergey S Etchin; Mark R Ewan; Milan Fedurco; Louise J Fraser; Karin V Fuentes Fajardo; W Scott Furey; David George; Kimberley J Gietzen; Colin P Goddard; George S Golda; Philip A Granieri; David E Green; David L Gustafson; Nancy F Hansen; Kevin Harnish; Christian D Haudenschild; Narinder I Heyer; Matthew M Hims; Johnny T Ho; Adrian M Horgan; Katya Hoschler; Steve Hurwitz; Denis V Ivanov; Maria Q Johnson; Terena James; T A Huw Jones; Gyoung-Dong Kang; Tzvetana H Kerelska; Alan D Kersey; Irina Khrebtukova; Alex P Kindwall; Zoya Kingsbury; Paula I Kokko-Gonzales; Anil Kumar; Marc A Laurent; Cynthia T Lawley; Sarah E Lee; Xavier Lee; Arnold K Liao; Jennifer A Loch; Mitch Lok; Shujun Luo; Radhika M Mammen; John W Martin; Patrick G McCauley; Paul McNitt; Parul Mehta; Keith W Moon; Joe W Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M Novo; Michael J O'Neill; Mark A Osborne; Andrew Osnowski; Omead Ostadan; Lambros L Paraschos; Lea Pickering; Andrew C Pike; Alger C Pike; D Chris Pinkard; Daniel P Pliskin; Joe Podhasky; Victor J Quijano; Come Raczy; Vicki H Rae; Stephen R Rawlings; Ana Chiva Rodriguez; Phyllida M Roe; John Rogers; Maria C Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K Roth; Natalie J Rourke; Silke T Ruediger; Eli Rusman; Raquel M Sanches-Kuiper; Martin R Schenker; Josefina M Seoane; Richard J Shaw; Mitch K Shiver; Steven W Short; Ning L Sizto; Johannes P Sluis; Melanie A Smith; Jean Ernest Sohna Sohna; Eric J Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L Tregidgo; Gerardo Turcatti; Stephanie Vandevondele; Yuli Verhovsky; Selene M Virk; Suzanne Wakelin; Gregory C Walcott; Jingwen Wang; Graham J Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C Mullikin; Matthew E Hurles; Nick J McCooke; John S West; Frank L Oaks; Peter L Lundberg; David Klenerman; Richard Durbin; Anthony J Smith
Journal:  Nature       Date:  2008-11-06       Impact factor: 49.962

View more
  8 in total

1.  Joint haplotype phasing and genotype calling of multiple individuals using haplotype informative reads.

Authors:  Kui Zhang; Degui Zhi
Journal:  Bioinformatics       Date:  2013-08-13       Impact factor: 6.937

2.  Quantifying population genetic differentiation from next-generation sequencing data.

Authors:  Matteo Fumagalli; Filipe G Vieira; Thorfinn Sand Korneliussen; Tyler Linderoth; Emilia Huerta-Sánchez; Anders Albrechtsen; Rasmus Nielsen
Journal:  Genetics       Date:  2013-08-26       Impact factor: 4.562

3.  On the design and analysis of next-generation sequencing genotyping for a cohort with haplotype-informative reads.

Authors:  Degui Zhi; Nianjun Liu; Kui Zhang
Journal:  Methods       Date:  2015-01-30       Impact factor: 3.608

4.  Likelihood-based complex trait association testing for arbitrary depth sequencing data.

Authors:  Song Yan; Shuai Yuan; Zheng Xu; Baqun Zhang; Bo Zhang; Guolian Kang; Andrea Byrnes; Yun Li
Journal:  Bioinformatics       Date:  2015-05-14       Impact factor: 6.937

5.  A dynamic Bayesian Markov model for phasing and characterizing haplotypes in next-generation sequencing.

Authors:  Yu Zhang
Journal:  Bioinformatics       Date:  2013-02-13       Impact factor: 6.937

6.  Detection of Mendelian consistent genotyping errors in pedigrees.

Authors:  Charles Y K Cheung; Elizabeth A Thompson; Ellen M Wijsman
Journal:  Genet Epidemiol       Date:  2014-04-09       Impact factor: 2.135

7.  HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data.

Authors:  Sepp Hochreiter
Journal:  Nucleic Acids Res       Date:  2013-10-29       Impact factor: 16.971

8.  MixSIH: a mixture model for single individual haplotyping.

Authors:  Hirotaka Matsumoto; Hisanori Kiryu
Journal:  BMC Genomics       Date:  2013-02-15       Impact factor: 3.969

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.