Literature DB >> 23093610

Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold.

Androniki Menelaou1, Jonathan Marchini.   

Abstract

MOTIVATION: Given the current costs of next-generation sequencing, large studies carry out low-coverage sequencing followed by application of methods that leverage linkage disequilibrium to infer genotypes. We propose a novel method that assumes study samples are sequenced at low coverage and genotyped on a genome-wide microarray, as in the 1000 Genomes Project (1KGP). We assume polymorphic sites have been detected from the sequencing data and that genotype likelihoods are available at these sites. We also assume that the microarray genotypes have been phased to construct a haplotype scaffold. We then phase each polymorphic site using an MCMC algorithm that iteratively updates the unobserved alleles based on the genotype likelihoods at that site and local haplotype information. We use a multivariate normal model to capture both allele frequency and linkage disequilibrium information around each site. When sequencing data are available from trios, Mendelian transmission constraints are easily accommodated into the updates. The method is highly parallelizable, as it analyses one position at a time.
RESULTS: We illustrate the performance of the method compared with other methods using data from Phase 1 of the 1KGP in terms of genotype accuracy, phasing accuracy and downstream imputation performance. We show that the haplotype panel we infer in African samples, which was based on a trio-phased scaffold, increases downstream imputation accuracy for rare variants (R2 increases by >0.05 for minor allele frequency <1%), and this will translate into a boost in power to detect associations. These results highlight the value of incorporating microarray genotypes when calling variants from next-generation sequence data. AVAILABILITY: The method (called MVNcall) is implemented in a C++ program and is available from http://www.stats.ox.ac.uk/∼marchini/#software.

Mesh:

Year:  2012        PMID: 23093610     DOI: 10.1093/bioinformatics/bts632

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  24 in total

1.  Enhanced localization of genetic samples through linkage-disequilibrium correction.

Authors:  Yael Baran; Inés Quintela; Angel Carracedo; Bogdan Pasaniuc; Eran Halperin
Journal:  Am J Hum Genet       Date:  2013-05-30       Impact factor: 11.025

2.  On the design and analysis of next-generation sequencing genotyping for a cohort with haplotype-informative reads.

Authors:  Degui Zhi; Nianjun Liu; Kui Zhang
Journal:  Methods       Date:  2015-01-30       Impact factor: 3.608

3.  InPhaDel: integrative shotgun and proximity-ligation sequencing to phase deletions with single nucleotide polymorphisms.

Authors:  Anand Patel; Peter Edge; Siddarth Selvaraj; Vikas Bansal; Vineet Bafna
Journal:  Nucleic Acids Res       Date:  2016-04-21       Impact factor: 16.971

4.  Whole-genome sequence variation, population structure and demographic history of the Dutch population.

Authors: 
Journal:  Nat Genet       Date:  2014-06-29       Impact factor: 38.330

5.  Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes.

Authors:  Jana Ebler; Peter Ebert; Wayne E Clarke; Tobias Rausch; Peter A Audano; Torsten Houwaart; Yafei Mao; Jan O Korbel; Evan E Eichler; Michael C Zody; Alexander T Dilthey; Tobias Marschall
Journal:  Nat Genet       Date:  2022-04-11       Impact factor: 38.330

Review 6.  Long-read sequencing for molecular diagnostics in constitutional genetic disorders.

Authors:  Laura K Conlin; Erfan Aref-Eshghi; Deborah A McEldrew; Minjie Luo; Ramakrishnan Rajagopalan
Journal:  Hum Mutat       Date:  2022-09-18       Impact factor: 4.700

7.  Resistance to malaria through structural variation of red blood cell invasion receptors.

Authors:  Ellen M Leffler; Gavin Band; George B J Busby; Katja Kivinen; Quang Si Le; Geraldine M Clarke; Kalifa A Bojang; David J Conway; Muminatou Jallow; Fatoumatta Sisay-Joof; Edith C Bougouma; Valentina D Mangano; David Modiano; Sodiomon B Sirima; Eric Achidi; Tobias O Apinjoh; Kevin Marsh; Carolyne M Ndila; Norbert Peshu; Thomas N Williams; Chris Drakeley; Alphaxard Manjurano; Hugh Reyburn; Eleanor Riley; David Kachala; Malcolm Molyneux; Vysaul Nyirongo; Terrie Taylor; Nicole Thornton; Louise Tilley; Shane Grimsley; Eleanor Drury; Jim Stalker; Victoria Cornelius; Christina Hubbart; Anna E Jeffreys; Kate Rowlands; Kirk A Rockett; Chris C A Spencer; Dominic P Kwiatkowski
Journal:  Science       Date:  2017-05-18       Impact factor: 47.728

8.  Detecting and characterizing genomic signatures of positive selection in global populations.

Authors:  Xuanyao Liu; Rick Twee-Hee Ong; Esakimuthu Nisha Pillai; Abier M Elzein; Kerrin S Small; Taane G Clark; Dominic P Kwiatkowski; Yik-Ying Teo
Journal:  Am J Hum Genet       Date:  2013-05-23       Impact factor: 11.025

9.  Approaches to the detection of recessive effects using next generation sequencing data from outbred populations.

Authors:  David Curtis
Journal:  Adv Appl Bioinform Chem       Date:  2013-06-11

10.  An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data.

Authors:  Yi Wang; James Lu; Jin Yu; Richard A Gibbs; Fuli Yu
Journal:  Genome Res       Date:  2013-01-07       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.