Literature DB >> 25644447

On the design and analysis of next-generation sequencing genotyping for a cohort with haplotype-informative reads.

Degui Zhi1, Nianjun Liu2, Kui Zhang3.   

Abstract

Next-generation sequencing (NGS) technologies, which can provide base-pair resolution genetic information for all types of genetic variations, are increasingly used in genetics research. However, due to the complex nature of NGS technologies and analytics and their relatively high cost, investigators face practical challenges for both design and analysis. These challenges are further complicated by recent methodological developments that make it possible to use haplotype information in sequencing reads. In light of these developments, we conducted comprehensive simulations to evaluate the effects of sequencing coverage, insert size of paired-end reads, and sample size on genotype calling and haplotype phasing in NGS studies. In contrast to previous studies that typically use idealized scenarios to tease out the effects of individual design and analytic decisions, we used a complete analytical pipeline from read mapping and variant detection to genotype calling and haplotype phasing so that we can assess the joint effects of multiple decisions and thus make more realistic recommendations to investigators. Consistent with previous studies, we found that the use of haplotype information in reads can improve the accuracy of genotype calling and haplotype phasing, and we also found that a mixture of short and long insert sizes of paired-end reads may offer even greater accuracy. However, this benefit is only clear in high coverage sequencing where variant detection is close to perfect. Finally, we observed that LD-based refinement methods do not always outperform single site based methods for genotype calling. Therefore, we should choose analytical methods that are appropriate to the sequencing coverage and sample size in order to use haplotype information in sequencing reads.
Copyright © 2015 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Genotype calling; Haplotype; Human genetics; Next-generation sequencing; Resequencing; Simulation; Study design

Mesh:

Year:  2015        PMID: 25644447      PMCID: PMC4437872          DOI: 10.1016/j.ymeth.2015.01.016

Source DB:  PubMed          Journal:  Methods        ISSN: 1046-2023            Impact factor:   3.608


  34 in total

1.  Genotype calling from next-generation sequencing data using haplotype information of reads.

Authors:  Degui Zhi; Jihua Wu; Nianjun Liu; Kui Zhang
Journal:  Bioinformatics       Date:  2012-01-27       Impact factor: 6.937

2.  Joint haplotype phasing and genotype calling of multiple individuals using haplotype informative reads.

Authors:  Kui Zhang; Degui Zhi
Journal:  Bioinformatics       Date:  2013-08-13       Impact factor: 6.937

3.  Haplotype estimation using sequencing reads.

Authors:  Olivier Delaneau; Bryan Howie; Anthony J Cox; Jean-François Zagury; Jonathan Marchini
Journal:  Am J Hum Genet       Date:  2013-10-03       Impact factor: 11.025

4.  Population genomics based on low coverage sequencing: how low should we go?

Authors:  C Alex Buerkle; Zachariah Gompert
Journal:  Mol Ecol       Date:  2012-11-22       Impact factor: 6.185

5.  Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold.

Authors:  Androniki Menelaou; Jonathan Marchini
Journal:  Bioinformatics       Date:  2012-10-23       Impact factor: 6.937

6.  Rare variant detection using family-based sequencing analysis.

Authors:  Gang Peng; Yu Fan; Timothy B Palculict; Peidong Shen; E Cristy Ruteshouser; Aung-Kyaw Chi; Ronald W Davis; Vicki Huff; Curt Scharfe; Wenyi Wang
Journal:  Proc Natl Acad Sci U S A       Date:  2013-02-20       Impact factor: 11.205

7.  Analysis and optimal design for association studies using next-generation sequencing with case-control pools.

Authors:  Wei E Liang; Duncan C Thomas; David V Conti
Journal:  Genet Epidemiol       Date:  2012-09-12       Impact factor: 2.135

8.  ART: a next-generation sequencing read simulator.

Authors:  Weichun Huang; Leping Li; Jason R Myers; Gabor T Marth
Journal:  Bioinformatics       Date:  2011-12-23       Impact factor: 6.937

Review 9.  Sequencing depth and coverage: key considerations in genomic analyses.

Authors:  David Sims; Ian Sudbery; Nicholas E Ilott; Andreas Heger; Chris P Ponting
Journal:  Nat Rev Genet       Date:  2014-02       Impact factor: 53.242

10.  Assessing the effect of sequencing depth and sample size in population genetics inferences.

Authors:  Matteo Fumagalli
Journal:  PLoS One       Date:  2013-11-18       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.