Literature DB >> 20150320

Accurate detection and genotyping of SNPs utilizing population sequencing data.

Vikas Bansal1, Olivier Harismendy, Ryan Tewhey, Sarah S Murray, Nicholas J Schork, Eric J Topol, Kelly A Frazer.   

Abstract

Next-generation sequencing technologies have made it possible to sequence targeted regions of the human genome in hundreds of individuals. Deep sequencing represents a powerful approach for the discovery of the complete spectrum of DNA sequence variants in functionally important genomic intervals. Current methods for single nucleotide polymorphism (SNP) detection are designed to detect SNPs from single individual sequence data sets. Here, we describe a novel method SNIP-Seq (single nucleotide polymorphism identification from population sequence data) that leverages sequence data from a population of individuals to detect SNPs and assign genotypes to individuals. To evaluate our method, we utilized sequence data from a 200-kilobase (kb) region on chromosome 9p21 of the human genome. This region was sequenced in 48 individuals (five sequenced in duplicate) using the Illumina GA platform. Using this data set, we demonstrate that our method is highly accurate for detecting variants and can filter out false SNPs that are attributable to sequencing errors. The concordance of sequencing-based genotype assignments between duplicate samples was 98.8%. The 200-kb region was independently sequenced to a high depth of coverage using two sequence pools containing the 48 individuals. Many of the novel SNPs identified by SNIP-Seq from the individual sequencing were validated by the pooled sequencing data and were subsequently confirmed by Sanger sequencing. We estimate that SNIP-Seq achieves a low false-positive rate of approximately 2%, improving upon the higher false-positive rate for existing methods that do not utilize population sequence data. Collectively, these results suggest that analysis of population sequencing data is a powerful approach for the accurate detection of SNPs and the assignment of genotypes to individual samples.

Entities:  

Mesh:

Year:  2010        PMID: 20150320      PMCID: PMC2847757          DOI: 10.1101/gr.100040.109

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  28 in total

1.  Single nucleotide variation analysis in 65 candidate genes for CNS disorders in a representative sample of the European population.

Authors:  Yun Freudenberg-Hua; Jan Freudenberg; Nadine Kluck; Sven Cichon; Peter Propping; Markus M Nöthen
Journal:  Genome Res       Date:  2003-10       Impact factor: 9.043

2.  Multiple rare alleles contribute to low plasma levels of HDL cholesterol.

Authors:  Jonathan C Cohen; Robert S Kiss; Alexander Pertsemlidis; Yves L Marcel; Ruth McPherson; Helen H Hobbs
Journal:  Science       Date:  2004-08-06       Impact factor: 47.728

3.  Genome-wide in situ exon capture for selective resequencing.

Authors:  Emily Hodges; Zhenyu Xuan; Vivekanand Balija; Melissa Kramer; Michael N Molla; Steven W Smith; Christina M Middle; Matthew J Rodesch; Thomas J Albert; Gregory J Hannon; W Richard McCombie
Journal:  Nat Genet       Date:  2007-11-04       Impact factor: 38.330

4.  A new multipoint method for genome-wide association studies by imputation of genotypes.

Authors:  Jonathan Marchini; Bryan Howie; Simon Myers; Gil McVean; Peter Donnelly
Journal:  Nat Genet       Date:  2007-06-17       Impact factor: 38.330

5.  Microarray-based genomic selection for high-throughput resequencing.

Authors:  David T Okou; Karyn Meltz Steinberg; Christina Middle; David J Cutler; Thomas J Albert; Michael E Zwick
Journal:  Nat Methods       Date:  2007-10-14       Impact factor: 28.547

6.  The complete genome of an individual by massively parallel DNA sequencing.

Authors:  David A Wheeler; Maithreyan Srinivasan; Michael Egholm; Yufeng Shen; Lei Chen; Amy McGuire; Wen He; Yi-Ju Chen; Vinod Makhijani; G Thomas Roth; Xavier Gomes; Karrie Tartaro; Faheem Niazi; Cynthia L Turcotte; Gerard P Irzyk; James R Lupski; Craig Chinault; Xing-zhi Song; Yue Liu; Ye Yuan; Lynne Nazareth; Xiang Qin; Donna M Muzny; Marcel Margulies; George M Weinstock; Richard A Gibbs; Jonathan M Rothberg
Journal:  Nature       Date:  2008-04-17       Impact factor: 49.962

7.  Multiple rare variants in NPC1L1 associated with reduced sterol absorption and plasma low-density lipoprotein levels.

Authors:  Jonathan C Cohen; Alexander Pertsemlidis; Saleemah Fahmi; Sophie Esmail; Gloria L Vega; Scott M Grundy; Helen H Hobbs
Journal:  Proc Natl Acad Sci U S A       Date:  2006-01-31       Impact factor: 11.205

8.  Multiplex amplification of large sets of human exons.

Authors:  Gregory J Porreca; Kun Zhang; Jin Billy Li; Bin Xie; Derek Austin; Sara L Vassallo; Emily M LeProust; Bill J Peck; Christopher J Emig; Fredrik Dahl; Yuan Gao; George M Church; Jay Shendure
Journal:  Nat Methods       Date:  2007-10-14       Impact factor: 28.547

9.  Imputation-based analysis of association studies: candidate regions and quantitative traits.

Authors:  Bertrand Servin; Matthew Stephens
Journal:  PLoS Genet       Date:  2007-05-30       Impact factor: 5.917

10.  Rare independent mutations in renal salt handling genes contribute to blood pressure variation.

Authors:  Weizhen Ji; Jia Nee Foo; Brian J O'Roak; Hongyu Zhao; Martin G Larson; David B Simon; Christopher Newton-Cheh; Matthew W State; Daniel Levy; Richard P Lifton
Journal:  Nat Genet       Date:  2008-04-06       Impact factor: 38.330

View more
  56 in total

1.  Multi-sample pooling and illumina genome analyzer sequencing methods to determine gene sequence variation for database development.

Authors:  Rebecca L Margraf; Jacob D Durtschi; Shale Dames; David C Pattison; Jack E Stephens; Rong Mao; Karl V Voelkerding
Journal:  J Biomol Tech       Date:  2010-09

2.  Replication strategies for rare variant complex trait association studies via next-generation sequencing.

Authors:  Dajiang J Liu; Suzanne M Leal
Journal:  Am J Hum Genet       Date:  2010-12-10       Impact factor: 11.025

3.  Association studies for next-generation sequencing.

Authors:  Li Luo; Eric Boerwinkle; Momiao Xiong
Journal:  Genome Res       Date:  2011-04-26       Impact factor: 9.043

4.  Variant identification in multi-sample pools by illumina genome analyzer sequencing.

Authors:  Rebecca L Margraf; Jacob D Durtschi; Shale Dames; David C Pattison; Jack E Stephens; Karl V Voelkerding
Journal:  J Biomol Tech       Date:  2011-07

5.  Rare-variant association testing for sequencing data with the sequence kernel association test.

Authors:  Michael C Wu; Seunggeun Lee; Tianxi Cai; Yun Li; Michael Boehnke; Xihong Lin
Journal:  Am J Hum Genet       Date:  2011-07-07       Impact factor: 11.025

6.  Inference of population mutation rate and detection of segregating sites from next-generation sequence data.

Authors:  Chul Joo Kang; Paul Marjoram
Journal:  Genetics       Date:  2011-08-11       Impact factor: 4.562

7.  Low-coverage sequencing: implications for design of complex trait association studies.

Authors:  Yun Li; Carlo Sidore; Hyun Min Kang; Michael Boehnke; Gonçalo R Abecasis
Journal:  Genome Res       Date:  2011-04-01       Impact factor: 9.043

8.  A probabilistic method for the detection and genotyping of small indels from population-scale sequence data.

Authors:  Vikas Bansal; Ondrej Libiger
Journal:  Bioinformatics       Date:  2011-06-07       Impact factor: 6.937

9.  Next generation sequencing in cardiovascular diseases.

Authors:  Francesca Faita; Cecilia Vecoli; Ilenia Foffa; Maria Grazia Andreassi
Journal:  World J Cardiol       Date:  2012-10-26

10.  A statistical method for the detection of variants from next-generation resequencing of DNA pools.

Authors:  Vikas Bansal
Journal:  Bioinformatics       Date:  2010-06-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.