Literature DB >> 21328616

Confounded by sequencing depth in association studies of rare alleles.

Chad Garner1.   

Abstract

Next-generation DNA sequencing technologies are facilitating large-scale association studies of rare genetic variants. The depth of the sequence read coverage is an important experimental variable in the next-generation technologies and it is a major determinant of the quality of genotype calls generated from sequence data. When case and control samples are sequenced separately or in different proportions across batches, they are unlikely to be matched on sequencing read depth and a differential misclassification of genotypes can result, causing confounding and an increased false-positive rate. Data from Pilot Study 3 of the 1000 Genomes project was used to demonstrate that a difference between the mean sequencing read depth of case and control samples can result in false-positive association for rare and uncommon variants, even when the mean coverage depth exceeds 30× in both groups. The degree of the confounding and inflation in the false-positive rate depended on the extent to which the mean depth was different in the case and control groups. A logistic regression model was used to test for association between case-control status and the cumulative number of alleles in a collapsed set of rare and uncommon variants. Including each individual's mean sequence read depth across the variant sites in the logistic regression model nearly eliminated the confounding effect and the inflated false-positive rate. Furthermore, accounting for the potential error by modeling the probability of the heterozygote genotype calls in the regression analysis had a relatively minor but beneficial effect on the statistical results.
© 2011 Wiley-Liss, Inc.

Entities:  

Mesh:

Year:  2011        PMID: 21328616      PMCID: PMC3129358          DOI: 10.1002/gepi.20574

Source DB:  PubMed          Journal:  Genet Epidemiol        ISSN: 0741-0395            Impact factor:   2.135


  30 in total

1.  Generalized T2 test for genome association studies.

Authors:  Momiao Xiong; Jinying Zhao; Eric Boerwinkle
Journal:  Am J Hum Genet       Date:  2002-03-29       Impact factor: 11.025

2.  Group additive regression models for genomic data analysis.

Authors:  Yihui Luan; Hongzhe Li
Journal:  Biostatistics       Date:  2007-05-18       Impact factor: 5.899

3.  The use of random controls in genetic association studies.

Authors:  Chad Garner
Journal:  Hum Hered       Date:  2006-03-07       Impact factor: 0.444

4.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data.

Authors:  Bingshan Li; Suzanne M Leal
Journal:  Am J Hum Genet       Date:  2008-08-07       Impact factor: 11.025

5.  Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors:  B Ewing; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

6.  A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST).

Authors:  Stephan Morgenthaler; William G Thilly
Journal:  Mutat Res       Date:  2006-11-13       Impact factor: 2.433

Review 7.  Common and rare variants in multifactorial susceptibility to common diseases.

Authors:  Walter Bodmer; Carolina Bonilla
Journal:  Nat Genet       Date:  2008-06       Impact factor: 38.330

Review 8.  A tutorial on statistical methods for population association studies.

Authors:  David J Balding
Journal:  Nat Rev Genet       Date:  2006-10       Impact factor: 53.242

9.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

Authors: 
Journal:  Nature       Date:  2007-06-07       Impact factor: 49.962

10.  Simultaneous analysis of all SNPs in genome-wide and re-sequencing association studies.

Authors:  Clive J Hoggart; John C Whittaker; Maria De Iorio; David J Balding
Journal:  PLoS Genet       Date:  2008-07-25       Impact factor: 5.917

View more
  17 in total

1.  A flexible likelihood framework for detecting associations with secondary phenotypes in genetic studies using selected samples: application to sequence data.

Authors:  Dajiang J Liu; Suzanne M Leal
Journal:  Eur J Hum Genet       Date:  2011-12-14       Impact factor: 4.246

2.  Assessing the impact of non-differential genotyping errors on rare variant tests of association.

Authors:  Scott Powers; Shyam Gopalakrishnan; Nathan Tintle
Journal:  Hum Hered       Date:  2011-10-15       Impact factor: 0.444

3.  A statistical approach for rare-variant association testing in affected sibships.

Authors:  Michael P Epstein; Richard Duncan; Erin B Ware; Min A Jhun; Lawrence F Bielak; Wei Zhao; Jennifer A Smith; Patricia A Peyser; Sharon L R Kardia; Glen A Satten
Journal:  Am J Hum Genet       Date:  2015-03-19       Impact factor: 11.025

4.  Association analysis using next-generation sequence data from publicly available control groups: the robust variance score statistic.

Authors:  Andriy Derkach; Theodore Chiang; Jiafen Gong; Laura Addis; Sara Dobbins; Ian Tomlinson; Richard Houlston; Deb K Pal; Lisa J Strug
Journal:  Bioinformatics       Date:  2014-04-14       Impact factor: 6.937

5.  Value of Mendelian laws of segregation in families: data quality control, imputation, and beyond.

Authors:  Elizabeth M Blue; Lei Sun; Nathan L Tintle; Ellen M Wijsman
Journal:  Genet Epidemiol       Date:  2014-09       Impact factor: 2.135

6.  Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study.

Authors:  Lin Hou; Ning Sun; Shrikant Mane; Fred Sayward; Nallakkandi Rajeevan; Kei-Hoi Cheung; Kelly Cho; Saiju Pyarajan; Mihaela Aslan; Perry Miller; Philip D Harvey; J Michael Gaziano; John Concato; Hongyu Zhao
Journal:  Genet Epidemiol       Date:  2016-12-26       Impact factor: 2.135

7.  Estimation of allele frequency and association mapping using next-generation sequencing data.

Authors:  Su Yeon Kim; Kirk E Lohmueller; Anders Albrechtsen; Yingrui Li; Thorfinn Korneliussen; Geng Tian; Niels Grarup; Tao Jiang; Gitte Andersen; Daniel Witte; Torben Jorgensen; Torben Hansen; Oluf Pedersen; Jun Wang; Rasmus Nielsen
Journal:  BMC Bioinformatics       Date:  2011-06-11       Impact factor: 3.169

Review 8.  Computational and statistical approaches to analyzing variants identified by exome sequencing.

Authors:  Nathan O Stitziel; Adam Kiezun; Shamil Sunyaev
Journal:  Genome Biol       Date:  2011-09-14       Impact factor: 13.583

9.  Identification of Laying-Related SNP Markers in Geese Using RAD Sequencing.

Authors:  ShiGang Yu; WeiWei Chu; LiFan Zhang; HouMing Han; RongXue Zhao; Wei Wu; JiangNing Zhu; Michael V Dodson; Wei Wei; HongLin Liu; Jie Chen
Journal:  PLoS One       Date:  2015-07-16       Impact factor: 3.240

10.  Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification.

Authors:  Laura L Faye; Mitchell J Machiela; Peter Kraft; Shelley B Bull; Lei Sun
Journal:  PLoS Genet       Date:  2013-08-08       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.