Literature DB >> 23103226

Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data.

Goo Jun1, Matthew Flickinger, Kurt N Hetrick, Jane M Romm, Kimberly F Doheny, Gonçalo R Abecasis, Michael Boehnke, Hyun Min Kang.   

Abstract

DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies.
Copyright © 2012 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2012        PMID: 23103226      PMCID: PMC3487130          DOI: 10.1016/j.ajhg.2012.09.004

Source DB:  PubMed          Journal:  Am J Hum Genet        ISSN: 0002-9297            Impact factor:   11.025


  10 in total

1.  Increasing power for tests of genetic association in the presence of phenotype and/or genotype error by use of double-sampling.

Authors:  Derek Gordon; Yaning Yang; Chad Haynes; Stephen J Finch; Nancy R Mendell; Abraham M Brown; Vahram Haroutunian
Journal:  Stat Appl Genet Mol Biol       Date:  2004-10-06

2.  GenoSNP: a variational Bayes within-sample SNP genotyping algorithm that does not require a reference population.

Authors:  Eleni Giannoulatou; Christopher Yau; Stefano Colella; Jiannis Ragoussis; Christopher C Holmes
Journal:  Bioinformatics       Date:  2008-07-24       Impact factor: 6.937

3.  Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies.

Authors:  Brian L Browning; Zhaoxia Yu
Journal:  Am J Hum Genet       Date:  2009-12       Impact factor: 11.025

4.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data.

Authors:  Kai Wang; Mingyao Li; Dexter Hadley; Rui Liu; Joseph Glessner; Struan F A Grant; Hakon Hakonarson; Maja Bucan
Journal:  Genome Res       Date:  2007-10-05       Impact factor: 9.043

5.  Low-coverage sequencing: implications for design of complex trait association studies.

Authors:  Yun Li; Carlo Sidore; Hyun Min Kang; Michael Boehnke; Gonçalo R Abecasis
Journal:  Genome Res       Date:  2011-04-01       Impact factor: 9.043

6.  ContEst: estimating cross-contamination of human samples in next-generation sequencing data.

Authors:  Kristian Cibulskis; Aaron McKenna; Tim Fennell; Eric Banks; Mark DePristo; Gad Getz
Journal:  Bioinformatics       Date:  2011-07-29       Impact factor: 6.937

7.  A map of human genome variation from population-scale sequencing.

Authors:  Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean
Journal:  Nature       Date:  2010-10-28       Impact factor: 49.962

8.  Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs.

Authors:  Joshua M Korn; Finny G Kuruvilla; Steven A McCarroll; Alec Wysoker; James Nemesh; Simon Cawley; Earl Hubbell; Jim Veitch; Patrick J Collins; Katayoon Darvishi; Charles Lee; Marcia M Nizzari; Stacey B Gabriel; Shaun Purcell; Mark J Daly; David Altshuler
Journal:  Nat Genet       Date:  2008-09-07       Impact factor: 38.330

9.  Fast identification and removal of sequence contamination from genomic and metagenomic datasets.

Authors:  Robert Schmieder; Robert Edwards
Journal:  PLoS One       Date:  2011-03-09       Impact factor: 3.240

10.  The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits.

Authors:  Benjamin F Voight; Hyun Min Kang; Jun Ding; Cameron D Palmer; Carlo Sidore; Peter S Chines; Noël P Burtt; Christian Fuchsberger; Yanming Li; Jeanette Erdmann; Timothy M Frayling; Iris M Heid; Anne U Jackson; Toby Johnson; Tuomas O Kilpeläinen; Cecilia M Lindgren; Andrew P Morris; Inga Prokopenko; Joshua C Randall; Richa Saxena; Nicole Soranzo; Elizabeth K Speliotes; Tanya M Teslovich; Eleanor Wheeler; Jared Maguire; Melissa Parkin; Simon Potter; N William Rayner; Neil Robertson; Kathleen Stirrups; Wendy Winckler; Serena Sanna; Antonella Mulas; Ramaiah Nagaraja; Francesco Cucca; Inês Barroso; Panos Deloukas; Ruth J F Loos; Sekar Kathiresan; Patricia B Munroe; Christopher Newton-Cheh; Arne Pfeufer; Nilesh J Samani; Heribert Schunkert; Joel N Hirschhorn; David Altshuler; Mark I McCarthy; Gonçalo R Abecasis; Michael Boehnke
Journal:  PLoS Genet       Date:  2012-08-02       Impact factor: 5.917

  10 in total
  177 in total

1.  Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data.

Authors:  Matthew Flickinger; Goo Jun; Gonçalo R Abecasis; Michael Boehnke; Hyun Min Kang
Journal:  Am J Hum Genet       Date:  2015-07-30       Impact factor: 11.025

Review 2.  Detecting Somatic Mutations in Normal Cells.

Authors:  Yanmei Dou; Heather D Gold; Lovelace J Luquette; Peter J Park
Journal:  Trends Genet       Date:  2018-05-03       Impact factor: 11.639

3.  Genetic regulatory signatures underlying islet gene expression and type 2 diabetes.

Authors:  Arushi Varshney; Laura J Scott; Ryan P Welch; Michael R Erdos; Peter S Chines; Narisu Narisu; Ricardo D'O Albanus; Peter Orchard; Brooke N Wolford; Romy Kursawe; Swarooparani Vadlamudi; Maren E Cannon; John P Didion; John Hensley; Anthony Kirilusha; Lori L Bonnycastle; D Leland Taylor; Richard Watanabe; Karen L Mohlke; Michael Boehnke; Francis S Collins; Stephen C J Parker; Michael L Stitzel
Journal:  Proc Natl Acad Sci U S A       Date:  2017-02-13       Impact factor: 11.205

4.  Comprehensive Molecular Characterization of Salivary Duct Carcinoma Reveals Actionable Targets and Similarity to Apocrine Breast Cancer.

Authors:  Martin G Dalin; Alexis Desrichard; Nora Katabi; Vladimir Makarov; Logan A Walsh; Ken-Wing Lee; Qingguo Wang; Joshua Armenia; Lyndsay West; Snjezana Dogan; Lu Wang; Deepa Ramaswami; Alan L Ho; Ian Ganly; David B Solit; Michael F Berger; Nikolaus D Schultz; Jorge S Reis-Filho; Timothy A Chan; Luc G T Morris
Journal:  Clin Cancer Res       Date:  2016-04-21       Impact factor: 12.531

5.  Exome-Based Rare-Variant Analyses in CKD.

Authors:  Sophia Cameron-Christie; Charles J Wolock; Emily Groopman; Slavé Petrovski; Sitharthan Kamalakaran; Gundula Povysil; Dimitrios Vitsios; Mengqi Zhang; Jan Fleckner; Ruth E March; Sahar Gelfman; Maddalena Marasa; Yifu Li; Simone Sanna-Cherchi; Krzysztof Kiryluk; Andrew S Allen; Bengt C Fellström; Carolina Haefliger; Adam Platt; David B Goldstein; Ali G Gharavi
Journal:  J Am Soc Nephrol       Date:  2019-05-13       Impact factor: 10.121

6.  Children's rare disease cohorts: an integrative research and clinical genomics initiative.

Authors:  Shira Rockowitz; Nicholas LeCompte; Mary Carmack; Andrew Quitadamo; Lily Wang; Meredith Park; Devon Knight; Emma Sexton; Lacey Smith; Beth Sheidley; Michael Field; Ingrid A Holm; Catherine A Brownstein; Pankaj B Agrawal; Susan Kornetsky; Annapurna Poduri; Scott B Snapper; Alan H Beggs; Timothy W Yu; David A Williams; Piotr Sliz
Journal:  NPJ Genom Med       Date:  2020-07-06       Impact factor: 8.617

Review 7.  From next-generation resequencing reads to a high-quality variant data set.

Authors:  S P Pfeifer
Journal:  Heredity (Edinb)       Date:  2016-10-19       Impact factor: 3.821

8.  Ancestry estimation and control of population stratification for sequence-based association studies.

Authors:  Chaolong Wang; Xiaowei Zhan; Jennifer Bragg-Gresham; Hyun Min Kang; Dwight Stambolian; Emily Y Chew; Kari E Branham; John Heckenlively; Robert Fulton; Richard K Wilson; Elaine R Mardis; Xihong Lin; Anand Swaroop; Sebastian Zöllner; Gonçalo R Abecasis
Journal:  Nat Genet       Date:  2014-03-16       Impact factor: 38.330

Review 9.  Rare-variant association analysis: study designs and statistical tests.

Authors:  Seunggeung Lee; Gonçalo R Abecasis; Michael Boehnke; Xihong Lin
Journal:  Am J Hum Genet       Date:  2014-07-03       Impact factor: 11.025

10.  Germline Lysine-Specific Demethylase 1 (LSD1/KDM1A) Mutations Confer Susceptibility to Multiple Myeloma.

Authors:  Xiaomu Wei; M Nieves Calvo-Vidal; Siwei Chen; Gang Wu; Maria V Revuelta; Jian Sun; Jinghui Zhang; Michael F Walsh; Kim E Nichols; Vijai Joseph; Carrie Snyder; Celine M Vachon; James D McKay; Shu-Ping Wang; David S Jayabalan; Lauren M Jacobs; Dina Becirovic; Rosalie G Waller; Mykyta Artomov; Agnes Viale; Jayeshkumar Patel; Jude Phillip; Selina Chen-Kiang; Karen Curtin; Mohamed Salama; Djordje Atanackovic; Ruben Niesvizky; Ola Landgren; Susan L Slager; Lucy A Godley; Jane Churpek; Judy E Garber; Kenneth C Anderson; Mark J Daly; Robert G Roeder; Charles Dumontet; Henry T Lynch; Charles G Mullighan; Nicola J Camp; Kenneth Offit; Robert J Klein; Haiyuan Yu; Leandro Cerchietti; Steven M Lipkin
Journal:  Cancer Res       Date:  2018-03-20       Impact factor: 12.701

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.