Literature DB >> 26235984

Correcting for Sample Contamination in Genotype Calling of DNA Sequence Data.

Matthew Flickinger1, Goo Jun2, Gonçalo R Abecasis1, Michael Boehnke3, Hyun Min Kang4.   

Abstract

DNA sample contamination is a frequent problem in DNA sequencing studies and can result in genotyping errors and reduced power for association testing. We recently described methods to identify within-species DNA sample contamination based on sequencing read data, showed that our methods can reliably detect and estimate contamination levels as low as 1%, and suggested strategies to identify and remove contaminated samples from sequencing studies. Here we propose methods to model contamination during genotype calling as an alternative to removal of contaminated samples from further analyses. We compare our contamination-adjusted calls to calls that ignore contamination and to calls based on uncontaminated data. We demonstrate that, for moderate contamination levels (5%-20%), contamination-adjusted calls eliminate 48%-77% of the genotyping errors. For lower levels of contamination, our contamination correction methods produce genotypes nearly as accurate as those based on uncontaminated data. Our contamination correction methods are useful generally, but are particularly helpful for sample contamination levels from 2% to 20%.
Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

Mesh:

Year:  2015        PMID: 26235984      PMCID: PMC4573246          DOI: 10.1016/j.ajhg.2015.07.002

Source DB:  PubMed          Journal:  Am J Hum Genet        ISSN: 0002-9297            Impact factor:   11.025


  6 in total

1.  Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies.

Authors:  Brian L Browning; Zhaoxia Yu
Journal:  Am J Hum Genet       Date:  2009-12       Impact factor: 11.025

2.  Low-coverage sequencing: implications for design of complex trait association studies.

Authors:  Yun Li; Carlo Sidore; Hyun Min Kang; Michael Boehnke; Gonçalo R Abecasis
Journal:  Genome Res       Date:  2011-04-01       Impact factor: 9.043

3.  Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors:  B Ewing; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

4.  Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data.

Authors:  Goo Jun; Matthew Flickinger; Kurt N Hetrick; Jane M Romm; Kimberly F Doheny; Gonçalo R Abecasis; Michael Boehnke; Hyun Min Kang
Journal:  Am J Hum Genet       Date:  2012-10-25       Impact factor: 11.025

5.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

6.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

  6 in total
  10 in total

Review 1.  Detecting Somatic Mutations in Normal Cells.

Authors:  Yanmei Dou; Heather D Gold; Lovelace J Luquette; Peter J Park
Journal:  Trends Genet       Date:  2018-05-03       Impact factor: 11.639

2.  Estimation of DNA contamination and its sources in genotyped samples.

Authors:  Gregory J M Zajac; Lars G Fritsche; Joshua S Weinstock; Susan L Dagenais; Robert H Lyons; Chad M Brummett; Gonçalo R Abecasis
Journal:  Genet Epidemiol       Date:  2019-08-26       Impact factor: 2.135

3.  The human "contaminome": bacterial, viral, and computational contamination in whole genome sequences from 1000 families.

Authors:  Brianna Chrisman; Chloe He; Jae-Yoon Jung; Nate Stockham; Kelley Paskov; Peter Washington; Dennis P Wall
Journal:  Sci Rep       Date:  2022-06-14       Impact factor: 4.996

4.  Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions.

Authors:  Marion Ballenghien; Nicolas Faivre; Nicolas Galtier
Journal:  BMC Biol       Date:  2017-03-29       Impact factor: 7.431

5.  Peptidomic and transcriptomic profiling of four distinct spider venoms.

Authors:  Vera Oldrati; Dominique Koua; Pierre-Marie Allard; Nicolas Hulo; Miriam Arrell; Wolfgang Nentwig; Frédérique Lisacek; Jean-Luc Wolfender; Lucia Kuhn-Nentwig; Reto Stöcklin
Journal:  PLoS One       Date:  2017-03-17       Impact factor: 3.240

6.  Indel detection from RNA-seq data: tool evaluation and strategies for accurate detection of actionable mutations.

Authors:  Zhifu Sun; Aditya Bhagwate; Naresh Prodduturi; Ping Yang; Jean-Pierre A Kocher
Journal:  Brief Bioinform       Date:  2017-11-01       Impact factor: 11.622

7.  ConFindr: rapid detection of intraspecies and cross-species contamination in bacterial whole-genome sequence data.

Authors:  Andrew J Low; Catherine D Carrillo; Adam G Koziol; Paul A Manninger; Burton Blais
Journal:  PeerJ       Date:  2019-05-31       Impact factor: 2.984

8.  Ancestry-agnostic estimation of DNA sample contamination from sequence reads.

Authors:  Fan Zhang; Matthew Flickinger; Sarah A Gagliano Taliun; Gonçalo R Abecasis; Laura J Scott; Steven A McCaroll; Carlos N Pato; Michael Boehnke; Hyun Min Kang
Journal:  Genome Res       Date:  2020-01-24       Impact factor: 9.043

9.  Integration, abundance, and transmission of mutations and transgenes in a series of CRISPR/Cas9 soybean lines.

Authors:  Jean-Michel Michno; Kamaldeep Virdi; Adrian O Stec; Junqi Liu; Xiaobo Wang; Yer Xiong; Robert M Stupar
Journal:  BMC Biotechnol       Date:  2020-02-24       Impact factor: 2.563

10.  Forensic validation of a panel of 12 SNPs for identification of Mongolian wolf and dog.

Authors:  Hong Hui Jiang; Bo Li; Yue Ma; Su Ying Bai; Thomas D Dahmer; Adrian Linacre; Yan Chun Xu
Journal:  Sci Rep       Date:  2020-08-06       Impact factor: 4.379

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.