Literature DB >> 24558117

Assessing single nucleotide variant detection and genotype calling on whole-genome sequenced individuals.

Anthony Youzhi Cheng1, Yik-Ying Teo2, Rick Twee-Hee Ong1.   

Abstract

MOTIVATION: Whole-genome sequencing (WGS) is now routinely used for the detection and identification of genetic variants, particularly single nucleotide polymorphisms (SNPs) in humans, and this has provided valuable new insights into human diversity, population histories and genetic association studies of traits and diseases. However, this relies on accurate detection and genotyping calling of the polymorphisms present in the samples sequenced. To minimize cost, the majority of current WGS studies, including the 1000 Genomes Project (1 KGP) have adopted low coverage sequencing of large number of samples, where such designs have inadvertently influenced the development of variant calling methods on WGS data. Assessment of variant accuracy are usually performed on the same set of low coverage individuals or a smaller number of deeply sequenced individuals. It is thus unclear how these variant calling methods would fare for a dataset of ∼100 samples from a population not part of the 1 KGP that have been sequenced at various coverage depths.
AVAILABILITY AND IMPLEMENTATION: Using down-sampling of the sequencing reads obtained from the Singapore Sequencing Malay Project (SSMP), and a set of SNP calls from the same individuals genotyped on the Illumina Omni1-Quad array, we assessed the sensitivity of SNP detection, accuracy of genotype calls made and variant accuracy for six commonly used variant calling methods of GATK, SAMtools, Consensus Assessment of Sequence and Variation (CASAVA), VarScan, glfTools and SOAPsnp. The results indicate that at 5× coverage depth, the multi-sample callers of GATK and SAMtools yield the best accuracy particularly if the study samples are called together with a large number of individuals such as those from 1000 Genomes Project. If study samples are sequenced at a high coverage depth such as 30×, CASAVA has the highest variant accuracy as compared with the other variant callers assessed.
© The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2014        PMID: 24558117     DOI: 10.1093/bioinformatics/btu067

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  35 in total

1.  Simple, rapid and accurate genotyping-by-sequencing from aligned whole genomes with ArrayMaker.

Authors:  Cali E Willet; Bianca Haase; Michael A Charleston; Claire M Wade
Journal:  Bioinformatics       Date:  2014-10-21       Impact factor: 6.937

Review 2.  From next-generation resequencing reads to a high-quality variant data set.

Authors:  S P Pfeifer
Journal:  Heredity (Edinb)       Date:  2016-10-19       Impact factor: 3.821

Review 3.  Toward better understanding of artifacts in variant calling from high-coverage samples.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2014-06-27       Impact factor: 6.937

4.  Reply to Wang et al.: Sequencing datasets do not refute Central Asian domestication origin of dogs.

Authors:  Laura M Shannon; Ryan H Boyko; Marta Castelhano; Elizabeth Corey; Jessica J Hayward; Corin McLean; Michelle E White; Mounir R Abi Said; Baddley A Anita; Nono Ikombe Bondjengo; Jorge Calero; Ana Galov; Marius Hedimbi; Bulu Imam; Rajashree Khalap; Douglas Lally; Andrew Masta; Kyle C Oliveira; Lucía Pérez; Julia Randall; Nguyen Minh Tam; Francisco J Trujillo-Cornejo; Carlos Valeriano; Nathan B Sutter; Rory J Todhunter; Carlos D Bustamante; Adam R Boyko
Journal:  Proc Natl Acad Sci U S A       Date:  2016-04-20       Impact factor: 11.205

5.  Set-theory based benchmarking of three different variant callers for targeted sequencing.

Authors:  Jose Arturo Molina-Mora; Mariela Solano-Vargas
Journal:  BMC Bioinformatics       Date:  2021-01-07       Impact factor: 3.169

Review 6.  Paediatric genomics: diagnosing rare disease in children.

Authors:  Caroline F Wright; David R FitzPatrick; Helen V Firth
Journal:  Nat Rev Genet       Date:  2018-02-05       Impact factor: 53.242

7.  MycoSNP: A Portable Workflow for Performing Whole-Genome Sequencing Analysis of Candida auris.

Authors:  Ujwal R Bagal; John Phan; Rory M Welsh; Elizabeth Misas; Darlene Wagner; Lalitha Gade; Anastasia P Litvintseva; Christina A Cuomo; Nancy A Chow
Journal:  Methods Mol Biol       Date:  2022

8.  Whole-Genome Sequencing to Evaluate the Resistance Landscape Following Antimalarial Treatment Failure With Fosmidomycin-Clindamycin.

Authors:  Ann M Guggisberg; Sesh A Sundararaman; Miguel Lanaspa; Cinta Moraleda; Raquel González; Alfredo Mayor; Pau Cisteró; David Hutchinson; Peter G Kremsner; Beatrice H Hahn; Quique Bassat; Audrey R Odom
Journal:  J Infect Dis       Date:  2016-07-20       Impact factor: 5.226

9.  Simulation of African and non-African low and high coverage whole genome sequence data to assess variant calling approaches.

Authors:  Shatha Alosaimi; Noëlle van Biljon; Denis Awany; Prisca K Thami; Joel Defo; Jacquiline W Mugo; Christian D Bope; Gaston K Mazandu; Nicola J Mulder; Emile R Chimusa
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

Review 10.  Whole-genome sequencing as a first-tier diagnostic framework for rare genetic diseases.

Authors:  Haseeb Nisar; Bilal Wajid; Samiah Shahid; Faria Anwar; Imran Wajid; Asia Khatoon; Mian Usman Sattar; Saima Sadaf
Journal:  Exp Biol Med (Maywood)       Date:  2021-09-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.