Literature DB >> 33413082

Set-theory based benchmarking of three different variant callers for targeted sequencing.

Jose Arturo Molina-Mora1,2, Mariela Solano-Vargas3.   

Abstract

BACKGROUND: Next generation sequencing (NGS) technologies have improved the study of hereditary diseases. Since the evaluation of bioinformatics pipelines is not straightforward, NGS demands effective strategies to analyze data that is of paramount relevance for decision making under a clinical scenario. According to the benchmarking framework of the Global Alliance for Genomics and Health (GA4GH), we implemented a new simple and user-friendly set-theory based method to assess variant callers using a gold standard variant set and high confidence regions. As model, we used TruSight Cardio kit sequencing data of the reference genome NA12878. This targeted sequencing kit is used to identify variants in key genes related to Inherited Cardiac Conditions (ICCs), a group of cardiovascular diseases with high rates of morbidity and mortality.
RESULTS: We implemented and compared three variant calling pipelines (Isaac, Freebayes, and VarScan). Performance metrics using our set-theory approach showed high-resolution pipelines and revealed: (1) a perfect recall of 1.000 for all three pipelines, (2) very high precision values, i.e. 0.987 for Freebayes, 0.928 for VarScan, and 1.000 for Isaac, when compared with the reference material, and (3) a ROC curve analysis with AUC > 0.94 for all cases. Moreover, significant differences were obtained between the three pipelines. In general, results indicate that the three pipelines were able to recognize the expected variants in the gold standard data set.
CONCLUSIONS: Our set-theory approach to calculate metrics was able to identify the expected ICCs related variants by the three selected pipelines, but results were completely dependent on the algorithms. We emphasize the importance to assess pipelines using gold standard materials to achieve the most reliable results for clinical application.

Entities:  

Keywords:  Freebayes; Inherited cardiac conditions; Isaac; Next-generation sequencing; Set-theory; VarScan; Variant calling

Mesh:

Year:  2021        PMID: 33413082      PMCID: PMC7791862          DOI: 10.1186/s12859-020-03926-3

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  26 in total

1.  A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Authors:  Pablo Cingolani; Adrian Platts; Le Lily Wang; Melissa Coon; Tung Nguyen; Luan Wang; Susan J Land; Xiangyi Lu; Douglas M Ruden
Journal:  Fly (Austin)       Date:  2012 Apr-Jun       Impact factor: 2.160

Review 2.  Molecular diagnostics: harmonization through reference materials, documentary standards and proficiency testing.

Authors:  Marcia J Holden; Roberta M Madej; Philip Minor; Lisa V Kalman
Journal:  Expert Rev Mol Diagn       Date:  2011-09       Impact factor: 5.225

3.  Best practices for benchmarking germline small-variant calls in human genomes.

Authors:  Peter Krusche; Len Trigg; Paul C Boutros; Christopher E Mason; Francisco M De La Vega; Benjamin L Moore; Mar Gonzalez-Porta; Michael A Eberle; Zivana Tezak; Samir Lababidi; Rebecca Truty; George Asimenos; Birgit Funke; Mark Fleharty; Brad A Chapman; Marc Salit; Justin M Zook
Journal:  Nat Biotechnol       Date:  2019-03-11       Impact factor: 54.908

4.  InterVar: Clinical Interpretation of Genetic Variants by the 2015 ACMG-AMP Guidelines.

Authors:  Quan Li; Kai Wang
Journal:  Am J Hum Genet       Date:  2017-01-26       Impact factor: 11.025

5.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.

Authors:  Kai Wang; Mingyao Li; Hakon Hakonarson
Journal:  Nucleic Acids Res       Date:  2010-07-03       Impact factor: 16.971

6.  An analytical framework for optimizing variant discovery from personal genomes.

Authors:  Gareth Highnam; Jason J Wang; Dean Kusler; Justin Zook; Vinaya Vijayan; Nir Leibovich; David Mittelman
Journal:  Nat Commun       Date:  2015-02-25       Impact factor: 14.919

7.  A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree.

Authors:  Michael A Eberle; Epameinondas Fritzilas; Peter Krusche; Morten Källberg; Benjamin L Moore; Mitchell A Bekritsky; Zamin Iqbal; Han-Yu Chuang; Sean J Humphray; Aaron L Halpern; Semyon Kruglyak; Elliott H Margulies; Gil McVean; David R Bentley
Journal:  Genome Res       Date:  2016-11-30       Impact factor: 9.043

8.  Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data.

Authors:  Sarah Sandmann; Aniek O de Graaf; Mohsen Karimi; Bert A van der Reijden; Eva Hellström-Lindberg; Joop H Jansen; Martin Dugas
Journal:  Sci Rep       Date:  2017-02-24       Impact factor: 4.379

9.  Systematic comparison of variant calling pipelines using gold standard personal exome variants.

Authors:  Sohyun Hwang; Eiru Kim; Insuk Lee; Edward M Marcotte
Journal:  Sci Rep       Date:  2015-12-07       Impact factor: 4.379

10.  Development of a Comprehensive Sequencing Assay for Inherited Cardiac Condition Genes.

Authors:  Chee Jian Pua; Jaydutt Bhalshankar; Kui Miao; Roddy Walsh; Shibu John; Shi Qi Lim; Kingsley Chow; Rachel Buchan; Bee Yong Soh; Pei Min Lio; Jaclyn Lim; Sebastian Schafer; Jing Quan Lim; Patrick Tan; Nicola Whiffin; Paul J Barton; James S Ware; Stuart A Cook
Journal:  J Cardiovasc Transl Res       Date:  2016-02-17       Impact factor: 4.132

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.