Literature DB >> 33627090

Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays.

Vandhana Krishnan1,2, Sowmithri Utiramerur3,4,5, Zena Ng6, Somalee Datta2,7, Michael P Snyder1,2, Euan A Ashley8,9,10.   

Abstract

BACKGROUND: Benchmarking the performance of complex analytical pipelines is an essential part of developing Lab Developed Tests (LDT). Reference samples and benchmark calls published by Genome in a Bottle (GIAB) consortium have enabled the evaluation of analytical methods. The performance of such methods is not uniform across the different genomic regions of interest and variant types. Several benchmarking methods such as hap.py, vcfeval, and vcflib are available to assess the analytical performance characteristics of variant calling algorithms. However, assessing the performance characteristics of an overall LDT assay still requires stringing together several such methods and experienced bioinformaticians to interpret the results. In addition, these methods are dependent on the hardware, operating system and other software libraries, making it impossible to reliably repeat the analytical assessment, when any of the underlying dependencies change in the assay. Here we present a scalable and reproducible, cloud-based benchmarking workflow that is independent of the laboratory and the technician executing the workflow, or the underlying compute hardware used to rapidly and continually assess the performance of LDT assays, across their regions of interest and reportable range, using a broad set of benchmarking samples.
RESULTS: The benchmarking workflow was used to evaluate the performance characteristics for secondary analysis pipelines commonly used by Clinical Genomics laboratories in their LDT assays such as the GATK HaplotypeCaller v3.7 and the SpeedSeq workflow based on FreeBayes v0.9.10. Five reference sample truth sets generated by Genome in a Bottle (GIAB) consortium, six samples from the Personal Genome Project (PGP) and several samples with validated clinically relevant variants from the Centers for Disease Control were used in this work. The performance characteristics were evaluated and compared for multiple reportable ranges, such as whole exome and the clinical exome.
CONCLUSIONS: We have implemented a benchmarking workflow for clinical diagnostic laboratories that generates metrics such as specificity, precision and sensitivity for germline SNPs and InDels within a reportable range using whole exome or genome sequencing data. Combining these benchmarking results with validation using known variants of clinical significance in publicly available cell lines, we were able to establish the performance of variant calling pipelines in a clinical setting.

Entities:  

Keywords:  Benchmarking; Docker; GIAB reference genomes; Germline variants; Lab developed tests; Precision; Recall; Truth set; Workflow

Mesh:

Year:  2021        PMID: 33627090      PMCID: PMC7903625          DOI: 10.1186/s12859-020-03934-3

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  21 in total

1.  The UCSC Table Browser data retrieval tool.

Authors:  Donna Karolchik; Angela S Hinrichs; Terrence S Furey; Krishna M Roskin; Charles W Sugnet; David Haussler; W James Kent
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

Review 2.  Recommended principles and practices for validating clinical molecular pathology tests.

Authors:  Lawrence Jennings; Vivianna M Van Deerlin; Margaret L Gulley
Journal:  Arch Pathol Lab Med       Date:  2009-05       Impact factor: 5.534

3.  Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls.

Authors:  Justin M Zook; Brad Chapman; Jason Wang; David Mittelman; Oliver Hofmann; Winston Hide; Marc Salit
Journal:  Nat Biotechnol       Date:  2014-02-16       Impact factor: 54.908

4.  DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources.

Authors:  Helen V Firth; Shola M Richards; A Paul Bevan; Stephen Clayton; Manuel Corpas; Diana Rajan; Steven Van Vooren; Yves Moreau; Roger M Pettett; Nigel P Carter
Journal:  Am J Hum Genet       Date:  2009-04-02       Impact factor: 11.025

5.  The UCSC Genome Browser database: update 2011.

Authors:  Pauline A Fujita; Brooke Rhead; Ann S Zweig; Angie S Hinrichs; Donna Karolchik; Melissa S Cline; Mary Goldman; Galt P Barber; Hiram Clawson; Antonio Coelho; Mark Diekhans; Timothy R Dreszer; Belinda M Giardine; Rachel A Harte; Jennifer Hillman-Jackson; Fan Hsu; Vanessa Kirkup; Robert M Kuhn; Katrina Learned; Chin H Li; Laurence R Meyer; Andy Pohl; Brian J Raney; Kate R Rosenbloom; Kayla E Smith; David Haussler; W James Kent
Journal:  Nucleic Acids Res       Date:  2010-10-18       Impact factor: 16.971

6.  Reproducibility vs. Replicability: A Brief History of a Confused Terminology.

Authors:  Hans E Plesser
Journal:  Front Neuroinform       Date:  2018-01-18       Impact factor: 4.081

7.  The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows.

Authors:  Brian D O'Connor; Denis Yuen; Vincent Chung; Andrew G Duncan; Xiang Kun Liu; Janice Patricia; Benedict Paten; Lincoln Stein; Vincent Ferretti
Journal:  F1000Res       Date:  2017-01-18

8.  A Serverless Tool for Platform Agnostic Computational Experiment Management.

Authors:  Gregory Kiar; Shawn T Brown; Tristan Glatard; Alan C Evans
Journal:  Front Neuroinform       Date:  2019-03-05       Impact factor: 4.081

9.  Cystic fibrosis population carrier screening: 2004 revision of American College of Medical Genetics mutation panel.

Authors:  Michael S Watson; Garry R Cutting; Robert J Desnick; Deborah A Driscoll; Katherine Klinger; Michael Mennuti; Glenn E Palomaki; Bradley W Popovich; Victoria M Pratt; Elizabeth M Rohlfs; Charles M Strom; C Sue Richards; David R Witt; Wayne W Grody
Journal:  Genet Med       Date:  2004 Sep-Oct       Impact factor: 8.822

Review 10.  Container-Based Clinical Solutions for Portable and Reproducible Image Analysis.

Authors:  Jordan Matelsky; Gregory Kiar; Erik Johnson; Corban Rivera; Michael Toma; William Gray-Roncal
Journal:  J Digit Imaging       Date:  2018-06       Impact factor: 4.056

View more
  1 in total

1.  Establishment of reference standards for multifaceted mosaic variant analysis.

Authors:  Yoo-Jin Ha; Myung Joon Oh; Junhan Kim; Jisoo Kim; Seungseok Kang; John D Minna; Hyun Seok Kim; Sangwoo Kim
Journal:  Sci Data       Date:  2022-02-03       Impact factor: 6.444

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.