Literature DB >> 30858580

Best practices for benchmarking germline small-variant calls in human genomes.

Peter Krusche1, Len Trigg2, Paul C Boutros3, Christopher E Mason4,5,6,7, Francisco M De La Vega8, Benjamin L Moore1, Mar Gonzalez-Porta1, Michael A Eberle9, Zivana Tezak10, Samir Lababidi11, Rebecca Truty12, George Asimenos13, Birgit Funke14, Mark Fleharty15, Brad A Chapman16, Marc Salit17, Justin M Zook18.   

Abstract

Standardized benchmarking approaches are required to assess the accuracy of variants called from sequence data. Although variant-calling tools and the metrics used to assess their performance continue to improve, important challenges remain. Here, as part of the Global Alliance for Genomics and Health (GA4GH), we present a benchmarking framework for variant calling. We provide guidance on how to match variant calls with different representations, define standard performance metrics, and stratify performance by variant type and genome context. We describe limitations of high-confidence calls and regions that can be used as truth sets (for example, single-nucleotide variant concordance of two methods is 99.7% inside versus 76.5% outside high-confidence regions). Our web-based app enables comparison of variant calls against truth sets to obtain a standardized performance report. Our approach has been piloted in the PrecisionFDA variant-calling challenges to identify the best-in-class variant-calling methods within high-confidence regions. Finally, we recommend a set of best practices for using our tools and evaluating the results.

Entities:  

Mesh:

Year:  2019        PMID: 30858580      PMCID: PMC6699627          DOI: 10.1038/s41587-019-0054-x

Source DB:  PubMed          Journal:  Nat Biotechnol        ISSN: 1087-0156            Impact factor:   54.908


  67 in total

1.  Assessment of human diploid genome assembly with 10x Linked-Reads data.

Authors:  Lu Zhang; Xin Zhou; Ziming Weng; Arend Sidow
Journal:  Gigascience       Date:  2019-11-01       Impact factor: 6.524

2.  Set-theory based benchmarking of three different variant callers for targeted sequencing.

Authors:  Jose Arturo Molina-Mora; Mariela Solano-Vargas
Journal:  BMC Bioinformatics       Date:  2021-01-07       Impact factor: 3.169

3.  Accurate, scalable cohort variant calls using DeepVariant and GLnexus.

Authors:  Taedong Yun; Helen Li; Pi-Chuan Chang; Michael F Lin; Andrew Carroll; Cory Y McLean
Journal:  Bioinformatics       Date:  2021-01-05       Impact factor: 6.937

4.  Assessing reproducibility of inherited variants detected with short-read whole genome sequencing.

Authors:  Bohu Pan; Luyao Ren; Vitor Onuchic; Meijian Guan; Rebecca Kusko; Steve Bruinsma; Len Trigg; Andreas Scherer; Baitang Ning; Chaoyang Zhang; Christine Glidewell-Kenney; Chunlin Xiao; Eric Donaldson; Fritz J Sedlazeck; Gary Schroth; Gokhan Yavas; Haiying Grunenwald; Haodong Chen; Heather Meinholz; Joe Meehan; Jing Wang; Jingcheng Yang; Jonathan Foox; Jun Shang; Kelci Miclaus; Lianhua Dong; Leming Shi; Marghoob Mohiyuddin; Mehdi Pirooznia; Ping Gong; Rooz Golshani; Russ Wolfinger; Samir Lababidi; Sayed Mohammad Ebrahim Sahraeian; Steve Sherry; Tao Han; Tao Chen; Tieliu Shi; Wanwan Hou; Weigong Ge; Wen Zou; Wenjing Guo; Wenjun Bao; Wenzhong Xiao; Xiaohui Fan; Yoichi Gondo; Ying Yu; Yongmei Zhao; Zhenqiang Su; Zhichao Liu; Weida Tong; Wenming Xiao; Justin M Zook; Yuanting Zheng; Huixiao Hong
Journal:  Genome Biol       Date:  2022-01-03       Impact factor: 13.583

5.  Establishing analytical validity of BeadChip array genotype data by comparison to whole-genome sequence and standard benchmark datasets.

Authors:  Praveen F Cherukuri; Melissa M Soe; David E Condon; Shubhi Bartaria; Kaitlynn Meis; Shaopeng Gu; Frederick G Frost; Lindsay M Fricke; Krzysztof P Lubieniecki; Joanna M Lubieniecka; Robert E Pyatt; Catherine Hajek; Cornelius F Boerkoel; Lynn Carmichael
Journal:  BMC Med Genomics       Date:  2022-03-14       Impact factor: 3.063

6.  Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies.

Authors:  Arang Rhie; Brian P Walenz; Sergey Koren; Adam M Phillippy
Journal:  Genome Biol       Date:  2020-09-14       Impact factor: 13.583

7.  xGAP: A python based efficient, modular, extensible and fault tolerant genomic analysis pipeline for variant discovery.

Authors:  Aditya Gorla; Brandon Jew; Luke Zhang; Jae Hoon Sul
Journal:  Bioinformatics       Date:  2021-01-08       Impact factor: 6.937

8.  Halcyon: an accurate basecaller exploiting an encoder-decoder model with monotonic attention.

Authors:  Hiroki Konishi; Rui Yamaguchi; Kiyoshi Yamaguchi; Yoichi Furukawa; Seiya Imoto
Journal:  Bioinformatics       Date:  2021-06-09       Impact factor: 6.937

9.  GeneBreaker: Variant simulation to improve the diagnosis of Mendelian rare genetic diseases.

Authors:  Phillip A Richmond; Tamar V Av-Shalom; Oriol Fornes; Bhavi Modi; Alison M Elliott; Wyeth W Wasserman
Journal:  Hum Mutat       Date:  2021-02-10       Impact factor: 4.878

10.  Reducing Sanger confirmation testing through false positive prediction algorithms.

Authors:  James M Holt; Melissa Kelly; Brett Sundlof; Ghunwa Nakouzi; David Bick; Elaine Lyon
Journal:  Genet Med       Date:  2021-03-25       Impact factor: 8.822

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.