Adrian Tan1, Gonçalo R Abecasis1, Hyun Min Kang1. 1. Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
Abstract
UNLABELLED: A genetic variant can be represented in the Variant Call Format (VCF) in multiple different ways. Inconsistent representation of variants between variant callers and analyses will magnify discrepancies between them and complicate variant filtering and duplicate removal. We present a software tool vt normalize that normalizes representation of genetic variants in the VCF. We formally define variant normalization as the consistent representation of genetic variants in an unambiguous and concise way and derive a simple general algorithm to enforce it. We demonstrate the inconsistent representation of variants across existing sequence analysis tools and show that our tool facilitates integration of diverse variant types and call sets. AVAILABILITY AND IMPLEMENTATION: The source code is available for download at http://github.com/atks/vt. More detailed documentation is available at http://genome.sph.umich.edu/wiki/Variant_Normalization. CONTACT: hmkang@umich.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
UNLABELLED: A genetic variant can be represented in the Variant Call Format (VCF) in multiple different ways. Inconsistent representation of variants between variant callers and analyses will magnify discrepancies between them and complicate variant filtering and duplicate removal. We present a software tool vt normalize that normalizes representation of genetic variants in the VCF. We formally define variant normalization as the consistent representation of genetic variants in an unambiguous and concise way and derive a simple general algorithm to enforce it. We demonstrate the inconsistent representation of variants across existing sequence analysis tools and show that our tool facilitates integration of diverse variant types and call sets. AVAILABILITY AND IMPLEMENTATION: The source code is available for download at http://github.com/atks/vt. More detailed documentation is available at http://genome.sph.umich.edu/wiki/Variant_Normalization. CONTACT: hmkang@umich.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin Journal: Nucleic Acids Res Date: 2001-01-01 Impact factor: 16.971
Authors: Ryan E Mills; W Stephen Pittard; Julienne M Mullaney; Umar Farooq; Todd H Creasy; Anup A Mahurkar; David M Kemeza; Daniel S Strassler; Chris P Ponting; Caleb Webber; Scott E Devine Journal: Genome Res Date: 2011-04-01 Impact factor: 9.043
Authors: Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly Journal: Nat Genet Date: 2011-04-10 Impact factor: 38.330
Authors: Petr Danecek; Adam Auton; Goncalo Abecasis; Cornelis A Albers; Eric Banks; Mark A DePristo; Robert E Handsaker; Gerton Lunter; Gabor T Marth; Stephen T Sherry; Gilean McVean; Richard Durbin Journal: Bioinformatics Date: 2011-06-07 Impact factor: 6.937
Authors: Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean Journal: Nature Date: 2012-11-01 Impact factor: 49.962
Authors: Xiaolei Zhang; Eric V Minikel; Anne H O'Donnell-Luria; Daniel G MacArthur; James S Ware; Ben Weisburd Journal: Wellcome Open Res Date: 2017-05-23
Authors: Brian W Kunkle; Badri N Vardarajan; Adam C Naj; Patrice L Whitehead; Sophie Rolati; Susan Slifer; Regina M Carney; Michael L Cuccaro; Jeffery M Vance; John R Gilbert; Li-San Wang; Lindsay A Farrer; Christiane Reitz; Jonathan L Haines; Gary W Beecham; Eden R Martin; Gerard D Schellenberg; Richard P Mayeux; Margaret A Pericak-Vance Journal: JAMA Neurol Date: 2017-09-01 Impact factor: 18.302
Authors: Han Fang; Ewa A Bergmann; Kanika Arora; Vladimir Vacic; Michael C Zody; Ivan Iossifov; Jason A O'Rawe; Yiyang Wu; Laura T Jimenez Barron; Julie Rosenbaum; Michael Ronemus; Yoon-Ha Lee; Zihua Wang; Esra Dikoglu; Vaidehi Jobanputra; Gholson J Lyon; Michael Wigler; Michael C Schatz; Giuseppe Narzisi Journal: Nat Protoc Date: 2016-11-17 Impact factor: 13.491
Authors: Sonja Hutter; Rosario M Piro; Sebastian M Waszak; Hildegard Kehrer-Sawatzki; Reinhard E Friedrich; Alvaro Lassaletta; Olaf Witt; Jan O Korbel; Peter Lichter; Martin U Schuhmann; Stefan M Pfister; Uri Tabori; Victor F Mautner; David T W Jones Journal: Hum Genet Date: 2016-03-11 Impact factor: 4.132
Authors: Kohei Hagiwara; Liang Ding; Michael N Edmonson; Stephen V Rice; Scott Newman; John Easton; Juncheng Dai; Soheil Meshinchi; Rhonda E Ries; Michael Rusch; Jinghui Zhang Journal: Bioinformatics Date: 2020-03-01 Impact factor: 6.937
Authors: M Liu; S M Malone; U Vaidyanathan; M C Keller; G Abecasis; M McGue; W G Iacono; S I Vrieze Journal: Psychol Med Date: 2016-12-20 Impact factor: 7.723