Anwoy Kumar Mohanty1, Dana Vuzman1, Laurent Francioli2,3, Christopher Cassa1, Agnes Toth-Petroczy1, Shamil Sunyaev1,4. 1. Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA. 2. Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA. 3. Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA. 4. Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
Abstract
MOTIVATION: De novo mutations (i.e. newly occurring mutations) are a pre-dominant cause of sporadic dominant monogenic diseases and play a significant role in the genetics of complex disorders. De novo mutation studies also inform population genetics models and shed light on the biology of DNA replication and repair. Despite the broad interest, there is room for improvement with regard to the accuracy of de novo mutation calling. RESULTS: We designed novoCaller, a Bayesian variant calling algorithm that uses information from read-level data both in the pedigree and in unrelated samples. The method was extensively tested using large trio-sequencing studies, and it consistently achieved over 97% sensitivity. We applied the algorithm to 48 trio cases of suspected rare Mendelian disorders as part of the Brigham Genomic Medicine gene discovery initiative. Its application resulted in a significant reduction in the resources required for manual inspection and experimental validation of the calls. Three de novo variants were found in known genes associated with rare disorders, leading to rapid genetic diagnosis of the probands. Another 14 variants were found in genes that are likely to explain the phenotype, and could lead to novel disease-gene discovery. AVAILABILITY AND IMPLEMENTATION: Source code implemented in C++ and Python can be downloaded from https://github.com/bgm-cwg/novoCaller. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: De novo mutations (i.e. newly occurring mutations) are a pre-dominant cause of sporadic dominant monogenic diseases and play a significant role in the genetics of complex disorders. De novo mutation studies also inform population genetics models and shed light on the biology of DNA replication and repair. Despite the broad interest, there is room for improvement with regard to the accuracy of de novo mutation calling. RESULTS: We designed novoCaller, a Bayesian variant calling algorithm that uses information from read-level data both in the pedigree and in unrelated samples. The method was extensively tested using large trio-sequencing studies, and it consistently achieved over 97% sensitivity. We applied the algorithm to 48 trio cases of suspected rare Mendelian disorders as part of the Brigham Genomic Medicine gene discovery initiative. Its application resulted in a significant reduction in the resources required for manual inspection and experimental validation of the calls. Three de novo variants were found in known genes associated with rare disorders, leading to rapid genetic diagnosis of the probands. Another 14 variants were found in genes that are likely to explain the phenotype, and could lead to novel disease-gene discovery. AVAILABILITY AND IMPLEMENTATION: Source code implemented in C++ and Python can be downloaded from https://github.com/bgm-cwg/novoCaller. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Gang Peng; Yu Fan; Timothy B Palculict; Peidong Shen; E Cristy Ruteshouser; Aung-Kyaw Chi; Ronald W Davis; Vicki Huff; Curt Scharfe; Wenyi Wang Journal: Proc Natl Acad Sci U S A Date: 2013-02-20 Impact factor: 11.205
Authors: Rodrigo Goya; Mark G F Sun; Ryan D Morin; Gillian Leung; Gavin Ha; Kimberley C Wiegand; Janine Senz; Anamaria Crisan; Marco A Marra; Martin Hirst; David Huntsman; Kevin P Murphy; Sam Aparicio; Sohrab P Shah Journal: Bioinformatics Date: 2010-02-03 Impact factor: 6.937
Authors: Ryan Poplin; Pi-Chuan Chang; David Alexander; Scott Schwartz; Thomas Colthurst; Alexander Ku; Dan Newburger; Jojo Dijamco; Nam Nguyen; Pegah T Afshar; Sam S Gross; Lizzie Dorfman; Cory Y McLean; Mark A DePristo Journal: Nat Biotechnol Date: 2018-09-24 Impact factor: 54.908
Authors: Brian J O'Roak; Laura Vives; Santhosh Girirajan; Emre Karakoc; Niklas Krumm; Bradley P Coe; Roie Levy; Arthur Ko; Choli Lee; Joshua D Smith; Emily H Turner; Ian B Stanaway; Benjamin Vernot; Maika Malig; Carl Baker; Beau Reilly; Joshua M Akey; Elhanan Borenstein; Mark J Rieder; Deborah A Nickerson; Raphael Bernier; Jay Shendure; Evan E Eichler Journal: Nature Date: 2012-04-04 Impact factor: 49.962
Authors: Laurent C Francioli; Mircea Cretu-Stancu; Kiran V Garimella; Menachem Fromer; Wigard P Kloosterman; Kaitlin E Samocha; Benjamin M Neale; Mark J Daly; Eric Banks; Mark A DePristo; Paul Iw de Bakker Journal: Eur J Hum Genet Date: 2016-11-23 Impact factor: 4.246
Authors: Menachem Fromer; Andrew J Pocklington; David H Kavanagh; Hywel J Williams; Sarah Dwyer; Padhraig Gormley; Lyudmila Georgieva; Elliott Rees; Priit Palta; Douglas M Ruderfer; Noa Carrera; Isla Humphreys; Jessica S Johnson; Panos Roussos; Douglas D Barker; Eric Banks; Vihra Milanova; Seth G Grant; Eilis Hannon; Samuel A Rose; Kimberly Chambert; Milind Mahajan; Edward M Scolnick; Jennifer L Moran; George Kirov; Aarno Palotie; Steven A McCarroll; Peter Holmans; Pamela Sklar; Michael J Owen; Shaun M Purcell; Michael C O'Donovan Journal: Nature Date: 2014-01-22 Impact factor: 49.962
Authors: Augustine Kong; Michael L Frigge; Gisli Masson; Soren Besenbacher; Patrick Sulem; Gisli Magnusson; Sigurjon A Gudjonsson; Asgeir Sigurdsson; Aslaug Jonasdottir; Adalbjorg Jonasdottir; Wendy S W Wong; Gunnar Sigurdsson; G Bragi Walters; Stacy Steinberg; Hannes Helgason; Gudmar Thorleifsson; Daniel F Gudbjartsson; Agnar Helgason; Olafur Th Magnusson; Unnur Thorsteinsdottir; Kari Stefansson Journal: Nature Date: 2012-08-23 Impact factor: 49.962