Qiang Wei1, Xiaowei Zhan1, Xue Zhong1, Yongzhuang Liu2, Yujun Han1, Wei Chen1, Bingshan Li2. 1. Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA, Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX, USA, Center for Quantitative Sciences, Vanderbilt University, Nashville, TN, USA,Center for Human Genetic Variation, Duke University, Durham, NC, USA, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, USA. 2. Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA, Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX, USA, Center for Quantitative Sciences, Vanderbilt University, Nashville, TN, USA,Center for Human Genetic Variation, Duke University, Durham, NC, USA, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, USA Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA, Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX, USA, Center for Quantitative Sciences, Vanderbilt University, Nashville, TN, USA,Center for Human Genetic Variation, Duke University, Durham, NC, USA, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China and Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, USA.
Abstract
MOTIVATION: Spontaneous (de novo) mutations play an important role in the disease etiology of a range of complex diseases. Identifying de novo mutations (DNMs) in sporadic cases provides an effective strategy to find genes or genomic regions implicated in the genetics of disease. High-throughput next-generation sequencing enables genome- or exome-wide detection of DNMs by sequencing parents-proband trios. It is challenging to sift true mutations through massive amount of noise due to sequencing error and alignment artifacts. One of the critical limitations of existing methods is that for all genomic regions the same pre-specified mutation rate is assumed, which has a significant impact on the DNM calling accuracy. RESULTS: In this study, we developed and implemented a novel Bayesian framework for DNM calling in trios (TrioDeNovo), which overcomes these limitations by disentangling prior mutation rates from evaluation of the likelihood of the data so that flexible priors can be adjusted post-hoc at different genomic sites. Through extensively simulations and application to real data we showed that this new method has improved sensitivity and specificity over existing methods, and provides a flexible framework to further improve the efficiency by incorporating proper priors. The accuracy is further improved using effective filtering based on sequence alignment characteristics. AVAILABILITY AND IMPLEMENTATION: The C++ source code implementing TrioDeNovo is freely available at https://medschool.vanderbilt.edu/cgg. CONTACT: bingshan.li@vanderbilt.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Spontaneous (de novo) mutations play an important role in the disease etiology of a range of complex diseases. Identifying de novo mutations (DNMs) in sporadic cases provides an effective strategy to find genes or genomic regions implicated in the genetics of disease. High-throughput next-generation sequencing enables genome- or exome-wide detection of DNMs by sequencing parents-proband trios. It is challenging to sift true mutations through massive amount of noise due to sequencing error and alignment artifacts. One of the critical limitations of existing methods is that for all genomic regions the same pre-specified mutation rate is assumed, which has a significant impact on the DNM calling accuracy. RESULTS: In this study, we developed and implemented a novel Bayesian framework for DNM calling in trios (TrioDeNovo), which overcomes these limitations by disentangling prior mutation rates from evaluation of the likelihood of the data so that flexible priors can be adjusted post-hoc at different genomic sites. Through extensively simulations and application to real data we showed that this new method has improved sensitivity and specificity over existing methods, and provides a flexible framework to further improve the efficiency by incorporating proper priors. The accuracy is further improved using effective filtering based on sequence alignment characteristics. AVAILABILITY AND IMPLEMENTATION: The C++ source code implementing TrioDeNovo is freely available at https://medschool.vanderbilt.edu/cgg. CONTACT: bingshan.li@vanderbilt.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Jayne Y Hehir-Kwa; Benjamín Rodríguez-Santiago; Lisenka E Vissers; Nicole de Leeuw; Rolph Pfundt; Jan K Buitelaar; Luis A Pérez-Jurado; Joris A Veltman Journal: J Med Genet Date: 2011-10-03 Impact factor: 6.318
Authors: Donald F Conrad; Jonathan E M Keebler; Mark A DePristo; Sarah J Lindsay; Yujun Zhang; Ferran Casals; Youssef Idaghdour; Chris L Hartl; Carlos Torroja; Kiran V Garimella; Martine Zilversmit; Reed Cartwright; Guy A Rouleau; Mark Daly; Eric A Stone; Matthew E Hurles; Philip Awadalla Journal: Nat Genet Date: 2011-06-12 Impact factor: 38.330
Authors: Dan Levy; Michael Ronemus; Boris Yamrom; Yoon-ha Lee; Anthony Leotta; Jude Kendall; Steven Marks; B Lakshmi; Deepa Pai; Kenny Ye; Andreas Buja; Abba Krieger; Seungtai Yoon; Jennifer Troge; Linda Rodgers; Ivan Iossifov; Michael Wigler Journal: Neuron Date: 2011-06-09 Impact factor: 17.173
Authors: Anne Gregor; Martin Oti; Evelyn N Kouwenhoven; Juliane Hoyer; Heinrich Sticht; Arif B Ekici; Susanne Kjaergaard; Anita Rauch; Hendrik G Stunnenberg; Steffen Uebe; Georgia Vasileiou; André Reis; Huiqing Zhou; Christiane Zweier Journal: Am J Hum Genet Date: 2013-06-06 Impact factor: 11.025
Authors: Ivan Iossifov; Michael Ronemus; Dan Levy; Zihua Wang; Inessa Hakker; Julie Rosenbaum; Boris Yamrom; Yoon-Ha Lee; Giuseppe Narzisi; Anthony Leotta; Jude Kendall; Ewa Grabowska; Beicong Ma; Steven Marks; Linda Rodgers; Asya Stepansky; Jennifer Troge; Peter Andrews; Mitchell Bekritsky; Kith Pradhan; Elena Ghiban; Melissa Kramer; Jennifer Parla; Ryan Demeter; Lucinda L Fulton; Robert S Fulton; Vincent J Magrini; Kenny Ye; Jennifer C Darnell; Robert B Darnell; Elaine R Mardis; Richard K Wilson; Michael C Schatz; W Richard McCombie; Michael Wigler Journal: Neuron Date: 2012-04-26 Impact factor: 17.173
Authors: Stephan J Sanders; Michael T Murtha; Abha R Gupta; John D Murdoch; Melanie J Raubeson; A Jeremy Willsey; A Gulhan Ercan-Sencicek; Nicholas M DiLullo; Neelroop N Parikshak; Jason L Stein; Michael F Walker; Gordon T Ober; Nicole A Teran; Youeun Song; Paul El-Fishawy; Ryan C Murtha; Murim Choi; John D Overton; Robert D Bjornson; Nicholas J Carriero; Kyle A Meyer; Kaya Bilguvar; Shrikant M Mane; Nenad Sestan; Richard P Lifton; Murat Günel; Kathryn Roeder; Daniel H Geschwind; Bernie Devlin; Matthew W State Journal: Nature Date: 2012-04-04 Impact factor: 49.962
Authors: Benjamin M Neale; Yan Kou; Li Liu; Avi Ma'ayan; Kaitlin E Samocha; Aniko Sabo; Chiao-Feng Lin; Christine Stevens; Li-San Wang; Vladimir Makarov; Paz Polak; Seungtai Yoon; Jared Maguire; Emily L Crawford; Nicholas G Campbell; Evan T Geller; Otto Valladares; Chad Schafer; Han Liu; Tuo Zhao; Guiqing Cai; Jayon Lihm; Ruth Dannenfelser; Omar Jabado; Zuleyma Peralta; Uma Nagaswamy; Donna Muzny; Jeffrey G Reid; Irene Newsham; Yuanqing Wu; Lora Lewis; Yi Han; Benjamin F Voight; Elaine Lim; Elizabeth Rossin; Andrew Kirby; Jason Flannick; Menachem Fromer; Khalid Shakir; Tim Fennell; Kiran Garimella; Eric Banks; Ryan Poplin; Stacey Gabriel; Mark DePristo; Jack R Wimbish; Braden E Boone; Shawn E Levy; Catalina Betancur; Shamil Sunyaev; Eric Boerwinkle; Joseph D Buxbaum; Edwin H Cook; Bernie Devlin; Richard A Gibbs; Kathryn Roeder; Gerard D Schellenberg; James S Sutcliffe; Mark J Daly Journal: Nature Date: 2012-04-04 Impact factor: 49.962
Authors: Brian J O'Roak; Laura Vives; Santhosh Girirajan; Emre Karakoc; Niklas Krumm; Bradley P Coe; Roie Levy; Arthur Ko; Choli Lee; Joshua D Smith; Emily H Turner; Ian B Stanaway; Benjamin Vernot; Maika Malig; Carl Baker; Beau Reilly; Joshua M Akey; Elhanan Borenstein; Mark J Rieder; Deborah A Nickerson; Raphael Bernier; Jay Shendure; Evan E Eichler Journal: Nature Date: 2012-04-04 Impact factor: 49.962
Authors: Catarina D Campbell; Jessica X Chong; Maika Malig; Arthur Ko; Beth L Dumont; Lide Han; Laura Vives; Brian J O'Roak; Peter H Sudmant; Jay Shendure; Mark Abney; Carole Ober; Evan E Eichler Journal: Nat Genet Date: 2012-09-23 Impact factor: 38.330
Authors: Andrew T Timberlake; Charuta G Furey; Jungmin Choi; Carol Nelson-Williams; Erin Loring; Amy Galm; Kristopher T Kahle; Derek M Steinbacher; Dawid Larysz; John A Persing; Richard P Lifton Journal: Proc Natl Acad Sci U S A Date: 2017-08-14 Impact factor: 11.205
Authors: Sheng Chih Jin; Weilai Dong; Adam J Kundishora; Shreyas Panchagnula; Andres Moreno-De-Luca; Charuta G Furey; August A Allocco; Rebecca L Walker; Carol Nelson-Williams; Hannah Smith; Ashley Dunbar; Sierra Conine; Qiongshi Lu; Xue Zeng; Michael C Sierant; James R Knight; William Sullivan; Phan Q Duy; Tyrone DeSpenza; Benjamin C Reeves; Jason K Karimy; Arnaud Marlier; Christopher Castaldi; Irina R Tikhonova; Boyang Li; Helena Perez Peña; James R Broach; Edith M Kabachelor; Peter Ssenyonga; Christine Hehnly; Li Ge; Boris Keren; Andrew T Timberlake; June Goto; Francesco T Mangano; James M Johnston; William E Butler; Benjamin C Warf; Edward R Smith; Steven J Schiff; David D Limbrick; Gregory Heuer; Eric M Jackson; Bermans J Iskandar; Shrikant Mane; Shozeb Haider; Bulent Guclu; Yasar Bayri; Yener Sahin; Charles C Duncan; Michael L J Apuzzo; Michael L DiLuna; Ellen J Hoffman; Nenad Sestan; Laura R Ment; Seth L Alper; Kaya Bilguvar; Daniel H Geschwind; Murat Günel; Richard P Lifton; Kristopher T Kahle Journal: Nat Med Date: 2020-10-19 Impact factor: 53.440
Authors: Donna M Werling; Harrison Brand; Joon-Yong An; Matthew R Stone; Lingxue Zhu; Joseph T Glessner; Ryan L Collins; Shan Dong; Ryan M Layer; Eirene Markenscoff-Papadimitriou; Andrew Farrell; Grace B Schwartz; Harold Z Wang; Benjamin B Currall; Xuefang Zhao; Jeanselle Dea; Clif Duhn; Carolyn A Erdman; Michael C Gilson; Rachita Yadav; Robert E Handsaker; Seva Kashin; Lambertus Klei; Jeffrey D Mandell; Tomasz J Nowakowski; Yuwen Liu; Sirisha Pochareddy; Louw Smith; Michael F Walker; Matthew J Waterman; Xin He; Arnold R Kriegstein; John L Rubenstein; Nenad Sestan; Steven A McCarroll; Benjamin M Neale; Hilary Coon; A Jeremy Willsey; Joseph D Buxbaum; Mark J Daly; Matthew W State; Aaron R Quinlan; Gabor T Marth; Kathryn Roeder; Bernie Devlin; Michael E Talkowski; Stephan J Sanders Journal: Nat Genet Date: 2018-04-26 Impact factor: 38.330
Authors: Jeffrey Staples; Evan K Maxwell; Nehal Gosalia; Claudia Gonzaga-Jauregui; Christopher Snyder; Alicia Hawes; John Penn; Ricardo Ulloa; Xiaodong Bai; Alexander E Lopez; Cristopher V Van Hout; Colm O'Dushlaine; Tanya M Teslovich; Shane E McCarthy; Suganthi Balasubramanian; H Lester Kirchner; Joseph B Leader; Michael F Murray; David H Ledbetter; Alan R Shuldiner; George D Yancoupolos; Frederick E Dewey; David J Carey; John D Overton; Aris Baras; Lukas Habegger; Jeffrey G Reid Journal: Am J Hum Genet Date: 2018-05-03 Impact factor: 11.025
Authors: Tychele N Turner; Fereydoun Hormozdiari; Michael H Duyzend; Sarah A McClymont; Paul W Hook; Ivan Iossifov; Archana Raja; Carl Baker; Kendra Hoekzema; Holly A Stessman; Michael C Zody; Bradley J Nelson; John Huddleston; Richard Sandstrom; Joshua D Smith; David Hanna; James M Swanson; Elaine M Faustman; Michael J Bamshad; John Stamatoyannopoulos; Deborah A Nickerson; Andrew S McCallion; Robert Darnell; Evan E Eichler Journal: Am J Hum Genet Date: 2015-12-31 Impact factor: 11.025