MOTIVATION: Population-scale sequenced cohorts are foundational resources for genetic analyses, but processing raw reads into analysis-ready cohort-level variants remains challenging. RESULTS: We introduce an open-source cohort-calling method that uses the highly-accurate caller DeepVariant and scalable merging tool GLnexus. Using callset quality metrics based on variant recall and precision in benchmark samples and Mendelian consistency in father-mother-child trios, we optimized the method across a range of cohort sizes, sequencing methods, and sequencing depths. The resulting callsets show consistent quality improvements over those generated using existing best practices with reduced cost. We further evaluate our pipeline in the deeply sequenced 1000 Genomes Project (1KGP) samples and show superior callset quality metrics and imputation reference panel performance compared to an independently-generated GATK Best Practices pipeline. AVAILABILITY AND IMPLEMENTATION: We publicly release the 1KGP individual-level variant calls and cohort callset (https://console.cloud.google.com/storage/browser/brain-genomics-public/research/cohort/1KGP) to foster additional development and evaluation of cohort merging methods as well as broad studies of genetic variation. Both DeepVariant (https://github.com/google/deepvariant) and GLnexus (https://github.com/dnanexus-rnd/GLnexus) are open-sourced, and the optimized GLnexus setup discovered in this study is also integrated into GLnexus public releases v1.2.2 and later. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Population-scale sequenced cohorts are foundational resources for genetic analyses, but processing raw reads into analysis-ready cohort-level variants remains challenging. RESULTS: We introduce an open-source cohort-calling method that uses the highly-accurate caller DeepVariant and scalable merging tool GLnexus. Using callset quality metrics based on variant recall and precision in benchmark samples and Mendelian consistency in father-mother-child trios, we optimized the method across a range of cohort sizes, sequencing methods, and sequencing depths. The resulting callsets show consistent quality improvements over those generated using existing best practices with reduced cost. We further evaluate our pipeline in the deeply sequenced 1000 Genomes Project (1KGP) samples and show superior callset quality metrics and imputation reference panel performance compared to an independently-generated GATK Best Practices pipeline. AVAILABILITY AND IMPLEMENTATION: We publicly release the 1KGP individual-level variant calls and cohort callset (https://console.cloud.google.com/storage/browser/brain-genomics-public/research/cohort/1KGP) to foster additional development and evaluation of cohort merging methods as well as broad studies of genetic variation. Both DeepVariant (https://github.com/google/deepvariant) and GLnexus (https://github.com/dnanexus-rnd/GLnexus) are open-sourced, and the optimized GLnexus setup discovered in this study is also integrated into GLnexus public releases v1.2.2 and later. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin Journal: Nucleic Acids Res Date: 2001-01-01 Impact factor: 16.971
Authors: Frederick E Dewey; Michael F Murray; John D Overton; Lukas Habegger; Joseph B Leader; Samantha N Fetterolf; Colm O'Dushlaine; Cristopher V Van Hout; Jeffrey Staples; Claudia Gonzaga-Jauregui; Raghu Metpally; Sarah A Pendergrass; Monica A Giovanni; H Lester Kirchner; Suganthi Balasubramanian; Noura S Abul-Husn; Dustin N Hartzel; Daniel R Lavage; Korey A Kost; Jonathan S Packer; Alexander E Lopez; John Penn; Semanti Mukherjee; Nehal Gosalia; Manoj Kanagaraj; Alexander H Li; Lyndon J Mitnaul; Lance J Adams; Thomas N Person; Kavita Praveen; Anthony Marcketta; Matthew S Lebo; Christina A Austin-Tse; Heather M Mason-Suares; Shannon Bruse; Scott Mellis; Robert Phillips; Neil Stahl; Andrew Murphy; Aris Economides; Kimberly A Skelding; Christopher D Still; James R Elmore; Ingrid B Borecki; George D Yancopoulos; F Daniel Davis; William A Faucett; Omri Gottesman; Marylyn D Ritchie; Alan R Shuldiner; Jeffrey G Reid; David H Ledbetter; Aris Baras; David J Carey Journal: Science Date: 2016-12-23 Impact factor: 47.728
Authors: Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean Journal: Nature Date: 2010-10-28 Impact factor: 49.962
Authors: Ryan Poplin; Pi-Chuan Chang; David Alexander; Scott Schwartz; Thomas Colthurst; Alexander Ku; Dan Newburger; Jojo Dijamco; Nam Nguyen; Pegah T Afshar; Sam S Gross; Lizzie Dorfman; Cory Y McLean; Mark A DePristo Journal: Nat Biotechnol Date: 2018-09-24 Impact factor: 54.908
Authors: Xiuwen Zheng; Stephanie M Gogarten; Michael Lawrence; Adrienne Stilp; Matthew P Conomos; Bruce S Weir; Cathy Laurie; David Levine Journal: Bioinformatics Date: 2017-08-01 Impact factor: 6.937
Authors: Monkol Lek; Konrad J Karczewski; Eric V Minikel; Kaitlin E Samocha; Eric Banks; Timothy Fennell; Anne H O'Donnell-Luria; James S Ware; Andrew J Hill; Beryl B Cummings; Taru Tukiainen; Daniel P Birnbaum; Jack A Kosmicki; Laramie E Duncan; Karol Estrada; Fengmei Zhao; James Zou; Emma Pierce-Hoffman; Joanne Berghout; David N Cooper; Nicole Deflaux; Mark DePristo; Ron Do; Jason Flannick; Menachem Fromer; Laura Gauthier; Jackie Goldstein; Namrata Gupta; Daniel Howrigan; Adam Kiezun; Mitja I Kurki; Ami Levy Moonshine; Pradeep Natarajan; Lorena Orozco; Gina M Peloso; Ryan Poplin; Manuel A Rivas; Valentin Ruano-Rubio; Samuel A Rose; Douglas M Ruderfer; Khalid Shakir; Peter D Stenson; Christine Stevens; Brett P Thomas; Grace Tiao; Maria T Tusie-Luna; Ben Weisburd; Hong-Hee Won; Dongmei Yu; David M Altshuler; Diego Ardissino; Michael Boehnke; John Danesh; Stacey Donnelly; Roberto Elosua; Jose C Florez; Stacey B Gabriel; Gad Getz; Stephen J Glatt; Christina M Hultman; Sekar Kathiresan; Markku Laakso; Steven McCarroll; Mark I McCarthy; Dermot McGovern; Ruth McPherson; Benjamin M Neale; Aarno Palotie; Shaun M Purcell; Danish Saleheen; Jeremiah M Scharf; Pamela Sklar; Patrick F Sullivan; Jaakko Tuomilehto; Ming T Tsuang; Hugh C Watkins; James G Wilson; Mark J Daly; Daniel G MacArthur Journal: Nature Date: 2016-08-18 Impact factor: 49.962
Authors: Kristina Stenløkk; Marie Saitou; Live Rud-Johansen; Torfinn Nome; Michel Moser; Mariann Árnyasi; Matthew Kent; Nicola Jane Barson; Sigbjørn Lien Journal: Philos Trans R Soc Lond B Biol Sci Date: 2022-06-13 Impact factor: 6.671
Authors: Michelle D Noyes; William T Harvey; David Porubsky; Arvis Sulovari; Ruiyang Li; Nicholas R Rose; Peter A Audano; Katherine M Munson; Alexandra P Lewis; Kendra Hoekzema; Tuomo Mantere; Tina A Graves-Lindsay; Ashley D Sanders; Sara Goodwin; Melissa Kramer; Younes Mokrab; Michael C Zody; Alexander Hoischen; Jan O Korbel; W Richard McCombie; Evan E Eichler Journal: Am J Hum Genet Date: 2022-03-14 Impact factor: 11.043
Authors: Charles Markello; Charles Huang; Alex Rodriguez; Andrew Carroll; Pi-Chuan Chang; Jordan Eizenga; Thomas Markello; David Haussler; Benedict Paten Journal: Genome Res Date: 2022-04-28 Impact factor: 9.438
Authors: Jared O'Connell; Taedong Yun; Meghan Moreno; Helen Li; Nadia Litterman; Alexey Kolesnikov; Elizabeth Noblin; Pi-Chuan Chang; Anjali Shastri; Elizabeth H Dorfman; Suyash Shringarpure; Adam Auton; Andrew Carroll; Cory Y McLean Journal: Commun Biol Date: 2021-11-05
Authors: Mine Koprulu; Yajie Zhao; Eleanor Wheeler; Liang Dong; Nuno Rocha; Chen Li; John D Griffin; Satish Patel; Marcel Van de Streek; Craig A Glastonbury; Isobel D Stewart; Felix R Day; Jian'an Luan; Nicholas Bowker; Laura B L Wittemans; Nicola D Kerrison; Lina Cai; Debora M E Lucarelli; Inês Barroso; Mark I McCarthy; Robert A Scott; Vladimir Saudek; Kerrin S Small; Nicholas J Wareham; Robert K Semple; John R B Perry; Stephen O'Rahilly; Luca A Lotta; Claudia Langenberg; David B Savage Journal: J Clin Endocrinol Metab Date: 2022-03-24 Impact factor: 5.958
Authors: Brent S Pedersen; Joe M Brown; Harriet Dashnow; Amelia D Wallace; Matt Velinder; Martin Tristani-Firouzi; Joshua D Schiffman; Tatiana Tvrdik; Rong Mao; D Hunter Best; Pinar Bayrak-Toydemir; Aaron R Quinlan Journal: NPJ Genom Med Date: 2021-07-15 Impact factor: 8.617
Authors: Tristan V de Jong; Panjun Kim; Victor Guryev; Megan K Mulligan; Robert W Williams; Eva E Redei; Hao Chen Journal: Sci Rep Date: 2021-07-20 Impact factor: 4.996