Literature DB >> 32232836

Thousands of missing variants in the UK Biobank are recoverable by genome realignment.

Tongqiu Jia1, Brenton Munson1, Hana Lango Allen2, Trey Ideker1, Amit R Majithia1.   

Abstract

The UK Biobank is an unprecedented resource for human disease research. In March 2019, 49,997 exomes were made publicly available to investigators. Here we note that thousands of variant calls are unexpectedly absent from this dataset, with 641 genes showing zero variation. We show that the reason for this was an erroneous read alignment to the GRCh38 reference. The missing variants can be recovered by modifying read alignment parameters to correctly handle the expanded set of contigs available in the human genome reference. Given the size and complexity of such population scale datasets, we propose a simple heuristic that can uncover systematic errors using summary data accessible to most investigators.
© 2020 John Wiley & Sons Ltd/University College London.

Entities:  

Keywords:  DNA; exome; genetics; sequence alignment; sequence analysis

Mesh:

Year:  2020        PMID: 32232836      PMCID: PMC7402360          DOI: 10.1111/ahg.12383

Source DB:  PubMed          Journal:  Ann Hum Genet        ISSN: 0003-4800            Impact factor:   1.670


  14 in total

1.  BEDOPS: high-performance genomic feature operations.

Authors:  Shane Neph; M Scott Kuehn; Alex P Reynolds; Eric Haugen; Robert E Thurman; Audra K Johnson; Eric Rynes; Matthew T Maurano; Jeff Vierstra; Sean Thomas; Richard Sandstrom; Richard Humbert; John A Stamatoyannopoulos
Journal:  Bioinformatics       Date:  2012-05-09       Impact factor: 6.937

2.  Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors:  B Ewing; P Green
Journal:  Genome Res       Date:  1998-03       Impact factor: 9.043

3.  A Protein-Truncating HSD17B13 Variant and Protection from Chronic Liver Disease.

Authors:  Noura S Abul-Husn; Xiping Cheng; Alexander H Li; Yurong Xin; Claudia Schurmann; Panayiotis Stevis; Yashu Liu; Julia Kozlitina; Stefan Stender; G Craig Wood; Ann N Stepanchick; Matthew D Still; Shane McCarthy; Colm O'Dushlaine; Jonathan S Packer; Suganthi Balasubramanian; Nehal Gosalia; David Esopi; Sun Y Kim; Semanti Mukherjee; Alexander E Lopez; Erin D Fuller; John Penn; Xin Chu; Jonathan Z Luo; Uyenlinh L Mirshahi; David J Carey; Christopher D Still; Michael D Feldman; Aeron Small; Scott M Damrauer; Daniel J Rader; Brian Zambrowicz; William Olson; Andrew J Murphy; Ingrid B Borecki; Alan R Shuldiner; Jeffrey G Reid; John D Overton; George D Yancopoulos; Helen H Hobbs; Jonathan C Cohen; Omri Gottesman; Tanya M Teslovich; Aris Baras; Tooraj Mirshahi; Jesper Gromada; Frederick E Dewey
Journal:  N Engl J Med       Date:  2018-03-22       Impact factor: 91.245

4.  A framework for variation discovery and genotyping using next-generation DNA sequencing data.

Authors:  Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly
Journal:  Nat Genet       Date:  2011-04-10       Impact factor: 38.330

5.  Ensembl 2016.

Authors:  Andrew Yates; Wasiu Akanni; M Ridwan Amode; Daniel Barrell; Konstantinos Billis; Denise Carvalho-Silva; Carla Cummins; Peter Clapham; Stephen Fitzgerald; Laurent Gil; Carlos García Girón; Leo Gordon; Thibaut Hourlier; Sarah E Hunt; Sophie H Janacek; Nathan Johnson; Thomas Juettemann; Stephen Keenan; Ilias Lavidas; Fergal J Martin; Thomas Maurel; William McLaren; Daniel N Murphy; Rishi Nag; Michael Nuhn; Anne Parker; Mateus Patricio; Miguel Pignatelli; Matthew Rahtz; Harpreet Singh Riat; Daniel Sheppard; Kieron Taylor; Anja Thormann; Alessandro Vullo; Steven P Wilder; Amonida Zadissa; Ewan Birney; Jennifer Harrow; Matthieu Muffato; Emily Perry; Magali Ruffier; Giulietta Spudich; Stephen J Trevanion; Fiona Cunningham; Bronwen L Aken; Daniel R Zerbino; Paul Flicek
Journal:  Nucleic Acids Res       Date:  2015-12-19       Impact factor: 16.971

6.  Analysis of protein-coding genetic variation in 60,706 humans.

Authors:  Monkol Lek; Konrad J Karczewski; Eric V Minikel; Kaitlin E Samocha; Eric Banks; Timothy Fennell; Anne H O'Donnell-Luria; James S Ware; Andrew J Hill; Beryl B Cummings; Taru Tukiainen; Daniel P Birnbaum; Jack A Kosmicki; Laramie E Duncan; Karol Estrada; Fengmei Zhao; James Zou; Emma Pierce-Hoffman; Joanne Berghout; David N Cooper; Nicole Deflaux; Mark DePristo; Ron Do; Jason Flannick; Menachem Fromer; Laura Gauthier; Jackie Goldstein; Namrata Gupta; Daniel Howrigan; Adam Kiezun; Mitja I Kurki; Ami Levy Moonshine; Pradeep Natarajan; Lorena Orozco; Gina M Peloso; Ryan Poplin; Manuel A Rivas; Valentin Ruano-Rubio; Samuel A Rose; Douglas M Ruderfer; Khalid Shakir; Peter D Stenson; Christine Stevens; Brett P Thomas; Grace Tiao; Maria T Tusie-Luna; Ben Weisburd; Hong-Hee Won; Dongmei Yu; David M Altshuler; Diego Ardissino; Michael Boehnke; John Danesh; Stacey Donnelly; Roberto Elosua; Jose C Florez; Stacey B Gabriel; Gad Getz; Stephen J Glatt; Christina M Hultman; Sekar Kathiresan; Markku Laakso; Steven McCarroll; Mark I McCarthy; Dermot McGovern; Ruth McPherson; Benjamin M Neale; Aarno Palotie; Shaun M Purcell; Danish Saleheen; Jeremiah M Scharf; Pamela Sklar; Patrick F Sullivan; Jaakko Tuomilehto; Ming T Tsuang; Hugh C Watkins; James G Wilson; Mark J Daly; Daniel G MacArthur
Journal:  Nature       Date:  2016-08-18       Impact factor: 49.962

7.  Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects.

Authors:  Allison A Regier; Yossi Farjoun; David E Larson; Olga Krasheninina; Hyun Min Kang; Daniel P Howrigan; Bo-Juen Chen; Manisha Kher; Eric Banks; Darren C Ames; Adam C English; Heng Li; Jinchuan Xing; Yeting Zhang; Tara Matise; Goncalo R Abecasis; Will Salerno; Michael C Zody; Benjamin M Neale; Ira M Hall
Journal:  Nat Commun       Date:  2018-10-02       Impact factor: 14.919

8.  ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing.

Authors:  Robert C Green; Jonathan S Berg; Wayne W Grody; Sarah S Kalia; Bruce R Korf; Christa L Martin; Amy L McGuire; Robert L Nussbaum; Julianne M O'Daniel; Kelly E Ormond; Heidi L Rehm; Michael S Watson; Marc S Williams; Leslie G Biesecker
Journal:  Genet Med       Date:  2013-06-20       Impact factor: 8.822

9.  A framework for the interpretation of de novo mutation in human disease.

Authors:  Kaitlin E Samocha; Elise B Robinson; Stephan J Sanders; Christine Stevens; Aniko Sabo; Lauren M McGrath; Jack A Kosmicki; Karola Rehnström; Swapan Mallick; Andrew Kirby; Dennis P Wall; Daniel G MacArthur; Stacey B Gabriel; Mark DePristo; Shaun M Purcell; Aarno Palotie; Eric Boerwinkle; Joseph D Buxbaum; Edwin H Cook; Richard A Gibbs; Gerard D Schellenberg; James S Sutcliffe; Bernie Devlin; Kathryn Roeder; Benjamin M Neale; Mark J Daly
Journal:  Nat Genet       Date:  2014-08-03       Impact factor: 38.330

10.  The UK Biobank resource with deep phenotyping and genomic data.

Authors:  Clare Bycroft; Colin Freeman; Desislava Petkova; Gavin Band; Lloyd T Elliott; Kevin Sharp; Allan Motyer; Damjan Vukcevic; Olivier Delaneau; Jared O'Connell; Adrian Cortes; Samantha Welsh; Alan Young; Mark Effingham; Gil McVean; Stephen Leslie; Naomi Allen; Peter Donnelly; Jonathan Marchini
Journal:  Nature       Date:  2018-10-10       Impact factor: 49.962

View more
  8 in total

1.  Genome-first approach to rare EYA4 variants and cardio-auditory phenotypes in adults.

Authors:  Shadi Ahmadmehrabi; Binglan Li; Joseph Park; Batsal Devkota; Marijana Vujkovic; Yi-An Ko; David Van Wagoner; W H Wilson Tang; Ian Krantz; Marylyn Ritchie; Jason Brant; Michael J Ruckenstein; Douglas J Epstein; Daniel J Rader
Journal:  Hum Genet       Date:  2021-03-21       Impact factor: 4.132

2.  Rare coding variants in 35 genes associate with circulating lipid levels-A multi-ancestry analysis of 170,000 exomes.

Authors:  George Hindy; Peter Dornbos; Mark D Chaffin; Dajiang J Liu; Minxian Wang; Margaret Sunitha Selvaraj; David Zhang; Joseph Park; Carlos A Aguilar-Salinas; Lucinda Antonacci-Fulton; Diego Ardissino; Donna K Arnett; Stella Aslibekyan; Gil Atzmon; Christie M Ballantyne; Francisco Barajas-Olmos; Nir Barzilai; Lewis C Becker; Lawrence F Bielak; Joshua C Bis; John Blangero; Eric Boerwinkle; Lori L Bonnycastle; Erwin Bottinger; Donald W Bowden; Matthew J Bown; Jennifer A Brody; Jai G Broome; Noël P Burtt; Brian E Cade; Federico Centeno-Cruz; Edmund Chan; Yi-Cheng Chang; Yii-Der I Chen; Ching-Yu Cheng; Won Jung Choi; Rajiv Chowdhury; Cecilia Contreras-Cubas; Emilio J Córdova; Adolfo Correa; L Adrienne Cupples; Joanne E Curran; John Danesh; Paul S de Vries; Ralph A DeFronzo; Harsha Doddapaneni; Ravindranath Duggirala; Susan K Dutcher; Patrick T Ellinor; Leslie S Emery; Jose C Florez; Myriam Fornage; Barry I Freedman; Valentin Fuster; Ma Eugenia Garay-Sevilla; Humberto García-Ortiz; Soren Germer; Richard A Gibbs; Christian Gieger; Benjamin Glaser; Clicerio Gonzalez; Maria Elena Gonzalez-Villalpando; Mariaelisa Graff; Sarah E Graham; Niels Grarup; Leif C Groop; Xiuqing Guo; Namrata Gupta; Sohee Han; Craig L Hanis; Torben Hansen; Jiang He; Nancy L Heard-Costa; Yi-Jen Hung; Mi Yeong Hwang; Marguerite R Irvin; Sergio Islas-Andrade; Gail P Jarvik; Hyun Min Kang; Sharon L R Kardia; Tanika Kelly; Eimear E Kenny; Alyna T Khan; Bong-Jo Kim; Ryan W Kim; Young Jin Kim; Heikki A Koistinen; Charles Kooperberg; Johanna Kuusisto; Soo Heon Kwak; Markku Laakso; Leslie A Lange; Jiwon Lee; Juyoung Lee; Seonwook Lee; Donna M Lehman; Rozenn N Lemaitre; Allan Linneberg; Jianjun Liu; Ruth J F Loos; Steven A Lubitz; Valeriya Lyssenko; Ronald C W Ma; Lisa Warsinger Martin; Angélica Martínez-Hernández; Rasika A Mathias; Stephen T McGarvey; Ruth McPherson; James B Meigs; Thomas Meitinger; Olle Melander; Elvia Mendoza-Caamal; Ginger A Metcalf; Xuenan Mi; Karen L Mohlke; May E Montasser; Jee-Young Moon; Hortensia Moreno-Macías; Alanna C Morrison; Donna M Muzny; Sarah C Nelson; Peter M Nilsson; Jeffrey R O'Connell; Marju Orho-Melander; Lorena Orozco; Colin N A Palmer; Nicholette D Palmer; Cheol Joo Park; Kyong Soo Park; Oluf Pedersen; Juan M Peralta; Patricia A Peyser; Wendy S Post; Michael Preuss; Bruce M Psaty; Qibin Qi; D C Rao; Susan Redline; Alexander P Reiner; Cristina Revilla-Monsalve; Stephen S Rich; Nilesh Samani; Heribert Schunkert; Claudia Schurmann; Daekwan Seo; Jeong-Sun Seo; Xueling Sim; Rob Sladek; Kerrin S Small; Wing Yee So; Adrienne M Stilp; E Shyong Tai; Claudia H T Tam; Kent D Taylor; Yik Ying Teo; Farook Thameem; Brian Tomlinson; Michael Y Tsai; Tiinamaija Tuomi; Jaakko Tuomilehto; Teresa Tusié-Luna; Miriam S Udler; Rob M van Dam; Ramachandran S Vasan; Karine A Viaud Martinez; Fei Fei Wang; Xuzhi Wang; Hugh Watkins; Daniel E Weeks; James G Wilson; Daniel R Witte; Tien-Yin Wong; Lisa R Yanek; Sekar Kathiresan; Daniel J Rader; Jerome I Rotter; Michael Boehnke; Mark I McCarthy; Cristen J Willer; Pradeep Natarajan; Jason A Flannick; Amit V Khera; Gina M Peloso
Journal:  Am J Hum Genet       Date:  2021-12-20       Impact factor: 11.043

3.  The contribution of X-linked coding variation to severe developmental disorders.

Authors:  Hilary C Martin; Eugene J Gardner; Kaitlin E Samocha; Joanna Kaplanis; Nadia Akawi; Alejandro Sifrim; Ruth Y Eberhardt; Ana Lisa Taylor Tavares; Matthew D C Neville; Mari E K Niemi; Giuseppe Gallone; Jeremy McRae; Caroline F Wright; David R FitzPatrick; Helen V Firth; Matthew E Hurles
Journal:  Nat Commun       Date:  2021-01-27       Impact factor: 14.919

4.  Frequency and Phenotype Associations of Rare Variants in 5 Monogenic Cerebral Small Vessel Disease Genes in 200,000 UK Biobank Participants.

Authors:  Amy Christina Ferguson; Sophie Thrippleton; David Henshall; Ed Whittaker; Bryan Conway; Malcolm MacLeod; Rainer Malik; Konrad Rawlik; Albert Tenesa; Cathie Sudlow; Kristiina Rannikmae
Journal:  Neurol Genet       Date:  2022-08-24

Review 5.  Clinical exome sequencing-Mistakes and caveats.

Authors:  Jordi Corominas; Sanne P Smeekens; Marcel R Nelen; Helger G Yntema; Erik-Jan Kamsteeg; Rolph Pfundt; Christian Gilissen
Journal:  Hum Mutat       Date:  2022-03-15       Impact factor: 4.700

6.  Exome variant discrepancies due to reference-genome differences.

Authors:  He Li; Moez Dawood; Michael M Khayat; Jesse R Farek; Shalini N Jhangiani; Ziad M Khan; Tadahiro Mitani; Zeynep Coban-Akdemir; James R Lupski; Eric Venner; Jennifer E Posey; Aniko Sabo; Richard A Gibbs
Journal:  Am J Hum Genet       Date:  2021-06-14       Impact factor: 11.025

7.  GeneBreaker: Variant simulation to improve the diagnosis of Mendelian rare genetic diseases.

Authors:  Phillip A Richmond; Tamar V Av-Shalom; Oriol Fornes; Bhavi Modi; Alison M Elliott; Wyeth W Wasserman
Journal:  Hum Mutat       Date:  2021-02-10       Impact factor: 4.878

8.  Genetic architecture of complex traits and disease risk predictors.

Authors:  Soke Yuen Yong; Timothy G Raben; Louis Lello; Stephen D H Hsu
Journal:  Sci Rep       Date:  2020-07-21       Impact factor: 4.379

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.