Lei Song1,2, Aiyi Liu3, Jianxin Shi1. 1. Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA. 2. Cancer Genomics Research Laboratory, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA. 3. Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, National Institute of Child Health and Human Development, Bethesda, MD, USA.
Abstract
MOTIVATION: Polygenic risk score (PRS) methods based on genome-wide association studies (GWAS) have a potential for predicting the risk of developing complex diseases and are expected to become more accurate with larger training datasets and innovative statistical methods. The area under the ROC curve (AUC) is often used to evaluate the performance of PRSs, which requires individual genotypic and phenotypic data in an independent GWAS validation dataset. We are motivated to develop methods for approximating AUC of PRSs based on the summary level data of the validation dataset, which will greatly facilitate the development of PRS models for complex diseases. RESULTS: We develop statistical methods and an R package SummaryAUC for approximating the AUC and its variance of a PRS when only the summary level data of the validation dataset are available. SummaryAUC can be applied to PRSs with SNPs either genotyped or imputed in the validation dataset. We examined the performance of SummaryAUC using a large-scale GWAS of schizophrenia. SummaryAUC provides accurate approximations to AUCs and their variances. The bias of AUC is typically <0.5% in most analyses. SummaryAUC cannot be applied to PRSs that use all SNPs in the genome because it is computationally prohibitive. AVAILABILITY AND IMPLEMENTATION: https://github.com/lsncibb/SummaryAUC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Polygenic risk score (PRS) methods based on genome-wide association studies (GWAS) have a potential for predicting the risk of developing complex diseases and are expected to become more accurate with larger training datasets and innovative statistical methods. The area under the ROC curve (AUC) is often used to evaluate the performance of PRSs, which requires individual genotypic and phenotypic data in an independent GWAS validation dataset. We are motivated to develop methods for approximating AUC of PRSs based on the summary level data of the validation dataset, which will greatly facilitate the development of PRS models for complex diseases. RESULTS: We develop statistical methods and an R package SummaryAUC for approximating the AUC and its variance of a PRS when only the summary level data of the validation dataset are available. SummaryAUC can be applied to PRSs with SNPs either genotyped or imputed in the validation dataset. We examined the performance of SummaryAUC using a large-scale GWAS of schizophrenia. SummaryAUC provides accurate approximations to AUCs and their variances. The bias of AUC is typically <0.5% in most analyses. SummaryAUC cannot be applied to PRSs that use all SNPs in the genome because it is computationally prohibitive. AVAILABILITY AND IMPLEMENTATION: https://github.com/lsncibb/SummaryAUC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Brendan K Bulik-Sullivan; Po-Ru Loh; Hilary K Finucane; Stephan Ripke; Jian Yang; Nick Patterson; Mark J Daly; Alkes L Price; Benjamin M Neale Journal: Nat Genet Date: 2015-02-02 Impact factor: 38.330
Authors: Zhi Wei; Kai Wang; Hui-Qi Qu; Haitao Zhang; Jonathan Bradfield; Cecilia Kim; Edward Frackleton; Cuiping Hou; Joseph T Glessner; Rosetta Chiavacci; Charles Stanley; Dimitri Monos; Struan F A Grant; Constantin Polychronakos; Hakon Hakonarson Journal: PLoS Genet Date: 2009-10-09 Impact factor: 5.917
Authors: Robert Maier; Gerhard Moser; Guo-Bo Chen; Stephan Ripke; William Coryell; James B Potash; William A Scheftner; Jianxin Shi; Myrna M Weissman; Christina M Hultman; Mikael Landén; Douglas F Levinson; Kenneth S Kendler; Jordan W Smoller; Naomi R Wray; S Hong Lee Journal: Am J Hum Genet Date: 2015-01-29 Impact factor: 11.043
Authors: Ali Amin Al Olama; Zsofia Kote-Jarai; Sonja I Berndt; David V Conti; Fredrick Schumacher; Ying Han; Sara Benlloch; Dennis J Hazelett; Zhaoming Wang; Ed Saunders; Daniel Leongamornlert; Sara Lindstrom; Sara Jugurnauth-Little; Tokhir Dadaev; Malgorzata Tymrakiewicz; Daniel O Stram; Kristin Rand; Peggy Wan; Alex Stram; Xin Sheng; Loreall C Pooler; Karen Park; Lucy Xia; Jonathan Tyrer; Laurence N Kolonel; Loic Le Marchand; Robert N Hoover; Mitchell J Machiela; Merideth Yeager; Laurie Burdette; Charles C Chung; Amy Hutchinson; Kai Yu; Chee Goh; Mahbubl Ahmed; Koveela Govindasami; Michelle Guy; Teuvo L J Tammela; Anssi Auvinen; Tiina Wahlfors; Johanna Schleutker; Tapio Visakorpi; Katri A Leinonen; Jianfeng Xu; Markus Aly; Jenny Donovan; Ruth C Travis; Tim J Key; Afshan Siddiq; Federico Canzian; Kay-Tee Khaw; Atsushi Takahashi; Michiaki Kubo; Paul Pharoah; Nora Pashayan; Maren Weischer; Borge G Nordestgaard; Sune F Nielsen; Peter Klarskov; Martin Andreas Røder; Peter Iversen; Stephen N Thibodeau; Shannon K McDonnell; Daniel J Schaid; Janet L Stanford; Suzanne Kolb; Sarah Holt; Beatrice Knudsen; Antonio Hurtado Coll; Susan M Gapstur; W Ryan Diver; Victoria L Stevens; Christiane Maier; Manuel Luedeke; Kathleen Herkommer; Antje E Rinckleb; Sara S Strom; Curtis Pettaway; Edward D Yeboah; Yao Tettey; Richard B Biritwum; Andrew A Adjei; Evelyn Tay; Ann Truelove; Shelley Niwa; Anand P Chokkalingam; Lisa Cannon-Albright; Cezary Cybulski; Dominika Wokołorczyk; Wojciech Kluźniak; Jong Park; Thomas Sellers; Hui-Yi Lin; William B Isaacs; Alan W Partin; Hermann Brenner; Aida Karina Dieffenbach; Christa Stegmaier; Constance Chen; Edward L Giovannucci; Jing Ma; Meir Stampfer; Kathryn L Penney; Lorelei Mucci; Esther M John; Sue A Ingles; Rick A Kittles; Adam B Murphy; Hardev Pandha; Agnieszka Michael; Andrzej M Kierzek; William Blot; Lisa B Signorello; Wei Zheng; Demetrius Albanes; Jarmo Virtamo; Stephanie Weinstein; Barbara Nemesure; John Carpten; Cristina Leske; Suh-Yuh Wu; Anselm Hennis; Adam S Kibel; Benjamin A Rybicki; Christine Neslund-Dudas; Ann W Hsing; Lisa Chu; Phyllis J Goodman; Eric A Klein; S Lilly Zheng; Jyotsna Batra; Judith Clements; Amanda Spurdle; Manuel R Teixeira; Paula Paulo; Sofia Maia; Chavdar Slavov; Radka Kaneva; Vanio Mitev; John S Witte; Graham Casey; Elizabeth M Gillanders; Daniella Seminara; Elio Riboli; Freddie C Hamdy; Gerhard A Coetzee; Qiyuan Li; Matthew L Freedman; David J Hunter; Kenneth Muir; Henrik Gronberg; David E Neal; Melissa Southey; Graham G Giles; Gianluca Severi; Michael B Cook; Hidewaki Nakagawa; Fredrik Wiklund; Peter Kraft; Stephen J Chanock; Brian E Henderson; Douglas F Easton; Rosalind A Eeles; Christopher A Haiman Journal: Nat Genet Date: 2014-09-14 Impact factor: 38.330
Authors: Robert A Scott; Laura J Scott; Reedik Mägi; Letizia Marullo; Kyle J Gaulton; Marika Kaakinen; Natalia Pervjakova; Tune H Pers; Andrew D Johnson; John D Eicher; Anne U Jackson; Teresa Ferreira; Yeji Lee; Clement Ma; Valgerdur Steinthorsdottir; Gudmar Thorleifsson; Lu Qi; Natalie R Van Zuydam; Anubha Mahajan; Han Chen; Peter Almgren; Ben F Voight; Harald Grallert; Martina Müller-Nurasyid; Janina S Ried; Nigel W Rayner; Neil Robertson; Lennart C Karssen; Elisabeth M van Leeuwen; Sara M Willems; Christian Fuchsberger; Phoenix Kwan; Tanya M Teslovich; Pritam Chanda; Man Li; Yingchang Lu; Christian Dina; Dorothee Thuillier; Loic Yengo; Longda Jiang; Thomas Sparso; Hans A Kestler; Himanshu Chheda; Lewin Eisele; Stefan Gustafsson; Mattias Frånberg; Rona J Strawbridge; Rafn Benediktsson; Astradur B Hreidarsson; Augustine Kong; Gunnar Sigurðsson; Nicola D Kerrison; Jian'an Luan; Liming Liang; Thomas Meitinger; Michael Roden; Barbara Thorand; Tõnu Esko; Evelin Mihailov; Caroline Fox; Ching-Ti Liu; Denis Rybin; Bo Isomaa; Valeriya Lyssenko; Tiinamaija Tuomi; David J Couper; James S Pankow; Niels Grarup; Christian T Have; Marit E Jørgensen; Torben Jørgensen; Allan Linneberg; Marilyn C Cornelis; Rob M van Dam; David J Hunter; Peter Kraft; Qi Sun; Sarah Edkins; Katharine R Owen; John R B Perry; Andrew R Wood; Eleftheria Zeggini; Juan Tajes-Fernandes; Goncalo R Abecasis; Lori L Bonnycastle; Peter S Chines; Heather M Stringham; Heikki A Koistinen; Leena Kinnunen; Bengt Sennblad; Thomas W Mühleisen; Markus M Nöthen; Sonali Pechlivanis; Damiano Baldassarre; Karl Gertow; Steve E Humphries; Elena Tremoli; Norman Klopp; Julia Meyer; Gerald Steinbach; Roman Wennauer; Johan G Eriksson; Satu Mӓnnistö; Leena Peltonen; Emmi Tikkanen; Guillaume Charpentier; Elodie Eury; Stéphane Lobbens; Bruna Gigante; Karin Leander; Olga McLeod; Erwin P Bottinger; Omri Gottesman; Douglas Ruderfer; Matthias Blüher; Peter Kovacs; Anke Tonjes; Nisa M Maruthur; Chiara Scapoli; Raimund Erbel; Karl-Heinz Jöckel; Susanne Moebus; Ulf de Faire; Anders Hamsten; Michael Stumvoll; Panagiotis Deloukas; Peter J Donnelly; Timothy M Frayling; Andrew T Hattersley; Samuli Ripatti; Veikko Salomaa; Nancy L Pedersen; Bernhard O Boehm; Richard N Bergman; Francis S Collins; Karen L Mohlke; Jaakko Tuomilehto; Torben Hansen; Oluf Pedersen; Inês Barroso; Lars Lannfelt; Erik Ingelsson; Lars Lind; Cecilia M Lindgren; Stephane Cauchi; Philippe Froguel; Ruth J F Loos; Beverley Balkau; Heiner Boeing; Paul W Franks; Aurelio Barricarte Gurrea; Domenico Palli; Yvonne T van der Schouw; David Altshuler; Leif C Groop; Claudia Langenberg; Nicholas J Wareham; Eric Sijbrands; Cornelia M van Duijn; Jose C Florez; James B Meigs; Eric Boerwinkle; Christian Gieger; Konstantin Strauch; Andres Metspalu; Andrew D Morris; Colin N A Palmer; Frank B Hu; Unnur Thorsteinsdottir; Kari Stefansson; Josée Dupuis; Andrew P Morris; Michael Boehnke; Mark I McCarthy; Inga Prokopenko Journal: Diabetes Date: 2017-05-31 Impact factor: 9.337
Authors: Cynthia D J Kusters; Kimberly C Paul; Aline Duarte Folle; Adrienne M Keener; Jeff M Bronstein; Valerija Dobricic; Ole-Bjørn Tysnes; Lars Bertram; Guido Alves; Janet S Sinsheimer; Christina M Lill; Jodi Maple-Grødem; Beate R Ritz Journal: Neurol Genet Date: 2020-07-20