Literature DB >> 27318201

PhenoScanner: a database of human genotype-phenotype associations.

James R Staley1, James Blackshaw1, Mihir A Kamat1, Steve Ellis1, Praveen Surendran1, Benjamin B Sun1, Dirk S Paul1, Daniel Freitag1, Stephen Burgess1, John Danesh2, Robin Young3, Adam S Butterworth4.   

Abstract

PhenoScanner is a curated database of publicly available results from large-scale genetic association studies. This tool aims to facilitate 'phenome scans', the cross-referencing of genetic variants with many phenotypes, to help aid understanding of disease pathways and biology. The database currently contains over 350 million association results and over 10 million unique genetic variants, mostly single nucleotide polymorphisms. It is accompanied by a web-based tool that queries the database for associations with user-specified variants, providing results according to the same effect and non-effect alleles for each input variant. The tool provides the option of searching for trait associations with proxies of the input variants, calculated using the European samples from 1000 Genomes and Hapmap.
AVAILABILITY AND IMPLEMENTATION: PhenoScanner is available at www.phenoscanner.medschl.cam.ac.uk CONTACT: jrs95@medschl.cam.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2016        PMID: 27318201      PMCID: PMC5048068          DOI: 10.1093/bioinformatics/btw373

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Genome-wide association studies (GWAS) have discovered thousands of associations between genetic variants and a wide range of human phenotypes, yielding novel insights into disease aetiology. However, a key challenge for the human genomics community is to develop methods that enable efficient cross-referencing of a genetic variant with a wide range of phenotypes, such as disease states, physiological parameters, cellular traits and other characteristics. Such ‘phenome scans’ could help inform a range of analyses, such as Mendelian randomization analyses, in which genetic variants are used as proxies for modifiable risk factors to attempt to infer causality between traits and diseases (Burgess and Thompson, 2015). Identifying the broad phenotypic consequences of perturbing a particular pathway (indexed by a genetic variant) could also enhance biological understanding and provide insights relevant to the identification and prioritization of potential therapeutic targets, such as the re-purposing of existing therapies to new disease indications and the anticipation of safety and efficacy signals in clinical trials. One notable example has been our demonstration, following a phenome scan across a wide range of traits and diseases, that genetic variants that upregulate the interleukin-1 receptor antagonist are associated with a higher risk of coronary artery disease, partly mediated through elevation of pro-atherogenic lipids (Interleukin 1 Genetics Consortium ). So far, however, it has been difficult to generalize this approach, partly because the collation of associations with many phenotypes can be time-consuming, especially if information is sought about multiple variants. Catalogues of GWAS results already exist, such as the NHGRI-EBI GWAS catalog (Welter ), as well as data repositories, e.g. dbGaP (Mailman ). However, these either focus on variants robustly associated with a particular trait (and hence do not take advantage of the wide range of publicly available full GWAS results), do not contain estimates or directions of effect, and/or are difficult to search in a systematic way. Also, the results often have inconsistent formats and the output for each variant is not necessarily given according to the same effect allele. In addition, most catalogues of GWAS do not identify associations with proxy variants, which means that if an association between the variant of interest and a trait is unavailable, a suitable proxy must be found using a separate resource and then searched in the catalogue. Some of the latest variant annotation tools (which include proxy look-ups) do contain results from the NHGRI-EBI GWAS catalog (e.g. SNiPA; Arnold ), however, they only return P values. To help address these issues, we developed a web-based tool ‘PhenoScanner’ that extracts and aligns associations for user-specified variants and proxies across a large curated database.

2 Methods

PhenoScanner consists of a Perl interface (with R command line tool) that connects to a MySQL database. To develop the initial database, we collated 137 genotype–phenotype association datasets, including results for anthropometric traits, blood pressure, lipids, cardiometabolic diseases, renal function measures, glycemic traits, inflammatory diseases, psychiatric diseases and smoking phenotypes (Supplementary Table). We also included the NHGRI-EBI GWAS catalog, NHLBI GRASP (Leslie ) and dbGaP catalogues of associations. To ensure consistent formatting, we aligned alleles to the plus strand, added or updated chromosome positions to build 37 using dbSNP (release 138) (Sherry ) and liftOver (https://genome.ucsc.edu/cgi-bin/hgLiftOver), and updated old rsIDs to dbSNP release 141 (Supplementary Data). Linkage disequilibrium (LD) measures between neighbouring variants in the autosomal chromosomes were calculated using the phased haplotypes from European samples in 1000 Genomes phase 3 (N = 503) (1000 Genomes Project Consortium ). Variants with minor allele frequencies <0.5% were removed along with multiallelic variants and large indels (5 bases). For each remaining variant, we calculated and r2 for variants within 500 kb in either direction, and kept LD statistics for pairs of variants with . LD statistics based on the CEU population from Hapmap 2 release 24 (Frazer ) are also available (Supplementary Data). The user may enter either one variant into the text box on the website or upload up to 50 variants in a text file. The Perl interface annotates the variant alleles using dbSNP, identifies proxies of the specified variants (if requested) in the database according to a user-specified pairwise r2 threshold, and queries the catalogue of genotype–phenotype associations for the specified variants and their proxies. Association results are collated and presented with respect to the same effect and non-effect alleles for each variant. The associations with proxies are aligned according to the effect and non-effect alleles of the corresponding primary variant of interest for added ease of interpretation. The output is a file of associations, which is made available to download. There is also a P value filter option that only retains results with study-specific P values less than the selected threshold.

3 Results

To illustrate the use of PhenoScanner, we ran the program with rs10840293 (an intronic variant in SWAP70) using proxies from 1000 Genomes and a r2 cut-off of 0.8. The program found and aligned over 1000 associations with either rs10840293 or a proxy of rs10840293 () in <10s (Fig. 1 and Supplementary Data). Hence, even though associations between rs10840293 and phenotypes are mostly unavailable, we were able to obtain a range of related associations using proxies (e.g. rs93138 in Fig. 1).
Fig. 1.

Association results for rs10840293 with a subset of the traits (a) and diseases (b) available in PhenoScanner. An asterisk indicates the use of a proxy variant (rs93138; ) in reporting the association. SD, standard deviation; OR, odds ratio

Association results for rs10840293 with a subset of the traits (a) and diseases (b) available in PhenoScanner. An asterisk indicates the use of a proxy variant (rs93138; ) in reporting the association. SD, standard deviation; OR, odds ratio

4 Conclusion

In summary, PhenoScanner is a large curated database of publicly available summary results from genetic association studies. This database extends current catalogues of genetic data by including all available results as opposed to filtering on strength of association. Moreover, PhenoScanner aligns genotype–phenotype associations across traits and proxies, providing the user with an easily interpretable formatted output file. We anticipate that this tool will make cross-referencing genetic variants with many phenotypes faster and more efficient.

Funding

This work was supported by the UK Medical Research Council [G66840, G0800270], Pfizer [G73632], British Heart Foundation [SP/09/002], UK National Institute for Health Research Cambridge Biomedical Research Centre, European Research Council [268834], and European Commission Framework Programme 7 [HEALTH-F2-2012-279233]. Conflict of Interest: none declared. Click here for additional data file.
  8 in total

1.  dbSNP: the NCBI database of genetic variation.

Authors:  S T Sherry; M H Ward; M Kholodov; J Baker; L Phan; E M Smigielski; K Sirotkin
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  The NCBI dbGaP database of genotypes and phenotypes.

Authors:  Matthew D Mailman; Michael Feolo; Yumi Jin; Masato Kimura; Kimberly Tryka; Rinat Bagoutdinov; Luning Hao; Anne Kiang; Justin Paschall; Lon Phan; Natalia Popova; Stephanie Pretel; Lora Ziyabari; Moira Lee; Yu Shao; Zhen Y Wang; Karl Sirotkin; Minghong Ward; Michael Kholodov; Kerry Zbicz; Jeffrey Beck; Michael Kimelman; Sergey Shevelev; Don Preuss; Eugene Yaschenko; Alan Graeff; James Ostell; Stephen T Sherry
Journal:  Nat Genet       Date:  2007-10       Impact factor: 38.330

3.  GRASP: analysis of genotype-phenotype results from 1390 genome-wide association studies and corresponding open access database.

Authors:  Richard Leslie; Christopher J O'Donnell; Andrew D Johnson
Journal:  Bioinformatics       Date:  2014-06-15       Impact factor: 6.937

4.  A second generation human haplotype map of over 3.1 million SNPs.

Authors:  Kelly A Frazer; Dennis G Ballinger; David R Cox; David A Hinds; Laura L Stuve; Richard A Gibbs; John W Belmont; Andrew Boudreau; Paul Hardenbol; Suzanne M Leal; Shiran Pasternak; David A Wheeler; Thomas D Willis; Fuli Yu; Huanming Yang; Changqing Zeng; Yang Gao; Haoran Hu; Weitao Hu; Chaohua Li; Wei Lin; Siqi Liu; Hao Pan; Xiaoli Tang; Jian Wang; Wei Wang; Jun Yu; Bo Zhang; Qingrun Zhang; Hongbin Zhao; Hui Zhao; Jun Zhou; Stacey B Gabriel; Rachel Barry; Brendan Blumenstiel; Amy Camargo; Matthew Defelice; Maura Faggart; Mary Goyette; Supriya Gupta; Jamie Moore; Huy Nguyen; Robert C Onofrio; Melissa Parkin; Jessica Roy; Erich Stahl; Ellen Winchester; Liuda Ziaugra; David Altshuler; Yan Shen; Zhijian Yao; Wei Huang; Xun Chu; Yungang He; Li Jin; Yangfan Liu; Yayun Shen; Weiwei Sun; Haifeng Wang; Yi Wang; Ying Wang; Xiaoyan Xiong; Liang Xu; Mary M Y Waye; Stephen K W Tsui; Hong Xue; J Tze-Fei Wong; Luana M Galver; Jian-Bing Fan; Kevin Gunderson; Sarah S Murray; Arnold R Oliphant; Mark S Chee; Alexandre Montpetit; Fanny Chagnon; Vincent Ferretti; Martin Leboeuf; Jean-François Olivier; Michael S Phillips; Stéphanie Roumy; Clémentine Sallée; Andrei Verner; Thomas J Hudson; Pui-Yan Kwok; Dongmei Cai; Daniel C Koboldt; Raymond D Miller; Ludmila Pawlikowska; Patricia Taillon-Miller; Ming Xiao; Lap-Chee Tsui; William Mak; You Qiang Song; Paul K H Tam; Yusuke Nakamura; Takahisa Kawaguchi; Takuya Kitamoto; Takashi Morizono; Atsushi Nagashima; Yozo Ohnishi; Akihiro Sekine; Toshihiro Tanaka; Tatsuhiko Tsunoda; Panos Deloukas; Christine P Bird; Marcos Delgado; Emmanouil T Dermitzakis; Rhian Gwilliam; Sarah Hunt; Jonathan Morrison; Don Powell; Barbara E Stranger; Pamela Whittaker; David R Bentley; Mark J Daly; Paul I W de Bakker; Jeff Barrett; Yves R Chretien; Julian Maller; Steve McCarroll; Nick Patterson; Itsik Pe'er; Alkes Price; Shaun Purcell; Daniel J Richter; Pardis Sabeti; Richa Saxena; Stephen F Schaffner; Pak C Sham; Patrick Varilly; David Altshuler; Lincoln D Stein; Lalitha Krishnan; Albert Vernon Smith; Marcela K Tello-Ruiz; Gudmundur A Thorisson; Aravinda Chakravarti; Peter E Chen; David J Cutler; Carl S Kashuk; Shin Lin; Gonçalo R Abecasis; Weihua Guan; Yun Li; Heather M Munro; Zhaohui Steve Qin; Daryl J Thomas; Gilean McVean; Adam Auton; Leonardo Bottolo; Niall Cardin; Susana Eyheramendy; Colin Freeman; Jonathan Marchini; Simon Myers; Chris Spencer; Matthew Stephens; Peter Donnelly; Lon R Cardon; Geraldine Clarke; David M Evans; Andrew P Morris; Bruce S Weir; Tatsuhiko Tsunoda; James C Mullikin; Stephen T Sherry; Michael Feolo; Andrew Skol; Houcan Zhang; Changqing Zeng; Hui Zhao; Ichiro Matsuda; Yoshimitsu Fukushima; Darryl R Macer; Eiko Suda; Charles N Rotimi; Clement A Adebamowo; Ike Ajayi; Toyin Aniagwu; Patricia A Marshall; Chibuzor Nkwodimmah; Charmaine D M Royal; Mark F Leppert; Missy Dixon; Andy Peiffer; Renzong Qiu; Alastair Kent; Kazuto Kato; Norio Niikawa; Isaac F Adewole; Bartha M Knoppers; Morris W Foster; Ellen Wright Clayton; Jessica Watkin; Richard A Gibbs; John W Belmont; Donna Muzny; Lynne Nazareth; Erica Sodergren; George M Weinstock; David A Wheeler; Imtaz Yakub; Stacey B Gabriel; Robert C Onofrio; Daniel J Richter; Liuda Ziaugra; Bruce W Birren; Mark J Daly; David Altshuler; Richard K Wilson; Lucinda L Fulton; Jane Rogers; John Burton; Nigel P Carter; Christopher M Clee; Mark Griffiths; Matthew C Jones; Kirsten McLay; Robert W Plumb; Mark T Ross; Sarah K Sims; David L Willey; Zhu Chen; Hua Han; Le Kang; Martin Godbout; John C Wallenburg; Paul L'Archevêque; Guy Bellemare; Koji Saeki; Hongguang Wang; Daochang An; Hongbo Fu; Qing Li; Zhen Wang; Renwu Wang; Arthur L Holden; Lisa D Brooks; Jean E McEwen; Mark S Guyer; Vivian Ota Wang; Jane L Peterson; Michael Shi; Jack Spiegel; Lawrence M Sung; Lynn F Zacharia; Francis S Collins; Karen Kennedy; Ruth Jamieson; John Stewart
Journal:  Nature       Date:  2007-10-18       Impact factor: 49.962

5.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

Authors:  Danielle Welter; Jacqueline MacArthur; Joannella Morales; Tony Burdett; Peggy Hall; Heather Junkins; Alan Klemm; Paul Flicek; Teri Manolio; Lucia Hindorff; Helen Parkinson
Journal:  Nucleic Acids Res       Date:  2013-12-06       Impact factor: 16.971

6.  SNiPA: an interactive, genetic variant-centered annotation browser.

Authors:  Matthias Arnold; Johannes Raffler; Arne Pfeufer; Karsten Suhre; Gabi Kastenmüller
Journal:  Bioinformatics       Date:  2014-11-26       Impact factor: 6.937

7.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

8.  Cardiometabolic effects of genetic upregulation of the interleukin 1 receptor antagonist: a Mendelian randomisation analysis.

Authors: 
Journal:  Lancet Diabetes Endocrinol       Date:  2015-02-26       Impact factor: 32.069

  8 in total
  319 in total

1.  Validation of genetic associations with acute GVHD and nonrelapse mortality in DISCOVeRY-BMT.

Authors:  Hancong Tang; Theresa Hahn; Ezgi Karaesmen; Abbas A Rizvi; Junke Wang; Sophie Paczesny; Tao Wang; Leah Preus; Qianqian Zhu; Yiwen Wang; Christopher A Haiman; Daniel Stram; Loreall Pooler; Xin Sheng; David Van Den Berg; Guy Brock; Amy Webb; Marcelo C Pasquini; Philip L McCarthy; Stephen R Spellman; Lara E Sucheston-Campbell
Journal:  Blood Adv       Date:  2019-08-13

2.  No Association Between Vitamin D Status and Risk of Barrett's Esophagus or Esophageal Adenocarcinoma: A Mendelian Randomization Study.

Authors:  Jing Dong; Puya Gharahkhani; Wong-Ho Chow; Marilie D Gammon; Geoffrey Liu; Carlos Caldas; Anna H Wu; Weimin Ye; Lynn Onstad; Lesley A Anderson; Leslie Bernstein; Paul D Pharoah; Harvey A Risch; Douglas A Corley; Rebecca C Fitzgerald; Prasad G Iyer; Brian J Reid; Jesper Lagergren; Nicholas J Shaheen; Thomas L Vaughan; Stuart MacGregor; Sharon Love; Claire Palles; Ian Tomlinson; Ines Gockel; Andrea May; Christian Gerges; Mario Anders; Anne C Böhmer; Jessica Becker; Nicole Kreuser; Rene Thieme; Tania Noder; Marino Venerito; Lothar Veits; Thomas Schmidt; Claudia Schmidt; Jakob R Izbicki; Arnulf H Hölscher; Hauke Lang; Dietmar Lorenz; Brigitte Schumacher; Rupert Mayershofer; Yogesh Vashist; Katja Ott; Michael Vieth; Josef Weismüller; Markus M Nöthen; Susanne Moebus; Michael Knapp; Wilbert H M Peters; Horst Neuhaus; Thomas Rösch; Christian Ell; Janusz Jankowski; Johannes Schumacher; Rachel E Neale; David C Whiteman; Aaron P Thrift
Journal:  Clin Gastroenterol Hepatol       Date:  2019-02-01       Impact factor: 11.382

3.  Association analyses based on false discovery rate implicate new loci for coronary artery disease.

Authors:  Christopher P Nelson; Anuj Goel; Adam S Butterworth; Stavroula Kanoni; Tom R Webb; Eirini Marouli; Lingyao Zeng; Ioanna Ntalla; Florence Y Lai; Jemma C Hopewell; Olga Giannakopoulou; Tao Jiang; Stephen E Hamby; Emanuele Di Angelantonio; Themistocles L Assimes; Erwin P Bottinger; John C Chambers; Robert Clarke; Colin N A Palmer; Richard M Cubbon; Patrick Ellinor; Raili Ermel; Evangelos Evangelou; Paul W Franks; Christopher Grace; Dongfeng Gu; Aroon D Hingorani; Joanna M M Howson; Erik Ingelsson; Adnan Kastrati; Thorsten Kessler; Theodosios Kyriakou; Terho Lehtimäki; Xiangfeng Lu; Yingchang Lu; Winfried März; Ruth McPherson; Andres Metspalu; Mar Pujades-Rodriguez; Arno Ruusalepp; Eric E Schadt; Amand F Schmidt; Michael J Sweeting; Pierre A Zalloua; Kamal AlGhalayini; Bernard D Keavney; Jaspal S Kooner; Ruth J F Loos; Riyaz S Patel; Martin K Rutter; Maciej Tomaszewski; Ioanna Tzoulaki; Eleftheria Zeggini; Jeanette Erdmann; George Dedoussis; Johan L M Björkegren; Heribert Schunkert; Martin Farrall; John Danesh; Nilesh J Samani; Hugh Watkins; Panos Deloukas
Journal:  Nat Genet       Date:  2017-07-17       Impact factor: 38.330

4.  No clear support for a role for vitamin D in Parkinson's disease: A Mendelian randomization study.

Authors:  Susanna C Larsson; Andrew B Singleton; Mike A Nalls; J Brent Richards
Journal:  Mov Disord       Date:  2017-06-08       Impact factor: 10.338

5.  Mendelian randomization of inorganic arsenic metabolism as a risk factor for hypertension- and diabetes-related traits among adults in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL) cohort.

Authors:  Molly Scannell Bryan; Tamar Sofer; Yasmin Mossavar-Rahmani; Bharat Thyagarajan; Donglin Zeng; Martha L Daviglus; Maria Argos
Journal:  Int J Epidemiol       Date:  2019-06-01       Impact factor: 7.196

Review 6.  Genome-wide association studies of albuminuria: towards genetic stratification in diabetes?

Authors:  Cristian Pattaro
Journal:  J Nephrol       Date:  2017-09-16       Impact factor: 3.902

7.  A genome-wide scan for pleiotropy between bone mineral density and nonbone phenotypes.

Authors:  Maria A Christou; Georgios Ntritsos; Georgios Markozannes; Fotis Koskeridis; Spyros N Nikas; David Karasik; Douglas P Kiel; Evangelos Evangelou; Evangelia E Ntzani
Journal:  Bone Res       Date:  2020-07-01       Impact factor: 13.567

Review 8.  Genetics meets proteomics: perspectives for large population-based studies.

Authors:  Karsten Suhre; Mark I McCarthy; Jochen M Schwenk
Journal:  Nat Rev Genet       Date:  2020-08-28       Impact factor: 53.242

Review 9.  Mendelian randomization in cardiometabolic disease: challenges in evaluating causality.

Authors:  Michael V Holmes; Mika Ala-Korpela; George Davey Smith
Journal:  Nat Rev Cardiol       Date:  2017-06-01       Impact factor: 32.419

10.  Co-regulatory networks of human serum proteins link genetics to disease.

Authors:  Valur Emilsson; Marjan Ilkov; John R Lamb; Lori L Jennings; Vilmundur Gudnason; Nancy Finkel; Elias F Gudmundsson; Rebecca Pitts; Heather Hoover; Valborg Gudmundsdottir; Shane R Horman; Thor Aspelund; Le Shu; Vladimir Trifonov; Sigurdur Sigurdsson; Andrei Manolescu; Jun Zhu; Örn Olafsson; Johanna Jakobsdottir; Scott A Lesley; Jeremy To; Jia Zhang; Tamara B Harris; Lenore J Launer; Bin Zhang; Gudny Eiriksdottir; Xia Yang; Anthony P Orth
Journal:  Science       Date:  2018-08-02       Impact factor: 47.728

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.