Literature DB >> 31578528

Target genes, variants, tissues and transcriptional pathways influencing human serum urate levels.

Adrienne Tin^1,2, Jonathan Marten³, Victoria L Halperin Kuhns⁴, Yong Li⁵, Matthias Wuttke⁵, Holger Kirsten^6,7, Karsten B Sieber⁸, Chengxiang Qiu⁹, Mathias Gorski^10,11, Zhi Yu^12,13, Ayush Giri^14,15, Gardar Sveinbjornsson¹⁶, Man Li¹⁷, Audrey Y Chu¹⁸, Anselm Hoppmann⁵, Luke J O'Connor¹⁹, Bram Prins²⁰, Teresa Nutile²¹, Damia Noce²², Masato Akiyama^23,24, Massimiliano Cocca²⁵, Sahar Ghasemi^26,27, Peter J van der Most²⁸, Katrin Horn^6,7, Yizhe Xu¹⁷, Christian Fuchsberger²², Sanaz Sedaghat²⁹, Saima Afaq^30,31, Najaf Amin²⁹, Johan Ärnlöv^32,33, Stephan J L Bakker³⁴, Nisha Bansal^35,36, Daniela Baptista³⁷, Sven Bergmann^38,39,40, Mary L Biggs^41,42, Ginevra Biino⁴³, Eric Boerwinkle⁴⁴, Erwin P Bottinger⁴⁵, Thibaud S Boutin³, Marco Brumat⁴⁶, Ralph Burkhardt^7,47,48, Eric Campana⁴⁶, Archie Campbell⁴⁹, Harry Campbell⁵⁰, Robert J Carroll⁵¹, Eulalia Catamo²⁵, John C Chambers^{30,52,53,54,55}, Marina Ciullo^21,56, Maria Pina Concas²⁵, Josef Coresh¹², Tanguy Corre^38,39,57, Daniele Cusi^58,59, Sala Cinzia Felicita⁶⁰, Martin H de Borst³⁴, Alessandro De Grandi²², Renée de Mutsert⁶¹, Aiko P J de Vries⁶², Graciela Delgado⁶³, Ayşe Demirkan^29,64, Olivier Devuyst⁶⁵, Katalin Dittrich^66,67, Kai-Uwe Eckardt^68,69, Georg Ehret³⁷, Karlhans Endlich^27,70, Michele K Evans⁷¹, Ron T Gansevoort³⁴, Paolo Gasparini^25,46, Vilmantas Giedraitis⁷², Christian Gieger^73,74,75, Giorgia Girotto^25,46, Martin Gögele²², Scott D Gordon⁷⁶, Daniel F Gudbjartsson¹⁶, Vilmundur Gudnason^77,78, Toomas Haller⁷⁹, Pavel Hamet^80,81, Tamara B Harris⁸², Caroline Hayward³, Andrew A Hicks²², Edith Hofer^83,84, Hilma Holm¹⁶, Wei Huang^85,86, Nina Hutri-Kähönen^87,88, Shih-Jen Hwang^89,90, M Arfan Ikram²⁹, Raychel M Lewis⁴, Erik Ingelsson^91,92,93,94, Johanna Jakobsdottir^77,95, Ingileif Jonsdottir¹⁶, Helgi Jonsson^96,97, Peter K Joshi⁵⁰, Navya Shilpa Josyula⁹⁸, Bettina Jung¹⁰, Mika Kähönen⁹⁹, Yoichiro Kamatani^23,100, Masahiro Kanai^23,101, Shona M Kerr³, Wieland Kiess^7,66,67, Marcus E Kleber⁶³, Wolfgang Koenig^102,103,104, Jaspal S Kooner^{53,54,105,106}, Antje Körner^7,66,67, Peter Kovacs¹⁰⁷, Bernhard K Krämer⁶³, Florian Kronenberg¹⁰⁸, Michiaki Kubo¹⁰⁹, Brigitte Kühnel⁷³, Martina La Bianca²⁵, Leslie A Lange¹¹⁰, Benjamin Lehne³⁰, Terho Lehtimäki⁸⁷, Jun Liu^29,111, Markus Loeffler^6,7, Ruth J F Loos^112,113, Leo-Pekka Lyytikäinen⁸⁷, Reedik Magi⁷⁹, Anubha Mahajan^114,115, Nicholas G Martin⁷⁶, Winfried März^63,116,117, Deborah Mascalzoni²², Koichi Matsuda¹¹⁸, Christa Meisinger^119,120, Thomas Meitinger^103,121,122, Andres Metspalu⁷⁹, Yuri Milaneschi¹²³, Christopher J O'Donnell^124,125, Otis D Wilson¹²⁶, J Michael Gaziano^125,127, Pashupati P Mishra⁸⁷, Karen L Mohlke¹²⁸, Nina Mononen⁸⁷, Grant W Montgomery¹²⁹, Dennis O Mook-Kanamori^61,130, Martina Müller-Nurasyid^{103,131,132,133}, Girish N Nadkarni^112,134, Mike A Nalls^135,136, Matthias Nauck^27,137, Kjell Nikus^138,139, Boting Ning¹⁴⁰, Ilja M Nolte²⁸, Raymond Noordam¹⁴¹, Jeffrey R O'Connell¹⁴², Isleifur Olafsson¹⁴³, Sandosh Padmanabhan¹⁴⁴, Brenda W J H Penninx¹²³, Thomas Perls¹⁴⁵, Annette Peters^74,75,103, Mario Pirastu¹⁴⁶, Nicola Pirastu⁵⁰, Giorgio Pistis¹⁴⁷, Ozren Polasek^148,149, Belen Ponte¹⁵⁰, David J Porteous^49,151, Tanja Poulain⁷, Michael H Preuss¹¹², Ton J Rabelink^62,152, Laura M Raffield¹²⁸, Olli T Raitakari^153,154,155, Rainer Rettig¹⁵⁶, Myriam Rheinberger¹⁰, Kenneth M Rice⁴², Federica Rizzi^157,158, Antonietta Robino²⁵, Igor Rudan⁵⁰, Alena Krajcoviechova^159,160, Renata Cifkova^159,161, Rico Rueedi^38,39, Daniela Ruggiero^21,56, Kathleen A Ryan¹⁶², Yasaman Saba¹⁶³, Erika Salvi^157,164, Helena Schmidt¹⁶⁵, Reinhold Schmidt⁸³, Christian M Shaffer⁵¹, Albert V Smith⁷⁸, Blair H Smith¹⁶⁶, Cassandra N Spracklen¹²⁸, Konstantin Strauch^131,132, Michael Stumvoll¹⁶⁷, Patrick Sulem¹⁶, Salman M Tajuddin⁷¹, Andrej Teren^7,168, Joachim Thiery^7,47, Chris H L Thio²⁸, Unnur Thorsteinsdottir¹⁶, Daniela Toniolo⁶⁰, Anke Tönjes¹⁶⁹, Johanne Tremblay^80,170, André G Uitterlinden¹⁷¹, Simona Vaccargiu¹⁴⁶, Pim van der Harst^172,173,174, Cornelia M van Duijn^29,111,175, Niek Verweij^172,176, Uwe Völker^27,177, Peter Vollenweider¹⁷⁸, Gerard Waeber¹⁷⁸, Melanie Waldenberger^73,74,103, John B Whitfield⁷⁶, Sarah H Wild¹⁷⁹, James F Wilson^3,50, Qiong Yang¹⁴⁰, Weihua Zhang^30,53, Alan B Zonderman⁷¹, Murielle Bochud⁵⁷, James G Wilson¹⁸⁰, Sarah A Pendergrass¹⁸¹, Kevin Ho^182,183, Afshin Parsa^184,185, Peter P Pramstaller²², Bruce M Psaty^186,187, Carsten A Böger^10,188, Harold Snieder²⁸, Adam S Butterworth¹⁸⁹, Yukinori Okada^190,191, Todd L Edwards^192,193, Kari Stefansson¹⁶, Katalin Susztak⁹, Markus Scholz^6,7, Iris M Heid¹¹, Adriana M Hung^126,193, Alexander Teumer^26,27, Cristian Pattaro²², Owen M Woodward⁴, Veronique Vitart³, Anna Köttgen^194,195.

Abstract

Elevated serum urate levels cause gout and correlate with cardiometabolic diseases via poorly understood mechanisms. We performed a trans-ancestry genome-wide association study of serum urate in 457,690 individuals, identifying 183 loci (147 previously unknown) that improve the prediction of gout in an independent cohort of 334,880 individuals. Serum urate showed significant genetic correlations with many cardiometabolic traits, with genetic causality analyses supporting a substantial role for pleiotropy. Enrichment analysis, fine-mapping of urate-associated loci and colocalization with gene expression in 47 tissues implicated the kidney and liver as the main target organs and prioritized potentially causal genes and variants, including the transcriptional master regulators in the liver and kidney, HNF1A and HNF4A. Experimental validation showed that HNF4A transactivated the promoter of ABCG2, encoding a major urate transporter, in kidney cells, and that HNF4A p.Thr139Ile is a functional variant. Transcriptional coregulation within and across organs may be a general mechanism underlying the observed pleiotropy between urate and cardiometabolic traits.

Entities: Chemical

Mesh：

Substances：

Year: 2019 PMID： 31578528 PMCID： PMC6858555 DOI： 10.1038/s41588-019-0504-x

Source DB: PubMed Journal: Nat Genet ISSN： 1061-4036 Impact factor: 38.330

Serum urate levels reflect a balance between uric acid production and its renal and intestinal excretion. Elevated serum urate levels define hyperuricemia, which is associated with metabolic, cardiovascular and kidney-related conditions. Hyperuricemia can cause kidney stones and gout, the most common inflammatory arthritis[1,2]. Gout attacks are a highly painful response to the deposition of urate crystals, and are a significant cause of morbidity and related health care costs[3]. Although gout has become a major public health issue, it is undertreated due to low awareness, poor patient adherence[4], and inappropriate prescription practices of the most commonly used drug, allopurinol[5]. A better understanding of the mechanisms controlling serum urate may help to develop novel medications for gout treatment and prevention and provide insights into regulatory mechanisms shared between urate and cardio-metabolic traits. Heritability of serum urate varies between 30% and 60%[6-11]. Candidate gene and genome-wide associations studies (GWAS) have identified three genes as major determinants of urate levels: SLC2A9, ABCG2, and SLC22A12[7,12-18]. While SLC2A9 and ABCG2 harbor common variants of relatively large effect[19], SLC22A12 contains many rare or low-frequency variants[20]. The largest GWAS meta-analyses performed to date identified 28 loci among European ancestry (EA)[21] and 27 among Japanese individuals[22]. Many genes in the associated loci encode renal and intestinal urate transporters or their regulators, while others are relevant to glucose and lipid metabolism, functions of the liver, where uric acid is generated. With increased public availability of large annotation and gene expression datasets[23,24], fine-mapping associated loci to prioritize target tissues, pathways, and potentially causal genes and variants has become possible. Here, we perform a trans-ethnic meta-analysis of GWAS of serum urate among 457,690 individuals and identify 183 associated loci that improve gout risk prediction in an independent sample of 334,880 UK Biobank (UKBB) participants. We evaluate the genetic correlation of serum urate with hundreds of cardio-metabolic traits and diseases, and use a recently developed latent causal variable model to examine the contribution of causality versus pleiotropy. We prioritize target variants, genes, tissues and pathways that contribute to the complex regulation of urate levels through comprehensive data integration. Lastly, we conduct proof-of-principle experimental studies showing that HNF4A, a transcriptional master regulator in liver and kidney proximal tubule, can regulate transcription of the major urate transporter ABCG2 in kidney cells and that the fine-mapped HNF4A variant p.Thr139Ile is functional. Transcriptional co-regulation of processes linked to energy metabolism within and across organs may underlie the pleiotropy observed between urate levels and numerous cardio-metabolic traits.

Results

Trans-ethnic meta-analysis identifies 183 urate-associated loci

Trans-ethnic meta-analyses were conducted to maximize the sample size for locus discovery, and EA-specific analyses were used where population-specific linkage disequilibrium (LD) was required to characterize loci (Supplementary Fig. 1). The primary trans-ethnic meta-analysis included 457,690 individuals (EA, n = 288,649; East Asian ancestry (EAS), n = 125,725; African Americans (AA), n = 33,671; South Asian ancestry (SA), n = 9,037; and Hispanics (HIS), n = 608) from 74 studies. Mean urate levels ranged from 4.2 to 7.2 mg/dl (Supplementary Table 1). GWAS were performed based on genotypes imputed using the 1000 Genomes Project or Haplotype Reference Consortium reference panels (Methods and Supplementary Table 2). Results were combined through inverse-variance weighted fixed effect meta-analysis after central study-specific quality control. There was no evidence of inflation due to unmodeled population structure (LD score regression intercept = 1.01; genomic inflation factor λGC = 1.04). Post-meta-analysis variant filtering left 8,249,849 high-quality SNPs for downstream analyses (Methods). We identified 183 loci that contained at least one genome-wide significant SNP (P ≤ 5 × 10−8, Fig. 1 and Supplementary Table 3). Of these, 36 contained an index SNP reported in previous GWAS of serum urate[13,15,17,18,21,22,25,26], and 147 were considered novel (Fig. 1). Allelic effects on serum urate ranged from 0.28 to 0.017 mg/dl (mean 0.038 mg/dl, standard deviation (SD) 0.033). Regional association plots are shown in the Supplementary Data Set.

Figure 1 ∣

Trans-ethnic GWAS meta-analysis identifies 183 loci associated with serum urate.

Outer ring: Dot size represents the genetic effect size of the index SNP at each labeled locus on serum urate. Blue band: −log10(two-sided meta-analysis P-value) for association with serum urate (n = 457,690), by chromosomal position (GRCh37 (hg19) reference build). Red line indicates genome-wide significance (P = 5 × 10−8). Blue gene labels indicate novel loci, gray labels loci reported in previous GWAS of serum urate. Green band: −log10(two-sided meta-analysis P-value) for association with gout (n = 763,813), by chromosomal position. Red line indicates genome-wide significance (P = 5 × 10−8). Inner band: Dots represent index SNPs with significant heterogeneity and are color-coded according to its source: green for ancestry-related heterogeneity (Panc-het < 2.7 × 10−4 (0.05/183)), red for residual heterogeneity (Pres-het < 2.7 × 10−4), and yellow for both (Panc-het and Pres-het < 2.7 × 10−4). Loci are labeled with the gene closest to the index SNP. Panc-het and Pres-het were generated by MR-MEGA (Methods).

The index SNPs at all 183 loci explained 7.7% of the serum urate variance (Methods), compared to 5.3% explained by index SNPs previously reported from GWAS in EA populations[21]. In a large participating general population-based pedigree study, the 183 index SNPs explained 17% of serum urate genetic heritability (h2 = 37%, 95% credible interval: 29%, 45%), with 5% attributed to the index SNPs at SLC2A9, ABCG2 and SLC22A12 (Supplementary Fig. 2 and Methods).

Characterization of ancestry-related heterogeneity

For the 183 index SNPs, we observed no evidence of systematic between-study heterogeneity (median I2 = 2%, interquartile range 0-14%; Supplementary Table 3). Fourteen index SNPs showed significant evidence of ancestry-associated heterogeneity (Panc-het < 2.7 × 10−4 = 0.05/183) when tested using meta-regression (Supplementary Fig. 3 and Methods), consistent with their higher measures of between-study heterogeneity (I2 > 25%, Fig. 1 and Supplementary Table 3). The most significant ancestry-associated heterogeneity was observed for rs3775947 at SLC2A9 (Panc-het = 1.5 × 10−127, allelic effect 0.34 (EA), 0.26 (AA), 0.17 (EAS), 0.41 (HIS), and 0.21 (SA) mg/dl), consistent with previous reports of population heterogeneity at this locus[27]. Nine genome-wide significant loci identified through meta-regression did not overlap with the 183 loci, including SLC2A2 and KCNQ1 that were genome-wide significant in EAS (Supplementary Table 4). Ancestry-specific meta-analyses of EA, AA, EAS and SA are summarized in Supplementary Table 5-8, respectively, and in the Supplementary Note.

Sex-stratified meta-analyses of serum urate GWAS

Mean serum urate levels and gout risk are higher in men than in women[28]. We therefore tested whether the 183 urate-associated index SNPs showed sex-specific differences. Six SNPs showed significant effect differences (Pdiff < 2.7 × 10−4 = 0.05/183), at SLC2A9, ABCG2, CAPN1, GCKR, IDH2, and SLC22A12 (Supplementary Table 9). The genome-wide test for differences in genetic effects on urate levels between men and women identified only SNPs at SLC2A9 and ABCG2 (Pdiff < 5 × 10−8, Methods and Supplementary Fig. 4), consistent with previous reports[7,14,15,21], and several suggestive loci (Pdiff < 1 × 10−5, Supplementary Table 10).

Urate index SNPs are associated with gout

We next assessed the association of the 183 trans-ethnic urate index SNPs with gout in a trans-ethnic meta-analysis of 20 studies comprising 763,813 participants with 13,179 gout cases (Methods, Fig. 1 and Supplementary Table 1). Consistent with the causal role of hyperuricemia in gout, genetic effects were highly correlated (Spearman correlation coefficient 0.87, Supplementary Fig. 5a; 0.82 for SNPs with urate association P-values between 5 × 10−8 and 1 × 10−8). Fifty-five SNPs were significantly associated with gout (P < 2.7 × 10−4 = 0.05/183). In agreement with previous findings[29], the largest odds ratio (OR) for gout was observed at ABCG2 (OR 2.04, 95% confidence interval (CI) 1.96-2.12, P = 7.7 × 10−299). Genetic effects were generally larger among index SNPs with lower minor allele frequency (MAF), with the exception of a few common large-effect SNPs in known major urate loci SLC2A9, ABCG2, and SLC22A12[30] (Supplementary Fig. 5b).

A genetic risk score for urate improves gout risk prediction

We evaluated whether a weighted urate genetic risk score (GRS) improved gout risk prediction when added to demographic information in a large, independent sample of 334,880 UKBB participants, including 4,908 gout cases (Methods). Across categories of the GRS, gout prevalence increased from 0.1% to 12.9% (Fig. 2a and Supplementary Table 11). Compared to the most common GRS category, the age- and sex-adjusted OR of gout ranged from 0.09 (95% CI 0.02-0.37, P = 7.8 × 10−4) in the lowest to 13.6 (95% CI 7.2-25.7, P = 1.4 × 10−15) in the highest GRS category (Fig. 2b and Supplementary Table 11). The 3.5% of individuals in the three highest GRS categories had a >3-fold increase in gout risk compared to individuals in the most common GRS category. This risk is comparable to a monogenic disease of modest effect size[31], but affects a higher proportion of the population.

Figure 2 ∣

A genetic risk score (GRS) for serum urate improves gout risk prediction.

a, Histogram of the urate GRS among 334,880 European ancestry participants of the UK Biobank. The y-axes show the number of individuals (left) and the prevalence of gout (right), the x-axis shows categories of the urate GRS. The units on the x-axis represent genetically predicted serum urate levels (mg/dl) compared to individuals without any urate-increasing alleles. b, Age- and sex-adjusted odds ratio of gout (y-axis) by GRS category (x-axis) among 334,880 European-ancestry participants of the UK Biobank, comparing each category to the most prevalent category (4.74 < GRS ≤ 5.02) with error bars representing 95% confidence intervals; * denotes logistic regression two-sided P-value < 0.05, ** denotes P < 5 × 10−10, and *** P < 5 × 10−100. c, Comparison of the receiver operating characteristic (ROC) curves of different prediction models of gout: genetic (GRS only; red), demographic (age + sex; green), and combined (GRS + age + sex; blue). y-axis: sensitivity, x-axis: specificity. At the optimal cut points determined by the maximum of the Youden’s index, the sensitivity of the combined model was 84% and specificity was 68%.

We additionally constructed gout risk prediction models in the UKBB sample, which was not part of the discovery analysis of serum urate-associated variants. Gout status was regressed on the GRS alone (“genetic model”), on age and sex (“demographic model”), and on the GRS, age, and sex (“combined model”) in a model development subset of 90% of the individuals to obtain precise estimates. These models were then used to predict gout status in the remaining 10%, the validation sample. The genetic model was a weaker predictor (area under the receiver operating characteristic curve (AUC) = 0.68) than the demographic model (AUC = 0.79). Addition of the GRS (combined model) significantly increased prediction accuracy (AUC = 0.84, DeLong’s test P < 2.2 × 10−16; Fig. 2c) and achieved a sensitivity of 84% and specificity of 68%. Ten-fold cross-validation of the regression models provided mean AUCs of 0.67 (s.d. 0.011), 0.78 (s.d. 0.006) and 0.83 (s.d. 0.008) for the genetic, demographic and combined models, respectively (Methods). The GRS represents a life-long predisposition to higher urate levels and can be calculated at birth. Thus, the GRS may help to identify individuals with a high genetic predisposition for gout, allowing for compensatory lifestyle choices to reduce the risk of gout.

High genetic correlations of serum urate with cardio-metabolic traits

Serum urate is positively correlated with many cardio-metabolic risk factors and diseases[32]. We assessed genetic correlations between urate and 748 complex traits using cross-trait LD score regression (Methods). Serum urate levels were significantly (P < 6.6 × 10−5 = 0.05/748) genetically correlated with 214 complex traits and diseases (Supplementary Table 12). The highest positive genetic correlation (rg) was with gout (rg = 0.92, P = 3.3 × 10−70), followed by traits representing components of the metabolic syndrome such as HOMA-IR (rg = 0.49) and fasting insulin (rg = 0.45, Fig. 3). The largest negative correlations were observed with HDL cholesterol-related measurements (rg up to −0.46), and with estimated glomerular filtration rate (rg = −0.38 and −0.26 for cystatin C and creatinine-based estimated glomerular filtration rate (eGFR), respectively), consistent with the known role of the kidneys in urate excretion. Overall, the genetic correlations were consistent with observational associations from epidemiological studies[32].

Figure 3 ∣

Serum urate shows widespread genetic correlations with cardio-metabolic risk factors and diseases.

The Circos plot shows significant genome-wide genetic correlations between serum urate and 214 complex traits or diseases (genetic correlation P < 6.6 × 10−5 = 0.05/748 traits tested), with bar height proportional to the genetic correlation coefficient (rg) estimate for each trait and coloring according to its direction (dark blue, rg > 0; light blue, rg < 0). Traits and diseases are labeled on the outside of the plot and grouped into nine different categories. Each category is color-coded (inner ring, inset). The greatest genetic correlation was observed with gout (rg = 0.92, P = 3.3 × 10−70). Genetic correlations with multiple cardio-metabolic risk factors and diseases reflect their known directions from observational studies. The serum urate association statistics for estimating genetic correlations were from the European-ancestry meta-analysis (n = 288,649).

To examine whether these genetic correlations reflect causal relationships or pleiotropy, we applied a recently developed latent causal variable (LCV) model to estimate the genetic causality proportion (GCP) for seven commonly studied cardio-metabolic traits (Methods). As a positive control, we analyzed gout, confirming a genetically causal effect of urate on gout (GCP = 0.79; Supplementary Table 13), consistent with Mendelian randomization (MR) studies[33,34]. The seven cardio-metabolic traits showed a GCP range consistent with mostly or partially genetically causal effects on serum urate. The largest GCP estimates were observed for adiposity-related traits (e.g. GCP = −0.84 for waist circumference; Supplementary Table 13), where higher cell numbers should result in higher purine and consequently urate production. A bi-directional MR study reported a causal effect of adiposity on serum urate levels[35]. While the GCP and MR methods estimate different quantities to assess causality, the direction of effect can be compared and was consistent with a positive causal effect of obesity on serum urate. Smaller GCP estimates for HDL cholesterol levels (GCP < 0.5; Supplementary Table 13) on the other hand suggest the existence of a genetic process with a causal effect on both HDL cholesterol and serum urate, for example co-regulated metabolic processes in the liver. These processes may explain a large fraction of heritability for cholesterol levels and a modest fraction for urate, a type of asymmetry expected to produce a partially genetically causal relationship consistent with the one observed. MR studies did not support a causal relationship between cholesterol levels and serum urate[36].

Enriched tissues and pathways

To identify tissues and molecular mechanisms relevant for urate metabolism and handling, and to provide potential clues to the observed genetic correlations, we investigated which tissues, cell types and systems were significantly enriched for the expression of genes mapping into urate-associated loci (Methods). Based on all SNPs with P < 1 × 10−5, we identified significant enrichment (false discovery rate (FDR) < 0.01) for 19 physiological systems, three tissues, and two cell types (Supplementary Table 14). The strongest enrichment was observed for kidney (P = 9.5 × 10−9) and urinary tract (P = 9.9 × 10−9), consistent with the kidney’s prominent role in controlling urate levels. Additional significant enrichments were observed for endocrine and digestive systems, including liver, the major site of urate production. Interestingly, a novel significant enrichment was also observed in the musculoskeletal system, specifically for synovial membrane, joint capsule, and joints (Fig. 4a), the sites of gout attacks.

Figure 4 ∣

Genes expressed in urate-associated loci are enriched in kidney tissue and pathways.

a, Grouped physiological systems (x-axis) that were tested individually for enrichment of expression of genes in urate-associated loci among European-ancestry individuals (n = 288,649) using DEPICT are shown as a bar plot, with the −log10(enrichment P-value) on the y-axis. Significantly enriched systems are labeled and highlighted in blue (enrichment false discovery rate (FDR) < 0.01). b, Correlated (r > 0.2) meta-gene sets that were strongly enriched (enrichment FDR < 0.01) for genes mapping into urate-associated loci among European-ancestry individuals (n = 288,649). Thickness of the edges represents the magnitude of the correlation coefficient, node size, color and intensity represent the number of clustered gene sets, gene set origin, and enrichment P-value, respectively.

We next tested for cell-type groups with evidence for enriched heritability based on cell-type-specific functional genomic elements using stratified LD score regression (Methods). The strongest enrichment was observed for kidney (11.5-fold), followed by liver (5.39-fold; Supplementary Table 15). Lastly, we tested whether any gene sets were enriched for variants associated with urate at P < 10−5 (Methods). Significant enrichment (FDR < 0.01) was observed for 383 reconstituted gene sets (Supplementary Table 16). Since many of these contained overlapping groups of genes, we used affinity propagation clustering to identify 57 meta gene sets (Methods and Supplementary Table 17), including a prominent group of inter-correlated gene sets related to kidney and liver development, morphology and function (Fig. 4b). Together, these results underscore the prominent roles of the kidney and liver in regulating serum urate levels and implicate the kidney as a major target organ for lowering serum urate.

Prioritization via fine-mapping, functional annotation, and gene expression

We established a workflow that combined fine mapping of urate-associated loci with functional annotation and a systematic evaluation of tissue-specific differential gene expression to prioritize target SNPs and genes for translational research.

Statistical fine-mapping prioritizes candidate SNPs.

Statistical fine-mapping was performed starting from the 123 genome-wide significant loci identified in the EA-specific meta-analysis, because the workflow included methods that used LD estimates from an ancestry-matched reference panel (Methods)[37]. After LD-based combination into 99 larger genomic regions, stepwise model selection in each region identified 114 independent SNPs (r2 < 0.01, Methods). Overall, 87 regions contained one independent SNP, ten contained two independent SNPs, the ABCG2 locus contained three and the SLC2A9 locus four independent SNPs (Supplementary Table 18). We computed 99% credible sets representing the smallest set of SNPs which collectively account for 99% posterior probability of containing the variant(s) driving the association signal (PPA)[38]. The 99% credible sets contained a median of 16 SNPs (Q1, Q3: 6, 57), and six of them only a single SNP, mapping in or near INSR, RBM8A, MPPED2, HNF4A, CPT1C, and SLC2A9 (Supplementary Table 18). Among 28 small credible sets (≤ 5 SNPs), several mapped in or near genes with an established role in urate handling such as SLC2A9, PDZK1, ABCG2, SLC22A11, and SLC16A9[20]. These credible sets contain the most supported SNPs and greatly reduce the number of candidate variants for experimental follow-up. Credible set SNPs were annotated for their functional consequence and regulatory potential (Methods). Missense SNPs with PPA > 50% or belonging to small credible sets were identified in ABCG2, UNC5CL, HNF1A, HNF4A, CPS1, and GCKR (Fig. 5a and Supplementary Table 19). All missense SNPs except the one in GCKR had a CADD score > 15, supporting them as potentially deleterious. Indeed, functional effects have already been demonstrated experimentally for rs2231142 (Gln141Lys, r2 = 1 to the index SNP rs74904971) in ABCG2, rs742493 (p.Arg432Gly) in UNC5CL, and rs1260326 (p.Leu446Pro) in GCKR (Table 1). Non-exonic variants with PAA > 90% and mapping into open chromatin in enriched tissues were identified in RBM8A, SLC2A9, INSR, HNF4A, PDZK1, NRG4, UNC5CL, and AAK1 (Methods, Supplementary Fig. 6 and Supplementary Table 19). When complemented by evidence of gene expression co-localization, these SNPs may represent causal regulatory variants and highlight their potential effector genes.

Figure 5 ∣

Prioritization of p.Thr139Ile at HNF4A and functional study of HNF4A regulation of ABCG2 transcription.

a, Graph shows credible set size (x-axis) against the posterior probability of association (PPA; y-axis) for each of 1,453 SNPs with PPA > 1% in 114 99% credible sets. Triangles mark missense SNPs, with size proportional to their Combined Annotation Dependent Depletion (CADD) score. Blue triangles indicate missense variants mapping into small (≤ 5 SNPs) credible sets or with high PPA (≥ 50%). b, Predicted HNF1A or HNF4A binding sites in the promoter region of ABCG2 using LASAGNA 2.0, the consensus affinity sequence, and the P-value of likely matches based on nucleotide position within a consensus transcription factor binding site (Methods). c, Relative luciferase activity and transactivation of ABCG2 promoter in cells transfected with variable amount of HNF1A or HNF4A constructs (mean (line) ± s.e.m. (whiskers), n = 3 independent experiments, P-values calculated with ordinary one-way ANOVA with Tukey’s multiple comparison test). d, Position of p.Thr139Ile (T139I) in DNA binding domain/hinge region within HNF4A homodimer structure (PDB 4IQR). e, Relative luciferase activity and transactivation of ABCG2 promoter in cells transfected with variable amount of constructs (ng’s of transfected DNA) of wild-type HNF4A (threonine) or isoleucine at position 139 (± s.e.m., n = 3 independent experiments, P-values calculated with ordinary one-way ANOVA with Tukey’s multiple comparison test).

Table 1

Genes implicated as causal via identification of missense variants with high probability of driving the urate association signal.

Genes are included if they contain a missense variant with posterior probability of association of >50% or mapping into a small credible set (≤5 SNPs).

Gene	SNP	#SNPsin set	SNPPP	Consequence	CADD	DHS	Gout meta-analysis P-value (EA)	Brief summary of literature and gene function
ABCG2	rs2231142	4	0.41	p.Gln141Lys(NP_004818.2)	18.2	ENCODE epithelial	1.21E-290	Encodes a xenobiotic and high-capacity urate membrane transporter expressed in kidney, liver and gut. Causal variants have been reported for gout susceptibility (#138900) and the Junior Jr(a-) blood group phenotype (#614490). The locus was first identified in association with serum urate through GWAS (PMID:18834626) and confirmed in many studies since. The common causal variant Q141K has been experimentally confirmed (PMID:19506252) as a partial loss of function.
UNC5CL	rs742493	4	0.95	p.Arg432Gly (NP_775832.2)(within Death domain)	21.0	ENCODE epithelial	2.73E-01	Encodes for the death-domain-containing Unc-5 Family C-Terminal-Like membrane-bound protein. Suggested as a candidate gene for mucosal diseases, with a role in epithelial inflammation and immunity (PMID:22158417). Experiments using human HEK293 cells showed that UNC5CL can transduce pro-inflammatory programs via activation of NF-κB, with the 432Gly variant less potent to do so than the 432Arg one (PMID:22158417).
HNF1A	rs1800574	2	0.92	p.Ala98Val (NP_000536.5)	23.4		1.83E-02	Encodes a transcription factor with strong expression in liver, guts and kidney. Rare mutations cause autosomal-dominant MODY type III (#600496). Locus found in GWAS of T2D (PMID:22325160) and blood urea nitrogen (PMID:29403010). Together with HNF4-alpha, it was first recognized as master regulator of hepatocyte and islet transcription. Knockout mice show proximal tubular dysfunction (Fanconi syndrome). HNF1A enhanced promoter activity of PDZK1, URAT1, NPT4 and OAT4 in human renal proximal tubule cell-based assays (PMID:28724612), supporting a role in the coordinated expression of components of the urate “transportosome”.
HNF4A	rs1800961	1	1.00	p.Thr139Ile (NP_000448.3)	24.7	ENCODE pancreas	7.43E-03	Encodes another nuclear receptor and transcription factor that controls expression of many genes, including HNF1A and other overlapping target genes. Rare mutations cause autosomal-dominant MODY type I (#125850) and autosomal-dominant renal Fanconi syndrome 4 (# 616026). Shown to regulate expression of SLC2A9 and other members of the urate "transportosome" in cell-based assays (PMID 25209865, PMID:30124855). The GWAS locus has been reported for multiple cardio-metabolic traits and T2D (PMID:21874001).
CPS1	rs1047891	84	0.84	p.Thr1412Asn (NP_001116105.1)	22.1		5.66E-02	Encodes mitochondrial carbamoyl phosphate synthetase I, which catalyzes the first committed step of the urea cycle by synthesizing carbamoyl phosphate from ammonia, bicarbonate, and 2 molecules of ATP. Rare mutations cause autosomal-recessive carbamoylphosphate synthetase I deficiency (#237300). In addition to hyperammonemia, this disease features increased synthesis of glutamine, a precursor of purines. Elevated uric acid excretion has been reported in patients with hyperammonemia (PMID:6771064). GWAS locus for eGFR (PMID:26831199), homocysteine (PMID:23824729), urinary glycine concentrations (PMID: 26352407).
GCKR	rs1260326	2	0.67	p.Leu446Pro (NP_001477.2)	0.1	ENCODE kidney	4.09E-41	Encodes a regulatory protein prominently expressed in the liver that inhibits glucokinase. Identified in previous GWAS of urate (PMID:23263486) and multiple other cardio-metabolic traits. The 446L protein was shown to be less activated than 446Pro by physiological concentrations of fructose-6-phosphate, leading to reduced glucokinase inhibitory ability (PMID:19643913).

Abbreviation: PP, posterior probability; DHS, DNase-I hypersensitivity site; CADD, Combined Annotation Dependent Depletion phred score; EA, European ancestry.

Gout meta-analysis P-values were two-sided (n = 763,813). Posterior probabilities were estimated from statistical fine-mapping using the Wakefield approach (Methods).

We compared our fine-mapping workflow (“Wakefield”), established in previous studies[39,40], to an alternative approach implemented in FINEMAP (Methods)[41]. FINEMAP identified 152 credible sets (median of 7 SNPs). With respect to known causal variants in ABCG2 (rs2231142), GCKR (rs1260326), HNF4A (rs1800961) and PDZK1 (rs1967017), the Wakefield approach identified the causal variants in ABCG2, GCKR, and HNF4A as credible set members, whereas FINEMAP found those in ABCG2 and HNF4A. A comparison of all SNPs mapping into small credible sets (≤ 5 SNPs) identified through both approaches found highly correlated posterior probabilities (Pearson correlation coefficient 0.86, Supplementary Table 19).

Gene prioritization via gene expression co-localization analyses.

The urate association signals were next tested for co-localization with expression quantitative trait loci (eQTL) in cis across three kidney tissue resources and 44 GTEx tissues (Methods). High posterior probability of co-localization (H4 ≥ 0.8, Methods) supports a trait-associated variant acting through gene expression in the tissue where co-localization is identified. We identified co-localization with the expression of 13 genes in kidney (Fig. 6), the organ with the strongest enrichment for urate-associated variants. Whereas co-localization of some genes was only observed in kidney (SLC17A4, BICC1, UMOD, GALNTL5, NCOA7), others showed co-localization in several tissues (e.g., ARL6IP5). The direction of change in gene expression with higher urate levels could vary for the same gene across tissues. For instance, the allele associated with higher serum urate at SLC16A9 was associated with higher gene expression in kidney, consistent with a regulatory variant in a transporter mediating the reabsorption of urate. This same allele was associated with lower gene expression in other tissues such as aorta, pointing towards tissue-specific regulatory mechanisms[42]. Details of the 13 genes with evidence for co-localization with gene expression in kidney are summarized in Supplementary Table 20. Significant co-localizations across all 47 tissues (Supplementary Fig. 7) revealed additional insights such as co-localization of the urate association signal with NFAT5 expression in subcutaneous adipose tissue, emphasizing its role in adipogenesis[43], or PDZK1 expression in colon and ileum, important sites of urate excretion.

Figure 6 ∣

Co-localization of urate-association signals with gene expression in cis in kidney tissues.

Serum urate association signals identified among European ancestry individuals (n = 288,649) were tested for co-localization with all eQTLs where the eQTL cis-window overlapped (±100 kb) the index SNP. Genes with ≥1 positive co-localization (posterior probability of one common causal variant, H4, ≥ 0.80) in a kidney tissue are illustrated with the respective index SNP and transcript (y-axis). Co-localizations across all tissues (x-axis) are illustrated as dots, where the size of the dots indicates the posterior probability of the co-localization. Negative co-localizations (posterior probability of H4 < 0.80) are marked in gray, while the positive co-localizations are color-coded relative to the change in expression with a color gradient as indicated in the legend.

Lastly, we investigated whether any trans-ethnic index SNPs or their proxies (r2 > 0.8) were reproducibly associated with gene expression in trans in several large eQTL studies (Supplementary Table 21 and Supplementary Note). We identified inter-chromosomal associations between five index SNPs and 16 transcripts that were enriched in the term “cardiovascular disease” based on the Human Disease Ontology database (Supplementary Note and Supplementary Table 22).

HNF4A activates ABCG2 transcription and HNF4A p.Thr139Ile is a functional variant

The gene and variant prioritization workflow was validated using the identified candidates HNF1A and HNF4A. Co-regulation of target genes by these transcriptional master regulators in kidney proximal tubule and liver could potentially explain observed genetic correlations[44]. We first tested whether HNF1A and HNF4A affect transcription of ABCG2, which encodes for a major human urate transporter and represented the locus with the highest gout risk in our screen. The ABCG2 promoter region contains several predicted HNF1A and HNF4A binding sites (Fig. 5b). A luciferase reporter assay in the human embryonic kidney cell line HEK 293 was used to assess transactivation of the human ABCG2 promoter by HNF4A and HNF1A proteins (Methods and Supplementary Fig. 8a). Co-expression of HNF4A significantly increased the ABCG2 promoter-driven luciferase activity in a transfection dose- and HNF4A protein abundance-dependent manner (Fig. 5c and Supplementary Fig. 8b). No increase of luciferase activity occurred with the negative-control vector devoid of the ABCG2 promoter (Supplementary Fig. 8d,e). Results for HNF1A indicated that the observed association with serum urate is unlikely to occur via activation of ABCG2 in kidney cells (Fig. 5c), but HNF1A has been reported to activate transcription of PDZK1, which encodes a regulatory protein for several other renal urate transporters[45,46] also identified in this study. Next, we tested the functional relevance of the prioritized p.Thr139Ile allele in HNF4A (NM_178849.2, isoform 1, Methods). Its location within the hinge/DNA binding domain (Fig. 5d and Supplementary Fig. 8f) supports potentially altered interactions with targeted promoter regions. The isoleucine substitution at position 139 significantly increased the transactivation of the ABCG2 promoter as compared to the wild-type threonine (Fig. 5e), without altering HNF4A protein abundance (Supplementary Fig. 8c). Thus, HNF4A can activate ABCG2 transcription in a kidney cell line, and HNFA4 p.Thr139Ile is a functional variant. Increased activation of the urate excretory protein ABCG2 by the allele encoding the isoleucine residue should result in lower serum urate levels, consistent with the observed negative association in our GWAS.

Discussion

This trans-ethnic GWAS meta-analysis of serum urate based on 457,690 individuals represents a four-fold increase in sample size over previous studies[21,22,47] and identified 183 urate-associated loci, 147 of which are novel. A genetic urate risk score led to significant improvements of gout risk prediction among 334,880 UKBB participants: 3.5% had a risk of gout comparable to a Mendelian disease effect size. Genetic correlation and causality analyses confirmed the causal effect of urate on gout, and were consistent with transcriptional co-regulation as a source of pleiotropy in the widespread genetic correlations between serum urate and cardio-metabolic traits. Tissue and cell type-specific enrichment analyses supported kidney and liver, the sites of urate excretion and generation, as key target tissues. Comprehensive fine-mapping and co-localization analyses with gene expression across 47 tissues delivered an extensive list of target genes and SNPs for follow-up studies, of which we experimentally confirmed HNF4A p.Thr139Ile as a functional allele involved in transcriptional regulation of urate homeostasis. Major challenges of GWAS are to pinpoint causal genes and variants, and to provide actionable insights into disease-relevant mechanisms. This study developed a comprehensive resource of urate-related candidate SNPs, genes, tissues and pathways that will enable a wide range of follow-up studies. Out of the many novel and biologically plausible findings, we highlight two instances in which co-localization analyses provided new insights. First, co-localization helped to prioritize genes in association peaks that previous GWAS could not resolve. For example, the locus at chromosome 6p22.2 contains genes encoding for four members of the SLC17 transporter family (SLC17A1-SLC17A4). Systematic testing of co-localization across genes and tissues identified evidence only for SLC17A4 in kidney, with higher expression associated with higher serum urate. Previous experimental studies have implicated SLC17A4 as a urate exporter in intestine[48], and our data support its yet unappreciated role in renal urate transport. Second, co-localization with MUC1, BICC1 and UMOD expression in kidney suggests a shared biological mechanism. Rare mutations in all three genes underlie monogenic cystic kidney diseases[49-51]. Another noteworthy finding is the significant genetic correlations with many cardio-metabolic traits, consistent with observational associations[52]. Many of these traits are influenced by liver metabolism. The estimated genetic causality proportions supported their genetic correlations to be partly driven by overlapping or co-regulated metabolic pathways and not only by a fully causal effect of e.g. cholesterol or insulin levels on urate. Likewise, significant genetic correlations with kidney-related traits such as eGFR may reflect shared regulatory processes in the kidney. The observed pleiotropic effects of many urate-associated variants could thus be the potential manifestation of co-regulation of processes that occur within and across tissues relevant to the implicated traits, a mechanism likely to be prevailing in metabolic but also other traits. In the kidney, nuclear HNF4A is exclusively detected in the proximal tubule[53], where it has been reported to regulate the expression of SLC2A9 isoform 1[54] and PDZK1[55]. Kidney-specific deletion of Hnf4a in mice phenocopies Fanconi renotubular syndrome[56]. Transcriptomic analyses support HNF4A to drive a proximal tubule signature cluster of 221 co-expressed genes, including many candidate genes for urate metabolism and transport[53]. In addition to HNF4A, HNF4G, and HNF1A, ten genes in this cluster also map into urate-associated loci we identified (A1CF, CUBN, LRP2, PDZK1, SERPINF2, SLC2A9, SLC16A9, SLC17A1, SLC22A12 and SLC47A1). In addition, our study establishes that HNF4A can trans-activate transcription of ABCG2 in a kidney cell line, the key urate secretory transporter in gut and kidney epithelium[57]. The genetic variant encoding the p.Thr139Ile substitution is located in a region of the HNF4A protein harboring many causative mutations for monogenic maturity onset diabetes of the young (MODY type 1)[58]. Yet, unlike the severe MODY1 missense mutations p.Arg127Trp, p.Asp126Tyr, p.Arg125Trp,[59] p.Thr139Ile has not been reported to cause MODY1. Instead, it has been reported to increase the risk of type 2 diabetes, possibly through a liver-specific loss of HNF4A phosphorylation at p.Thr139, and to associate with HDL-cholesterol levels[58,60]. These data point to additional complexities when interpreting pleiotropic effects, because there may be several tissue-specific mechanisms by which genetic variants in transcriptional regulators influence metabolic pathways and urate homeostasis. Some limitations warrant mention. The numbers of individuals of ancestries other than European or East Asian were small, and the generalizability of the gout prediction models should be assessed in future independent studies of non-European ancestry. Focusing on SNPs present in the majority of studies emphasizes those that may be of greatest importance globally over population-specific variants. General limitations of the field include that statistical fine-mapping approaches based on meta-analysis summary statistics cannot clearly prioritize functional variants in regions of tight LD, and that they are influenced by the availability and imputation quality of SNPs in the contributing studies. Only few regulatory maps from important target tissues such as synovial membrane and kidney are available, but we were able to evaluate differential gene expression in three kidney datasets. Generating additional regulatory and expression datasets across disease states, developmental stages and additional cell types in kidney and other metabolically active organs constitutes an important future research avenue. Lastly, a large independent sample for adequately powered replication testing was unavailable and represents a future endeavor. However, high correlations between genetic effects on serum urate and gout even for SNPs with the weakest significant urate associations as well as no indication of significant heterogeneity reduce concerns about false positives. In summary, this large-scale study generated an atlas of candidate SNPs, genes, tissues and pathways involved in urate metabolism and its shared regulation with multiple cardio-metabolic traits that will enable a wide range of follow-up studies.

Online Methods

Phenotype definition, genotyping and imputation in participating studies

The primary study outcome was serum urate in mg/dl. The laboratory methods for measuring serum urate in each study are reported in Supplementary Table 1. Prevalent gout was analyzed as a secondary outcome to examine whether urate-associated SNPs conferred gout risk. Gout cases were ascertained based on self-report, intake of urate-lowering medications, or International Statistical Classification of Diseases and Related Health Problems (ICD) codes for gout (Supplementary Table 1). The participants of all studies provided written informed consent. Each study had its research protocol approved by the corresponding local ethics committee. Each study performed genotyping separately and imputed the genotypes to reference panels of the Haplotype Reference Consortium (HRC) version 1.1[61], 1000 Genomes Project (1000G) phase 3 v5 ALL, or the 1000G phase 1 v3 ALL[62]. Study-specific quality filters, and software used for phasing and imputation are provided in Supplementary Table 2 and the Supplementary Note. Variants were annotated using NCBI b37 (hg19).

Study-specific association analysis

Phenotype generation was standardized across studies using a common script, and study-specific association analyses followed a centrally developed analysis plan. GWAS summary statistics were checked centrally using GWAtoolbox[63] and custom scripts (Supplementary Note). Each study performed ancestry-specific association analysis of serum urate by generating age- and sex-adjusted residuals of serum urate and regressing the residuals on SNP dosage levels, adjusting for study-specific covariates such as study centers and genetic principal components, assuming an additive genetic model. Gout was analyzed as a binary outcome adjusting for age, sex, genetic principal components, and study-specific covariates. Software used for these regression analyses were EPACTS (q.emmax for family based studies and q.linear otherwise; https://genome.sph.umich.edu/wiki/EPACTS), SNPTest[64], RegScan[65], RVTEST[66], PLINK 1.90[67], Probabel[68], GWAF[69], GEMMA[70], mach2qtl[71] and R. Family-based studies used methods that accounted for relatedness.

Trans-ethnic, ancestry-specific, and sex-stratified meta-analyses

GWAS results from each study were pre-filtered to retain bi-allelic SNPs with imputation quality score > 0.6 and minor allele count (MAC) > 10 before inclusion into meta-analysis. Fixed effects inverse-variance weighted meta-analysis was performed using METAL[72] with modifications to output higher precision (six decimal places). Genomic control was applied for each study. The genomic inflation factor λGC[73] was calculated to assess inflation of the test statistics. For each meta-analysis result (trans-ethnic, ancestry-specific, and sex-specific), we excluded SNPs that were present in <50% of the studies and with a total MAC < 400. For ancestry-specific meta-analysis, we additionally excluded SNPs with a heterogeneity I²-statistic[74] > 95%. Genome-wide significance was defined as P-value < 5 × 10−8. The LD score regression intercept was calculated to assess the evidence for associations driven by population structure[75]. For downstream characterization, 8,249,849 and 8,217,339 autosomal SNPs were retained in the trans-ethnic and European ancestry meta-analysis, respectively. Ancestry-specific meta-analyses were conducted for European ancestry (EA), African Americans (AA), East Asian (EAS) ancestry, and South Asian (SA) ancestry using the same methods and variant filters as the trans-ethnic meta-analysis. Secondary meta-analyses were performed separately in men and women, using the same analytical approaches. To test for significant difference of association between males and females, we used a two-sample t-test: where βM and βF were beta coefficients in males and females, respectively, and SEM and SEF were the standard errors among males and females, respectively.

Initial determination and annotation of genome-wide significant loci

For each meta-analysis result, the SNP with the lowest P-value per chromosome was selected as an initial index SNP, and along with the +/− 500 kb surrounding was defined as one 1-Mb locus. This procedure was repeated with the SNP with the lowest P-value not yet assigned to a locus, until no genome-wide significant SNPs outside 1-Mb loci remained. To visualize loci, the genomic region +/− 500 kb around each index SNP was plotted and can contain two index SNPs when index SNPs were > 500kb but < 1 Mb apart. An ancestry-specific locus was defined as a genome-wide significant locus in an ancestry-specific meta-analysis of which the index SNP did not map into within the ±500 kb intervals of any genome-wide significant loci in the trans-ethnic meta-analysis. Index SNPs were annotated using its position and the nearest gene based on hg19, RefSeq genes, and dbSNP147 downloaded from ftp://hgdownload.soe.ucsc.edu/mysql/hg19/ on 23 March 2017.

Proportion of phenotypic variance explained and estimated heritability

The proportion of phenotypic variance explained by index SNPs was calculated as the sum of the variance explained by each index SNP based on this formula: , where β is the beta coefficient and p is the MAF of the SNP, and var is the phenotypic variance. For this study, we used the variance of the age- and sex-adjusted residuals of serum urate in EA participants of the ARIC study as the estimate of the phenotypic variance (variance = 1.767). Genetic heritability of age- and sex-adjusted urate levels was estimated using the R package ‘MCMCglmm’[76] in the Cooperative Health Research In South Tyrol (CHRIS) study[77], a participating EA study with 4,373 individuals split into 186 up-to-five generation pedigrees[78]. Genetic heritability was estimated overall, after accounting for the index SNPs of the three major urate loci (SLC2A9, ABCG2, and SLC22A12), and after accounting for the index SNPs of all genome-wide significant loci for both the trans-ethnic and EA-specific meta-analyses. Estimates were obtained by running 1,000,000 MCMC iterations (burn in = 500,000) based on previously described settings[78]. The difference between the overall heritability and the heritability excluding the index SNPs represents the heritability explained by the identified loci.

Trans-ethnic meta-regression

Prior to conducting trans-ethnic meta-regression, we applied the same study-specific SNP filters as those applied to the fixed effects trans-ethnic meta-analysis (imputation quality score > 0.6 and MAC > 10). An additional filter for MAF > 0.0025 was also applied to reduce the influence of rare SNPs that passed the MAC filter in very large studies. Trans-ethnic meta-regression was conducted using the MR-MEGA software package[79], which models ancestry-associated heterogeneity in the allelic effect as a function of principal components (PCs) generated from a matrix of mean pairwise allele frequency differences between studies. Three principal components generated from a matrix of mean pairwise allele frequency differences between studies were sufficient to separate the self-reported ancestry groups. Due to software requirements, the minimum number of cohorts for each SNP had to be greater than the number of PCs plus two, resulting in the exclusion of SNPs present in five or fewer cohorts. In addition to genome-wide SNP associations with urate, MR-MEGA reports ancestry-associated (Panc-het) and residual heterogeneity (Pres-het). Index SNPs from the fixed effects meta-analysis with Panc-het < 2.7 × 10−4 (0.05/183) in MR-MEGA were considered to have significant ancestry-associated heterogeneity.

Effect of urate-associated index SNPs on gout and risk prediction for gout

To evaluate the association of the trans-ethnic urate-associated index SNPs with gout, we conducted a trans-ethnic meta-analysis of gout with the same study-specific filtering criteria as for the urate trans-ethnic meta-analysis. The association between a genetic urate risk score constructed from the 114 independent serum urate-associated SNPs identified among European individuals (see fine-mapping section below) and gout was assessed in a large, independent sample from the UKBB (Projects 19655 and 20272)[80]. We selected 334,880 unrelated individuals (pairwise kinship coefficient < 0.0313) of White British ancestry with sex chromosome euploidy and concordance of phenotypic and genotypic sex, including 4,908 with gout identified by self-report at the inclusion visit. Individuals with an ICD10 for gout (M10) in hospital admissions who did not self-report gout were excluded from the analysis. A genetic risk score (GRS) was constructed as the sum of the imputed dosage of the allele associated with higher urate levels (“risk alleles”) over all SNPs, multiplied by the genetic effect of the risk allele on serum urate levels. The GRS distribution was divided into ten evenly spaced categories, and individuals assigned to a category based on their GRS. The category with the lowest GRS did not contain any gout cases and so was combined with its adjacent category. Gout status was regressed on GRS category in a logistic model, including age and sex as covariates, with the category containing the largest number of individuals (genetically predicted mean urate levels 4.74-5.02 mg/dl higher compared to individuals without any urate-increasing alleles) as the reference group. The performance of the GRS for risk prediction of gout was first evaluated in a randomly selected model development sample comprising 90% of the participants to obtain precise estimates, and tested in a validation sample of the remaining 10%. Logistic regression was used to regress gout on the GRS alone (genetic model), age and sex (demographic model) and GRS with age and sex (combined model) in the model development sample. Each of these models was then used to predict gout status in the validation sample. Model performance was assessed by comparing predicted and true gout status using Area Under the Curve (AUC) in a Receiver Operating Characteristic (ROC) curve. A cutoff of the ROC curve to report sensitivity and specificity of a combined GRS-based diagnostic test was determined by the maximum of the Youden’s index (sensitivity + specificity - 1). Ten-fold cross-validation of the models was performed by randomly dividing the UKBB sample into ten equally sized groups. Each group in turn was used as the validation sample for the estimates developed on the remaining data. The AUC the ROC curve was calculated for each of the three models for all ten validation samples, and the means and standard deviations are reported.

Genetic correlation

To assess the genetic correlation between serum urate and other traits in EA, we conducted cross-trait LD score regression[81] using LD Hub[82] with the EA-specific urate meta-analysis results as input. Genetic correlation estimates with 746 traits were obtained from LD Hub, excluding two previous serum urate GWAS results. For presentation, the 212 significantly correlated traits (P < 6.7×10−5 = 0.05/746) were grouped into 9 categories based on the trait names and labels and presented in a circos plot. To determine whether observed genetic correlations between serum urate and cardio-metabolic traits are likely to represent causal relationships, we used the recently developed latent causal variable (LCV) method to estimate the genetic causality proportion (GCP) between serum urate and another trait[83]. Compared to MR, the LCV method produces fewer false positive results in the setting of high genetic correlation and large sample sizes, a situation applicable to our analysis[83]. The GCP describes what proportion of the genetic component of one trait also affects the other trait; a positive GCP value indicates that a proportion of the genetic component of urate affects the other trait, and vice versa for a negative GCP value. LCV produces posterior mean and standard deviation estimates of the GCP using mixed fourth moments of the bivariate effect size distribution, based on GWAS summary statistics and LD scores. When using summary statistics of cardio-metabolic traits generated from the UKBB, we assumed non-overlapping populations, and overlapping populations otherwise. We selected six unique continuous cardio-metabolic traits commonly examined in epidemiological studies with high genetic correlation with serum urate (∣rg∣ > 0.35). We additionally included gout as a positive control and creatinine-based glomerular filtration rate. EA-specific GWAS summary statistics were used as input to match the ancestry of the LD scores used with the method (https://data.broadinstitute.org/alkesgroup/LDSCORE/eur_w_ld_chr.tar.bz2).

Functional enrichment

To assess gene-set and tissue enrichment, we used the Data-Driven Expression Prioritized Integration for Complex Traits analysis (DEPICT) version 1 release 194[84], which performs gene set enrichment analysis by testing whether genes in 14,461 reconstituted gene sets were enriched for urate-associated SNPs (P-values < 1 × 10−5) from the trans-ethnic meta-analysis results. Affinity propagation clustering (APC)[85], implemented in the R package ‘APCluster’[86], was applied to all urate-associated reconstituted gene sets with false discovery rates (FDR)-corrected enrichment P-value < 0.01 to cluster gene sets containing similar combinations of genes. More details on the methods of DEPICT and APC are provided in the Supplementary Note. The methods for using stratified LD score regression[81] based on cell type–specific genomic annotations to identify cell type and tissue-specific enrichments of serum urate heritability are reported in the Supplementary Note.

Statistical fine-mapping of genome-wide significant loci in European ancestry

Statistical fine-mapping to identify potentially causal variants was performed for the genome-wide significant loci from the EA-specific meta-analysis. LD was estimated based on 16,969,363 SNPs from 13,558 unrelated UKBB participants after quality control (Supplementary Note). The analyses were based on a previously described workflow[39,40,87] using GCTA (cojo-slct option) to identify independent index SNPs in each region, followed by using GCTA (cojo-cond option) to obtain conditional beta and standard errors for regions with >1 independent signal. Next, approximate Bayes factors (ABF) were calculated using the Wakefield’s formula[38], as implemented in the R package ‘gtx’ version 2.0.1 (https://github.com/tobyjohnson/gtx). The posterior probability for a variant being the driver of the association signal was calculated as the ABF of the variant divided by the sum of the ABF in the region. The 99% credible sets of a region is derived by summing the posterior probabilities in descending order until the cumulative posterior probability was > 99%. We prioritized variants in credible sets containing ≤ 5 SNPs or SNPs with posterior probabilities > 0.5. More details on statistical fine-mapping are provided in the Supplementary Note.

Annotation of the variants in the credible sets

We annotated SNPs in the credible sets for their exonic effect, Combined Annotation Dependent Depletion (CADD) score, and mapping into DNaseI-hypersensitive sites (DHS) from the Encyclopedia of DNA Elements (ENCODE) and Roadmap Epigenomics Consortium projects[88,89]. The exonic effect and CADD score were obtained using SNiPA v3.2 (March 2017)[90]. SNiPA presented the CADD score as PHRED-like transformation of the C score, which was based on CADD release v1.3 downloaded from http://cadd.gs.washington.edu/download. A CADD score of 15 is used to distinguish potentially deleterious variants from background noise in clinical genetics, and represents the median value of all non-synonymous variants in CADD v1.0[91,92]. As opposed to posterior probabilities of causing the association signal, CADD scores represent an integrative measure of predicted deleteriousness based on an ensemble of variant annotations derived by contrasting common variants that survived natural selection with simulated mutations. Based on known pathogenic variants in the ClinVar database, the performance of the CADD score had an AUC of 0.88[93].

Co-localization analysis of cis-eQTL and urate-associated loci

Co-localization analysis of urate-associated loci with gene expression was conducted using EA meta-analysis results, cis-eQTL results from micro-dissected human glomerular and tubulo-interstitial kidney portions from 187 individuals in the NEPTUNE study[94], as well as from 44 tissues in the GTEx Project version 6p release[95]. For each urate locus, we identified all transcripts and all tissue-transcript pairs with reported eQTLs within ±100 kb of each GWAS index SNP. The region for each co-localization test was defined as the eQTL cis window in the underlying studies[94,95]. We used the default parameters and prior definitions set in the ‘coloc.fast’ function from the R package ‘gtx’ (https://github.com/tobyjohnson/gtx), which is an adapted implementation of Giambartolomei’s co-localization method[24]. Evidence for co-localization was defined as H4 ≥ 0.8, which represents the posterior probability that the association with serum urate and gene expression is due to the same underlying variant. In addition, co-localization of urate-associated loci was also performed with gene expression quantified using RNA sequencing of the healthy tissue portion of 99 kidney cortex samples from the Cancer Genome Atlas (TCGA)[96]. First, all transcripts that shared eQTL variants with urate index SNPs within ±100 kb were extracted. Then the posterior probability of co-localization was calculated including eQTLs within the cis-window (±1 Mb from the transcription start site) for each gene using the R coloc package[24] with default values for the three prior probabilities. The methods for trans-eQTL annotation are reported in the Supplementary Note.

Experimental study

Promoter binding site predictions.

For promoter binding site predictions, we used the JASPAR 2018 database[97,98]. The frequency matrices were downloaded for transcription factor binding sites of both vertebrate and human sequences (HNF1A: MA0046.1 and MA0046.2; HNF4A: MA0114.1 and MA0114.2). These matrices were then used to query the promoter region of ABCG2 (−1285/+362, or base pairs upstream of the transcription start site / and downstream after transcription start site)[99] by means of the LASAGNA 2.0 transcription factor binding site search tool with default parameters and a P-value cutoff of 0.01[100].

Site-directed mutagenesis.

HNF1A and HNF4A clones were purchased from GeneCopoeia (EX-A7792-M02 and EX-Z5283-M02, respectively) and were mutagenized using the QuikChange Lightning Site Directed Mutagenesis kit (Agilent Technologies, #210518) per manufacturer’s instructions using PAGE purified primers, which are reported in the Supplementary Note.

Luciferase assay.

HEK293T cells were seeded in white-walled 96-well plates coated with Poly-L-lysine at roughly 12,500 cells per well. Cells were transfected 18 hours later with either the ABCG2 promoter (−1285/+362) upstream of a firefly luciferase in the pGL4.14 vector (a generous gift from Douglas D. Ross, University of Maryland School of Medicine), or the pGL4.14 vector (Promega, #E699A) without promoter construct, as well as GFP expressing vector used as an internal negative control (pEGFP-C1, Clontech)[101] using X-tremeGene™ 9 DNA Transfection Reagent (Roche Diagnostics, #6365787001). Transfection cocktails were prepared per manufacturer’s specifications either with or without transcription factor using the following ratio: 0.6 μg promoter construct, 0.2, 0.1, or 0.05 μg transcription factor, and 0.05 μg GFP. When no transcription factor was used, pcDNA3.1 was substituted. Approximately 48 hours after transfection, cells were rinsed with 1x PBS, then lysed using Passive Lysis Buffer (Promega #E194A) for 15 minutes. During this incubation, GFP measurements were taken using a CLARIOstar microplate reader (BMG Labtech). Next, 30 μl of Luciferase Reagent (Promega, E297A&B) were added to each well, and the plate was incubated for an additional 20 minutes at room temperature. Finally, luciferase activity was measured using the CLARIOstar microplate reader taking the average over 6 seconds. To evaluate the significance of transactivation of the ABCG2 promoter, we compared cells expressing transcription factors to those transfected with the empty vector (pcDAN3.1) and to evaluate TF dose responses or differences in TF variants all experimental conditions from one plate were compared using an Ordinary one-way ANOVA, accounting for multiple comparisons with a Tukey’s multiple comparison test. Statistical analysis was performed using Prism 7 (GraphPad Software Inc, USA).

Western blots.

Equal volumes of deoxycholate-RIPA buffer were added to wells containing desired lysates following the luciferase assay and plates were then incubated at 4 °C overnight. Equal volumes of sample + 5x SDS loading dye + 10% β-merceptoethanol were then loaded into 10% Mini-PROTEAN® TGX Stain-Free™ Precast Gels (Bio-Rad, #4568033) and run per manufacturer’s specifications. Gels were then cross-linked for 45 seconds and imaged to reveal total protein load, which was used as the loading control for each lane (representative images of these protein gels are found in Supplementary Fig. 8). Gels were then transferred onto nitrocellulose membranes using the Trans-Blot® Turbo™ Transfer System (Bio-Rad), blocked for 2 hours at room temperature in 5% milk in TBS-T, and incubated overnight at 4 °C with primary antibody. Membranes were then washed 3 times with TBS-T, incubated at room temperature for 1 hour with Donkey anti-rabbit secondary antibody (Jackson ImmunoResearch, #111-035-144) diluted 1:5,000 in 2.5% milk in TBS-T. Membranes were then washed again and developed using SuperSignal™ West Pico PLUS Chemiluminescent Substrate (Thermo Scientific, #34577) and imaged on the ChemiDoc MP imaging system (Bio-Rad). All primary antibodies were diluted 1:1,000 in 2.5% milk in TBS-T. Antibodies used included HNF4α (Cell Signaling Technology, #3113) and HNF1α (Cell Signaling Technology, #89670). Antibodies were validated using lysates of overexpressing HEK293T cells transfected with either HNF construct, demonstrating bands at the appropriate sizes (Supplementary Fig. 8).

105 in total

1. Heritability of measures of kidney disease among Zuni Indians: the Zuni Kidney Project.

Authors: Jean W MacCluer; Marina Scavini; Vallabh O Shah; Shelley A Cole; Sandra L Laston; V Saroja Voruganti; Susan S Paine; Alfred J Eaton; Anthony G Comuzzie; Francesca Tentori; Dorothy R Pathak; Arlene Bobelu; Jeanette Bobelu; Donica Ghahate; Mildred Waikaniwa; Philip G Zager
Journal: Am J Kidney Dis Date: 2010-06-19 Impact factor: 8.860

Review 2. Global epidemiology of gout: prevalence, incidence and risk factors.

Authors: Chang-Fu Kuo; Matthew J Grainge; Weiya Zhang; Michael Doherty
Journal: Nat Rev Rheumatol Date: 2015-07-07 Impact factor: 20.543

3. Genome-wide search for genes affecting serum uric acid levels: the Framingham Heart Study.

Authors: Qiong Yang; Chao-Yu Guo; L Adrienne Cupples; Daniel Levy; Peter W F Wilson; Caroline S Fox
Journal: Metabolism Date: 2005-11 Impact factor: 8.694

4. Molecular identification of a renal urate anion exchanger that regulates blood urate levels.

Authors: Atsushi Enomoto; Hiroaki Kimura; Arthit Chairoungdua; Yasuhiro Shigeta; Promsuk Jutabha; Seok Ho Cha; Makoto Hosoyamada; Michio Takeda; Takashi Sekine; Takashi Igarashi; Hirotaka Matsuo; Yuichi Kikuchi; Takashi Oda; Kimiyoshi Ichida; Tatsuo Hosoya; Kaoru Shimokata; Toshimitsu Niwa; Yoshikatsu Kanai; Hitoshi Endou
Journal: Nature Date: 2002-04-14 Impact factor: 49.962

5. Trends in Emergency Department Visits and Charges for Gout in the United States between 2006 and 2012.

Authors: Sadao Jinno; Kohei Hasegawa; Tuhina Neogi; Tadahiro Goto; Maureen Dubreuil
Journal: J Rheumatol Date: 2016-06-01 Impact factor: 4.666

6. SLC2A9 is a newly identified urate transporter influencing serum urate concentration, urate excretion and gout.

Authors: Veronique Vitart; Igor Rudan; Caroline Hayward; Nicola K Gray; James Floyd; Colin N A Palmer; Sara A Knott; Ivana Kolcic; Ozren Polasek; Juergen Graessler; James F Wilson; Anthony Marinaki; Philip L Riches; Xinhua Shu; Branka Janicijevic; Nina Smolej-Narancic; Barbara Gorgoni; Joanne Morgan; Susan Campbell; Zrinka Biloglav; Lovorka Barac-Lauc; Marijana Pericic; Irena Martinovic Klaric; Lina Zgaga; Tatjana Skaric-Juric; Sarah H Wild; William A Richardson; Peter Hohenstein; Charley H Kimber; Albert Tenesa; Louise A Donnelly; Lynette D Fairbanks; Martin Aringer; Paul M McKeigue; Stuart H Ralston; Andrew D Morris; Pavao Rudan; Nicholas D Hastie; Harry Campbell; Alan F Wright
Journal: Nat Genet Date: 2008-03-09 Impact factor: 38.330

7. Genome-wide linkage analysis for uric acid in families enriched for hypertension.

Authors: Andrew D Rule; Brooke L Fridley; Steven C Hunt; Yan Asmann; Eric Boerwinkle; James S Pankow; Thomas H Mosley; Stephen T Turner
Journal: Nephrol Dial Transplant Date: 2009-03-03 Impact factor: 5.992

8. Heritability of cardiovascular and personality traits in 6,148 Sardinians.

Authors: Giuseppe Pilia; Wei-Min Chen; Angelo Scuteri; Marco Orrú; Giuseppe Albai; Mariano Dei; Sandra Lai; Gianluca Usala; Monica Lai; Paola Loi; Cinzia Mameli; Loredana Vacca; Manila Deiana; Nazario Olla; Marco Masala; Antonio Cao; Samer S Najjar; Antonio Terracciano; Timur Nedorezov; Alexei Sharov; Alan B Zonderman; Gonçalo R Abecasis; Paul Costa; Edward Lakatta; David Schlessinger
Journal: PLoS Genet Date: 2006-07-10 Impact factor: 5.917

Review 9. Serum uric acid levels and multiple health outcomes: umbrella review of evidence from observational studies, randomised controlled trials, and Mendelian randomisation studies.

Authors: Xue Li; Xiangrui Meng; Maria Timofeeva; Ioanna Tzoulaki; Konstantinos K Tsilidis; John PA Ioannidis; Harry Campbell; Evropi Theodoratou
Journal: BMJ Date: 2017-06-07

10. Rising burden of gout in the UK but continuing suboptimal management: a nationwide population study.

Authors: Chang-Fu Kuo; Matthew J Grainge; Christian Mallen; Weiya Zhang; Michael Doherty
Journal: Ann Rheum Dis Date: 2014-01-15 Impact factor: 19.103

92 in total

1. Research in brief: Serum urate reduction and its effect on the progression of chronic kidney disease.

Authors: Rajan S Pooni; Richard Corbett
Journal: Clin Med (Lond) Date: 2020-09 Impact factor: 2.659

2. Genome-Wide Association Studies of CKD and Related Traits.

Authors: Adrienne Tin; Anna Köttgen
Journal: Clin J Am Soc Nephrol Date: 2020-05-14 Impact factor: 8.237

3. Serum Urate Lowering with Allopurinol and Kidney Function in Type 1 Diabetes.

Authors: Alessandro Doria; Andrzej T Galecki; Cathie Spino; Rodica Pop-Busui; David Z Cherney; Ildiko Lingvay; Afshin Parsa; Peter Rossing; Ronald J Sigal; Maryam Afkarian; Ronnie Aronson; M Luiza Caramori; Jill P Crandall; Ian H de Boer; Thomas G Elliott; Allison B Goldfine; J Sonya Haw; Irl B Hirsch; Amy B Karger; David M Maahs; Janet B McGill; Mark E Molitch; Bruce A Perkins; Sarit Polsky; Marlon Pragnell; William N Robiner; Sylvia E Rosas; Peter Senior; Katherine R Tuttle; Guillermo E Umpierrez; Amisha Wallia; Ruth S Weinstock; Chunyi Wu; Michael Mauer
Journal: N Engl J Med Date: 2020-06-25 Impact factor: 91.245

4. Integration of GWAS Summary Statistics and Gene Expression Reveals Target Cell Types Underlying Kidney Function Traits.

Authors: Yong Li; Stefan Haug; Pascal Schlosser; Alexander Teumer; Adrienne Tin; Cristian Pattaro; Anna Köttgen; Matthias Wuttke
Journal: J Am Soc Nephrol Date: 2020-08-06 Impact factor: 10.121

5. Trans-ancestral dissection of urate- and gout-associated major loci SLC2A9 and ABCG2 reveals primate-specific regulatory effects.

Authors: Riku Takei; Murray Cadzow; David Markie; Matt Bixley; Amanda Phipps-Green; Tanya J Major; Changgui Li; Hyon K Choi; Zhiqiang Li; Hua Hu; Hui Guo; Meian He; Yongyong Shi; Lisa K Stamp; Nicola Dalbeth; Tony R Merriman; Wen-Hua Wei
Journal: J Hum Genet Date: 2020-08-10 Impact factor: 3.172

6. Estimation of non-additive genetic variance in human complex traits from a large sample of unrelated individuals.

Authors: Valentin Hivert; Julia Sidorenko; Florian Rohart; Michael E Goddard; Jian Yang; Naomi R Wray; Loic Yengo; Peter M Visscher
Journal: Am J Hum Genet Date: 2021-04-02 Impact factor: 11.025

7. Assessing the Causal Relationships Between Insulin Resistance and Hyperuricemia and Gout Using Bidirectional Mendelian Randomization.

Authors: Natalie McCormick; Mark J O'Connor; Chio Yokose; Tony R Merriman; David B Mount; Aaron Leong; Hyon K Choi
Journal: Arthritis Rheumatol Date: 2021-09-26 Impact factor: 10.995

8. GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background.

Authors: Nasa Sinnott-Armstrong; Sahin Naqvi; Manuel Rivas; Jonathan K Pritchard
Journal: Elife Date: 2021-02-15 Impact factor: 8.140

9. Identification of pleiotropic loci underlying hip bone mineral density and trunk lean mass.

Authors: Gui-Juan Feng; Xin-Tong Wei; Hong Zhang; Xiao-Lin Yang; Hui Shen; Qing Tian; Hong-Wen Deng; Lei Zhang; Yu-Fang Pei
Journal: J Hum Genet Date: 2020-09-14 Impact factor: 3.172

10. Pleiotropic genomic variants at 17q21.31 associated with bone mineral density and body fat mass: a bivariate genome-wide association analysis.

Authors: Xin-Tong Wei; Gui-Juan Feng; Hong Zhang; Qian Xu; Jing-Jing Ni; Min Zhao; Xiao-Lin Yang; Qing Tian; Hui Shen; Rong Hai; Hong-Wen Deng; Lei Zhang; Yu-Fang Pei
Journal: Eur J Hum Genet Date: 2020-09-22 Impact factor: 4.246