Literature DB >> 31451708

Mapping eGFR loci to the renal transcriptome and phenome in the VA Million Veteran Program.

Jacklyn N Hellwege^1,2, Digna R Velez Edwards^1,3,4, Ayush Giri^1,3, Chengxiang Qiu⁵, Jihwan Park⁵, Eric S Torstenson^1,6, Jacob M Keaton^1,6, O D Wilson⁷, Cassianne Robinson-Cohen⁷, Cecilia P Chung^1,8, Christianne L Roumie^9,10, Derek Klarin^11,12,13,14, Scott M Damrauer^15,16, Scott L DuVall^17,18, Edward Siew¹⁰, Elvis A Akwo⁷, Matthias Wuttke¹⁹, Mathias Gorski^20,21, Man Li^11,22, Yong Li¹⁹, J Michael Gaziano^13,14,23, Peter W F Wilson^24,25, Philip S Tsao^26,27, Christopher J O'Donnell^11,28, Csaba P Kovesdy^29,30, Cristian Pattaro³¹, Anna Köttgen^19,32, Katalin Susztak⁵, Todd L Edwards^33,34, Adriana M Hung^35,36.

Abstract

Chronic kidney disease (CKD), defined by low estimated glomerular filtration rate (eGFR), contributes to global morbidity and mortality. Here we conduct a transethnic Genome-Wide Association Study of eGFR in 280,722 participants of the Million Veteran Program (MVP), with replication in 765,289 participants from the Chronic Kidney Disease Genetics (CKDGen) Consortium. We identify 82 previously unreported variants, confirm 54 loci, and report interesting findings including association of the sickle cell allele of betaglobin among non-Hispanic blacks. Our transcriptome-wide association study of kidney function in healthy kidney tissue identifies 36 previously unreported and nine known genes, and maps gene expression to renal cell types. In a Phenome-Wide Association Study in 192,868 MVP participants using a weighted genetic score we detect associations with CKD stages and complications and kidney stones. This investigation reinterprets the genetic architecture of kidney function to identify the gene, tissue, and anatomical context of renal homeostasis and the clinical consequences of dysregulation.

Entities: Chemical Disease Gene Mutation Species

Mesh：

Year: 2019 PMID： 31451708 PMCID： PMC6710266 DOI： 10.1038/s41467-019-11704-w

Source DB: PubMed Journal: Nat Commun ISSN： 2041-1723 Impact factor: 14.919

Introduction

Chronic kidney disease (CKD), defined by an estimated glomerular filtration rate (eGFR) lower than 60 ml/min/1.73 m2 for three or more months[1], is a global health concern and is associated with premature death[2,3]. CKD has a prevalence of 15% and affects 30 million people in the US[4,5]. In addition to the risk of progressing to end-stage-renal disease (ESRD), CKD is associated with significant cardiovascular morbidity and mortality[6,7]. Patients with an eGFR of <15 ml/min/1.73 m2, for example, have a threefold increased mortality than those with normal renal function[8]. Diabetes is the most common comorbidity associated with ESRD worldwide, occurring in 44–60% of ESRD cases[4,5]. CKD in diabetics is multifactorial and also related to hyperglycemia, hypertension, atherosclerosis, and aging. Currently there are few therapies that slow CKD progression and life-extending treatments for ESRD are restricted to dialysis and transplantation. There is a great need to understand the biological mechanisms that lead to CKD so that treatments that target those biological factors can be developed. Kidney function, as measured by eGFR, is a heritable trait[9] that has been studied in genetic association studies, and over 50 eGFR loci have been definitively identified[10-13]. A small proportion of variance in eGFR is explained by the subtle effects of previously reported variants, and the genetic etiology of this trait is highly complex. The inferences of what genes influence eGFR are often unclear from loci detected by genome-wide association studies (GWAS) of common variants, which comprise a large proportion of known genetic determinants. Regulatory effects may account for significant additional heritability in GWAS, and GWAS results are enriched for regulatory single-nucleotide polymorphisms (SNPs) compared with the proportion of the genome containing regulatory elements[14-16]. In recent large-scale GWAS meta-analyses of eGFR, it was shown that many significant SNPs map into tissue-specific regulatory regions, and that gene expression may mediate many of the associations between genetic variants and eGFR[10,12]. Several methods were recently developed to leverage multiple variants to perform gene-based tests of association between imputed gene expression levels and phenotypes[17-20]. These tests are tissue-specific and provide effect estimates with interpretable direction and magnitude compared with studies that only evaluate associations between SNPs and traits. Thus, to capitalize on our large electronic health record-based study of patients receiving care in a standardized setting and to leverage available analytical possibilities, we conduct a transethnic GWAS of eGFR in 280,722 participants of the U.S. Veteran’s Administration Million Veteran Program, with replication in an additional 765,289 participants from the Chronic Kidney Disease Genetics (CKDGen) Consortium. In addition, because of the important role of diabetes in nephropathy, we stratify analysis by diabetes status. We also evaluate associations between genetically predicted gene expression (GPGE) in a human healthy kidney expression quantitative trait locus (eQTL) reference panel[21] and eGFR, followed by comparison of significant GPGE associations with gene expression profiles in murine kidney cell types[22] to identify the specific cells where gene expression effects likely arise. Finally, we evaluate clinical translation by performing a phenome-wide association study (PheWAS)[23] of a weighted genetic risk score (GRS) of eGFR in the electronic health records across 192,868 MVP participants.

Results

MVP characteristics

A total of 280,722 participants were available from MVP for analyses of eGFR, ~33% were diabetic (Table 1). Most participants were non-Hispanic whites (80%), male (93%), and hypertensive (70%). When evaluating the data stratified by diabetes status, there were more diabetics than non-diabetics who were hypertensive within both race groups (non-Hispanic white diabetics and hypertensive 91%; non-Hispanic black diabetics and hypertensive 93%). Across both race groups, eGFR was lower in diabetics than in non-diabetics, and eGFR was higher in non-Hispanic blacks than in non-Hispanic whites.

Table 1

Characteristics of Million Veteran Program discovery sample

Characteristic	Overall	Non-Hispanic white diabetic (n = 70,762)	Non-Hispanic white non-diabetic (n = 152,624)	Non-Hispanic black diabetic (n = 20,967)	Non-Hispanic black non-diabetic (n = 36,369)
Hypertension (n, [% Yes])	198,137 (71%)	64,493 (91%)	90,875 (59%)	19,459 (93%)	23,310 (64%)
Gender (n, [% Female])	20,733 (7.4%)	3081 (4.4%)	12,986 (8.5%)	1781 (8.5%)	2885 (16%)
Age (mean [SD], years)	62.8 (13.2)	67.4 (9.8)	62.5 (14.3)	62.1 (9.9)	55.4 (12.5)
BMI (mean [SD], kg/m²)	30.1 (6.0)	32.1 (6.4)	29.0 (5.4)	31.9 (6.5)	29.4 (5.9)
eGFR (mean [SD], mL/min/1.73 m²)	79.1 (20.7)	72.3 (20.6)	79.0 (18.8)	80.4 (24.6)	89.4 (21.6)

SD standard deviation, BMI body mass index, kg kilograms, m2 meters squared

Characteristics of Million Veteran Program discovery sample SD standard deviation, BMI body mass index, kg kilograms, m2 meters squared

Transethnic GWAS

Meta-analyses of GWAS data were performed on all variants (minor allele frequency, MAF, ≥1%), with data stratified by race, diabetes status, and hypertension status. Meta-analysis of both race groups across all diabetes and hypertension strata identified 136 genome-wide significant loci, of which 122 were available and evaluated for replication in a transethnic GWAS meta-analysis of eGFR from the CKDGen Consortium (n = 765,289). Fourteen of these SNPs or proxies from the all-ancestry analyses were unavailable for replication. Of the 122 variants, 79 were replicated at the Tier 1 definition (p ≤ 5 × 10−8) and an additional 28 were Tier 2 replicated (5 × 10−8 < p ≤ Bonferroni correction for 122 SNPs = 0.0004), and 11 were Tier 3 replicated (0.0004 < p ≤ 0.05) (Supplementary Data 1, Table 2, Fig. 1). Only four showed no evidence of association in CKDGen. Meta-analysis of p-values from MVP and CKDGen results identified all Tier 1 and seven Tier 2 results as genome-wide significant. Across the three tiers of successful replication, 2.75% of variance in eGFR was explained. Fifty-seven of these results have been previously detected in GWAS of eGFR, while six others have associated with overt kidney disease, microalbuminuria, or other related phenotypes in prior studies. Among 64 novel variants, the most significantly associated was rs2823139 near NRIP1 (pdiscovery = 1.82 × 10−18, effect = -0.45 mL/min/1.73 m2 [standard error (SE) = 0.051]; preplication = 5.24 × 10−16). Evaluation of conditionally independent signals within the discovery meta-analysis results using GCTA[24] identified 18 SNPs from 15 loci (five novel) that are significantly independently associated with eGFR (Supplementary Table 1).

Table 2

Summary of significant known and novel loci from analysis of common variants

	Number of loci	Average effect^a (SD)	P-values*
Known loci	54	0.49 (0.28)	0.0025
Novel loci	64	0.38 (0.17)
Tier 1	79	0.45 (0.26)
Tier 2	28	0.41 (0.19)
Tier 3	11	0.36 (0.10)^b
Diabetics	35	0.90 (0.61)	0.001
Non-diabetics	98	0.52 (0.28)

Tier 1 = First tier significance criteria: GWAS significance at discovery + replicationpassing Bonferroni threshold + consistent directions of associations between discovery and replication sets

Tier 2 = Second tier significance criteria: GWAS significance at discovery + replication p between 5 × 10−8 and 0.00040004 + consistent directions of associations between discovery and replication sets

Tier 3 = Third tier significance criteria: GWAS significance at discovery + replication p between 0.0004 and 0.05 + consistent directions of associations between discovery and replication sets

*P-value from two-sample t-test comparing mean effect estimates

aAverage effect = average and standard deviation of the absolute value of beta-estimates from MVP, in mL/min/1.73 m2

bExcluding two extreme outliers

Fig. 1

Manhattan plot summarizing transethnic discovery meta-analysis of eGFR. The y axis shows the –log10 P-values and the x axis shows the chromosomal positions. The horizontal red line represents the thresholds of P-value = 5 × 10−8 for genome-wide significance. SNPs in red are in previously-identified kidney function loci, whereas SNPs in orange are in novel loci

Summary of significant known and novel loci from analysis of common variants Tier 1 = First tier significance criteria: GWAS significance at discovery + replicationpassing Bonferroni threshold + consistent directions of associations between discovery and replication sets Tier 2 = Second tier significance criteria: GWAS significance at discovery + replication p between 5 × 10−8 and 0.00040004 + consistent directions of associations between discovery and replication sets Tier 3 = Third tier significance criteria: GWAS significance at discovery + replication p between 0.0004 and 0.05 + consistent directions of associations between discovery and replication sets *P-value from two-sample t-test comparing mean effect estimates aAverage effect = average and standard deviation of the absolute value of beta-estimates from MVP, in mL/min/1.73 m2 bExcluding two extreme outliers Manhattan plot summarizing transethnic discovery meta-analysis of eGFR. The y axis shows the –log10 P-values and the x axis shows the chromosomal positions. The horizontal red line represents the thresholds of P-value = 5 × 10−8 for genome-wide significance. SNPs in red are in previously-identified kidney function loci, whereas SNPs in orange are in novel loci We also conducted analyses limiting to non-Hispanic white individuals. Similar to the trans-ancestry analyses, the majority of variants replicated in CKDGen (n = 567,401; Supplementary Data 2). Six variants failed to replicate (p > 0.05). Of the 105 replicated SNPs, 61 were Tier 1 (p ≤ 5 × 10−8), 29 were Tier 2 (5 × 10−8 < p ≤ 0.00048), and 9 were Tier 3 (0.00048 < p ≤ 0.05). We identified 14 variants that were significant in non-Hispanic blacks (Supplementary Table 2) within our MVP discovery population. Among these four were novel (rs75113983 near OLFR690, rs200950799, which is intergenic on chromosome 12, rs144803907 near C15orf43, rs10084572 near AGPAT3). Only one of these novel variants was also identified in non-Hispanic white analyses (rs13230509). An interesting finding among the non-Hispanic blacks is the strong association with rs334 (p = 1.54 × 10−18) in HBB, the variant which encodes the sickle cell allele of beta globin. The origins of this variant have been recently described[25]. The derived A allele on the + strand that encodes for the sickle version of the beta globin protein is associated with lower kidney function. This allele has also been previously associated with increased urinary albumin-to-creatinine ratio[26]. We compared the effect sizes for significant discovery GWAS results across non-Hispanic black and non-Hispanic white subjects MVP (Supplementary Fig. 1, Supplementary Table 3). Most of the effects were in the same direction except for three variants. One was a novel variant (rs142314590 in LDB2) and two were known variants (rs532086 and rs2235826 near C2 and PRK1, respectively). The effect sizes across non-Hispanic black and non-Hispanic white subjects were more correlated among known (r2 = 0.25) loci than novel (r2 = 0.13) variants (Supplementary Table 3). The overall allele frequencies across ancestry groups for significant variants were highly correlated (r2 = 0.67).

Diabetes stratified GWAS

There were 91,279 patients with diabetes. In our GWAS for this group we identified 32 variants reaching genome-wide significance, which make up a subset of the loci discovered in combined analysis. Seventeen of these variants were in known loci and 15 were novel loci (Supplementary Data 3, Supplementary Figs. 2 and 3). The top six hits were near UMOD (p = 2.43 × 10−82), PRKAG2 (p = 7.89 × 10−23), MPPED2/DCDC5 (p = 9.85 × 10−22) GATM/SPATA5L1 (p = 2.21 × 10−17), SHROOM3 (p = 1.63 × 10−13), and HBB (p = 7.95 × 10−13). Comparison of the association effect sizes in the stratified analysis in subjects with (Supplementary Data 3) and without diabetes (Supplementary Data 4) demonstrated generally consistent effects reflecting the shared pathways between diabetic and non-diabetic kidney disease (Supplementary Fig. 4).

LD score regression

Subsequently, we utilized the LD Score Regression approach[27] in each contributing group to ascertain whether inflation was due to residual population stratification or polygenicity. Calculation of the intercept in the MVP non-Hispanic white discovery analysis datasets were 1.02 (SE = 0.01), 1.04 (SE = 0.01), 1.03 (SE = 0.01), and 0.99 (SE = 0.01), for diabetics with hypertension, non-diabetics with hypertension, non-diabetics without hypertension and diabetics without hypertension, respectively, suggesting that little of the observed inflation in the lambda is due to population stratification (Supplementary Table 4). Similarly, intercepts in the MVP non-Hispanic black discovery analysis datasets were 1.01 (SE = 0.003), 1.02 (SE = 0.004), 1.00 (SE = 0.003), and 1.00 (SE = 0.003), for diabetics with hypertension, non-diabetics with hypertension, non-diabetics without hypertension, and diabetics without hypertension, respectively.

eGFR heritability estimation

Evaluation of SNP-based heritability was assessed using LD score regression across groups and indicated that eGFR was most heritable among non-diabetics and non-hypertensive individuals in both non-Hispanic whites (h2 = 0.15, SE = 0.02) and non-Hispanic blacks (h2 = 0.13, SE = 0.04) (Supplementary Table 4). In general, heritability was higher in non-Hispanic whites than in non-Hispanic blacks.

Human kidney genetically predicted gene expression (GPGE)

Common variants from transethnic meta-analyses were used to evaluate the association between eGFR and GPGE in human healthy kidney tissue using S-PrediXcan[17] with the healthy kidney eQTL reference described by Ko et al.[21] (Fig. 2, Supplementary Data 5). We identified 45 significant results for transethnic analyses, among these 36 were novel genes which had not been identified by GWAS, either in the original reports or by the GWAS catalog mapping. The strongest result was with a known CKD locus, SPATA5L1 (effect = −2.38; p = 1.01 × 10−110). The strongest novel result was at TPRKB, a protein coding gene for the TP53RK-binding protein (effect = −10.78; p = 3.40 × 10−22). Twenty-five of the 45 genes with significant results were associated with a decrease in kidney function with increasing renal gene expression. Among all significant results, 19 also had SNPs likely to be causal for both gene expression and eGFR differences, as identified through the COLOC approach.

Fig. 2

Juxtaposed mirror plots for S-PrediXcan and GWAS for eGFR. Log10(p-values) for associations between genetically predicted gene expression (GPGE) analyses with eGFR in kidney tissue are juxtaposed with –log10(p-values) from transethnic GWAS analyses (triangles represent odd numbered chromosomes, circles represent even numbered chromosomes). GWAS plot represents transethnic discovery meta-analysis results (chromosomes alternate by light gray and black coloration) We also conducted secondary GPGE analyses stratifying subjects by diabetes status (Supplementary Data 5). SPATA5L1 was the strongest result in analyses of diabetic and non-diabetic participants, with stronger effects estimated in the non-diabetics. In analyses of diabetic participants, we identified seven significant associations. Two of the genes (HLA-H, UBD) identified in analyses of diabetic participants were not observed in overall analyses (combining diabetics and non-diabetics) or in analyses limiting to non-diabetic subjects. Both genes were associated with a decrease in kidney function with increasing expression and neither gene has been previously associated with eGFR. HLA-H has been associated with non-albumin protein levels and UBD with blood proteins[28,29]. All significant results in analyses of non-diabetics were also identified in overall analyses combining diabetics and non-diabetics.

Evaluation of GPGE results in murine kidney cells

We also evaluated homologs of genes identified in GPGE analyses using an atlas of kidney cell type-specific RNA expression from single cell sequencing of murine kidney cells[22] (Fig. 3, Supplementary Table 5). Cells were clustered into 13 types that represent structural features and other cell types of the kidney. We identified six genes (NARS2, ARNT, TPRKB, RNF152, BST2, and RGS14) across five cell types (podocyte, proximal tubule, collecting duct principal cell, fibroblast, and neutrophil) that had a 1.96 or greater fold increase in gene expression, though none of these were significant after accounting for multiple tests. Cross-referencing protein expression levels in the Human Protein Atlas confirmed findings from murine kidney, including higher expression of TPRKB protein in tubules compared with glomeruli (Supplementary Table 6).

Fig. 3

Mapping eGFR-associated genes to kidney cell type clusters. Average expression level of GWAS/eQTL defined genes across overall transethnic (ALL), diabetic (DM), and non-diabetic (Non DM) analyses. Mean expression values of the genes were calculated in each cluster. Color scheme is based on z-score distribution. Each row represents one gene and each column is single cell type cluster (as defined by Park et al.[22]) on the heatmap. Endo: endothelial, vascular, descending loop of Henle, Podo: podocyte, PT: proximal tubule, LOH: ascending loop of Henle, DCT: distal convoluted tubule, CD-PC: collecting duct principal cell, CD-IC: CD intercalated cell, Fib: fibroblast, Macro: macrophage, Neutro: neutrophil, NK: natural killer cell

eGFR risk score PheWAS

To assess the potential pleiotropic effects of associated eGFR variants we tested the association between an eGFR weighted GRS (w-GRS) and diseases throughout the phenome using 63 SNPs with independent weights identified previously in the CKDGen Consortium[10,30] (Supplementary Table 7) and clinical phenotype data from 192,868 self-reported/administratively identified non-Hispanic white MVP individuals (Supplementary Table 8). We regressed PheWAS outcomes onto the w-GRS, adjusting for sex and the top ten principal components of ancestry. We identified nine conditions that were significantly associated with the eGFR w-GRS. The majority were diseases related to genitourinary systems (n = 6), the strongest result was with chronic renal failure (p = 3.55 × 10−57, OR per SD of w-GRS = 0.88). Interesting results included calculus of kidney, calculus of ureter, and urinary calculus, which were all significant and positively associated with the weighted GRS (i.e., increased eGFR [improved kidney function] associated with increased risk of kidney stones). In addition, we observed significant associations with stage III of CKD and renal failure.

Discussion

We present the results from a multi-omic transethnic GWAS of eGFR with discovery and replication in over a million participants. We identified several novel loci using multiple statistical and bioinformatics approaches and validated previously reported loci, with replicated SNPs explaining 2.75% of eGFR variance. Significant strengths of this study include our large discovery and replication populations, transethnic analyses of diverse populations, incorporation of GPGE from a healthy human kidney reference, identification of cell-specific eGFR-associated gene expression from murine kidney, and evaluation of the clinical phenome through PheWAS of an eGFR w-GRS. We also assessed heterogeneity of effect estimates at GWAS loci between diabetic and non-diabetic populations. In addition to identifying novel and known loci, our discovery analyses using the MVP population showed heritability estimates consistent with published studies of eGFR, supporting the quality of our EHR phenotyping[30]. These data provide insight into the genetic architecture and clinical factors of eGFR. The limitations of estimating GFR from creatinine and/or cystatin C include low sensitivity in detecting early CKD and poor prediction of the course of CKD. These limitations have been described previously in detail[31-33]. However, despite these limitations, eGFR is the outcome most often used in genetic association studies of kidney function because of its clinical utility and the translational potential of inferences, as well as the availability of large numbers relative to alternative measures. We identified several novel common variants associated with eGFR, some of which may tag genes implicated in Mendelian forms of kidney disease. For instance, we detected a common (and reportedly benign) missense variant in PKD1 (polycystin 1) associated with eGFR. Mutations in PKD1 can cause autosomal dominant polycystic kidney disease (OMIM #173900). Variants in NOS3 (nitric oxide synthase 3), where we also detected an association with eGFR, may act as a modifier among those with polycystic kidney disease[34] through the nitric oxide pathway. We also identified associations near the NRIP1 gene (nuclear receptor interacting protein 1), with supporting GPGE evidence. NRIP1 mutations have recently been implicated in congenital anomalies of the kidney and urinary tract (OMIM# 610805), which are a leading cause of CKD among those under 30[35]. It has been suggested that mutations in this gene cause CKD through disruption of retinoic acid signaling. Novel SNP associations were found at or nearby genes previously reported in GWAS of other renal and urological phenotypes. Specifically, loci were identified with urate/urea/uric acid levels (KLHDC7A,[36] MTX1,[37] RREB1,[38] and MIR1538[28,38]), urine albumin-to-creatinine ratio (C9orf3[39]), IgA Nephropathy (MTMR3[40]), and frequency of urinary tract infection (ZNF165[41]). We also observed an association at a pharmacogenetic locus strongly related to tacrolimus dose in renal transplant patients[42], as well as development of new-onset diabetes after transplant[42]. However, transplant patients were excluded from this analysis. Loci implicated in platelet and red blood cell phenotypes were also represented among novel eGFR-associated SNPs, including TET2, HIST1H1C, UBE2H, TRIB1-LOC105375746, FAM53B, TPM1, NRIP1, A4GALT, DOCK7, PLCB1, and SERTAD2[28,43]. It has been previously shown that mean platelet volume, platelet counts, and platelet distribution width are associated with eGFR[44,45]. Higher values of mean platelet volume are observed among CKD patients[46], while platelet count and distribution width decrease along with eGFR. CKD patients also demonstrate an attenuated response to antiplatelet therapy relative to those without renal insufficiency[47]. Ischemic stroke is a common secondary occurrence in CKD patients, which may be due to shared risk factors[48]. Reduced kidney function among stroke patients is a risk factor for mortality[49,50]. We identified four novel eGFR-associated loci which have been previously associated with ischemic stroke in other studies, specifically USP38, ZFHX3, PMF1-BGLAP, and TTBK1[51-53]. A previous analysis of the APOL1 high risk variants showed increased risk of ischemic stroke[54]. Combined with our results, this suggests that the genetic architecture of kidney function may predispose to other small vessel disease outcomes. We examined the APOL1 locus extensively for association with eGFR in black MVP participants in anticipation that there would be a relationship there with eGFR and did not observe an association with an additive genotype coding. It has been suggested in multiple studies where participants had predominantly normal kidney function that APOL1 is a kidney disease gene and has not been significantly associated with eGFR[55-57]. This suggests that the lack of association between APOL1 genetic variants and eGFR is likely not due to low power, as has been true in previous studies of African-descent populations. We evaluated our transethnic sentinel SNPs for consistency of effect size across racial groups and found that in general effects were of similar direction, with only three variants being of opposite effect in non-Hispanic blacks (Supplementary Fig. 1). These included two known variants: rs532086 at C2, rs2235826 at PCK2 and a novel variant, rs12646572 at LDB2. Effect sizes of these three SNPs in non-Hispanic blacks were small (<0.01) and likely represented null effects. Comparison of variants identified in non-Hispanic blacks with non-Hispanic whites demonstrated consistency of direction across known variants with effect sizes generally similar or larger in non-Hispanic blacks. When eGFR effects across the entire genome were compared between white and black MVP participants using the Popcorn software package[58], the genetic effects were highly correlated and not significantly different from no difference (r = 0.94, p = 0.45). Novel variants detected in non-Hispanic blacks were not observed in non-Hispanic whites. Our analysis of eGFR in non-Hispanic whites was largely consistent with previous European-ancestry cohorts, replicating many strongly associated loci among our top hits, such as UMOD, GATM/SPATA5L1, SHROOM3, CPS1, PRKAG2, and SLC31A1[10,11,59]. An interesting finding among the non-Hispanic blacks is the strong association with rs334 (p = 1.54 × 10−18, SNPTEST info score = 0.75) in HBB, which encodes the sickle cell allele of beta globin. This variant has been previously associated with increased urinary albumin-to-creatinine ratio[26]. The sickling allele was present at a frequency of 5.7% among MVP non-Hispanic blacks, despite sickle cell trait carriers being restricted from serving in certain roles in the United States armed forces[60]. This observed frequency is higher than that in the African Americans from the southwestern United States in 1000 Genomes (MAF = 1.6%), but substantially lower than in continental African populations (MAF 10.1–13.9%). Screening for sickle cell in the United States Military has undergone several changes since the initial policy was introduced in 1969, in which the Navy began screening all recruits[60,61]. Currently universal sickle cell screening is performed prior to military ascension in the U.S. Navy, Air Force, and Marine Corps, while in the U.S. Army screening is performed only in specific scenarios related to deployment and certain occupational hazards[62,63]. Similarly, in other countries’ militaries, sickle cell allele carriers may be barred from diving, submarine, and aircrew service[64]. A recent study found that sickle cell trait was not associated with increased risk of mortality among black soldiers in the U.S. Army, but that it was associated with increased risk of exertional rhabdomyolysis (breakdown of skeletal-muscle tissue)[65]. However, the effect of sickle cell trait was less than that of recent statin use. Our results suggest that this allele also negatively affects kidney function, which may indicate that carriers should be monitored clinically for decreases in eGFR. Associations were also identified near loci previously implicated in diabetes and insulin/glucose homeostasis (CDKAL1, SLC9B2, RREB1, RAI1, and PPARG)[28,66-68]. CDKAL1, RREB1, and PPARG were not significant in either arm of the diabetes stratified analysis, while RAI1 and SLC9B2 were significant in the non-diabetes group. We identified 36 SNPs that reached genome-wide significance in our diabetes strata, 12 were novel and not significant in non-diabetics, but were observed in the overall transethnic analysis combining diabetic groups. We also did an exploratory comparison of the association effect size in the subjects with and without diabetes. The effects were generally consistent across groups (Supplementary Fig. 3). Uromodulin (UMOD) has been recently detected in a GWAS of diabetic eGFR that limited to diabetic subjects[69]. We observed similar associations for UMOD in separate analyses within diabetic and non-diabetic participants (sentinel SNP [A effect allele] effect size for UMOD diabetic 0.11 and non-diabetic 0.07), suggesting the association is not specific to diabetics. UMOD is known to affect kidney function and mutations in that gene cause several syndromic kidney disorders. Few variants have reached genome wide significance for studies of eGFR in patients with diabetes, potentially due to power and heterogeneity of renal disease phenotypes in diabetics[70,71]. We leveraged the human healthy kidney eQTL reference first described by Ko et al.[21] to investigate the relationship between GPGE and eGFR using S-PrediXcan. We detected associations with 45 genes, 18 of which are not annotated for any trait in the GWAS catalog, either by the original report or by the catalog mapping. When we restricted the GWAS catalog to genes identified by studies of eGFR, CKD, kidney disease, urinary albumin to creatinine ratio, or urinary metabolites, 36 of the 45 genes we detected were novel. The most statistically significant association with GPGE was for the spermatogenesis associated protein 5-like 1 gene (SPATA5L1). This locus has been previously reported to be associated with eGFR[10,72]. However, there are several genes in the region and Köttgen et al[72] suggested the nearby gene glycine amidinotransferase (GATM), based on a pathway-level connection of that gene to creatinine biosynthesis. The sentinel non-coding SNP rs2467853 in this region is within the SPATA5L1 gene and is more strongly associated with SPATA5L1 expression in human kidney than GATM (rs2467853 SPATA5L1 beta = 0.77, p = 8.2 × 10−8; GATM beta = 2 × 10−4, p = 0.99)[73]. In our evaluation of RNAseq and genotypes in cis with GATM from healthy human kidney, no models could be constructed that sufficiently predict GATM expression to use in GPGE analysis. We detected an association between GPGE and eGFR for SPATA5L1 and not for GATM, as well as a COLOC signal with a high posterior probability (P4 = 0.98) of several SNPs in the region being causal for both SPATA5L1 expression levels and the eGFR association signal, and no COLOC or GPGE evidence supporting GATM. The mechanism underlying the relationship between SPATA5L1 gene product abundance and kidney function is unclear. The most statistically significant novel gene detected in S-PrediXcan analysis was the TP53RK-binding protein (TPRKB), which is significantly highly expressed in murine proximal tubules. Consanguineous families with homozygous missense mutations in TPRKB have been diagnosed with Galloway-Mowat syndrome-5 (OMIM #617731), in which affected people exhibit early-onset nephrotic syndrome, as well as dysmorphologies and delayed psychomotor development[74]. The NARS2 gene is also novel and significantly highly expressed in murine podocytes. Mutations of this gene have been observed in a phenotype similar to Alpers syndrome that included renal dysfunction (OMIM #612803)[75]. It is also notable that UMOD from the chr16p12.3 region was not associated in the GPGE association analysis. No SNPs in the UMOD region were sufficiently associated with UMOD expression in the human kidney reference data to create predictive models for S-PrediXcan analysis. This suggests that the relationship between UMOD genotypes and kidney function is mediated by a mechanism other than gene expression, or that the proportion of cells from which UMOD is expressed is small and difficult to detect in bulk RNA sequencing of whole kidney. The sentinel SNP nearby UMOD from our study (rs77924615) is a nominally significant eQTL within NephQTL[76] (a database of nephrotic syndrome human kidneys) in the kidney tubules (p = 0.0014) and has a non-significant effect in the glomerulus (p = 0.079). The conditionally independently associated SNP rs111285796 in UMOD is also nominally associated in NephQTL with UMOD expression in tubules (p = 3.6 × 10−4) and less associated with other genes at this locus (PDILT p = 0.047), and less significant for glomerulus (UMOD p = 0.021). Within a healthy kidney eQTL atlas[77] (Online Methods) rs111285796 is not associated with UMOD in tubules (p = 0.214) but is nominally associated in glomerulus (p = 0.004). This demonstrates that there are important differences between the nephrotic and healthy kidney reference data with regard to cis-eQTL effects at this locus and suggests that SNPs in this locus influence gene expression in a context-dependent manner. We constructed a w-GRS of eGFR and performed a PheWAS using EHRs from 192,868 non-Hispanic white MVP participants. To our knowledge this is the first PheWAS of an eGFR w-GRS. We observed several associations between our w-GRS and kidney disease-related phenotypes. The most significant PheWAS association (p = 3.55 × 10−57) was with chronic renal failure, followed by related renal failure phenotypes, and CKD stage III (p = 6.13 × 10−24), all diseases defined by eGFR. Our w-GRS was associated with hypertensive kidney disease but not essential hypertension. In addition, we observed an association with kidney stones that was positively associated with our w-GRS; a relationship that has not been previously reported with eGFR. Our results are consistent with a prior study that reported an association between increased eGFR and hypercalciuria[78]. It is known that some individual genes that influence eGFR are also important for kidney stone formation. For example, decreased production of UMOD is associated with kidney stone formation as uromodulin impairs the aggregation of calcium oxalate crystals[79], hypertension, arterial stiffness, and CKD. In conclusion, we identified multiple novel loci associated with eGFR levels after conducting a transethnic GWAS and several post-GWAS bioinformatic analyses. We identified several novel loci and genes with additional confirmation and refinement through GPGE analyses, including cell-specific expression. We observed consistent effects across racial groups for associated GWAS loci. Furthermore, we evaluated the clinical phenome associated with our eGFR w-GRS and identified associations among diseases related to kidney and endocrine disease phenotypes. Overall, our study leveraged a racially diverse clinical population to identify novel eGFR loci common across racial groups and remapped previously reported loci using GPGE, leading to a greater understanding of the genetic architecture of kidney function.

Methods

Ethics statement

The central Veterans Affairs (VA) institutional review board (IRB) and site-specific IRBs approved the Million Veteran Program study.

The Million Veteran Program

The Million Veteran Program (MVP) is a large cohort of fully consented participants who were recruited from the patient populations of 63 VA medical facilities. MVP recruitment was initiated in 2011 and conducted in-person, after responding to an invitation letter. Full MVP participation includes completion of baseline and lifestyle surveys, providing access to medical records, a blood sample, and giving permission to recontact. Informed consent is provided after counseling by research staff and access to informational materials. All study materials and protocols are approved by the VA Central Institutional Review Board. Genotyping was performed on the Affymetrix Axiom Biobank Array chip, with custom content included to provide better coverage of African and Hispanic haplotypes. All samples are de-identified for research purposes, and investigators are not permitted or able to link study data to a participant’s identity. Blood samples were obtained from MVP participants by phlebotomists and shipped to a central biorepository in Boston, Massachusetts for biobanking. DNA was extracted and provided to two external sites for genotyping. Standardized quality control and genotype calling algorithms using the Affymetrix Power Tools Suite (v1.18) were applied to the data in batches by the MVP genomics working group. Quality control pipelines included the exclusion of duplicate samples, those with discordant reported and genotyped sex, and samples with more heterozygosity than expected. One of each pair of related individuals as measured by the KING software[80](halfway between 2nd and 3rd degree relatives or closer) were excluded from genetic analysis. Prior to imputation, variants that were poorly called or that deviated from their expected allele frequency based on reference data from the 1000 Genomes Project[81] were excluded. After pre-phasing using EAGLE v2[82], genotypes from the 1000 Genomes Project phase 3, version 5 reference panel were imputed into Million Veteran Program (MVP) participants via Minimac3 software[83]. Principal component analysis was performed using the FlashPCA[84], to generate the top ten genetic principal components explaining the greatest variability. Information on race and ethnicity (Hispanic: Yes or No) were extracted from standardized survey forms (self-report), or from the corporate data warehouse (CDW), or observational medical outcomes partnership (OMOP) data, when information from self-report was unavailable, and data were combined to form administratively assigned variables. Race and ethnicity categories used in this study included non-Hispanic whites and non-Hispanic blacks. Baseline estimated glomerular filtration rate (eGFR) was determined using the creatinine closest to enrollment. For the vast majority of the patients creatinine was measured using the IDMS reference method. GFR was calculated using chronic kidney disease epidemiology collaboration (CKD-EPI) serum creatinine equation[85]. The eGFR CKD-EPI equation is (1): GFR = 141 ×min(Scr ×κ−1,1) × max(Scr × κ−1, 1)−1.209 × 0.993Age × 1.018 [if female] × 1.159 [if Black] where Scr is serum creatinine (mg/dL), κ is 0.7 for females and 0.9 for males, α is −0.329 for females and −0.411 for males, min indicates the minimum of Scr × κ−1 or 1, and max indicates the maximum of Scr × κ−1 or 1. We excluded individuals that were on dialysis, had a kidney transplant, amputees, individuals on HIV medications which may increase creatinine clearance, BMI < 18, and Scr values less than 0.4 mg/dl as they may have represented lab errors. Diabetic patients were defined as those subjects on any anti-diabetic medications or those who had at least two outpatient ICD-9 codes for diabetes (ICD9 250.*) on separate dates within 365 days prior to enrollment. Subjects lacking codes and not on anti-diabetes medications were categorized as non-diabetics. Hypertension was defined as the presence of a hypertension code, being on antihypertensive drug or having two SBP’s > 140 mmHg and/or two DBP’s > 90. BMI was estimated using the closest weight to the GFR measure, and the height mode as weight (kg) × (height (m))−2.

MVP GWAS analysis

For the MVP GWAS we performed linear regression association tests with additive models for untransformed eGFR. We adjusted linear regression models analyzing SNP associations for age at eGFR measure, age2, sex, BMI, and the top ten genetic principal components (PCs) in analyses. All primary analyses for the MVP were conducted by stratum of administratively assigned race, as well as by the presence or absence of diabetes and hypertension. All regression-based analyses were conducted in SNPTEST-v2.5.4-beta[86]. Inference was limited to genotyped and imputed variants with SNPTEST Info scores of 0.4 or higher, with Hardy-Weinberg equilibrium p-value > 5 × 10−8 for common variant analysis (MAF > 0.1). Meta-analyses across race and strata were performed using fixed-effects, inverse variance-weighted meta-analysis implemented in METAL[87]. Genomic inflation factors were calculated, and λ for the discovery from MVP were 1.11, 1.16, 1.15, and 1.01 for the diabetic hypertensive participants, non-diabetic hypertensive participants, non-diabetic non-hypertensive participants, and diabetic non-hypertensive participants in whites, respectively, 1.03, 1.03, 1.02, and 0.99 for the diabetic hypertensive participants, non-diabetic hypertensive participants, non-diabetic non-hypertensive participants, and diabetic non-hypertensive participants in blacks, respectively, and 1.36 in the overall discovery analysis (Supplementary Fig. 3). Genome-wide significant loci were defined by one or more SNPs attaining genome-wide significance (p < 5 × 10−8) which were at least 1MB away from other signals, or if within that distance, were not in linkage disequilibrium (r2 < 0.1). Sentinel variants and up to two proxies (r2 > 0.8 and prioritized by correlation and then by distance from lead SNP) were selected for replication. We approximated the proportion of variance explained in the transethnic meta-analysis by all independent sentinel SNPs (novel and known) and novel SNPs, separately. Variance explained by each SNP was first estimated by the following equation (2):The sum of the variances of the independent sentinel SNPs for common variants provided estimates for the proportion of variance explained for all SNPs, and novel SNPs for eGFR. The transformation of the relationship between t-statistic and r2 to statistic to r2 is described in Supplementary Note 1.

CKDGen transethnic and European ancestry GWAS meta-analyses

The Chronic Kidney Disease Genetics (CKDGen) Consortium is a collaborative effort that includes mainly population-based studies from different ethnicities to perform GWAS of renal function traits aimed at uncovering the genetic basis of CKD. We interrogated data from the most recent CKDGen meta-analysis that included 121 GWAS, totaling 765,289 individuals of European (n = 567,401), East Asian (n = 165,726), South Asian (n = 13,359), African American (n = 13,842), and Hispanic (n = 4961) ancestries[88] . Following a centralized analysis plan, participating studies estimated the eGFR based on serum creatinine using the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI)[85] or the Schwartz’s[89] equations, depending on whether adults or ≤18 year old children were concerned. eGFR values were winsorized at 15 and 200 ml/min/1.73 m2. Studies had a median mean eGFR of 89 ml/min/1.73 m2 (interquartile range: 81–94) and a median mean age of 54 years. Overall, 50% participants were females. Each study performed genotype imputation based on the Haplotype Reference Consortium (HRC) v1.1 or the 1000 Genomes Project phase 3 v5 ALL or phase 1 v3 ALL panels. Sex- and age-adjusted linear regression models were fitted to the logarithm of eGFR. GWAS were performed by regressing the residuals of the linear models on SNP dosage levels, assuming additive genetic effects. Family-based studies accounted for relatedness including kinship estimation or genetic principal components into the linear models. After selecting SNPs with imputation quality score >0.6 and minor allele count >10, genomic control correction was applied in case of an inflation factor λ > 1. GWAS were pooled using fixed effects inverse-variance weighted meta-analysis. After meta-analysis, SNPs that were not present in at least 50% of the studies were discarded, leaving 8,221,591 variants from ≥61 GWAS across all ancestries and 8,834,748 variants from ≥42 GWAS in the European ancestry subset. No further genomic control correction was applied. Genome-wide significance level was set at 5 × 10−8. Summary statistics for selected variants from the CKDGen consortium were used for replication. 105 variants were assessed for the non-Hispanic white-only meta-analysis in a maximum of 567,453 individuals from up to 84 studies. As many as 120 studies provided summary statistics for 122 variants for a maximum of 765,346 participants for the all-ancestry analysis. We performed a sample-size weighted z-score linear combination meta-analysis of MVP and CKDgen results (Supplementary Data 1).

DEPICT methods

Enrichment analyses in DEPICT[90] were conducted using significant GWAS sentinel SNPs from three separate analyses as input: (1) transethnic analyses of all MVP subjects, (2) transethnic analyses of MVP subjects with DM, and (3) transethnic analyses of MVP subjects without DM. DEPICT incorporates predefined phenotypic gene sets from multiple databases with expression microarray data (Affymetrix HGU133a2.0) from more than37k subjects to provide gene sets with high expression for Medical Subject Heading (MeSH) tissue and cell type annotations. DEPICT output includes enrichment p-values for both tissue level and gene-set features, as well as an indicator for whether each enrichment test had an FDR q-value of <0.05.

LD score regression and Popcorn

Subsequently, we utilized the LD Score Regression approach[27] in each contributing group to ascertain whether inflation was due to residual population stratification or polygenicity. Among whites we used 1000 Genomes precomputed LD Scores and for non-Hispanic blacks we calculated LD Scores in 2217 African American participants from BioVU using Illumina Mega Array (Illumina, Inc) genotype data imputed to 1000 Genomes phase 3 haplotypes. Calculation of the intercept in the MVP non-Hispanic white discovery analysis datasets were 1.02 SE = 0.01), 1.04 (SE = 0.01), 1.03 (SE = 0.01), and 0.99 (SE = 0.01), for the diabetic hypertensive participants, non-diabetics hypertensive participants, non-diabetics non-hypertensive participants, and diabetic non-hypertensive participants, respectively, suggesting that little of the observed inflation in the lambda is due to population stratification. Similarly, intercepts in the MVP Blacks discovery analysis datasets were 1.01 (SE = 0.003), 1.02 (SE = 0.004), 1.00 (SE = 0.003), and 1.00 (SE = 0.003), for the diabetic hypertensive participants, non-diabetic hypertensive participants, non-diabetic non-hypertensive participants, and diabetic non-hypertensive participants, respectively. Heritability was also assessed within each strata of the MVP data using LD Score regression. We also used the related Popcorn[58] software to evaluate the transethnic genetic effect correlation across the entire genome.

Conditional analysis

For conditional analysis of common variants we used two parallel approaches implemented in the genome-wide complex traits analysis (GCTA) software:[24] (i) genome-wide joint conditional analysis; and (ii) locus-specific conditional analysis. (i)Genome-wide joint conditional analysis Conditional analysis was conducted within GCTA software, using the –cojo method, which performs iterative conditional and joint analysis simultaneously with stepwise model selection[24]. The summary statistics from the GWAS discovery whites was used as the input summary data, and the imputed, hard-called BioVU EA genetic data (n = 19,726) was used as the reference genotype-level data, in PLINK format. Combination of these two input data files restricted the GCTA analysis to the imputed SNPs in common to the GWAS discovery meta-analysis (which was itself restricted to MAF > 1%). Within the BioVU genetic data, LD was calculated between all pairwise SNPs. A p-value cutoff of 5 × 10−8 was used as the selection threshold within GCTA, and the collinearity threshold was set at the default value of 0.9, so that SNPs are not selected if the multiple regression with the current SNPs in the model has R2 ≥ 0.9. After combining results across all 22 chromosomes, each trait-specific analysis resulted in a distinct set of jointly independent significant signals. Hence all final SNPs are pairwise-LD-independent. (ii)Locus-specific conditional analysis Analysis of each significant or previously reported locus was performed across all regional (1 Mb locus region centered ± 500 kb around the sentinel SNP) imputed SNPs with MAF ≥ 1%, conditioning on the sentinel SNP, using the --cojo-cond command in GCTA. As in the genome-wide approach, GWAS discovery meta-analysis results were used as the input summary data, with the BioVU EA imputed genetic data used as the reference PLINK dataset for LD computation. This approach provides conditional analysis results for each SNP within the implicated regions after conditioning on the sentinel SNP(s). These results yield a list of potential secondary SNPs which are evaluated according to the criteria below to identify those which are both significant and independent: (a)p < 5 × 10−8 from original (unconditioned) GWAS discovery analysis, (i.e., SNP is significantly associated with eGFR itself) (b)pc < 5 × 10−8 from the conditional analysis (i.e., the SNP is significantly associated with eGFR after conditioning on the sentinel/published SNPs) (c)independent of any of the sentinel SNPs (r2 < 0.1) Significant independent SNPs meeting the above criteria, from any locuswere combined into a single list. This list is more comprehensive than that from approach (i), as it contains all possible secondary SNPs, rather than a single lead SNP at each independent signal. These secondary SNPs may be in LD with each other within a given locus. The outputs from the two different approaches were then combined to identify those SNPs which are genome-wide significant in the discovery dataset and jointly independent on a genome-wide level. For robustness, a secondary signal was only claimed if the SNP is validated from both approaches.

S-PrediXcan analysis

Genetically predicted gene expression was evaluated for the discovery GWAS with S-PrediXcan[9], a gene-level approach which estimates the genetically determined component of gene expression in each tissue and tests it for association with SNP-level summary statistics. We utilized a human kidney gene expression atlas published by Ko et al.[21] to conduct our genetically predicted gene expression analyses with our eGFR outcome.

Human kidney compartment cis-eQTL analysis

Human kidney tissues (n = 151) were microdissected in RNAlater to obtain glomerular and tubular renal compartments. An unbiased review of the tissue section was performed by a renal pathologist by scoring of multiple parameters. One microgram of total RNA from each compartment was used for isolation of poly(A) purified mRNA using the Illumina TruSeq RNA Preparation Kit. Samples were sequenced in single-end 100 bp reads, and annotated RNA counts were calculated by Illumina’s CASAVA 1.8.2 to generate fastq. Alignment of trimmed reads to the Gencode human genome (GRCh37) was performed using STAR-2.4.1d[91,92]. RNA-seq data are available at Gene Expression Omnibus (GSE115098). Compartmental eQTL data sets were generated from 121 tubule samples and 119 glomerulus samples, respectively[77]. The cis expression window was defined as 1 Mb on either side of the gene transcriptional start site. eQTL analyses were performed using linear regression in FastQTL software[93] under an additive model and adjusted for six genetic PCs.

Murine kidney single cell sequencing analysis

Homologs of human genes detected in S-PrediXcan analyses of healthy human kidney eQTL reference and GWAS summary statistics were further investigated for kidney cell type-specific RNA expression using single cell sequencing in murine kidney cells. Cells were clustered by expression profiles into groups representing kidney structural features and additional cell types found in the kidney[22]. Confirmation of murine kidney findings was performed through cross-reference of protein expression levels available in the Human Protein Atlas[94].

Genetic risk score construction

We constructed a weighted genetic risk score (w-GRS) for eGFR by calculating a linear combination of weights derived from previous publications from CKDGen[10,30] for index SNPs at each of the 63 statistically significant replicated loci (Supplementary Table 7). GRSs were constructed for self-reported/administratively assigned non-Hispanic white individuals in the MVP only.

Phenome wide association study analysis

We performed a phenome wide association study (PheWAS) of GRS in MVP non-Hispanic whites (Nmax = 192,868), leveraging the full catalog of ICD-9 diagnosis codes. We used logistic regression to separately model each of 1813 PheWAS traits as a function of GRS, adjusted for sex and ten PCs. We report the results from these analyses as odds ratios where the estimate is the average change in odds of the PheWAS trait per weighted eGFR-increasing allele. Multiple testing thresholds for significance were set to p ≤ 2.75 × 10−5(0.05/1813).

90 in total

Review 1. Estimated Glomerular Filtration Rate: Fit for What Purpose?

Authors: David G Warnock
Journal: Nephron Date: 2016-02-05 Impact factor: 2.847

2. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets.

Authors: Zhihong Zhu; Futao Zhang; Han Hu; Andrew Bakshi; Matthew R Robinson; Joseph E Powell; Grant W Montgomery; Michael E Goddard; Naomi R Wray; Peter M Visscher; Jian Yang
Journal: Nat Genet Date: 2016-03-28 Impact factor: 38.330

3. Integrative approaches for large-scale transcriptome-wide association studies.

Authors: Alexander Gusev; Arthur Ko; Huwenbo Shi; Gaurav Bhatia; Wonil Chung; Brenda W J H Penninx; Rick Jansen; Eco J C de Geus; Dorret I Boomsma; Fred A Wright; Patrick F Sullivan; Elina Nikkola; Marcus Alvarez; Mete Civelek; Aldons J Lusis; Terho Lehtimäki; Emma Raitoharju; Mika Kähönen; Ilkka Seppälä; Olli T Raitakari; Johanna Kuusisto; Markku Laakso; Alkes L Price; Päivi Pajukanta; Bogdan Pasaniuc
Journal: Nat Genet Date: 2016-02-08 Impact factor: 38.330

4. A Dominant Mutation in Nuclear Receptor Interacting Protein 1 Causes Urinary Tract Malformations via Dysregulation of Retinoic Acid Signaling.

Authors: Asaf Vivante; Nina Mann; Hagith Yonath; Anna-Carina Weiss; Maike Getwan; Michael M Kaminski; Tobias Bohnenpoll; Catherine Teyssier; Jing Chen; Shirlee Shril; Amelie T van der Ven; Hadas Ityel; Johanna Magdalena Schmidt; Eugen Widmeier; Stuart B Bauer; Simone Sanna-Cherchi; Ali G Gharavi; Weining Lu; Daniella Magen; Rachel Shukrun; Richard P Lifton; Velibor Tasic; Horia C Stanescu; Vincent Cavaillès; Robert Kleta; Yair Anikster; Benjamin Dekel; Andreas Kispert; Soeren S Lienkamp; Friedhelm Hildebrandt
Journal: J Am Soc Nephrol Date: 2017-04-05 Impact factor: 10.121

5. Mean platelet volume is independently associated with renal dysfunction in stable coronary artery disease.

Authors: Hakan Uçar; Mustafa Gür; Nermin Yildiz Koyunsever; Taner Şeker; Caner Türkoğlu; Onur Kaypakli; Durmuş Yildiray Sahin; Zafer Elbasan; Murat Çayli
Journal: Platelets Date: 2013-06-17 Impact factor: 3.862

6. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes.

Authors: Rainer Malik; Ganesh Chauhan; Matthew Traylor; Muralidharan Sargurupremraj; Yukinori Okada; Kari Stefansson; Bradford B Worrall; Steven J Kittner; Sudha Seshadri; Myriam Fornage; Hugh S Markus; Joanna M M Howson; Yoichiro Kamatani; Stephanie Debette; Martin Dichgans; Aniket Mishra; Loes Rutten-Jacobs; Anne-Katrin Giese; Sander W van der Laan; Solveig Gretarsdottir; Christopher D Anderson; Michael Chong; Hieab H H Adams; Tetsuro Ago; Peter Almgren; Philippe Amouyel; Hakan Ay; Traci M Bartz; Oscar R Benavente; Steve Bevan; Giorgio B Boncoraglio; Robert D Brown; Adam S Butterworth; Caty Carrera; Cara L Carty; Daniel I Chasman; Wei-Min Chen; John W Cole; Adolfo Correa; Ioana Cotlarciuc; Carlos Cruchaga; John Danesh; Paul I W de Bakker; Anita L DeStefano; Marcel den Hoed; Qing Duan; Stefan T Engelter; Guido J Falcone; Rebecca F Gottesman; Raji P Grewal; Vilmundur Gudnason; Stefan Gustafsson; Jeffrey Haessler; Tamara B Harris; Ahamad Hassan; Aki S Havulinna; Susan R Heckbert; Elizabeth G Holliday; George Howard; Fang-Chi Hsu; Hyacinth I Hyacinth; M Arfan Ikram; Erik Ingelsson; Marguerite R Irvin; Xueqiu Jian; Jordi Jiménez-Conde; Julie A Johnson; J Wouter Jukema; Masahiro Kanai; Keith L Keene; Brett M Kissela; Dawn O Kleindorfer; Charles Kooperberg; Michiaki Kubo; Leslie A Lange; Carl D Langefeld; Claudia Langenberg; Lenore J Launer; Jin-Moo Lee; Robin Lemmens; Didier Leys; Cathryn M Lewis; Wei-Yu Lin; Arne G Lindgren; Erik Lorentzen; Patrik K Magnusson; Jane Maguire; Ani Manichaikul; Patrick F McArdle; James F Meschia; Braxton D Mitchell; Thomas H Mosley; Michael A Nalls; Toshiharu Ninomiya; Martin J O'Donnell; Bruce M Psaty; Sara L Pulit; Kristiina Rannikmäe; Alexander P Reiner; Kathryn M Rexrode; Kenneth Rice; Stephen S Rich; Paul M Ridker; Natalia S Rost; Peter M Rothwell; Jerome I Rotter; Tatjana Rundek; Ralph L Sacco; Saori Sakaue; Michele M Sale; Veikko Salomaa; Bishwa R Sapkota; Reinhold Schmidt; Carsten O Schmidt; Ulf Schminke; Pankaj Sharma; Agnieszka Slowik; Cathie L M Sudlow; Christian Tanislav; Turgut Tatlisumak; Kent D Taylor; Vincent N S Thijs; Gudmar Thorleifsson; Unnur Thorsteinsdottir; Steffen Tiedt; Stella Trompet; Christophe Tzourio; Cornelia M van Duijn; Matthew Walters; Nicholas J Wareham; Sylvia Wassertheil-Smoller; James G Wilson; Kerri L Wiggins; Qiong Yang; Salim Yusuf; Joshua C Bis; Tomi Pastinen; Arno Ruusalepp; Eric E Schadt; Simon Koplev; Johan L M Björkegren; Veronica Codoni; Mete Civelek; Nicholas L Smith; David A Trégouët; Ingrid E Christophersen; Carolina Roselli; Steven A Lubitz; Patrick T Ellinor; E Shyong Tai; Jaspal S Kooner; Norihiro Kato; Jiang He; Pim van der Harst; Paul Elliott; John C Chambers; Fumihiko Takeuchi; Andrew D Johnson; Dharambir K Sanghera; Olle Melander; Christina Jern; Daniel Strbian; Israel Fernandez-Cadenas; W T Longstreth; Arndt Rolfs; Jun Hata; Daniel Woo; Jonathan Rosand; Guillaume Pare; Jemma C Hopewell; Danish Saleheen
Journal: Nat Genet Date: 2018-03-12 Impact factor: 38.330

7. A method to predict the impact of regulatory variants from DNA sequence.

Authors: Dongwon Lee; David U Gorkin; Maggie Baker; Benjamin J Strober; Alessandro L Asoni; Andrew S McCallion; Michael A Beer
Journal: Nat Genet Date: 2015-06-15 Impact factor: 38.330

8. Whole exome sequencing reveals mutations in NARS2 and PARS2, encoding the mitochondrial asparaginyl-tRNA synthetase and prolyl-tRNA synthetase, in patients with Alpers syndrome.

Authors: Kalliopi Sofou; Gittan Kollberg; Maria Holmström; Marcela Dávila; Niklas Darin; Claes M Gustafsson; Elisabeth Holme; Anders Oldfors; Már Tulinius; Jorge Asin-Cayuela
Journal: Mol Genet Genomic Med Date: 2014-10-23 Impact factor: 2.183

9. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics.

Authors: Alvaro N Barbeira; Scott P Dickinson; Rodrigo Bonazzola; Jiamao Zheng; Heather E Wheeler; Jason M Torres; Eric S Torstenson; Kaanan P Shah; Tzintzuni Garcia; Todd L Edwards; Eli A Stahl; Laura M Huckins; Dan L Nicolae; Nancy J Cox; Hae Kyung Im
Journal: Nat Commun Date: 2018-05-08 Impact factor: 14.919

10. Identification of new susceptibility loci for type 2 diabetes and shared etiological pathways with coronary heart disease.

Authors: Wei Zhao; Asif Rasheed; Emmi Tikkanen; Jung-Jin Lee; Adam S Butterworth; Joanna M M Howson; Themistocles L Assimes; Rajiv Chowdhury; Marju Orho-Melander; Scott Damrauer; Aeron Small; Senay Asma; Minako Imamura; Toshimasa Yamauch; John C Chambers; Peng Chen; Bishwa R Sapkota; Nabi Shah; Sehrish Jabeen; Praveen Surendran; Yingchang Lu; Weihua Zhang; Atif Imran; Shahid Abbas; Faisal Majeed; Kevin Trindade; Nadeem Qamar; Nadeem Hayyat Mallick; Zia Yaqoob; Tahir Saghir; Syed Nadeem Hasan Rizvi; Anis Memon; Syed Zahed Rasheed; Fazal-Ur-Rehman Memon; Khalid Mehmood; Naveeduddin Ahmed; Irshad Hussain Qureshi; Wasim Iqbal; Uzma Malik; Narinder Mehra; Jane Z Kuo; Wayne H-H Sheu; Xiuqing Guo; Chao A Hsiung; Jyh-Ming J Juang; Kent D Taylor; Yi-Jen Hung; Wen-Jane Lee; Thomas Quertermous; I-Te Lee; Chih-Cheng Hsu; Erwin P Bottinger; Sarju Ralhan; Yik Ying Teo; Tzung-Dau Wang; Dewan S Alam; Emanuele Di Angelantonio; Steve Epstein; Sune F Nielsen; Børge G Nordestgaard; Anne Tybjaerg-Hansen; Robin Young; Marianne Benn; Ruth Frikke-Schmidt; Pia R Kamstrup; J Wouter Jukema; Naveed Sattar; Roelof Smit; Ren-Hua Chung; Kae-Woei Liang; Sonia Anand; Dharambir K Sanghera; Samuli Ripatti; Ruth J F Loos; Jaspal S Kooner; E Shyong Tai; Jerome I Rotter; Yii-Der Ida Chen; Philippe Frossard; Shiro Maeda; Takashi Kadowaki; Muredach Reilly; Guillaume Pare; Olle Melander; Veikko Salomaa; Daniel J Rader; John Danesh; Benjamin F Voight; Danish Saleheen
Journal: Nat Genet Date: 2017-09-04 Impact factor: 38.330

42 in total

1. Systematic integrated analysis of genetic and epigenetic variation in diabetic kidney disease.

Authors: Xin Sheng; Chengxiang Qiu; Hongbo Liu; Caroline Gluck; Jesse Y Hsu; Jiang He; Chi-Yuan Hsu; Daohang Sha; Matthew R Weir; Tamara Isakova; Dominic Raj; Hernan Rincon-Choles; Harold I Feldman; Raymond Townsend; Hongzhe Li; Katalin Susztak
Journal: Proc Natl Acad Sci U S A Date: 2020-11-03 Impact factor: 11.205

2. Genome-Wide Association Studies of CKD and Related Traits.

Authors: Adrienne Tin; Anna Köttgen
Journal: Clin J Am Soc Nephrol Date: 2020-05-14 Impact factor: 8.237

3. Dnmt3a and Dnmt3b-Decommissioned Fetal Enhancers are Linked to Kidney Disease.

Authors: Yuting Guan; Hongbo Liu; Ziyuan Ma; Szu-Yuan Li; Jihwan Park; Xin Sheng; Katalin Susztak
Journal: J Am Soc Nephrol Date: 2020-03-03 Impact factor: 10.121

4. Mendelian pathway analysis of laboratory traits reveals distinct roles for ciliary subcompartments in common disease pathogenesis.

Authors: Theodore George Drivas; Anastasia Lucas; Xinyuan Zhang; Marylyn DeRiggi Ritchie
Journal: Am J Hum Genet Date: 2021-02-25 Impact factor: 11.025

5. Integration of GWAS Summary Statistics and Gene Expression Reveals Target Cell Types Underlying Kidney Function Traits.

Authors: Yong Li; Stefan Haug; Pascal Schlosser; Alexander Teumer; Adrienne Tin; Cristian Pattaro; Anna Köttgen; Matthias Wuttke
Journal: J Am Soc Nephrol Date: 2020-08-06 Impact factor: 10.121

Review 6. Unravelling the complex genetics of common kidney diseases: from variants to mechanisms.

Authors: Katie Marie Sullivan; Katalin Susztak
Journal: Nat Rev Nephrol Date: 2020-06-08 Impact factor: 28.314

Review 7. Air, Land, and Sea: Gene-Environment Interaction in Chronic Disease.

Authors: Russell A Wilke; Eric A Larson
Journal: Am J Med Date: 2021-07-31 Impact factor: 4.965

8. Discovery and fine-mapping of kidney function loci in first genome-wide association study in Africans.

Authors: Segun Fatumo; Tinashe Chikowore; Robert Kalyesubula; Rebecca N Nsubuga; Gershim Asiki; Oyekanmi Nashiru; Janet Seeley; Amelia C Crampin; Dorothea Nitsch; Liam Smeeth; Pontiano Kaleebu; Stephen Burgess; Moffat Nyirenda; Nora Franceschini; Andrew P Morris; Laurie Tomlinson; Robert Newton
Journal: Hum Mol Genet Date: 2021-07-28 Impact factor: 6.150

9. Kidney disease genetic risk variants alter lysosomal beta-mannosidase (MANBA) expression and disease severity.

Authors: Xiangchen Gu; Hongliu Yang; Xin Sheng; Yi-An Ko; Chengxiang Qiu; Jihwan Park; Shizheng Huang; Rachel Kember; Renae L Judy; Joseph Park; Scott M Damrauer; Girish Nadkarni; Ruth J F Loos; Vy Thi Ha My; Kumardeep Chaudhary; Erwin P Bottinger; Ishan Paranjpe; Aparna Saha; Christopher Brown; Shreeram Akilesh; Adriana M Hung; Matthew Palmer; Aris Baras; John D Overton; Jeffrey Reid; Marylyn Ritchie; Daniel J Rader; Katalin Susztak
Journal: Sci Transl Med Date: 2021-01-13 Impact factor: 17.956

10. Cooperative Studies Program (CSP) #572: A Study of Serious Mental Illness in Veterans as a Pathway to personalized medicine in Schizophrenia and Bipolar Illness.

Authors: Philip D Harvey; Tim B Bigdeli; Ayman H Fanous; Yuli Li; Nallakkandi Rajeevan; Frederick Sayward; Krishnan Radhakrishnan; Grant Huang; Mihaela Aslan
Journal: Pers Med Psychiatry Date: 2021-07