| Literature DB >> 28334807 |
Heekyoung Lee1,2,3,4, Kun Qian1,2,3,4,5, Christine von Toerne4,5, Lena Hoerburger2,6, Melina Claussnitzer1,2,3,4,7, Christoph Hoffmann2,8, Viktoria Glunk1,2,3,4, Simone Wahl4,9,10, Michaela Breier4,9,10, Franziska Eck5, Leili Jafari5, Sophie Molnos4,9,10, Harald Grallert3,4,9,10, Ingrid Dahlman11, Peter Arner11, Cornelia Brunner12, Hans Hauner1,2,3,4,13, Stefanie M Hauck4,5, Helmut Laumen1,2,3,4,5,6,14.
Abstract
Genome-wide association studies identified numerous disease risk loci. Delineating molecular mechanisms influenced by cis-regulatory variants is essential to understand gene regulation and ultimately disease pathophysiology. Combining bioinformatics and public domain chromatin information with quantitative proteomics supports prediction of cis-regulatory variants and enabled identification of allele-dependent binding of both, transcription factors and coregulators at the type 2 diabetes associated PPARG locus. We found rs7647481A nonrisk allele binding of Yin Yang 1 (YY1), confirmed by allele-specific chromatin immunoprecipitation in primary adipocytes. Quantitative proteomics also found the coregulator RING1 and YY1 binding protein (RYBP) whose mRNA levels correlate with improved insulin sensitivity in primary adipose cells carrying the rs7647481A nonrisk allele. Our findings support a concept with diverse cis-regulatory variants contributing to disease pathophysiology at one locus. Proteome-wide identification of both, transcription factors and coregulators, can profoundly improve understanding of mechanisms underlying genetic associations.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28334807 PMCID: PMC5389726 DOI: 10.1093/nar/gkx105
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Discovery of allele-specific binding proteins at cis-regulatory variants. (A) Workflow: (1) cis-regulatory variant prediction at disease associated variants (PPARG) in high LD (r2 ≥ 0.7 (6)) by integrating bioinformatics phylogenetic TFBS module complexity analysis and regulatory chromatin marks; (2) protein–DNA binding assessed by Cy5 labeled oligonucleotides matching the risk and nonrisk allele, respectively, in electrophoretic mobility shift assay (EMSA); (3) protein enrichment with biotin (bio) labeled oligonucleotides on streptavidin-beads (str) and elution of native protein complexes with increasing concentration of NaCl; (4) protein–DNA binding in eluted fractions; (5) protein identification and quantification by LC–MS/MS and subsequent label-free quantitative analysis; and (6) molecular mechanisms, experimental and genetics verification of significant allele-specific binding transcription factors and related coregulators. (B–D) Bioinformatics and public domain epigenomic marks of regulatory regions infer the cis-regulatory variant rs7647481 at the PPARG locus (related to Supplementary Figure S1). (B) PMCA analysis of cross-species TFBS pattern conservation predicted six indicated candidate cis-regulatory SNPs at complex regions (6) (red) out of 23 noncoding proxy SNPs (r2 ≥ 0.7 (6)) at the type 2 diabetes (T2D) associated PPARG locus (tagSNP rs1801282). (C) Overlap of six variants identified in (B) with H3K27ac (histone H3-lysine 27 acetylation), H3K4me1 and H3K4me2 (histone H3–lysine 4 mono- and di-methylation) histone modification regions at the PPARG locus during adipogenic differentiation of primary human adipocyte stem cells (36), GSE21366, genomic coordinates are given conform to hg19. (D) Localization of cis-regulatory (red) and non cis-regulatory (grey) variants subjected to workflow (A2–6) relative to transcriptional start site of the PPARG1–3 mRNA isoforms. rs7647481 overlapping with both, day 3 and day 9, tested late stage of adipogenesis histone modification regions (Figure 1C) and with adipocyte DNase-seq regions (see Supplementary Figure S1). * rs4684847 previously identified as specifically overlapping with homeobox TFBS (6). Blue boxes = coding exons, dashed white boxes = untranslated exons, blue lines = introns, black arrows = promoters.
Figure 4.rs7647481A nonrisk allele-specific binding and transcriptional activity of the transcription factor YY1 inferred from proteomics analysis. (A) The rs7647481G risk allele abrogates the core of a YY1 consensus binding site (Matbase Matrix Library 9.1, Genomatix. Munich, Germany). (B) Competition and supershift EMSA experiments using risk (R) and nonrisk (NR) allele-specific Cy5-labeled probes of the predicted cis-regulatory (red) and non cis-regulatory (grey) variants reveal a specific binding of YY1 at the rs7647481A nonrisk allele. Competition (comp.) assays using 33-fold excess of unlabeled YY1 probe and supershift assays by adding anti-YY1 (αYY1) or lgG isotype control antibody, respectively. (C) Reporter assays in 293T cells with constructs harbouring the risk and nonrisk allele of predicted cis-regulatory (red) and non cis-regulatory (grey) variants reveal allele-specific activation from the rs7647481A nonrisk allele upon YY1 overexpression. Mean ± SD from five experiments. (D) rs7647481A nonrisk allele-specific activation of reporter gene activity in 293T-cells, INS-1 β-cells, C2C12 cells (undifferentiated myoblasts, differentiated myocytes) and Huh7 hepatocytes assessing the effect of endogenous transcriptional regulators. Reporter assays with luciferase constructs containing the respective allele at midposition as indicated; for each cell line the TK-promoter control vector was co-transfected separately and set to one. Mean ± SD from seven experiments. (E) Increased in vivo YY1 binding at the rs7647481A nonrisk allele. The result shows allele-specific binding at the A-nonrisk/G-risk allele, determined by ChIP experiments in primary human adipose tissue cells, preadipocytes and in vitro differentiated adipocytes, heterozygous for rs7647481G/A using αYY1 or lgG isotype control, respectively, followed by allele-specific qPCR detecting the rs7647481A nonrisk and G risk allele for each ChIP experiment (see also Materials and Methods). Mean ± SD from ChIP experiments using chromatin–DNA from three donors, P-values from Wilcoxon's signed rank test.
Figure 5.Interaction network analysis of YY1 with cofactors infers RYBP contribution to nonrisk allele-specific effect on insulin-resistance. (A) Interaction network of the YY1 transcription factor identified at the rs7647481A nonrisk allele with all transcriptional coregulators identified in the same label-free proteomics analysis. Associations by co-citation (dotted lines) or expert curation (lines) from GePS tool analysis (Genomatix, see Materials and Methods). Proteins with direct interaction to the transcription factor YY1 (green dotted line) and with positive correlation of adipose mRNA levels to insulin-sensitivity (green line) are shown (Table 2). (B and C) YY1 and RYBP (B), PPARG1 and PPARG2 (C) mRNA expression levels measured by qPCR, standardized to GAPDH, in SGBS preadipocytes treated with different siRNAs for 72 h labeled as siYY1, siRYBP or siYY1+siRYBP/siNT (non-targeting control). Mean ± SD from five experiments. P-values from one sample t-test. (D) Impact of nonrisk and risk allele identified proteins on the PPARG locus phenotype insulin-resistance. The rs7647481A nonrisk allele promotes binding of YY1 and its cofactor RYBP which activate PPARG expression; thereby improving insulin-sensitivity. The rs4684847C risk allele binds the PPARG suppressor PRRX1; resulting in insulin-resistance (6).
Figure 2.Enrichment of risk and nonrisk allele-specific binding proteins at predicted cis-regulatory SNPs. (A) Representative EMSA experiments with allele-specific Cy5-labeled probes on nuclear extracts from HIB 1B cells (triangle = allele-specific band) demonstrated allele-specific differential binding affinity of proteins at the risk/nonrisk allele of predicted cis-regulatory rs4684847/rs7647481 variants (red), respectively, and no binding at predicted non cis-regulatory SNPs (grey). Bar charts illustrate the allelic fold change of protein–DNA complexes signal intensity (allele with highest binding/allele with lowest binding; allele with lowest binding set to one), mean ± SD of five experiments. *P < 0.05. P-value by paired t-test. (B) Risk and nonrisk allele protein–DNA interaction at predicted cis-regulatory and non cis-regulatory SNPs in human preadipocytes and adipocytes. EMSA with allelic Cy5-labeled probes for the indicated predicted cis-regulatory (red) and non cis-regulatory (grey) SNPs using nuclear extracts from undifferentiated primary human preadipocytes (left panel), the human SGBS preadipocyte cell line (mid panel) and SGBS cells in vitro differentiated to adipocytes for 14 days (right panel). (C) Enrichment of allele-specific differential binding proteins. EMSA with binding-allele specific Cy5-labeled probes of predicted cis-regulatory SNPs using protein from affinity chromatography with the respective biotin-labeled risk/nonrisk allelic-probes. Triangle = allele-specific band; input: nuclear protein used for affinity chromatography; Sn: supernatant after incubation with biotin-labeled allelic-probe-magnetic beads conjugates; Wash: low NaCl concentration wash eluates; E200/E300: 200 and 300 mM NaCl protein eluates used for LC–MS/MS. Protein eluates E200 and E300 with differential protein–DNA binding contain the prioritized transcription factors YY1 at rs7647481A nonrisk and PRRX1 at rs4684847C risk allele (Table 1). All experiments were performed in triplicates. For enrichments at predicted non cis-regulatory SNPs, see Supplementary Figure S2.
Allele-specific differentially binding transcription factors
| SNP | Gene symbol | Allelic ratio | Allelic fold change |
| Quantified peptides |
|---|---|---|---|---|---|
| rs7647481 | YY1 | A/G | 6.6 | 2.94×10−3 | 9 |
| NFATC4 | 2.6 | 0.01 | 2 | ||
| rs4684847 | PRRX1 | C/T | 2.6 | 0.01 | 5 |
| ILF3 | 4.2 | 0.01 | 4 |
Proteins annotated as transcription factors and identified by LC–MS/MS in the fractions with the highest allelic protein–DNA binding EMSA signal intensity after affinity-chromatography enrichment at rs7647481A nonrisk and rs4684847C risk allele (Figures 2C, 200 and 300 mM elution, respectively) and significant differential binding (allelic fold change ≥ 2.0, P-value ≤ 0.01, illustrated in Figure 3 and Supplementary Figure S3) are shown.
Figure 3.Label-free quantitative LC–MS/MS proteomics identified risk versus nonrisk allele-specific binding proteins at predicted cis-regulatory and non cis-regulatory SNPs. Volcano plots for the indicated variants illustrate the distribution of risk (blue) and nonrisk (green) allele-specific binding proteins identified by LC–MS/MS (results from 200 mM NaCl eluates illustrated by EMSA in Figure 2C and Supplementary Figure S2, results from 300 mM eluates see Supplementary Figure S3) at predicted cis-regulatory (red) and non cis-regulatory SNPs (grey). Proteins with significant (P ≤ 0.05, red line) allele-specific differential binding (allelic ratio ≤ 0.5 or ≥ 2) at the risk allele = blue dots, nonrisk allele = green dots; with no significant allele-specific binding = grey dots; n of proteins per quadrant = italic numbers. Arrows highlight proteins from Table 1 annotated as transcription factors with fold change ≥ 2, P ≤ 0.01, number of identified peptides ≥ 2. Mean protein levels (log2 ratio of indicated alleles) and P-value from unpaired t-test of three independent experiments.
Risk and nonrisk allele specific correlation of adipose tissue PPARG, YY1 and RYBP mRNA expression levels (log transformed) with the type 2 diabetes trait insulin-resistance
| HOMA_IR | |||||
|---|---|---|---|---|---|
|
| Allele | Adj |
| SE |
|
|
| All | — | −3.47 | 1.19 | 6.03 × 10−3 |
| a | −3.52 | 1.19 | 5.48 × 10−3 | ||
| a.b | −1.57 | 0.87 | 0.08 | ||
| Nonrisk | — | −6.27 | 1.33 | 2.27 × 10−4 | |
| a | −6.25 | 1.32 | 2.18 × 10−4 | ||
| a.b | −3.26 | 1.50 | 0.05 | ||
| Risk | — | 0.25 | 1.68 | 0.88 | |
| a | 0.23 | 1.73 | 0.89 | ||
| a.b | 0.11 | 1.10 | 0.92 | ||
|
| All | — | 0.07 | 1.59 | 0.96 |
| a | −0.01 | 1.69 | 1.00 | ||
| a.b | −0.25 | 1.08 | 0.82 | ||
| Nonrisk | — | −2.87 | 2.63 | 0.29 | |
| a | −3.97 | 2.79 | 0.17 | ||
| a.b | −2.33 | 1.70 | 0.19 | ||
| Risk | — | 2.18 | 1.77 | 0.23 | |
| a | 2.36 | 1.86 | 0.22 | ||
| a.b | 1.15 | 1.23 | 0.36 | ||
|
| All | — | −1.98 | 1.11 | 0.084 |
| a | −1.98 | 1.11 | 0.083 | ||
| a.b | −1.14 | 0.73 | 0.13 | ||
| Nonrisk | — | −5.52 | 1.49 | 1.89 × 10−3 | |
| a | −5.71 | 1.44 | 1.15 × 10−3 | ||
| a.b | −3.38 | 1.09 | 7.04 × 10−3 | ||
| risk | — | 0.55 | 1.31 | 0.68 | |
| a | 0.57 | 1.31 | 0.67 | ||
| a.b | 0.16 | 0.84 | 0.85 | ||
Gene expression was measured in adipose tissue from a lean/obese patient cohort (38 subjects. mean ± SD 24.2 ± 9.1 kg/m2). rs7647481 and rs4684847 risk allele and nonrisk allele genotypes were determined by Sequenom-assay. Nonrisk: subjects heterozygous or homozygous (n = 18) for the rs7647481A (YY1/RYBP binding) and rs4684847T nonrisk allele; risk: subjects homozygous (n = 20) for the rs7647481G and rs4684847C risk allele. P-values and β-estimates from linear regression analysis of total PPARG mRNA levels (from microarray data measuring exons shared by both PPARG1 and PPARG2), YY1 and RYBP mRNA expression levels with insulin-resistance measure HOMA-IR (homeostasis model assessment of insulin resistance) are shown. Adj = correlations without adjustment (—), age (a) or age and BMI adjusted (a.b).