| Literature DB >> 32912314 |
Yuan He1, Surya B Chhetri2,3, Marios Arvanitis1,4, Kaushik Srinivasan5, François Aguet6, Kristin G Ardlie6, Alvaro N Barbeira7, Rodrigo Bonazzola7, Hae Kyung Im7, Christopher D Brown8, Alexis Battle9,10.
Abstract
Genetic regulation of gene expression, revealed by expression quantitative trait loci (eQTLs), exhibits complex patterns of tissue-specific effects. Characterization of these patterns may allow us to better understand mechanisms of gene regulation and disease etiology. We develop a constrained matrix factorization model, sn-spMF, to learn patterns of tissue-sharing and apply it to 49 human tissues from the Genotype-Tissue Expression (GTEx) project. The learned factors reflect tissues with known biological similarity and identify transcription factors that may mediate tissue-specific effects. sn-spMF, available at https://github.com/heyuan7676/ts_eQTLs , can be applied to learn biologically interpretable patterns of eQTL tissue-specificity and generate testable mechanistic hypotheses.Entities:
Keywords: Matrix factorization; Tissue-specific eQTLs; Transcription factors; Ubiquitous eQTLs
Mesh:
Substances:
Year: 2020 PMID: 32912314 PMCID: PMC7488540 DOI: 10.1186/s13059-020-02129-6
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Matrix factorization model to dissect eQTL effects across tissues. a Simplified examples of the relationship between eQTL effect sizes and factors. eQTL1: the effect of an eQTL in the spleen can be represented by a spleen-specific factor. eQTL2: the effect of an eQTL in all nine tissues can be summarized as a ubiquitous effect across all tissues. eQTL3: the effect of an eQTL in four brain tissues and three skin tissues can be summarized as the summation of brain-specific effect and skin-specific effect. b Learning factors underlying eQTL effects from GTEx. X matrix represents the effect size of eQTLs across tissues (see the “Methods” section). Patterns of tissue-sharing and tissue-specificity are observed in X. Matrix factorization is implemented to learn the factor matrix F, where each factor captures a pattern of eQTL effect sizes across tissues. c Matrix W represents the weights for each eQTL across tissues. Each weight is the reciprocal of the standard error. d The objective function in sn-spMF, where α and λ are sparsity penalty parameters, and D is the number of eQTLs
Fig. 2Assignment of eQTLs to factors. Effect sizes and 95% confidence intervals of four eQTLs across 49 tissues are illustrated. The fitted linear combination of factors for the eQTL is displayed in gray scale at the right of each panel. Faded colors indicate factors with coefficients with FDR ≥ 0.05. Asterisk on the tissue indicates that this eQTL was significant with FDR < 0.05 in that tissue. a A liver-specific eQTL (GLT1D1-rs1012994). b An eQTL (AATF-rs76014915) with activity in brain tissues and tibial nerve. c A ubiquitous eQTL (U2AF1-rs234719). d An eQTL (CD14-rs2563249) with ubiquitous and testis-specific effects
Fig. 3Identification of tissue-specific and ubiquitous eQTLs. a Fraction of tested eQTLs that load on each factor. b Fraction of eQTLs that load on ubiquitous and tissue-specific factors. c The overlap of tested eQTLs that loaded on the ubiquitous factor (u-eQTLs) and any tissue-specific factor (ts-eQTLs). d Fraction of eQTLs that load on different numbers of tissue-specific factors. eQTLs that load with a specific number of ts-factors can fall into one of two categories: those with the ubiquitous factor and those with only ts-factors. The figure shows the fraction of tested eQTLs that load on each number of ts-factors with colors to show the contribution for each category. e Fraction of eQTLs with activity in different numbers of tissues. The numbers of unique tissues represented in the set of factors for each eQTL are summed
Fig. 4Enriched GO terms for eQTL genes from sn-spMF at FDR < 0.1. Color represents the level of enrichment (− log10P value). The significantly enriched GO terms are annotated by numbers representing the odds ratio. To compute the OR for each factor, background genes include all genes tested for the represented tissues in the factor. GO terms and factors are ordered by hierarchical clustering. Examples of relevant GO terms in related tissues are annotated
Fig. 5Enrichment of TFBS for u-eQTLs and ts-eQTLs. a Number of TFs whose binding sites are enriched for eQTLs across factors at FDR < 0.05 for sn-spMF, flashr , and heuristic 1 methods. Enh, enhancers; TssA, active transcription start sites. b Total number of TFs with binding sites enriched for either only u-eQTLs, or only ts-eQTLs, or both. c Distribution of the number of tissue-specific factors each TF is enriched in. d–f Enrichment for example TFs among eQTLs across each factor (− log10(P value)) where the TF was expressed in corresponding tissues for d FOSL2, e GATA4, and f HNF4A. Black bars represent that the BH-corrected P value is < 0.05
Fig. 6Example liver-specific eQTL, TNKS-rs9987289, in a TFBS of HNF4A that co-localizes with liver-specific phenotypes. a Effect size and 95% confidence interval of TNKS-rs9987289 across 49 tissues in GTEx. b Allele-specific HNF4A ChIP-seq reads over rs9987289 in the liver (see the “Methods” section, two-sided binomial test P value =8.8×10−5). c Normalized expression levels of TNKS in the liver among individuals with different genotypes at rs9987289. P value =3.4×10−4 from GTEx eQTL analysis. d Schematic illustration of hypothesized mechanism: allele-specific binding of HNF4A at rs9987289 and altered levels of expression of TNKS. e Manhattan plot (LocusZoom v0.4.8) [54] of TNKS expression levels in the liver around rs9987289. f Manhattan plot for LDL GWAS around rs9987289