| Literature DB >> 32912333 |
Nicole R Gay1, Michael Gloudemans2, Margaret L Antonio2, Nathan S Abell1, Brunilda Balliu3, YoSon Park4,5, Alicia R Martin6,7, Shaila Musharoff1, Abhiram S Rao8, François Aguet9, Alvaro N Barbeira10, Rodrigo Bonazzola10, Farhad Hormozdiari9,11, Kristin G Ardlie9, Christopher D Brown4, Hae Kyung Im10, Tuuli Lappalainen12,13, Xiaoquan Wen14, Stephen B Montgomery15,16.
Abstract
BACKGROUND: Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research across populations and further characterizes the impact of population structure on GWAS colocalization.Entities:
Keywords: Admixture; Colocalization; GTEx; Gene expression; Local ancestry; Population structure; eQTL
Mesh:
Year: 2020 PMID: 32912333 PMCID: PMC7488497 DOI: 10.1186/s13059-020-02113-0
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Population admixture in the GTEx v8 cohort. a Genotype principal components (gPCs) reflect global ancestry. Points are colored by self-reported ancestry. Circled points indicate the 117 individuals defined as admixed (117AX). b A subset of GTEx v8 tissues has an 117AX sample size of at least 30. The seven tissues selected for cis-eQTL mapping in 117AX are colored and shown in bold. c LA tracts collapse consecutive variants on a single parental chromosome with the same ancestry assignment into contiguous haplotype blocks. The fine spatial resolution of local ancestry contrasts with the global ancestry proportions indicated in the legend. Haplotypes (columns) are paired by individuals; rows are autosomal chromosomes. Individuals are sorted from left to right by decreasing proportions of European admixture. d gPCs are highly correlated with global ancestry proportions averaged from genome-wide local ancestry. e Local (or global) ancestry explains a fraction of variance in residual gene expression after correcting for global (or local) ancestry. Local ancestry is defined as the local ancestry at the transcription start site of each gene; global ancestry is the first five gPCs. Points are colored by tissue; colors correspond with b. Subc., subcutaneous; NSE, not sun-exposed; VE, variance explained; LA, local ancestry; GA, global ancestry
Fig. 2Comparison of cis-eQTLs called by LocalAA or GlobalAA. Cis-eQTL mapping was performed in seven tissues. A nominal P value threshold of 1e−6 was applied to identify significant associations. a A Q-Q plot of nominal P values for all tests indicates a modest improvement of power in most tissues when using LocalAA. b LocalAA identifies more eGenes than GlobalAA in all seven tissues (P value = 0.0078, binomial probability). c The majority of eGenes are identified by both ancestry adjustment methods (gray + purple). The two methods report different eVariants for a small fraction of these eGenes (purple). Numbers indicate eGenes uniquely called by one of the ancestry adjustment methods, which are plotted in d. d The majority of eGenes unique to one ancestry adjustment method fall near the significance threshold, as indicated by the rug plot. Dotted lines demarcate the region outside of which eGenes in one method have a nominal P value at least two orders of magnitude more significant than the alternate method. Points are colored by tissue
Fig. 3Impact of eQTL ancestry adjustment methods on colocalization with GWAS. a, b We performed colocalization for a subset of loci where LocalAA and GlobalAA called eQTLs with different lead eVariants (nominal P value threshold of 1e−4). Each point represents a GWAS/eQTL colocalization test near a single eGene (colored by eQTL tissue). The x- and y-axes respectively show the posterior probabilities of colocalization using either GlobalAA or LocalAA eQTL signals. The same 31 points highlighted in both plots correspond to loci where one ancestry-adjusted eQTL signal colocalized but the other did not, with concordant results between two colocalization methods. a Colocalization was performed with COLOC for all loci where LocalAA and GlobalAA called eQTLs with different lead eVariants (nominal P value threshold of 1e−4). A posterior probability of colocalization (PP4) threshold of 0.5 was used to identify colocalization events with COLOC. b For the subset of loci for which COLOC reported a colocalization (i.e., colored points in a), colocalization was also performed with FINEMAP. Colocalization posterior probabilities (CLPPs) are shown on a log10 scale. A CLPP threshold of 0.01 was used to identify colocalization events with FINEMAP. c Colocalization posterior probabilities are provided for the 31 loci highlighted in a and b. Larger values indicate stronger colocalization. The associated eQTL tissues are indicated with colored circles and tick marks below the x-axis. SR, self-reported; DBD, diagnosed by doctor; N, count
Fig. 4Correlation between genotype and local ancestry in GTEx v8 eVariants. For all eVariants reported by the overall GTEx v8 eQTL calling pipeline, we calculated the correlation between genotypes and local ancestry using the full GTEx v8 cohort. a The majority of GTEx v8 eVariants are not confounded by local ancestry when all 838 genotyped individuals are considered. b Local ancestry explains more than 70% of the variance in genotypes for a subset of GTEx v8 eVariants. Unlike a, b considers only individuals with matched genotype and gene expression data for each tissue, which reflects the sample used to call these significant associations. eQTLs with posterior probabilities of GWAS colocalization of at least 0.5 (COLOC PP4 > 0.5) are labeled with the eGene and GWAS trait