| Literature DB >> 34356065 |
Jia Wen1, Munan Xie1, Bryce Rowland2, Jonathan D Rosen2, Quan Sun2, Jiawen Chen2, Amanda L Tapia2, Huijun Qian3, Madeline H Kowalski2, Yue Shan2, Kristin L Young4, Marielisa Graff4, Maria Argos5, Christy L Avery4, Stephanie A Bien6, Steve Buyske7, Jie Yin8, Hélène Choquet8, Myriam Fornage9, Chani J Hodonsky10, Eric Jorgenson11, Charles Kooperberg6, Ruth J F Loos12, Yongmei Liu13, Jee-Young Moon14, Kari E North4, Stephen S Rich10, Jerome I Rotter15, Jennifer A Smith16, Wei Zhao16, Lulu Shang17, Tao Wang14, Xiang Zhou17, Alexander P Reiner18, Laura M Raffield1, Yun Li1,2.
Abstract
BACKGROUND: Thousands of genetic variants have been associated with hematological traits, though target genes remain unknown at most loci. Moreover, limited analyses have been conducted in African ancestry and Hispanic/Latino populations; hematological trait associated variants more common in these populations have likely been missed.Entities:
Keywords: TWAS (transcriptome-wide association study); ancestry; expression analysis; non-European populations
Mesh:
Year: 2021 PMID: 34356065 PMCID: PMC8307403 DOI: 10.3390/genes12071049
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Study design for blood cell trait focused TWAS analyses in African ancestry (AA) and Hispanic/Latino (HL) populations. TWAS: Transcriptome-wide association study; DGN: Depression Genes and Networks, n = 922; MESA: Multi-Ethnic Study of Atherosclerosis, n = 229 for AA and n = 381 for HL; ARIC: Atherosclerosis Risk in Communities, n = 2874; BioMe: BioMeTM Biobank, n = 3550 for AA and n = 4730 for HL; CARDIA: Coronary Artery Risk Development in Young Adults, n = 953; GERA: Genetic Epidemiology Research on Adult Health and Aging, n = 3699 for AA and n = 7348 for HL; UKB: UK Biobank, n = 8262; WHI: Women’s Health Initiative, n = 8617 for AA and n = 4359 for HL; HCHS/SOL: Hispanic Community Health Study/Study of Latinos, n = 11,887.
Figure 2Venn diagram showing the overlap of well-predicted genes by Depression Genes and Networks (DGN) European ancestry and Multi-Ethnic Study of Atherosclerosis (MESA) African ancestry (AA) and Hispanic/Latino (HL) reference panels.
Figure 3The smooth scatter plots show the model R2 distribution of common genes available in both the Depression Genes and Networks (DGN) European and Multi-Ethnic Study of Atherosclerosis (MESA) reference eQTL datasets. The dashed line denotes the threshold value (model R2 = 0.05) for well-predicted genes. (a) Comparison of model R2 of common genes found in both Depression Genes and Networks (DGN) and Multi-Ethnic Study of Atherosclerosis (MESA) African ancestry (AA) reference panels; (b) Comparison of model R2 of common genes between the Depression Genes and Networks (DGN) and Multi-Ethnic Study of Atherosclerosis (MESA) Hispanic/Latino (HL) reference panels; (c) Comparison of model R2 of common genes between the Multi-Ethnic Study of Atherosclerosis (MESA) African ancestry (AA) and Multi-Ethnic Study of Atherosclerosis (MESA) Hispanic/Latino (HL) reference panels; (d) Histograms showing the model R2 distribution of all genes in each reference panel (without model R2 filtering). The blue solid line denotes the median of model R2; the blue dashed line denotes the mean of model R2. All genes, including those which do not meet a model R2 = 0.05 cut-off, are displayed.
Figure 4Smooth scatter plots comparing true gene expression from Genetic Epidemiology Network of Arteriopathy (GENOA) lymphoblastoid (LCL) data to predicted gene expression using Depression Genes and Networks (DGN) and Multi-Ethnic Study of Atherosclerosis (MESA) African ancestry (AA) reference eQTL datasets. (a) True R2 distribution using the Depression Genes and Networks (DGN) and Multi-Ethnic Study of Atherosclerosis (MESA) eQTL reference panel for all genes (# genes = 4043); (b) true R2 distribution for genes with model R2 > 0.05 in each reference eQTL dataset (# genes = 3426); (c–f). Prediction performance of four examples of well-predicted genes. The scatter plots show the observed expression in GENOA versus predicted expression using prediction models build using MESA AA (c,d) and DGN (e,f). The observed expression in GENOA is from array data in LCLs. The red line is the diagonal line with slope = 1 and the blue line is the fitting line between the observed expression and predicted expression; (g) Venn diagram of well-predicted genes (true R2 > 0.05) by DGN and MESA AA reference panels in GENOA.
The table shows the marginally significant results discovered by MESA HL for all four phenotypes. “NA” denotes there is no known variants +/− 1 MB around the gene.
| Gene | Chr | Start_hg38 | End_hg38 | Phenotype | Meta_beta | Meta_se | Direction | Marginal | Conditional | Model | Cross-Validation | TWAS Reference Panel | Discovery Population |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ADAM15 | 1 | 155050566 | 155062775 | HCT | −0.078 | 0.018 | (−−−−+) | 8.74 | 2.52 | 0.301 | 0.395 | MESA | HL |
| THBS3 | 1 | 155195588 | 155209051 | HCT | 0.212 | 0.039 | (+++++) | 5.40 | 9.22 | 0.1 | 0.060 | MESA | HL |
| GTF2IRD2B | 7 | 75092573 | 75149817 | HCT | −0.249 | 0.056 | (−−−−−) | 9.18 | NA | 0.138 | 0.004 | MESA | HL |
| AGAP6 | 10 | 49982190 | 50010499 | HCT | −0.231 | 0.059 | (−−−−−) | 8.44 | NA | 0.119 | 0.017 | MESA | HL |
| SMAD6 | 15 | 66702228 | 66782849 | HCT | −0.356 | 0.081 | (−−−−+) | 1.14 | 8.55 | 0.074 | 0.005 | MESA | HL |
| ADAM15 | 1 | 155050566 | 155062775 | HGB | −0.066 | 0.018 | (−−−−+) | 3.95 | 1.51 | 0.301 | 0.395 | MESA | HL |
| THBS3 | 1 | 155195588 | 155209051 | HGB | 0.159 | 0.039 | (++++-) | 2.88 | 7.71 | 0.1 | 0.060 | MESA | HL |
| ARHGAP19 | 10 | 97222173 | 97292673 | HGB | −0.215 | 0.056 | (−−−−+) | 7.15 | NA | 0.1 | 0.011 | MESA | HL |
| CCDC15 | 11 | 124954121 | 125041489 | HGB | 0.059 | 0.016 | (++++−) | 5.84 | NA | 0.342 | 0.235 | MESA | HL |
| SMAD6 | 15 | 66702228 | 6678284 | HGB | −0.344 | 0.080 | (−−−−+) | 9.0 | 4.77 | 0.074 | 0.005 | MESA | HL |
| IL6R | 1 | 154405193 | 154469450 | PLT | −0.232 | 0.058 | (−−−−+) | 6.06 | 7.63 | 0.065 | 0.014 | MESA | HL |
| BAK1 | 6 | 33572547 | 33580293 | PLT | −0.118 | 0.029 | (−−−+−) | 4.95 | 2.42 | 0.167 | 0.088 | MESA | HL |
| PAQR8 | 6 | 52361421 | 52407777 | PLT | 0.080 | 0.019 | (+++++) | 4.79 | 1.27 | 0.268 | 0.165 | MESA | HL |
| TNFAIP2 | 14 | 103123442 | 103137439 | PLT | −0.265 | 0.065 | (−−−−−) | 6.85 | 8.10 | 0.095 | 0.013 | MESA | HL |
| SLC22A4 | 5 | 132294394 | 132344190 | WBC | 0.117 | 0.027 | (+++++) | 1.73 | 1.53 | 0.172 | 0.126 | MESA | HL |
| BAK1 | 6 | 33572547 | 33580293 | WBC | −0.110 | 0.028 | (−−−−+) | 9.47 | 1.46 | 0.167 | 0.088 | MESA | HL |
| GRINA | 8 | 143990056 | 143993415 | WBC | −0.298 | 0.077 | (−−+−−) | 9.70 | NA | 0.066 | 0.004 | MESA | HL |
| ATXN2 | 12 | 111443485 | 111599676 | WBC | −0.338 | 0.071 | (−−+−+) | 1.56 | 2.12 | 0.079 | 0.003 | MESA | HL |
Figure 5Mirror plot for the TWAS and GWAS results for platelets (PLT) in Multi-Ethnic Study of Atherosclerosis (MESA) Hispanic/Latino (HL). The upper panel shows TWAS marginal results and Table 1. for TWAS and 5 for GWAS.
Figure 6TNFAIP2 locus. (a) Mirror plots showing the conditional analysis for TNFAIP2, predicted using the Multi-Ethnic Study of Atherosclerosis (MESA) Hispanic/Latino (HL) reference panel, for platelet association meta-analysis in Hispanic/Latino cohorts. The red dots in the bottom panel denote the nearby GWAS signals conditioned on. TNFAIP2 is still significant when conditioned on nearby GWAS signals (as listed in Supplementary Table S5); (b) Gene expression for TNFAIP2 and the other three GWAS annotated genes in this locus from platelet-producing megakaryocytes (MK) from BLUEPRINT [43].
Figure 7ENG locus. (a) The marginal and conditional results for ENG for hematocrit (HCT) and (b) hemoglobin (HGB) predicted using the Multi-Ethnic Study of Atherosclerosis (MESA) reference panel in African ancestry (AA) meta-analysis. The green dot denotes the gene ENG, the red dots denote other genes within this locus region.
Figure 8Fine-mapping of the THBS3 and MFN2 locus. Blue dots denote genes in the causal gene set configuration; red dots denote the genes outside of the causal gene set configuration. Dot size is proportional to the marginal posterior inclusion probability of each gene in the 95% credible set within the locus. The red dashed line denotes the TWAS significance threshold value. (a,c) THBS3 locus for hematocrit using Depression Genes and Networks (DGN) reference panel in African ancestry (AA) and the posterior inclusion probability of each gene in the 95% credible set within this locus; (b,d) MFN2 locus for platelet count using Depression Genes and Networks (DGN) reference panel in African ancestry (AA) and the posterior inclusion probability of each gene in the 95% credible set within this locus.