| Literature DB >> 35547845 |
Chunyu Liu, Roby Joehanes, Jiantao Ma, Yuxuan Wang, Xianbang Sun, Amena Keshawarz, Meera Sooda, Tianxiao Huan, Shih-Jen Hwang, Helena Bui, Brandon Tejada, Peter J Munson, Demirkale Cumhur, Nancy L Heard-Costa, Achilleas N Pitsillides, Gina M Peloso, Michael Feolo, Nataliya Sharopova, Ramachandran S Vasan, Daniel Levy.
Abstract
To create a scientific resource of expression quantitative trail loci (eQTL), we conducted a genome-wide association study (GWAS) using genotypes obtained from whole genome sequencing (WGS) of DNA and gene expression levels from RNA sequencing (RNA-seq) of whole blood in 2622 participants in Framingham Heart Study. We identified 6,778,286 cis -eQTL variant-gene transcript (eGene) pairs at p <5×10 -8 (2,855,111 unique cis -eQTL variants and 15,982 unique eGenes) and 1,469,754 trans -eQTL variant-eGene pairs at p <1e-12 (526,056 unique trans -eQTL variants and 7,233 unique eGenes). In addition, 442,379 cis -eQTL variants were associated with expression of 1518 long non-protein coding RNAs (lncRNAs). Gene Ontology (GO) analyses revealed that the top GO terms for cis- eGenes are enriched for immune functions (FDR <0.05). The cis -eQTL variants are enriched for SNPs reported to be associated with 815 traits in prior GWAS, including cardiovascular disease risk factors. As proof of concept, we used this eQTL resource in conjunction with genetic variants from public GWAS databases in causal inference testing (e.g., COVID-19 severity). After Bonferroni correction, Mendelian randomization analyses identified putative causal associations of 60 eGenes with systolic blood pressure, 13 genes with coronary artery disease, and seven genes with COVID-19 severity. This study created a comprehensive eQTL resource via BioData Catalyst that will be made available to the scientific community. This will advance understanding of the genetic architecture of gene expression underlying a wide range of diseases.Entities:
Year: 2022 PMID: 35547845 PMCID: PMC9094109 DOI: 10.1101/2022.04.13.22273841
Source DB: PubMed Journal: medRxiv
Participant characteristic
| Variable mean (SD) or % | Offspring cohort (n=720) | Third Generation cohort (n=1902) |
|---|---|---|
| Women | 58.6 | 52.3 |
| Age, years | 71.3 (8.2) | 46.5 (8.7) |
| BMI, kg/m2 | 28.5 (5.6) | 27.7 (5.6) |
| SBP, mmHg | 126.8 (16.7) | 115.9 (14.1) |
| DBP, mmHg | 73.6 (9.8) | 74.2 (9.4) |
| Fasting glucose, mg/dL | 105.7 (19.8) | 96.5 (19.7) |
| TC, mg/dL | 189.2 (36.4) | 186.3 (33.2) |
| HLD, mg/dL | 57.8 (18.7) | 60.0 (17.8) |
| Trig, mg/dL | 119.1 (73.5) | 110.4 (70.9) |
| LDL, mg/dL | 108.0 (31.9) | 104.2 (29.8) |
| Current smoking | 8.2 | 10.8 |
| Hypertension | 48.1 | 33.6 |
| Diabetes | 12.6 | 5.7 |
| HRX | 44.1 | 19.9 |
| LIPIDRX | 41.4 | 29.0 |
| DMRX | 9.1 | 5.2 |
BMI, body mass index; SBP/DBP, systolic/diastolic blood pressure; TC, total cholesterol; HDL, high density lipoprotein; Trig, triglyceride; LDL, low-density lipoprotein; HRX, treatment for hypertension; LIPIDRX, treatment for high lipid level; DMRX, treatment for diabetes.
Cis- and trans-eQTL in the Framingham Heart Study
| eQTLs at gene level | lncRNA eQTLs | |
|---|---|---|
| Cis-eQTL-eGene ( | ||
| Number of pairs | 6,778,286 pairs | 442,379 |
| Trans-eQTL-eGene ( | ||
| Number of pairs | 1,469,754 pairs | 117,862 |
Top 25 cis-eQTLs (p < 5e-8)
| Gene Symbol | SNP | Chr | SNP position | Gene start position | R2 | Beta | log10P | OA | EA | EAF | Type | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PPIE | rs7513045 | 1 | 39738494 | 39692182 | 0.84 | 11.40 | −1029.25 | G | T | 0.36 | protein_coding | |
| CCDC163 | rs4660860 | 1 | 45480561 | 45493866 | 0.90 | −3.10 | −1286.89 | T | A | 0.30 | protein_coding | |
| CYP26B1 | rs13430651 | 2 | 72215195 | 72129238 | 0.81 | 1.98 | −920.005 | G | A | 0.15 | protein_coding | |
| MAP3K2-DT | rs2276683 | 2 | 127389186 | 127389130 | 0.88 | −1.61 | −1176.37 | G | C | 0.23 | lincRNA | |
| SLC12A7 | rs35188965 | 5 | 1104823 | 1050384 | 0.81 | −29.87 | −915.459 | C | T | 0.44 | protein_coding | |
| ENC1 | rs112772452 | 5 | 74631048 | 74627406 | 0.83 | 14.53 | −986.798 | CA | C | 0.11 | protein_coding | |
| ERAP2 | rs2910686 | 5 | 96916885 | 96875939 | 0.85 | 36.98 | −1044.91 | T | C | 0.43 | protein_coding | |
| BTNL3 | rs72494581 | 5 | 181003797 | 180988845 | 0.82 | 13.52 | −950.405 | T | C | 0.30 | protein_coding | |
| HLA-DRB5 | rs68176300 | 6 | 32558713 | 32517353 | 0.83 | −178.13 | −1003.76 | T | G | 0.15 | protein_coding | |
| AL512625.3 | rs1845054 | 9 | 62906092 | 62856999 | 0.83 | −1.19 | −993.655 | T | C | 0.13 | lincRNA | |
| CUTALP | rs13299616 | 9 | 120832525 | 120824828 | 0.86 | −23.25 | −1092.88 | T | C | 0.40 | transcribed_unitary_pseudogene | |
| LDHC | rs201993031 | 11 | 18412985 | 18412318 | 0.82 | 0.16 | −946.833 | CCCTTCCTT | C | 0.12 | protein_coding | |
| ACCS | rs2074038 | 11 | 44066439 | 44065925 | 0.83 | 16.69 | −997.26 | G | T | 0.11 | protein_coding | |
| FADS2 | rs968567 | 11 | 61828092 | 61792980 | 0.88 | 31.41 | −1186.37 | C | T | 0.17 | protein_coding | |
| XRRA1 | rs10899051 | 11 | 74931506 | 74807739 | 0.91 | 5.38 | −1327.88 | G | A | 0.26 | protein_coding | |
| B4GALNT3 | rs1056008 | 12 | 553672 | 460364 | 0.85 | 6.71 | −1043.34 | T | C | 0.25 | protein_coding | |
| DDX11 | rs3891006 | 12 | 31073506 | 31073860 | 0.86 | −13.25 | −1102.08 | A | G | 0.44 | protein_coding | |
| RPS26 | rs1131017 | 12 | 56042145 | 56041351 | 0.81 | −134.34 | −929.902 | C | G | 0.39 | protein_coding | |
| C17orf97 | rs7503725 | 17 | 410351 | 410325 | 0.85 | 1.89 | −1055.68 | G | T | 0.25 | protein_coding | |
| AC126544.2 | rs2696531 | 17 | 46278268 | 45586452 | 0.86 | 1.04 | −1097.79 | C | A | 0.21 | lincRNA | |
| SPATA20 | rs9890200 | 17 | 50547162 | 50543058 | 0.81 | −11.01 | −934.173 | A | C | 0.37 | protein_coding | |
| CEACAMP3 | rs3745936 | 19 | 41586462 | 41599735 | 0.84 | 1.11 | −1040.05 | A | T | 0.22 | transcribed_unprocessed_pseudogene | |
| PWP2 | rs2277806 | 21 | 44089769 | 44107373 | 0.87 | 3.16 | −1139.85 | A | C | 0.19 | protein_coding | |
| GATD3A | rs3788104 | 21 | 44092213 | 44133610 | 0.86 | 4.25 | −1104.35 | G | A | 0.18 | protein_coding | |
| FAM118A | rs576259663 | 22 | 45363712 | 45308968 | 0.86 | 43.45 | −1108.47 | T | TA | 0.12 | protein_coding |
A, effect allele; OA, the other allele
Figure 1.Variance in eGenes explained by significant cis-eQTLs in relation to the distance of significant cis-eQTLs to the transcription start site of the cis-gene.
Top 25 top trans-eQTLs (p < 1e-12)
| Gene Symbol | SNP | Gene Chr | SNP Chr | SNP Pos | Gene Start Pos | R2 | Beta | t value | log10P | OA | EA | EAF | Gene type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| EMBP1 | rs4549528 | 1 | 5 | 50372700 | 121519112 | 0.70 | 1.63 | 78.04 | −677.53 | T | C | 0.48 | transcribed_unprocessed_pseudogene |
| AL365357.1 | rs4841 | 1 | 5 | 150446963 | 178411616 | 0.64 | 2.79 | 67.64 | −570.81 | C | T | 0.25 | processed_pseudogene |
| AL591846.1 | rs13161099 | 1 | 5 | 150442799 | 206695837 | 0.62 | 1.87 | 65.67 | −549.988 | G | A | 0.25 | processed_pseudogene |
| AC004057.1 | rs1131017 | 4 | 12 | 56042145 | 113214046 | 0.61 | −0.42 | −63.03 | −521.823 | C | G | 0.39 | transcribed_processed_pseudogene |
| RPL10P9 | rs6655287 | 5 | X | 154396528 | 168616352 | 0.64 | 4.11 | 68.52 | −580.03 | A | G | 0.10 | processed_pseudogene |
| PSPHP1 | rs34945686 | 7 | 7 | 65809663 | 55764797 | 0.61 | 0.05 | 64.11 | −533.329 | C | G | 0.18 | unprocessed_pseudogene |
| AC104692.2 | rs6593279 | 7 | 7 | 55736277 | 152366763 | 0.60 | 0.05 | 62.97 | −521.195 | G | A | 0.20 | processed_pseudogene |
| RNF5P1 | rs8365 | 8 | 6 | 32180626 | 38600661 | 0.78 | 0.97 | 96.24 | −850.788 | G | C | 0.19 | processed_pseudogene |
| TUBB8 | rs28652789 | 10 | 16 | 33807 | 46892 | 0.61 | 0.32 | 63.35 | −525.289 | G | C | 0.25 | protein_coding |
| COX20P1 | rs10927332 | 10 | 1 | 244837362 | 68632371 | 0.62 | 0.10 | 64.57 | −538.221 | C | T | 0.19 | processed_pseudogene |
| EIF2S3B | rs16997659 | 12 | X | 24057745 | 10505602 | 0.81 | 0.99 | 106.39 | −939.701 | A | G | 0.17 | protein_coding |
| RPS2P5 | rs2286466 | 12 | 16 | 1964282 | 118246084 | 0.80 | 71.71 | 101.17 | −894.683 | A | G | 0.21 | processed_pseudogene |
| LINC00431 | rs41288614 | 13 | 13 | 112486035 | 110965704 | 0.70 | 0.20 | 76.92 | −666.36 | A | G | 0.15 | transcribed_unprocessed_pseudog |
| NPIPB15 | rs3927943 | 16 | 16 | 69977282 | 74377878 | 0.80 | 3.79 | 103.12 | −911.688 | T | A | 0.40 | protein_coding |
| TUBB8P7 | rs28652789 | 16 | 16 | 33807 | 90093154 | 0.75 | 0.51 | 88.54 | −779.687 | G | C | 0.25 | transcribed_unprocessed_pseudog |
| RPL13P12 | rs2280370 | 17 | 16 | 89561052 | 17383377 | 0.69 | 36.16 | 75.78 | −654.808 | T | G | 0.19 | processed_pseudogene |
| LRRC37A2 | rs56328224 | 17 | 17 | 45495053 | 46511511 | 0.80 | 5.91 | 101.76 | −899.821 | C | T | 0.24 | protein_coding |
| POLRMTP1 | rs14155 | 17 | 19 | 619021 | 62136972 | 0.69 | 0.62 | 75.32 | −650.176 | G | C | 0.50 | processed_pseudogene |
| TUBB8P12 | rs2562131 | 18 | 16 | 33887 | 47390 | 0.65 | 0.47 | 68.64 | −581.244 | C | A | 0.25 | protein_coding |
| AP001005.3 | rs28652789 | 18 | 16 | 33807 | 49815 | 0.61 | 0.15 | 64.25 | −534.859 | G | C | 0.25 | lincRNA |
| RPSAP58 | rs74987185 | 19 | 3 | 39414963 | 23827162 | 0.84 | 10.17 | 117.60 | −1031.88 | G | GCT | 0.31 | processed_pseudogene |
| GATD3B | rs2277806 | 21 | 21 | 44089769 | 5079294 | 0.74 | −3.83 | −84.78 | −743.85 | A | C | 0.19 | protein_coding |
| FP565260.1 | rs2277806 | 21 | 21 | 44089769 | 5130871 | 0.76 | −2.96 | −90.65 | −799.469 | A | C | 0.19 | protein_coding |
| SIRPAP1 | rs115287948 | 22 | 20 | 1915413 | 30542536 | 0.75 | 1.12 | 89.28 | −786.711 | G | A | 0.36 | processed_pseudogene |
| GPX1P1 | rs7643586 | X | 3 | 49394214 | 13378735 | 0.61 | 16.44 | 64.25 | −534.823 | C | G | 0.43 | processed_pseudogene |
EA, effect allele; OA, the other allele
Figure 2.Cis-long noncoding RNA, MAP3K2-DT, and the lead cis-eQTL, rs2276683.
Top results in Mendelian randomization analyses
| INV MR[ | |||||||
|---|---|---|---|---|---|---|---|
| Exposure | Chr | Gene type | Outcome | Beta | SE |
| N SNPs |
|
| 1 | Protein coding | CHD | −0.084 | 0.0075 | 4.8E-29 | 7 |
|
| 6 | Protein coding | CHD | −0.069 | 0.011 | 1.3E-09 | 5 |
|
| 6 | miRNA | CHD | 1.72 | 0.28 | 2.0E-09 | 25 |
|
| 10 | Protein coding | CHD | 0.0033 | 0.00039 | 2.9E-17 | 18 |
|
| 12 | Protein coding | CHD | −0.078 | 0.013 | 4.7E-09 | 3 |
|
| 5 | Protein coding | COVID-19 | 0.19 | 0.064 | 0.0025[ | 4 |
|
| 19 | Protein coding | COVID-19 | −0.044 | 0.017 | 0.0078[ | 3 |
|
| 6 | Protein coding | COVID-19 | 0.00099 | 0.00018 | 1.9E-08[ | 35 |
|
| 21 | Protein coding | COVID-19 | −0.023 | 0.0037 | 1.8E-06[ | 11 |
|
| 12 | Protein coding | COVID-19 | −0.0086 | 0.0022 | 1.6E-04[ | 1 |
|
| 12 | Protein coding | COVID-19 | 0.32 | 0.11 | 0.0029 | 13 |
|
| 21 | Protein coding | COVID-19 | 0.011 | 0.0021 | 2.8E-08 | 3 |
|
| 2 | Bidirectional promoter lncRNA | SBP | −5.60 | 0.55 | 2.3E-24 | 3 |
|
| 3 | Protein coding | SBP | 0.092 | 0.0086 | 4.6E-27 | 4 |
|
| 12 | Protein coding | SBP | −0.92 | 0.058 | 1.9E-58 | 3 |
|
| 16 | Protein coding | SBP | −0.82 | 0.066 | 5.3E-35 | 21 |
|
| 17 | Protein coding | SBP | −0.035 | 0.0030 | 1.5E-31 | 3 |
Beta/SE and p-value were obtained by inverse variance weighted MR method.
Heterogeneity was observed in MR analyses. Sensitivity analyses were performed with median-based and mode-based MR methods in Supplemental Table 9.
MR analysis was performed at gene level. At splice variation level (rs10774671), the MR p = 4E-06.