| Literature DB >> 31269066 |
Aabida Saferali1,2, Jeong H Yun1,2,3, Margaret M Parker1,2, Phuwanat Sakornsakolpat1, Robert P Chase1, Andrew Lamb1, Brian D Hobbs1,2,3, Marike H Boezen4,5, Xiangpeng Dai2,6, Kim de Jong4,5, Terri H Beaty7, Wenyi Wei2,6, Xiaobo Zhou1, Edwin K Silverman1,2,3, Michael H Cho1,2,3, Peter J Castaldi1,2,8, Craig P Hersh1,2,3.
Abstract
While many disease-associated single nucleotide polymorphisms (SNPs) are associated with gene expression (expression quantitative trait loci, eQTLs), a large proportion of complex disease genome-wide association study (GWAS) variants are of unknown function. Some of these SNPs may contribute to disease by regulating gene splicing. Here, we investigate whether SNPs that are associated with alternative splicing (splice QTL or sQTL) can identify novel functions for existing GWAS variants or suggest new associated variants in chronic obstructive pulmonary disease (COPD). RNA sequencing was performed on whole blood from 376 subjects from the COPDGene Study. Using linear models, we identified 561,060 unique sQTL SNPs associated with 30,333 splice sites corresponding to 6,419 unique genes. Similarly, 708,928 unique eQTL SNPs involving 15,913 genes were detected at 10% FDR. While there is overlap between sQTLs and eQTLs, 55.3% of sQTLs are not eQTLs. Co-localization analysis revealed that 7 out of 21 loci associated with COPD (p<1x10-6) in a published GWAS have at least one shared causal variant between the GWAS and sQTL studies. Among the genes identified to have splice sites associated with top GWAS SNPs was FBXO38, in which a novel exon was discovered to be protective against COPD. Importantly, the sQTL in this locus was validated by qPCR in both blood and lung tissue, demonstrating that splice variants relevant to lung tissue can be identified in blood. Other identified genes included CDK11A and SULT1A2. Overall, these data indicate that analysis of alternative splicing can provide novel insights into disease mechanisms. In particular, we demonstrated that SNPs in a known COPD GWAS locus on chromosome 5q32 influence alternative splicing in the gene FBXO38.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31269066 PMCID: PMC6634423 DOI: 10.1371/journal.pgen.1008229
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
COPD genome-wide association study loci containing cis-eQTLs and cis-sQTLs at 10% FDR.
| Locus | Window | Minimum GWAS | sQTLs | eQTLs | ||||
|---|---|---|---|---|---|---|---|---|
| Top SNP | FDR | Top Cluster | SNP | FDR | Top Gene | |||
| 15q25.1 | 15:76100000- | 9.54x10−24 | 15:78826180 | 0.0012 | 15:78834561-78836532 | 15:78857986 | 0.0012 | |
| 4q31.21 | 4:141700000- | 4.05x10−22 | 4:145257681 | 0.028 | 4:145041741-145059314 | 4:145270867 | 0.066 | |
| 5q32 | 5:143100000- | 2.59x10−14 | 5:147790860 | 1.56 x 10−7 | 5:147790328-147793699 | |||
| 4q22.1 | 4:88200000- | 1.32x10−12 | 4:89900452 | 0.0069 | 4:89319596-89326028 | 4:89930392 | 0.013 | |
| 14q32.12 | 14:90500000- | 2.91x10−12 | ||||||
| 5q33.3 | 5:155600000- | 1.09x10−10 | 5:156928008 | 0.0043 | 5:156957891-156964921 | |||
| 3p24.2 | 3:23800000- | 1.80x10−9 | ||||||
| 2q36.3 | 2:225800000- | 7.19x10−9 | ||||||
| 6p24.3 | 6:7000000- | 1.78x10−8 | ||||||
| 4q24 | 4:102500000- | 2.14x10−8 | 4:106622190 | 4.54 x 10−6 | ||||
| 8q22.3 | 8:101600000- | 3.29x10−8 | ||||||
| 1q41 | 1:212100000- | 4.14x10−8 | ||||||
| 4p15.1 | 4:27900000- | 5.39x10−8 | ||||||
| 20q11.21 | 20:28400000- | 7.84x10−8 | ||||||
| 16p11.2 | 16:27600000- | 8.12x10−8 | 16:28539848 | 1.888 x 10−18 | 16:28603764-28606688 | 16:28539848 | 2.18x10−30 | |
| 3q21.3 | 3:127700000- | 1.58x10−7 | ||||||
| 6q24.1 | 6:139100000- | 2.79x10−7 | 6:142655490 | 0.0057 | ||||
| 1p36.32 | 1:2300000- | 5.33x10−7 | 1:2316315 | 0.0250 | 1:1643839-1647785 | |||
| 18q21.33 | 18:57100000- | 5.79x10−7 | ||||||
| 6p23 | 6:13500000- | 8.10x10−7 | ||||||
| 6q16.3 | 6:99900000- | 9.19x10−7 | ||||||
1 A cluster is defined as set of overlapping spliced junctions or introns. One or more junctions/introns within a cluster may be associated with genotype to give an sQTL.
2 SNPs with positions located within a given window were grouped into the corresponding locus.
*Top Gene or Top Cluster is the gene or cluster with minimum p-value. Many SNPs in this analysis were associated with more than one gene or cluster. For results from all genes/clusters see S6 and S7 Tables
Functional annotation of cis expression quantitative trait loci (eQTLs) and splice QTLs (10% FDR), based on RefSeq annotation.
| Location | Cis-eQTL analysis | Cis-sQTL analysis | p-value | ||
|---|---|---|---|---|---|
| Number | Percentage of total eQTLs | Number | Percentage of total sQTLs | ||
| Downstream | 9,110 | 1.29 | 6,843 | 1.22 | 1.05x10−3 |
| Exonic | 9,901 | 1.40 | 7,867 | 1.40 | 0.80 |
| Exonic; splicing | 3 | 4.23x10−4 | 7 | 1.24x10−3 | 0.18 |
| Intergenic | 293,661 | 41.42 | 212,616 | 37.89 | <2.20x10−16 |
| Intronic | 323,666 | 45.66 | 279,762 | 49.86 | <2.20x10−16 |
| ncRNA_exonic | 4,986 | 0.70 | 3,670 | 0.65 | 8.51x10−4 |
| ncRNA_exonic; splicing | 2 | 2.82x10−4 | 1 | 1.78x10−4 | 1 |
| ncRNA_intronic | 43,617 | 0.06 | 32,152 | 0.06 | <2.20x10−16 |
| ncRNA_splicing | 27 | 3.81x10−3 | 20 | 3.56x10−3 | 0.94 |
| Splicing | 74 | 0.01 | 68 | 0.01 | 0.42 |
| Upstream | 8,841 | 1.25 | 6,218 | 1.11 | <7.52x10−13 |
| Upstream; downstream | 429 | 0.06 | 336 | 0.06 | 0.92 |
| UTR3 | 11,563 | 1.63 | 9,111 | 0.02 | 0.76 |
| UTR5 | 3,039 | 0.43 | 2,383 | 0.42 | 0.75 |
| UTR5, UTR3 | 9 | 1.27x10−3 | 6 | 1.07x10−3 | 0.95 |
| Total | 708,928 | 561,060 | |||
* ncRNA: non-coding RNA; UTR: untranslated region.
Colocalization of sQTLs and eQTLs with COPD case-control GWAS data.
| Locus | 500kb Window tested | GWAS window | Top colocalized sQTL SNP | sQTL | Top colocalized eQTL SNP | eQTL |
|---|---|---|---|---|---|---|
| 15q25.1 | 78,576,180 – | 78,712,101 – | rs931794 | 0.2303 | rs8034191 | 0.1897 |
| 4q31.21 | 145,007,681 – | 145,227,600 – | rs13141641 | 0.2232 | rs6857262 | 0.0840 |
| 5q32 | 147,540,860 – | 147,685,952- | rs7730971 | 0.8448 | NA | NA |
| 4q22.1 | 89,650,452 – | 89,750,361 – | rs7671261 | 0.0845 | rs2869966 | 0.0918 |
| 5q33.3 | 156,678,008 – | 156,824,546 – | rs56168343 | 0.1176 | NA | NA |
| 16p11.2 | 28,289,848 – | 28,513,403 – | rs750155 | 0.2644 | rs79039694 | 0.2096 |
| 1p36.32 | 2,066,315 - | 2,315,680- | rs2843128 | 0.1553 | NA | NA |
1Colocalization posterior probability from eCAVIAR
2NA indicates that the locus does not contain SNPs that are significant eQTLs at the 10% FDR
Fig 1Colocalization analysis of sQTLs and COPD GWAS data at 5q32.
a) Locus zoom plot of the GWAS association at 5q32. The secondary GWAS association at this locus consists of two SNPs–rs7730971 and rs4597955. The primary association is in moderate LD (R2 = 0.4–0.6) with this association. b) Locus zoom plot of sQTL data for the association between FBXO38 splicing with genotype. SNPs associated with FBXO38 splicing are located in HTR4 and FBXO38. c) Visualization of the FBXO38 splice site associated with rs7730971 genotype. The y axis shows read depth and the x axis shows genomic position on chromosome 5. Arched lines indicate junction spanning reads. There are fewer junctional reads supporting the presence of the cryptic exon in the CC genotype (8%) compared to the GG genotype (13%). d) Boxplot of qPCR results showing the fold change of the isoform containg the crypic exon compared to the CC genotype in whole blood (n = 30; selected based on expression levels) and lung tissue (n = 90, selected based on genotype). The center line indicates median, the box captures the interquartile range, and the whiskers show the range excluding outliers. P-values are from linear regression which was performed to test for an additive relationship between genotype and splicing ratio.
Fig 2Colocalization analysis of sQTLs and eQTLs with GWAS data at 16p11.2.
a) Locus zoom plot of the GWAS association at 16p11.2. b) Locus zoom plot of sQTL data for the association between SULT1A2 splicing with genotype. c) Locus zoom plot of eQTL data for the association between SULT1A2 whole gene expression with genotype. d) Visualization of the SULT1A2 splice site associated with rs4788084 genotype. The y axis shows read depth and the x axis shows genomic position on chromosome 16. Arched lines indicate junction spanning reads.
Fig 3Colocalization analysis of sQTLs and COPD GWAS data at 1p36.32.
a) Locus zoom plot of the GWAS association at 1p36.32. b) Locus zoom plot of sQTL data for the association between CDK11A splicing with genotype. c) Visualization of the CDK11A splice site associated with rs2843128. The y axis shows read depth and the x axis shows genomic position on chromosome 1. Arched lines indicate junction spanning reads.