| Literature DB >> 33311554 |
Xinyi Jiang1, Nefeli Dellepiane1, Erola Pairo-Castineira1, Thibaud Boutin1, Yatendra Kumar1, Wendy A Bickmore1, Veronique Vitart2.
Abstract
Corneal resistance factor (CRF) is altered during corneal diseases progression. Genome-wide-association studies (GWAS) indicated potential CRF and disease genetics overlap. Here, we characterise 135 CRF loci following GWAS in 76029 UK Biobank participants. Enrichment of extra-cellular matrix gene-sets, genetic correlation with corneal thickness (70% (SE = 5%)), reported keratoconus risk variants at 13 loci, all support relevance to corneal stroma biology. Fine-mapping identifies a subset of 55 highly likely causal variants, 91% of which are non-coding. Genomic features enrichments, using all associated variants, also indicate prominent regulatory causal role. We newly established open chromatin landscapes in two widely-used human cornea immortalised cell lines using ATAC-seq. Variants associated with CRF were significantly enriched in regulatory regions from the corneal stroma-derived cell line and enrichment increases to over 5 fold for variants prioritised by fine-mapping-including at GAS7, SMAD3 and COL6A1 loci. Our analysis generates many hypotheses for future functional validation of aetiological mechanisms.Entities:
Year: 2020 PMID: 33311554 PMCID: PMC7732848 DOI: 10.1038/s42003-020-01497-w
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
Fig. 1Manhattan plot of the corneal resistance factor GWAS in UK Biobank white-British participants (n = 76,029).
The genome-wide significant threshold (P value = 5 × 10−8) is displayed by the horizontal dotted black line. Only variants with P value < 0.001 are represented. Genomic risk loci overlapping known central cornea thickness and Fuchs corneal dystrophy loci are indicated in blue and orange, respectively. The test statistics inflation factor λgc, 1.147, is mostly due to polygenicity (ratio (LDScore intercept − 1)/(mean(chi2) − 1) = 14.7%; LDscore intercept = 1.057(SE = 0.007)).
Fig. 2Colocalisation with GTEx v7 eQTL signals at prioritized corneal resistance factor loci.
Genes: candidate target genes based on GTEx eQTL look-up for prioritized candidate causal variants at CRF loci (grey box). Each dot denotes significant association signals for both CRF and GTEx v7 GWAS, colour-coded as independent (grey) or identical (blue if CRF increasing allele increases gene expression, red if it decreases it). The dot size is proportional to the probability of a colocalisation (pp4); pp4 > 0.75 denotes strong evidence in favour of colocalisation.
Six noncoding highly likely causal variants for CRF associations colocalising with a single gene expression signal.
| Prioritised and eQTL | Locus | MAF | PIP | log10(BF) | COLOC pp | Implicated gene | Protein function |
|---|---|---|---|---|---|---|---|
| rs759509948 | 101 | 0.360 | 0.495 | 3.324 | 0.916 | Transcription regulator, histone acetyl-lysine reader with zinc and tandem PHD fingers. Component of tissue and stage specific chromatin remodelling complexes with roles in neuro (nBAF[ | |
| rs12913547 | 106 | 0.209 | 0.415 | 3.117 | 0.539 | Transcription factor involved in TGFβ signal transduction. Several | |
| rs62257617a | Ext3 | 0.052 | 0.297 | 3.320 | 0.959 | Shares high homology and UDP-glucose:glycoprotein glucosyltransferase potential with GLT8D2[ | |
| rs9913911 | 114 | 0.375 | 0.842 | 3.532 | 0.998 | Growth-arrest specific 7 gene expression was first identified in serum-starved cultured murine fibroblasts[ | |
| rs141144358 | 116 | 0.344 | 0.978 | 4.411 | 0.608 | Component of a complex responsible for histone H4 acetylation at lysine 16, which by promoting open chromatin structure impacts on gene transcriptional regulation[ | |
| rs10010325b | Ext4 | 0.486 | 0.489 | 3.195 | 0.68 | Methyl cytosine dioxygenase 2 which drives DNA demethylation to cytosine hydroxymethylation, e.g. at enhancers during cellular reprogramming[ |
Gene expression data used for colocalisation tests correspond to all cells/tissues represented in GTExv7.
MAF minor allele frequency, COLOC pp posterior probability of signals to be identical, PIP posterior probability of inclusion to list of candidate variants causing association signal, log10(BF) log10 of the Bayes factor quantifying how likely the variant is to be causal rather than non causal.
aPIP other variant in same credible set with high log10(BF), rs186144945, not significant eQTL.
bOther variant in same credible set with high log10(BF), rs2903385, is also an eQTL.
Prioritised CRF causal candidate variants with potential to disrupt a REST or RXRA binding site.
| TF | variant | Ref/Alt | Altered motif | Motif sourcea | Stronger bind ( | Stronger bind CRF association | MM | Annot | Locus |
|---|---|---|---|---|---|---|---|---|---|
| REST | rs3132302 | A/G | ggcggagtgagcAg | ENCODE REST_disc8 | G (3.4 × 10−4) | Decreasing | 0.9315 | intergenic | 68 |
| rs58933977 | T/G | Tgtgcacacaatgg | ENCODE REST_disc8 | G (1.7 × 10−4) | Decreasing | 0.9472 | intergenic | 102 | |
| rs7635832 | T/G | ctcagcacatttTtt | ENCODE REST_disc1 | G (4.8 × 10−4) | Decreasing | 0.7364 | intronic | 31 | |
| rs79126727 | G/A | Ggccctgt | ENCODE REST_disc6 | A (6.1 × 10−5) | Decreasing | 0.9297 | intronic | 135 | |
| rs182804464 | C/G | cttctCctcagcttgctgt | ENCODE REST_known3 | G (4.3 × 10−4) | Decreasing | 0.6829 | intronic | 135 | |
| rs74948688a | C/T | ccatCacacccagc | ENCODE REST_disc9 | C (2.3 × 10−4)a | Increasing | 0.9108 | intronic and 3’ | 96 | |
| rs2875238 | T/C | gccactggttcTggt | ENCODE REST_disc1 | T (2 × 10−4) | Increasing | 0.7685 | intronic and 5’ | 84 | |
| RXRA | rs2035835 | G/C | cGccatct | ENCODE RXRA_disc5 | G (3 × 10−5) | Decreasing | 0.9832 | intronic | 121 |
| rs34869a | G/C | aagggtcggaggaGatg | ENCODE RXRA_known6 | C (1 × 10−4)a | Increasing | 0.7247 | 5’ | 45 | |
| rs10868299 | C/A | gggtaCcttagagacca | ENCODE RXRA_known8 | A (1.2 × 10−4) | Increasing | 0.8288 | intergenic | 65 | |
| rs74948688a | C/T | tttccatCac | ENCODE RXRA_disc4 | T (4.1 × 10−4)a | Decreasing | 0.9043 | intronic and 3’ | 96 | |
| rs143038218 | T/A | cgTgagtaaa | ENCODE RXRA_disc3 | T (3 × 10−4) | Increasing | 0.9395 | intergenic | 1 | |
| rs6081765 | A/G | caAgttca | ENCODE RXRA_known4 | A (1.7 × 10−4) | Increasing | 0.98899 | Intergenic | 130 | |
| rs7863424a | G/A | agagccagagGgga | SwissRegulon | G (2.3 × 10−4)a | Decreasing | 0.7977 | intronic | 63 |
In silico prediction of significant differential binding for reference (Ref) and alternate (Alt) allele at fine-mapped prioritised SNPs was computed using transcription factor motif weight matrices from various sources (Motif Source) using MotifBreakR.
MM motif match as percentage of perfect match.
aDenotes a corresponding transcription factor occupancy peak in the ENCODE 3 ChIP-seq data survey of 338 factors in 130 cell types.
Cell-specific open chromatin regions enrichment analyses for CRF-associated variants.
| Enrichment | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| ATAC-seq annotation | Tag variants | OR | CI95_L | CI95_U | NAnnotThr | NAnnot | NThr | N | |
| Cranial neural crest cells CNCC_donor2 | GWAS | 1.45 | 1.14 | 1.83 | 2.10E-03 | 77 | 53,001 | 3692 | 4,428,625 |
| Likely causal BF > = 3 | 0.98 | 0.13 | 7.61 | 9.82E-01 | 1 | 53,812 | 26 | 4,429,172 | |
| Adult skin fibroblasts DermF | GWAS | 1.57 | 1.37 | 1.80 | 251 | 172,059 | 3692 | 4,428,625 | |
| Likely causal BF > = 3 | 3.36 | 1.29 | 8.79 | 1.35E-02 | 7 | 173,842 | 26 | 4,429,172 | |
| Neonate dermal fibroblasts nDF | GWAS | 1.54 | 1.34 | 1.78 | 236 | 160,782 | 3692 | 4,428,625 | |
| Likely causal BF > = 3 | 2.24 | 0.78 | 6.47 | 1.35E-01 | 5 | 162,447 | 26 | 4,429,172 | |
| Primary cornea epithelium CEC | GWAS | 1.42 | 1.21 | 1.66 | 194 | 143,339 | 3692 | 4,428,625 | |
| Likely causal BF > = 3 | 1.91 | 0.61 | 6.01 | 2.67E-01 | 4 | 145,008 | 26 | 4,429,172 | |
| Cornea epithelium cell line hTCEpi | GWAS | 1.07 | 0.90 | 1.27 | 4.22E-01 | 155 | 144,979 | 3692 | 4,428,625 |
| Likely causal BF > = 3 | 0.71 | 0.16 | 3.22 | 6.59E-01 | 2 | 146,425 | 26 | 4,429,172 | |
| hTCEpi-specifica | GWAS | 0.67 | 0.47 | 0.95 | 2.65E-02 | 34 | 49,520 | 3692 | 4,428,625 |
| Likely causal BF > = 3 | 0.997 | 0.13 | 7.76 | 9.97E-01 | 1 | 50,324 | 26 | 4,429,172 | |
| Cornea keratocyte cell line hTK | GWAS | 1.51 | 1.31 | 1.73 | 243 | 168,314 | 3692 | 4,428,625 | |
| Likely causal BF > = 3 | 2.85 | 1.05 | 7.73 | 3.92E-02 | 6 | 170,143 | 26 | 4,429,172 | |
| hTK-specifica | GWAS | 1.82 | 1.47 | 2.25 | 107 | 60,099 | 3692 | 4,428,625 | |
| Likely causal BF > = 3 | 5.68 | 1.81 | 17.80 | 2.88E-03 | 4 | 61,152 | 26 | 4,429,172 | |
| Myelogenous leukemia cell line K562 | GWAS | 1.39 | 1.09 | 1.76 | 7.17E-03 | 74 | 52,657 | 3692 | 4,428,625 |
| Likely causal BF > = 3 | 0.90 | 0.11 | 7.05 | 9.20E-01 | 1 | 53,554 | 26 | 4,429,172 | |
| Lymphoblastoid cell line LCL_NA18504 | GWAS | 1.25 | 1.01 | 1.54 | 3.93E-02 | 99 | 77,529 | 3692 | 4,428,625 |
| Likely causal BF > = 3 | 1.45 | 0.32 | 6.57 | 6.28E-01 | 2 | 78,638 | 26 | 4,429,172 | |
| Retinal pigmented epithelium RPE_donor2b | GWAS | 1.40 | 1.13 | 1.73 | 1.90E-03 | 95 | 67,560 | 3692 | 4,428,625 |
| Likely causal BF > = 3 | 2.55 | 0.70 | 9.23 | 1.55E-01 | 3 | 68,519 | 26 | 4,429,172 | |
Enrichment and significance was assessed using GARFIELD[33] with associated variants (tag variants) thresholded based on P value or Bayes factor for causality. Significant enrichments are highlighted in bold.
BF log10 of the Bayes factor, NAnnotThr the number of variants passing CRF association threshold criteria, which are annotated to an ATAC-seq feature, NAnnot the total number of variants annotated to feature, NThr the total number of independent variants passing CRF association threshold criteria, N the total number of variants analysed following LD pruning.
aResults vary depending on peak-calling algorithm used as presented in Supplementary Note 3.
bThree independent datasets were tested—RPE_Cell line yield nearly identical results, RPE_Donor3 had lower Nannot and less significant P value.
Fig. 3Regulatory genomic annotations for associated CRF-GWAS variants.
a Overlap of open chromatin regions mapped to GWAS variants (P-value threshold > 10−8 or tagging, r2 > 0.8, variants) across four ATAC-seq datasets significantly enriched in CRF-GWAS variants. Cell origins: adult cornea epithelium primary tissue (CEC); skin fibroblasts (DermFb); neonate skin fibroblasts (nDF); immortalized corneal keratocytes hTK. b Overlap details at two CRF loci. Variants named are those selected for enrichment analysis based on P-value threshold (tag variant), bold indicates that they belong to 95% credible sets of causal variants and fall in OCR themselves (*) or tag another credible set variant that do. **different credible set variants map to OCR: rs4646785 in CEC and nDF, rs12939864 in DermFb. c ATAC-seq profiles in immortalised corneal epithelial and stromal cell lines (respectively, hTCEpi and hTK, each in duplicate) around variant rs9913911, prioritised causal variant at locus 114, credible set 2. Screenshot from UCSC genome browser with annotation coordinates used in enrichment analysis in top tracks. d eQTL data from GTEx v8 and predicted significantly disrupted transcription factor binding motif at rs9913911 (T > C).