| Literature DB >> 26993500 |
Marco Cavalli1, Gang Pan1, Helena Nord1, Ola Wallerman1,2, Emelie Wallén Arzt1,3, Olof Berggren4, Ingegerd Elvers2,5, Maija-Leena Eloranta4, Lars Rönnblom4, Kerstin Lindblad Toh2,5, Claes Wadelius6.
Abstract
Genome-wide association studies (GWAS) have identified a large number of disease-associated SNPs, but in few cases the functional variant and the gene it controls have been identified. To systematically identify candidate regulatory variants, we sequenced ENCODE cell lines and used public ChIP-seq data to look for transcription factors binding preferentially to one allele. We found 9962 candidate regulatory SNPs, of which 16 % were rare and showed evidence of larger functional effect than common ones. Functionally rare variants may explain divergent GWAS results between populations and are candidates for a partial explanation of the missing heritability. The majority of allele-specific variants (96 %) were specific to a cell type. Furthermore, by examining GWAS loci we found >400 allele-specific candidate SNPs, 141 of which were highly relevant in our cell types. Functionally validated SNPs support identification of an SNP in SYNGR1 which may expose to the risk of rheumatoid arthritis and primary biliary cirrhosis, as well as an SNP in the last intron of COG6 exposing to the risk of psoriasis. We propose that by repeating the ChIP-seq experiments of 20 selected transcription factors in three to ten people, the most common polymorphisms can be interrogated for allele-specific binding. Our strategy may help to remove the current bottleneck in functional annotation of the genome.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26993500 PMCID: PMC4835527 DOI: 10.1007/s00439-016-1654-x
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132
AS-SNPs detected in four different cell lines
| Total SNPs called from WGS | Heterozygous SNPs | TFs ChIP-seq-ed | AS-SNPs (total) | Common AS-SNPs (% of total) | Rare AS-SNPs (% of total) | |
|---|---|---|---|---|---|---|
| GM12878 | 3,657,092 | 2,250,735 | 86 | 3265 | 2631 (81) | 634 (19) |
| H1-hESC | 3,377,147 | 2,014,114 | 51 | 1014 | 858 (85) | 156 (15) |
| SK-N-SH | 4,625,100 | 2,363,834 | 27 | 1384 | 1158 (84) | 226 (16) |
| K562 | 4,374,809 | 1,864,966 | 100 | 4299 | 3752 (87) | 547 (13) |
| Total | 9962 | 8399 (84) | 1563 (16) | |||
Fig. 1Allelic effects of rare variants. Allele-specific effect (see “Materials and methods”) at heterozygous SNPs with a significant difference between the ChIP-seq reads of alleles G1 and G2 in four different cell lines
Fig. 2Coverage of transcription factors and alleles in the population. a Network representing the top 20 TFs, polymerases or coactivators whose ChIP-seq reads detect most AS-SNPs in four different cell lines. The TFs detecting most AS-SNPs in several cell lines are clustered at the center with the more cell-specific ones in the outer layers. b The likelihood of finding a heterozygous SNP as a function of the allele frequency considering one or more individuals. The AUC represents the proportion of heterozygous SNPs a population of n individuals
AS-SNPs detected in different cell lines associated with cell-specific GWAS traits
| GWAS-associated traits | Number of AS-SNPs | Number of AS-loci | Number of reported loci ¤ | % of reported loci with AS-SNPs |
|---|---|---|---|---|
| GM12878 and K562 | ||||
| Systemic lupus erythematosus | 28 (18a) | 6 | 63 | 9, 5 |
| Crohn’s disease|inflammatory bowel disease|ulcerative colitis | 22 (3a) | 14 | 182 | 7, 7 |
| Type 1 diabetes | 14 (7a) | 4 | 65 | 6, 2 |
| Chronic lymphocytic leukemia | 14 (14a) | 1 | 25 | 4, 0 |
| Asthma | 13 (8a) | 3 | 74 | 4, 1 |
| IgG glycosylation | 11 (1a) | 9 | 295 | 3, 1 |
| Schizophrenia | 18 | 3 | 101 | 3, 0 |
| Self-reported allergy | 12 | 3 | 34 | 8, 8 |
| Rheumatoid arthritis | 9 | 4 | 151 | 2, 6 |
| Psoriasis | 6 | 3 | 27 | 11, 1 |
| Systemic sclerosis | 5 | 2 | 27 | 7, 4 |
| Celiac disease | 5 | 3 | 61 | 4, 9 |
| Primary biliary cirrhosis | 4 | 1 | 24 | 4, 2 |
| Adverse response to chemotherapy (neutropenia/leucopenia) (carboplatin) | 4 | 2 | 9 | 22, 2 |
| Chronic myeloid leukemia | 3 | 1 | 28 | 3, 6 |
| Multiple sclerosis | 3 | 3 | 138 | 2, 2 |
| Amyotrophic lateral sclerosis | 2 | 1 | 142 | 0, 7 |
| Restless legs syndrome | 2 | 1 | 6 | 16, 7 |
| Vitiligo | 2 | 1 | 33 | 3, 0 |
| Acute lymphoblastic leukemia (childhood) | 1 | 1 | 33 | 3, 0 |
| Ankylosing spondylitis | 1 | 1 | 21 | 4, 8 |
| Atopic dermatitis | 1 | 1 | 18 | 5, 6 |
| Endometriosis | 1 | 1 | 20 | 5, 0 |
| IgE grass sensitization | 1 | 1 | 9 | 11, 1 |
| Primary sclerosing cholangitis | 1 | 1 | 2 | 50, 0 |
| SK-N-SH | ||||
| Obesity-related traits | 3 | 3 | 572 | 0, 5 |
| Parkinson’s disease | 2 | 2 | 50 | 4, 0 |
| Schizophrenia | 1 | 1 | 101 | 1, 0 |
| Bipolar disorder and schizophrenia | 1 | 1 | 113 | 0, 9 |
| Attention deficit hyperactivity disorder (time to onset) | 1 | 1 | 128 | 0, 8 |
| Alzheimer’s disease (late onset) | 1 | 1 | 48 | 2, 1 |
| Inattentive symptoms | 1 | 1 | 14 | 7, 1 |
| Amyotrophic lateral sclerosis | 1 | 1 | 142 | 0, 7 |
| White matter integrity | 1 | 1 | 14 | 7, 1 |
| Anorexia nervosa | 1 | 1 | 26 | 3, 8 |
| Migraine | 1 | 1 | 60 | 1, 7 |
Loci defined as GWAS-SNPs within 1 Mb regions
aIn HLA region
Fig. 3AS-SNPs associated with GWAS-SNPs. a Top AS-SNPs associated with GWAS-SNPs intersecting the cell-specific collections of AS-SNPs with the full GWAS catalog. Numbers are reported for common and rare AS-SNPs that are direct hits in the GWAS catalog or in LD with GWAS-SNPs. Bottom AS-SNPs associated with GWAS-SNPs intersecting the cell-specific collections of AS-SNPs with GWAS-SNPs associated with cell-specific traits. b Model representation of the networks of interactions observed between AS-SNPs and GWAS-SNPs. The dotted red box highlights the simplest scenario with one AS-SNP in LD with a GWAS-SNP. The tables report the numbers of instances observed in each cell line where one AS-SNP is in LD with several GWAS-SNPs or one GWAS-SNPs is in LD with different AS-SNPs, or where GWAS-SNPs were also AS-SNPs. c Four AS-SNPs, located in three different regulatory elements, interact with several GWAS-SNPs associated with autoimmune diseases
Fig. 4Functional effects of AS-SNPs in primary B cells. a Stimulation of B cells with the oligonucleotide ODN2216 increases the expression of TF EBF1. b Top rs909685 is a GWAS-SNP with AS behavior detected by EBF1 ChIP-seq reads. rs909685 is in LD with the eQTL SNP rs2069235 which is associated with the expression of the SYNGR1 gene. Bottom SYNGR1 expression in an individual homozygous for the A- or T-allele, respectively, at rs909685, unstimulated (blue) or stimulated with ODN2216 (yellow). Middle motif for PITX3 at rs909685 which alters the TF-binding motif. c Top AS-SNP rs9603612 is located in an intron of COG6 and is in LD with the GWAS-SNP rs7993214. rs9603612 showed AS behavior with EBF1 ChIP-seq reads covering the SNP with significant difference. Bottom expression of COG6 in an individual homozygous for the C- and G-alleles, respectively, of rs9603612, unstimulated (blue) or stimulated with ODN2216 (yellow). Middle motif for CREB1 at rs9603612 which alters the TF-binding motif