| Literature DB >> 26194008 |
André M Ribeiro-dos-Santos1, Vandeclécio L da Silva2,3, Jorge E S de Souza4,5, Sandro J de Souza6.
Abstract
BACKGROUND: Differences in gene expression have a significant role in the diversity of phenotypes in humans. Here we integrated human public data from ENCODE, 1000 Genomes and Geuvadis to explore the populational landscape of INDELs affecting transcription factor-binding sites (TFBS). A significant fraction of TFBS close to the transcription start site of known genes is affected by INDELs with a consequent effect at the expression of the associated gene.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26194008 PMCID: PMC4509691 DOI: 10.1186/s12864-015-1744-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Analysis overview. Schematic representation of the strategy used here to identify and analyse TFBS affected by polymorphic INDELs in human populations
Fig. 2Relative position of TFBS-IDs within the 5 KB window adjacent to TSS of human genes. Overall distribution of TFBS-ID in both the 5 KB window upstream of TSS (A) and the 5 KB window flanking TSS (B)
Transcription factors enriched in the set of TFBS-ID close to the TSS of known human genes. “TF” refers to the name of the transcription factor; “Number of TFBS” refers to the number of binding sites for the respective TF within the TFBS-ID set; “p-value” refers to the degree of significance for the respective TF enrichment with the final TFBS set against all TFBS near genes.
| TF | 5Kbp upstream | 5Kbp flanking | ||
|---|---|---|---|---|
| N | p-value | N | p-value | |
|
| 1818 | <10-4 | 4505 | <10-4 |
|
| 1368 | <10-4 | 1982 | <10-4 |
|
| 879 | <10-4 | 1941 | <10-4 |
|
| 684 | <10-4 | 1877 | <10-4 |
|
| 666 | <10-4 | 1244 | <10-4 |
|
| 606 | <10-4 | 1238 | <10-4 |
|
| 370 | <10-4 | 623 | <10-4 |
|
| 237 | <10-4 | 475 | <10-4 |
|
| 364 | 0,002 | 802 | <10-4 |
|
| 580 | 0,097 | 1134 | <10-4 |
|
| 511 | 0,162 | 962 | <10-4 |
|
| 721 | 0,217 | 929 | <10-4 |
|
| 1264 | 0,246 | 2532 | <10-4 |
|
| 534 | 0,315 | 1090 | <10-4 |
|
| 905 | 0,577 | 2120 | <10-4 |
|
| 538 | 0,668 | 872 | <10-4 |
|
| 385 | 0,752 | 915 | <10-4 |
|
| 693 | 0,993 | 1517 | <10-4 |
|
| 613 | 0,998 | 1164 | <10-4 |
|
| 827 | 1,000 | 1523 | <10-4 |
|
| 505 | 1,000 | 1129 | <10-4 |
|
| 524 | 1,000 | 1026 | <10-4 |
|
| 434 | 1,000 | 872 | <10-4 |
|
| 432 | <10-4 | 747 | 1,000 |
|
| 351 | <10-4 | 684 | 1,000 |
|
| 373 | <10-4 | 546 | 1,000 |
|
| 155 | <10-4 | 450 | 1,000 |
|
| 280 | <10-4 | 444 | 1,000 |
|
| 202 | <10-4 | 359 | 1,000 |
|
| 28 | <10-4 | 41 | 1,000 |
|
| 193 | <10-4 | 246 | 0,494 |
|
| 97 | 0,001 | 225 | 1,000 |
|
| 216 | 0,001 | 506 | 1,000 |
|
| 95 | 0,004 | 127 | 1,000 |
TFBS-ID within the 5 KB window upstream of TSS and with highest δ in AFR, ASN or EUR.
| Population | dbSNP id | Gene | Type | Size | Population Frequency | δ |
|---|---|---|---|---|---|---|
|
| rs113103282 |
| DEL | 1 | 0.88 | 0.71 |
| rs111659599 |
| DEL | 6 | 0.73 | 0.70 | |
| rs201685762 |
| DEL | 3 | 0.75 | 0.69 | |
| rs200228600 |
| DEL | 2 | 0.83 | 0.68 | |
| rs60963584 |
| INS | 1 | 0.79 | 0.68 | |
| rs34107968 |
| DEL | 3 | 0.08 | -0.67 | |
| rs3842412 |
| DEL | 14 | 0.19 | -0.66 | |
| rs201075641 |
| DEL | 4 | 0.71 | 0.65 | |
| rs60602324 |
| INS | 1 | 0.88 | 0.65 | |
| rs59484263 |
| DEL | 1 | 0.89 | 0.64 | |
|
| rs28366020 |
| DEL | 3 | 0.06 | -0.62 |
| rs5840961 |
| INS | 1 | 0.08 | -0.57 | |
| - |
| INS | 1 | 0.08 | -0.55 | |
| rs34313783 |
| DEL | 1 | 0.62 | 0.53 | |
| rs66822811 |
| DEL | 38 | 0.78 | 0.52 | |
| rs5820777 |
| DEL | 1 | 0.66 | 0.50 | |
| rs200692689 |
| INS | 2 | 0.88 | 0.49 | |
| - |
| INS | 1 | 0.88 | 0.49 | |
| rs199953326 |
| DEL | 1 | 0.88 | 0.49 | |
| rs75077631 |
| INS | 1 | 0.77 | 0.49 | |
|
| rs201884277 |
| DEL | 2 | 0.87 | 0.75 |
| rs75244934 |
| DEL | 2 | 0.83 | 0.69 | |
| rs139938620 |
| DEL | 13 | 0.79 | 0.68 | |
| rs34692283 |
| DEL | 2 | 0.74 | 0.67 | |
| rs55726149 |
| INS | 3 | 0.20 | -0.60 | |
| rs77949675 |
| DEL | 2 | 0.78 | 0.60 | |
| rs35231579 |
| DEL | 1 | 0.16 | -0.58 | |
| rs139775692 |
| DEL | 11 | 0.79 | 0.58 | |
| rs61077744 |
| DEL | 1 | 0.70 | 0.57 | |
| rs149347369 |
| INS | 5 | 0.79 | 0.55 |
Fig. 3APIP expression is likely adapted in AFR. A TFBS-ID (rs139999735) with a δ = 0.22 in africans and associated with the APIP gene affects gene expression as seen in A. In B, individuals homozygous for the TFBS-ID (continuous line) had a lower genetic heterogeneity around the INDEL position (vertical dashed line) when compared to individuals homozygous for absence of the INDEL (dashed line)
Fig. 4Ontology analysis for genes associated to TFBS-ID with δ > = 0.2 in the respective population (5 KB window upstream of TSS). Color of the circle refers to the p-value of the enrichment while size of the circle refers to the numbers of genes within that GO category
Fig. 5GO enrichment analysis for TFBS-ID matching regions known to be under selection in the human genome (5 KB window upstream of TSS). Color of the bars refers to the p-value of the respective enrichment. Length of the bar refers to the number of genes within the respective GO category