| Literature DB >> 31175824 |
José Luis Villanueva-Cañas1, Vivien Horvath1, Laura Aguilera1, Josefa González1.
Abstract
Although transposable elements are an important source of regulatory variation, their genome-wide contribution to the transcriptional regulation of stress-response genes has not been studied yet. Stress is a major aspect of natural selection in the wild, leading to changes in the transcriptional regulation of a variety of genes that are often triggered by one or a few transcription factors. In this work, we take advantage of the wealth of information available for Drosophila melanogaster and humans to analyze the role of transposable elements in six stress regulatory networks: immune, hypoxia, oxidative, xenobiotic, heat shock, and heavy metal. We found that transposable elements were enriched for caudal, dorsal, HSF, and tango binding sites in D. melanogaster and for NFE2L2 binding sites in humans. Taking into account the D. melanogaster population frequencies of transposable elements with predicted binding motifs and/or binding sites, we showed that those containing three or more binding motifs/sites are more likely to be functional. For a representative subset of these TEs, we performed in vivo transgenic reporter assays in different stress conditions. Overall, our results showed that TEs are relevant contributors to the transcriptional regulation of stress-response genes.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31175824 PMCID: PMC6649756 DOI: 10.1093/nar/gkz490
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Transcription factors analyzed in this study. Description of the stress-related transcription factors, including the identifier (ID) of the position weight matrix (PWM) or transcription factor flexible model (TFFM) used. The stresses analyzed were: HSE: Heat Shock element; ARE: Antioxidant response element; HRE: Hypoxia response element; IRE: Immunity response element; MRE: Metal response element; and XRE: Xenobiotic response element. In D. melanogaster, only TFFM IDs are provided for models built based on a D. melanogaster PWMs. *For HSF, HIF1, MTF-1, and XBP1 a vertebrate PWM was used.
|
| Human | |||||||
|---|---|---|---|---|---|---|---|---|
| Stress | Transcription Factors | PWM ID | Score threshold (PWM) | TFFM ID | Transcription factors | PWM ID | Score threshold (PWM) | TFFM ID |
| HSE/ARE/HRE/ IRE/MRE/XRE | HSF ( | MA0486.2* | 10.04 | NA | HSF1 ( | MA0486.2 | 10.09 | TFFM0048.1 |
| ARE/IRE | DL ( | MA0022.1 | 8.48 | TFFM0158 | NFKB1 ( | MA0105.4 | 10.35 | |
| HRE | HIF1 (HIF1A, tango-HIF1B) ( | MA0259.1* | 9.43 | NA | EGR1 ( | MA0162.2 | 9.79 | TFFM0020.1 |
| SP1 ( | MA0079.3 | 10.48 | TFFM0097.1 | |||||
| MRE | MTF-1 ( | PB0044.1* | 9.27 | NA | – | – | – | – |
| IRE | CAD ( | MA0216.2 | 10.12 | TFFM0159 | – | – | – | – |
| DEAF1 ( | MA0185.1 | 8.33 | NA | – | – | – | – | |
| NUB ( | MA0197.2 | 8.99 | NA | – | – | – | – | |
| XBP1 ( | MA0844.1* | 9.69 | NA | – | – | – | – | |
| ARE/XRE | CNC ( | MA0530.1 | 9.97 | NA | – | – | – | – |
| ARE | – | – | – | – | NFE2L2 ( | MA0150.2 | 9.56 | TFFM0071.1 |
| ARE/HRE | – | – | – | – | NRF1 ( | MA0506.1 | 10.04 | TFFM0082.1 |
| – | – | – | – | CREB1 ( | MA0018.2 | 7.96 | TFFM0012.1 | |
| HRE/XRE | – | – | – | – | AP1 (FOS) ( | MA0476.1 | 10.46 | TFFM0032.1 |
Description of the stress-related transcription factors, including the identifier (ID) of the position weight matrix (PWM) or transcription factor flexible model (TFFM) used. The stresses analyzed were: HSE: Heat Shock element; ARE: Antioxidant response element; HRE: Hypoxia response element; IRE: Immunity response element; MRE: Metal response element; and XRE: Xenobiotic response element. In D. melanogaster, only TFFM IDs are provided for models built based on a D. melanogaster PWMs. *For HSF, HIF1, MTF-1 and XBP1 a vertebrate PWM was used.
Prediction of binding motifs (TFBMs) and binding sites (TFBSs) in D. melanogaster and humans
| TFBMs | TFBSs | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| PWMs | TFFMs | Chip-seq | |||||||
| Transcription factors | Number (TEs/Genome) | % |
| TEs | Ratio TE / background | Number (TEs/genome) | % |
| Merged TFBMs/TFBSs |
| (A) | |||||||||
| CNC | 1832/ 34 558 | 5.3 | 1 | – | – | – | – | 1573 | |
| DEAF1 | 10 735/ 219 557 | 4.89 | 5.72e−31 | – | – | – | – | 9042 | |
| MTF-1* | 2223/ 29 964 | 7.42 | 2.62e−45 | – | – | – | – | 1839 | |
| NUB | 8666/ 181 721 | 4.77 | 5.70e−38 | – | – | – | – | 7335 | |
| XBP1* | 528/ 10 402 | 5.08 | 0.86 | – | – | – | – | 458 | |
| caudal | 7068/ 123 046 | 5.74 | 5.87e−05 | 1519 | 0.64 | 5907 / 35 630 | 16.58 | >1e−323 | 8567 |
| dorsal | 5427/ 116 125 | 4.67 | 7.46e−32 | 4579 | 1.16 | 985 / 2883 | 34.17 | >1e−323 | 7555 |
| HSF* | 480/ 7354 | 6.52 | 6.78e−4 | 734 | 1.86 | 1643 / 4493 | 36.57 | >1e−323 | 2191 |
| tango (HIF1B)* | 2754/ 62 228 | 4.43 | 3.32e−30 | 1119 | 1.97 | 4349 / 15 238 | 28.54 | >1e−323 | 4382 |
|
| 39 713/ 784 955 | 5.06 | 2.2e−16 | 7995 | – | 12 884 / 58 244 | 22.33 | 2.2e-16 | 42 942 |
| (B) Humans | |||||||||
| CREB1 | 1 462 850/ 2 434 226 | 60.10 | 3.95e−322 | 308 156 | 0.89 | 2317/ 15 908 | 14.56 | 3.95e−323 | 1 627 554 |
| EGR1 | 434 593/ 1 169 693 | 37.15 | 2.77e−322 | 196 187 | 1.15 | 9972/ 36 982 | 26.96 | 3.95e−323 | 509 377 |
| FOS | 324 072/ 747 204 | 43.37 | 1.36e−309 | 630 618 | 0.89 | 45 748/ 92 352 | 49.54 | 4.4e−130 | 370 407 |
| HSF1 | 83 286/ 211 771 | 39.33 | 1.18e−322 | 338 290 | 0.69 | 343/ 1432 | 23.95 | 3.82e−63 | 325 915 |
| NFE2L2 | 298 168/ 571 695 | 52.16 | 1.98e−322 | 377 740 | 0.95 | 639/ 744 | 85.89 | 1.42e−115 | 505 947 |
| NFKB1 | 30 447/ 49 199 | 61.89 | 3.95e−323 | 180 383 | 1.47 | 12 638/ 28 678 | 44.07 | 4.62e−6 | 161 213 |
| NRF1 | 26 327/ 127 953 | 20.58 | 7.9e−323 | 28 857 | 0.88 | 259/ 4511 | 5.74 | 3.95e−323 | 37 708 |
| SP1 | 903 287/ 1 929 185 | 46.82 | 1.53e−279 | 138 185 | 1.94 | 4463/ 15 104 | 29.55 | 3.95e−323 | 847 478 |
| Total | 3 563 030/ 7 240 926 | 45.54 | 2.2e−16 | 2 198 416 | – | 76 379 / 195 711 | 39.02 | 2.2e−16 | 4 385 599 |
*TFs for which a vertebrate PWM was used.
Number of PWMs and ChIP-seq peaks (TFBSs) predicted in TEs/number predicted in the genome. For TFFMs, the number of predictions in TEs, and the ratio of predictions in TE versus background sequences is given. The merged TFBMs/TFBSs column shows the number of unique motifs/sites after considering the overlapping of coordinates between PWM, TFFM and ChIP-seq peaks predictions.
Figure 1.Percentage of transcription factor binding motifs (TFBMs) and ChIP-seq peaks (TFBSs) located in TEs in (A) Drosophila melanogaster and in (B) Humans. In green, motif predictions using position weight matrix (PWMs). The vertical dotted line depicts the expected percentage of motifs in TEs in D. melanogaster (5.45%) and human (45.54%). In blue, ratio of number of motifs predicted in TEs and number of motifs predicted in background sequences with the same properties than TEs. The expected ratio is 1 (vertical dotted line). In orange, percentage of ChIP-seq peaks located in TEs. The expected percentages of TFBSs falling in TEs are represented as vertical dotted lines as in the PWM predictions.
Figure 2.Several TE families are enriched for stress-related transcription factor motifs and binding sites. (A) The number of genomic copies for D. melanogaster TE families with at least 25 copies is represented. Families are painted depending on whether they are enriched for motifs, ChIP-seq peaks, or both (C+M). Absent columns for a particular TF indicate that the score could not be calculated due to lack of sufficient motifs or peaks. (B) Equivalent figure for humans. The number of copies is given in log scale due to the high number of copies of some families. Only families with more than 5000 copies are plotted.
Figure 3.Overlap of TFBMs and TFBS predictions. Venn diagrams showing the overlap in the predictions across methods (PWM, TFMM, and ChIP-seq) within TEs for representative transcription factors in panel (A) D. melanogaster and panel (B) humans. A motif/peak is considered as shared if there is overlap in their coordinates. Note that a ChIP-seq peak can overlap with several motifs.
Figure 4.Enhancer/promoter genomic characteristics in TEs with predicted TFBMs/TFBSs in D. melanogaster. Percentage of TEs with at least one TFBMs/TFBSs for each one of the nine transcription factors studied overlapping with (A) open chromatin regions, (B) containing a CBP peak, (C) enriched for active histone marks or (D) located in a regulatory region. In purple, merged dataset of TFBMs/TFBSs and in orange dataset with evidence from ChIP-seq. The vertical dotted line showed the expected percentage for each feature.
Number of TEs containing one or more, or three of more TFBMs/TFBSs present at high population frequencies or fixed
| High freq TEs | Fixed TEs (non- | Fixed TEs ( | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Dataset | TE # | TE # | % |
| TE # | % |
| TE # | % |
|
|
| 3894 | 424 | 11 | NA | 855 | 22 | NA | 2234 | 58 | NA |
|
| 2438 | 396 | 16.2 | <2.2e−16 | 621 | 25.5 | <2.2e−16 | 1086 | 44.5 | <2.2e−16 |
|
| 1314 | 340 | 25.9 | <2.2e−16 | 386 | 29.4 | <2.2e−16 | 275 | 20.9 | <2.2e−16 |
|
| 311 | 131 | 42.1 | <2.2e−16 | 12 | 3.9 | <2.2e−16 | 0 | 0 | <2.2e−16 |
Figure 5.Characteristics of TEs containing three or more TFBMs/TFBS present at (A) high frequency or (B) fixed (non-INE-1). Histone mod: TE bears H3K4me3 or H3K36me3 marks associated with active chromatin. Open chromatin: TE is located in an open chromatin region. Evidence of selection: TEs with evidence of selection (41). Reg. region: TE is located in the proximal regulatory region of a gene (promoter, 5′UTR or first intron). Gene association: TEs located nearby stress-associated genes. Ratio TFBS: TE contains 20% more TFBS than expected given their length.
Results summary for the in vivo enhancer assays performed
| TFBS/TFBM | q-PCR result ( | |||||||
|---|---|---|---|---|---|---|---|---|
| TE Family Class | TE | Reg. region | Additional evidence | Experimental design | Stress tested | Control | Treated | Reference |
| FBti0019386 | DEAF1: 1 | CAD: 2 NUB: 1 DEAF1: 1 dorsal: 1 | Regulatory region CBP TFBS ratio Histone marks Selection evidence | Intergenic | IRE | No | Up-regulation (8.91E-05) | This work |
| FBti0019082 | CAD: 1 DEAF1: 3 MTF-1: 3 CNC: 3 dorsal: 2 | NUB: 2 XBP1: 1 | Regulatory region Open chromatin CBP Histone marks Selection evidence | Intergenic | IRE | No | Down-regulation (0.033) | This work |
| FBti0019985 | DEAF1: 1 NUB: 1 MTF-1: 1 dorsal: 1 | DEAF1: 1 | Regulatory region Selection evidence TFBS ratio | TE/antisense | IRE | No | Up-regulation (0.0126) | This work |
| TE/sense | No | Up-regulation | ( | |||||
| tdn8 | NUB: 2 DEAF1: 4 | NA | Regulatory region | Intergenic | IRE | No | Up-regulation (0.046) | ( |
| FBti0020057 | NUB: 1 | NUB: 3 CAD: 1 DEAF1:1 | Regulatory region Open chromatin Gene: | Intergenic | IRE | Down-regulation (0.0193) | Down-regulation (0.0161) | ( |
| FBti0019453 | NUB: 1 CAD: 1 | NUB: 2 DEAF1:1 | Regulatory region Open chromatin Selection evidence | Intergenic | XRE | No | No | This work |
| IRE | No | Down-regulation (0.007) | This work | |||||
| FBti0019012 | NUB: 4 | NUB: 3 DEAF1: 1 XBP1: 1 | Regulatory region Gene: | TFBS/sense | IRE | No | No | This work |
| HSE | No | No | This work | |||||
| TFBS/antisense | HSE | No | No | This work | ||||
| FBti0019309 | DEAF1: 2 NUB: 3 MTF-1: 2 | NUB: 3 CAD: 1 DEAF1:2 dorsal:1 | Regulatory region TFBS ratio | TFBS/sense | IRE | No | No | This work |
| TFBS/antisense | HSE | No | No | This work | ||||
| FBti0018880 | CNC: 1 DEAF1: 3 NUB: 2 MTF-1: 1 | MTF-1: 1 DEAF1: 2 | Regulatory region TFBS ratio Gene: | TFBS | ARE | No | No | This work |
| Intergenic | ARE | No | No | This work | ||||
| IRE | No | No | This work | |||||
| FBti0061428 | dorsal: 3 DEAF1: 3 | NA | Open chromatin Histone marks Gene: | TFBS | IRE | No | No | This work |
| HSE | No | No | This work | |||||
| FBti0019197* | tango: 1 dorsal: 1 | MTF-1: 1 NUB: 1 | Regulatory region Histone marks | TE | IRE | No | No | This work |
| ARE | No | No | This work | |||||
| FBti0019978 | MTF-1: 2 | MTF-1: 1 CAD: 1 DEAF1: 1 | Regulatory region Open chromatin Histone marks | Intergenic | XRE | No | No | This work |
| FBti0061578* | DEAF1: 2 tango: 1 | CAD: 1 DEAF1: 1 dorsal: 1 | Regulatory region Histone Marks TFBS ratio Gene: | Intergenic | ARE | No | No | This work |
| IRE | No | No | This work | |||||
| FBti0018868 | DEAF1: 1 NUB: 1 CAD: 1 | NA | Regulatory region TFBS ratio Gene: | TE | IRE | No | No | ( |
*Fixed TEs; In bold, TFs for which the evidence for the presence of TFBSs in that particular TE comes from ChIP-seq data. Experimental design indicates the region that was cloned in front of the reporter gene. We also included the data for three reporter assays performed previously in the lab (44).
Figure 6.Four TE insertions analyzed in this work affect the expression of a reporter gene. qRT-PCR experiments comparing the expression of the gfp reporter gene in transgenic flies containing the genomic region under study without the TE insertion (gray) and with the TE insertions (red), in stress and non-stress conditions. The error bars represent the standard deviation of three biological replicates. Significant results are indicated with *.