| Literature DB >> 23708189 |
Mingchao Xie1, Chibo Hong, Bo Zhang, Rebecca F Lowdon, Xiaoyun Xing, Daofeng Li, Xin Zhou, Hyung Joo Lee, Cecile L Maire, Keith L Ligon, Philippe Gascard, Mahvash Sigaroudinia, Thea D Tlsty, Theresa Kadlecek, Arthur Weiss, Henriette O'Geen, Peggy J Farnham, Pamela A F Madden, Andrew J Mungall, Angela Tam, Baljit Kamoh, Stephanie Cho, Richard Moore, Martin Hirst, Marco A Marra, Joseph F Costello, Ting Wang.
Abstract
Transposable element (TE)-derived sequences comprise half of the human genome and DNA methylome and are presumed to be densely methylated and inactive. Examination of genome-wide DNA methylation status within 928 TE subfamilies in human embryonic and adult tissues identified unexpected tissue-specific and subfamily-specific hypomethylation signatures. Genes proximal to tissue-specific hypomethylated TE sequences were enriched for functions important for the relevant tissue type, and their expression correlated strongly with hypomethylation within the TEs. When hypomethylated, these TE sequences gained tissue-specific enhancer marks, including monomethylation of histone H3 at lysine 4 (H3K4me1) and occupancy by p300, and a majority exhibited enhancer activity in reporter gene assays. Many such TEs also harbored binding sites for transcription factors that are important for tissue-specific functions and showed evidence of evolutionary selection. These data suggest that sequences derived from TEs may be responsible for wiring tissue type-specific regulatory networks and may have acquired tissue-specific epigenetic regulation.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23708189 PMCID: PMC3695047 DOI: 10.1038/ng.2649
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330
Figure 1Clustering of TE families based on their DNA methylation profile reveals tissue specificity
TE families (rows) were clustered based on their MeDIP-seq (a) or MRE-seq (b) enrichment values across 29 samples (see Online Methods). The samples (columns) were clustered into four major groups, which were consistent with their tissue types: ESC H1 (gray), Brain (orange), Breast (blue), and Blood (purple). The vertical bar on the right side of the heat-map represents TE classes: LTR (blue), DNA transposon (purple), SINE (orange), and LINE (black). The corresponding methylation enrichment values are represented as horizontal bar with varying color gradients at the bottom of each panel.
GO enrichment of genes associated with hypomethylated TEs.
| TE | GO Biological | P-Value | FDR | Gene | Fold |
|---|---|---|---|---|---|
| LFSINE | Telencephalon | 1.49E-05 | 2.74E-03 | 19/87 | 3.55 |
| Pallium development | 9.35E-05 | 1.24E-02 | 12/56 | 3.48 | |
| Neuron migration | 1.50E-04 | 1.79E-02 | 16/69 | 3.77 | |
| UCON29 | Generation of neurons | 6.6031E-23 | 3.6419E-20 | 11/656 | 4.9126 |
| Neuron differentiation | 3.3780E-22 | 1.4247E-19 | 10/500 | 5.8593 | |
| Neuron recognition | 5.01E-5 | 4.49E-2 | 5/23 | 11.04 | |
| LTR12 | Oxidation reduction | 3.73E-06 | 2.67E-02 | 17/647 | 2.24 |
| Antigen processing and | 7.40E-06 | 2.65E-02 | 2/20 | 8.53 | |
| LTR77 | Homophilic cell | 7.0555E-7 | 5.0588E-3 | 10/105 | 11.70 |
| Cell-cell adhesion | 4.5389E-6 | 1.6272E-2 | 12/266 | 5.55 |
Genomic coordinates of individual TE copies of the TE families were used as input for GREAT analysis[55]. Each gene was assigned a basal regulatory domain of 5kb upstream and 1kb downstream of the TSS (regardless of other nearby genes). The gene regulatory domain was extended in both directions to the nearest gene’s basal domain but no more than a maximum of 1Mb extension in one direction. GO enrichment, p-values and FDR values were computed by GREAT.
Figure 2Tissue-specific enhancer signatures of LTR77 and LFSINE
LTR77 (a-d) and LFSINE (e-h) are specifically hypomethylated in blood samples and brain samples, respectively. (a) Boxplots of MeDIP-seq and MRE-seq enrichment scores of LTR77 in multiple cell/tissue types. (b) Histone modification signatures of LTR77 in CD8+ Naïve cells. (c) Comparison of H3K4me1 signal of LTR77 between fetal brain sample and CD8+ Naïve cells. (d) p300 binding signal on LTR77 in four cell lines. (e) Boxplots of MeDIP-seq and MRE-seq enrichment scores of LFSINE in multiple cell/tissue types. (f) Histone modification signatures of LFSINE in fetal brain sample. (g) Comparison of H3K4me1 signal of LFSINE between fetal brain sample and CD8+ Naïve cells. (h) p300 binding signal on LFSINE in four cell lines. Signals of different histone modification or p300 binding for each genomic copy of the TE family including 3kb upstream and downstream flanking regions were averaged in 5bp tiling windows. Error bar represents 1 standard deviation.
Figure 3Tissue-specific hypomethylated TEs correlate with gene expression
(a) Genome Browser view of an LTR77 element upstream of the ERAP1 gene. Displayed tracks include: DNA methylation (MeDIP-seq) for human ESC H1, breast, brain and blood samples; histone modification (H3K4me1 and H3K4me3) tracks for a CD8 naïve sample and a fetal brain cell sample; transcription factor binding tracks (ENCODE) for NFkB, Pol2, and TCF12 in three cell lines; gene annotation and RepeatMasker. (b) Bisulfite sequencing validation of DNA methylation status of the LTR77 element (5 CpG sites) in human ESC H1, breast, brain and blood samples. Black circle represents methylated CpG sites and white circle represents unmethylated CpG sites. (c) Boxplots of expression levels of ERAP1 in 4 different tissues. (d) Genome Browser view of an LFSINE element upstream of the GFRA1 gene. Displayed tracks include: DNA methylation (MeDIP-seq) for human ESC H1, breast, brain, and blood samples; histone modification (H3K4me3 and H3K4me1) tracks for a fetal brain sample and a CD8+ naïve cell sample; gene annotation and RepeatMasker. (e) Bisulfite sequencing validation of DNA methylation status of the LFSINE element (4 CpG sites) in human ESC H1, breast, brain, and blood samples. (f) Boxplots of expression levels of GFRA1 in 4 different tissues.
Figure 4Correlation between cell type-specific enhancer marks, binding of transcription factors, and sequence motifs
Histone modification, transcription factor binding, and sequence motif prediction data were displayed for individual genomic copies of LTR77 and LFSINE. Each row represents one element. Data were obtained from UCSC ENCODE portal[33]. For H3K4me1 histone modification and p300 ChIP-seq data, RPKM values at 50bp resolution were plotted for a 10kb region centered on the TE copy. For transcription factor binding data, a red tick indicates that the TE copy overlaps with a peak predicted using ChIP-seq data of the given transcription factor in the given cell type. For sequence motif data, each TE copy was scored using position specific weight matrix of the given transcription factor. A blue tick indicates log-transformed e-value of observing a sequence motif by chance.
| Experiment | Sample | GEO ID |
|---|---|---|
| MeDIP-seq | H1Es Batch1 | GSM543016 |
| H1Es Batch2 | GSM456941 | |
| Breast Luminal Epithelial Cells RM066 | GSM613856 | |
| Breast Luminal Epithelial Cells RM070 | GSM613843 | |
| Breast Luminal Epithelial Cells RM071 | GSM613852 | |
| Breast MyoEpithelial Cells RM066 | GSM613857 | |
| Breast MyoEpithelial Cells RM070 | GSM613846 | |
| Breast MyoEpithelial Cells RM071 | GSM613850 | |
| Breast Stem Cells RM066 | GSM613859 | |
| Breast Stem Cells RM070 | GSM613847 | |
| Breast Stem Cells RM071 | GSM613853 | |
| CD4 Memory Primary Cells TC003 | GSM613862 | |
| CD4 Memory Primary Cells TC007 | GSM613914 | |
| CD4 Memory Primary Cells TC009 | GSM669608 | |
| CD4 Naive Primary Cells TC003 | GSM543025 | |
| CD4 Naive Primary Cells TC007 | GSM613913 | |
| CD4 Naive Primary Cells TC009 | GSM669607 | |
| CD8 Naive Primary Cells TC003 | GSM543027 | |
| CD8 Naive Primary Cells TC007 | GSM613917 | |
| CD8 Naive Primary Cells TC009 | GSM669609 | |
| Fetal Brain HuFNSC01 | GSM669614 | |
| Fetal Brain HuFNSC02 | GSM669615 | |
| Neurosphere Cultured Cells, Cortex Derived HuFNSC01 | GSM669610 | |
| Neurosphere Cultured Cells, Cortex Derived HuFNSC02 | GSM669612 | |
| Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC01 | GSM669611 | |
| Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC02 | GSM669613 | |
| Peripheral Blood Mononuclear Primary Cells TC03 | GSM543023 | |
| Peripheral Blood Mononuclear Primary Cells TC007 | GSM613911 | |
| Peripheral Blood Mononuclear Primary Cells TC009 | GSM669606 | |
| MRE-seq | H1Es Batch1 | GSM428286 |
| H1Es Batch2 | GSM450236 | |
| Breast Luminal Epithelial Cells RM066 | GSM613833 | |
| Breast Luminal Epithelial Cells RM070 | GSM613818 | |
| Breast Luminal Epithelial Cells RM071 | GSM613826 | |
| Breast MyoEpithelial Cells RM066 | GSM613834 | |
| Breast MyoEpithelial Cells RM070 | GSM613821 | |
| Breast MyoEpithelial Cells RM071 | GSM613908 | |
| Breast Stem Cells RM066 | GSM613837 | |
| Breast Stem Cells RM070 | GSM613907 | |
| Breast Stem Cells RM071 | GSM613829 | |
| CD4 Memory Primary Cells TC003 | GSM613842 | |
| CD4 Memory Primary Cells TC007 | GSM613903 | |
| CD4 Memory Primary Cells TC009 | GSM669599 | |
| CD4 Naive Primary Cells TC003 | GSM543011 | |
| CD4 Naive Primary Cells TC007 | GSM613901 | |
| CD4 Naive Primary Cells TC009 | GSM613920 | |
| CD8 Naive Primary Cells TC003 | GSM543013 | |
| CD8 Naive Primary Cells TC007 | GSM613905 | |
| CD8 Naive Primary Cells TC009 | GSM613923 | |
| Fetal Brain HuFNSC01 | GSM669604 | |
| Fetal Brain HuFNSC02 | GSM669605 | |
| Neurosphere Cultured Cells, Cortex Derived HuFNSC01 | GSM669600 | |
| Neurosphere Cultured Cells, Cortex Derived HuFNSC02 | GSM669602 | |
| Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC01 | GSM669601 | |
| Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC02 | GSM669603 | |
| Peripheral Blood Mononuclear Primary Cells TC03 | GSM543009 | |
| Peripheral Blood Mononuclear Primary Cells TC007 | GSM613898 | |
| Peripheral Blood Mononuclear Primary Cells TC009 | GSM613919 | |
| Histone | CD8 Naive Primary Cells TC001 H3K4me1 | GSM613814 |
| CD8 Naive Primary Cells TC001 H3K4me3 | GSM613811 | |
| CD8 Naive Primary Cells TC001 H3K36me3 | GSM669593 | |
| CD8 Naive Primary Cells TC001 H3K27me3 | GSM613815 | |
| CD8 Naive Primary Cells TC001 H3K9me3 | GSM613812 | |
| Fetal Brain HuFNSC01 H3K4me1 | GSM806942 | |
| Fetal Brain HuFNSC01 H3K4me3 | GSM806943 | |
| Fetal Brain HuFNSC01 H3K36me3 | GSM806946 | |
| Fetal Brain HuFNSC01 H3K27me3 | GSM806945 | |
| Fetal Brain HuFNSC01 H3K9me3 | GSM806944 | |
| p300 | GM12878 rep1 | GSM803387 |
| GM12878 rep2 | GSM803387 | |
| H1 | GSM803542 | |
| HepG2 | GSM803499 | |
| SK-N-SH RA rep1 | GSM803495 | |
| SK-N-SH RA rep2 | GSM803495 | |
| mRNA-seq | Breast Luminal Epithelial Cells RM035 | GSM543029 |
| Breast Luminal Epithelial Cells RM080 | GSM669620 | |
| Breast MyoEpithelial Cells RM035 | GSM543031 | |
| Breast MyoEpithelial Cells RM080 | GSM669621 | |
| CD4 Memory TC014 | GSM669618 | |
| CD4 Naïve TC014 | GSM669617 | |
| CD8 Naïve TC014 | GSM669619 | |
| Fetal Brain HuFNSC01 | GSM751274 | |
| Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC01 | GSM751271 | |
| Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC02 | GSM751273 | |
| H1ES | GSM484408 | |
| TF ChIP-seq | RAD21 GM12878 Rep1 | GSM803416 |
| RAD21 SK-N-SH RA Rep1 | GSM803497 | |
| YY1 GM12878 Rep | GSM803406 | |
| YY1 SK-N-SH RA Rep | GSM803498 | |
| NFKB GM12878 Rep1 | GSM935478 |