Literature DB >> 23708189

DNA hypomethylation within specific transposable element families associates with tissue-specific enhancer landscape.

Mingchao Xie1, Chibo Hong, Bo Zhang, Rebecca F Lowdon, Xiaoyun Xing, Daofeng Li, Xin Zhou, Hyung Joo Lee, Cecile L Maire, Keith L Ligon, Philippe Gascard, Mahvash Sigaroudinia, Thea D Tlsty, Theresa Kadlecek, Arthur Weiss, Henriette O'Geen, Peggy J Farnham, Pamela A F Madden, Andrew J Mungall, Angela Tam, Baljit Kamoh, Stephanie Cho, Richard Moore, Martin Hirst, Marco A Marra, Joseph F Costello, Ting Wang.   

Abstract

Transposable element (TE)-derived sequences comprise half of the human genome and DNA methylome and are presumed to be densely methylated and inactive. Examination of genome-wide DNA methylation status within 928 TE subfamilies in human embryonic and adult tissues identified unexpected tissue-specific and subfamily-specific hypomethylation signatures. Genes proximal to tissue-specific hypomethylated TE sequences were enriched for functions important for the relevant tissue type, and their expression correlated strongly with hypomethylation within the TEs. When hypomethylated, these TE sequences gained tissue-specific enhancer marks, including monomethylation of histone H3 at lysine 4 (H3K4me1) and occupancy by p300, and a majority exhibited enhancer activity in reporter gene assays. Many such TEs also harbored binding sites for transcription factors that are important for tissue-specific functions and showed evidence of evolutionary selection. These data suggest that sequences derived from TEs may be responsible for wiring tissue type-specific regulatory networks and may have acquired tissue-specific epigenetic regulation.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23708189      PMCID: PMC3695047          DOI: 10.1038/ng.2649

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


A large portion of eukaryotic genomes is derived from transposable elements (TEs)[1]. TEs have been described as parasitic or junk DNA. However, there is mounting evidence for their evolutionary contribution to the wiring of gene regulatory networks[2-7], a theory rooted in Barbara McClintock’s discovery that TEs can control gene expression[3,8,9]. TEs contain functional binding sites for transcription factors[6,10,11]; TE DNAs are presumed to be methylated in somatic cells to suppress transposition and TE-mediated changes in gene expression[12-14]. However, the extent to which DNA methylation silences TEs and how DNA methylation-mediated silencing of TEs is reconciled with the known regulatory function of TE sequences remain unexplored. To construct TE DNA methylation profiles we assayed 29 human samples representing 11 cell types using two complementary DNA methylomics methods: MeDIP-seq and MRE-seq[15,16]. Tissue and cell types included embryonic stem cells (ESC H1); fetal brain tissue and primary neural progenitor cells (derived from cortex or ganglionic eminence regions); primary adult breast epithelial cells (luminal epithelial cells, myoepithelial cells, and a progenitor cell-enriched population); unfractionated peripheral blood mononuclear cells (PBMC), and adult immune cells including CD4+ naïve, CD4+ memory, and CD8+ naïve cells. Mapping short-read data to TEs is difficult due to the high copy number of these elements. Standard mapping often discard or mis-align high quality reads derived from TEs (Supplementary Note). We developed a computational strategy termed Repeat Analysis Pipeline (RAP) that allows mapping of reads derived from repetitive elements to one of 1,395 specific families of human repeats including 928 TE families (Supplementary Fig. 1-5, Note). RAP includes features of three previously published methods[17-20] combined with novel technical modifications (Methods). As expected, sequences of the majority of TE families were methylated in all samples examined. The total MeDIP-seq signal, which represents the proportion of individual TE families that are methylated, correlated tightly with the total number of CpGs in that TE family, consistent with the high level of DNA methylation in TEs (R2=0.95, Supplementary Fig. 6-9). In contrast to TE families, total MeDIP-seq signal was 4.9% in promoter CpG islands after normalizing for CpG content, consistent with the unmethylated status of promoter CpG islands. Conversely, MRE-seq signal, which measures unmethylated DNA, was 6.7-fold more enriched over promoter CpG islands than in TEs (Supplementary Fig. 6-9). Strikingly, we found sequences of numerous TE families that were differentially methylated in specific cell-types. Unsupervised clustering of samples based on TE methylation revealed a clear relationship among tissue-types, indicating that TE methylation is a signature that can distinguish tissue- or possibly cell-types (Fig. 1a, b). We identified 14 TE families with significant (p<0.05, ANOVA) hypomethylation patterns in brain samples, 55 in breast samples, 13 in blood samples, and 13 in ESC (total 95 TE families, p<0.05, ANOVA). More than 800 other families were consistently methylated across cell types from these 29 samples (Supplementary Note). Most tissue-specific hypomethylated TEs belonged to the ERV/LTR class (69/95), whereas 12 were DNA transposon families (Supplementary Table 1). These findings are consistent with previous studies that have shown that LTR-elements participate in regulation of mammalian genes[3,21-24], and support the hypothesis that LTRs might play a role in the epigenetic regulation of cell-type specific gene expression. For each TE family, we identified individual copies that were uniquely mappable and were tissue-specifically hypomethylated. The complete list of TE families and coordinates of individual elements are provided at our website (Supplementary Note).
Figure 1

Clustering of TE families based on their DNA methylation profile reveals tissue specificity

TE families (rows) were clustered based on their MeDIP-seq (a) or MRE-seq (b) enrichment values across 29 samples (see Online Methods). The samples (columns) were clustered into four major groups, which were consistent with their tissue types: ESC H1 (gray), Brain (orange), Breast (blue), and Blood (purple). The vertical bar on the right side of the heat-map represents TE classes: LTR (blue), DNA transposon (purple), SINE (orange), and LINE (black). The corresponding methylation enrichment values are represented as horizontal bar with varying color gradients at the bottom of each panel.

We next investigated the genomic distribution of members of TE families showing tissue-specific hypomethylation. Their proximities to “known genes” were not different from being expected by chance (Supplementary Fig. 10). However, genes near members of these TE families were significantly enriched for functions specific to the tissue type in which they were hypomethylated (Table 1 and Supplementary Table 2). For example, hypomethylation of the UCON29 DNA transposon was restricted to fetal brain, and 11 of the 60 genes with a nearby UCON29 element are involved in neuron development (p<6.6×10−23, binomial test). Another brain-specific hypomethylated retroelement, LFSINE, was located near 19 out of 87 genes involved in telencephalon development (p<1.5×10−5, binomial test). Similarly, genes associated with LTR12 and LTR77, two ERVs hypomethylated in immune cells, were enriched for immune-related functions, including ‘antigen processing and presentation of peptide or polysaccharide antigen via MHC class II’ (p<7.4×10−6, binomial test), and ‘oxidation reduction’ (p<3.7×10−6, binomial test). While antigen processing and presentation is a known function of lymphocytes and other antigen-presenting hematopoietic cells, the enrichment of genes in the oxidation-reduction process was interesting because T-cell activation, differentiation and proliferation are sensitive to the redox potential[25,26].
Table 1

GO enrichment of genes associated with hypomethylated TEs.

TEGO BiologicalProcessP-ValueFDRGeneHitsFoldEnrichment
LFSINETelencephalondevelopment1.49E-052.74E-0319/873.55
Pallium development9.35E-051.24E-0212/563.48
Neuron migration1.50E-041.79E-0216/693.77
UCON29Generation of neurons6.6031E-233.6419E-2011/6564.9126
Neuron differentiation3.3780E-221.4247E-1910/5005.8593
Neuron recognition5.01E-54.49E-25/2311.04
LTR12Oxidation reduction3.73E-062.67E-0217/6472.24
Antigen processing andpresentation of peptidevia MHC class II7.40E-062.65E-022/208.53
LTR77Homophilic celladhesion7.0555E-75.0588E-310/10511.70
Cell-cell adhesion4.5389E-61.6272E-212/2665.55

Genomic coordinates of individual TE copies of the TE families were used as input for GREAT analysis[55]. Each gene was assigned a basal regulatory domain of 5kb upstream and 1kb downstream of the TSS (regardless of other nearby genes). The gene regulatory domain was extended in both directions to the nearest gene’s basal domain but no more than a maximum of 1Mb extension in one direction. GO enrichment, p-values and FDR values were computed by GREAT.

DNA hypomethylation has been associated with distal regulatory regions[27]. We next asked if TE sequences with tissue-specific DNA hypomethylation possessed other tissue-specific epigenetic signatures. We generated histone modification data (H3K4me1, H3K4me3, H3K27me3, H3K36me3 and H3K9me3) from these same tissues, and collected p300 genome-wide locations from related tissues[28] (Fig. 2). Sequences within hypomethylated TE families displayed remarkably strong tissue-specific H3K4me1 signals. For example, LTR77, a TE of the ERV class, had the lowest methylated (MeDIP-seq) signal and the highest unmethylated (MRE-seq) signal in blood (Fig. 2a). When we applied RAP to H3K4me3 and H3K4me1 ChIP-seq data from the same samples, we found much stronger signals within the LTR77 family in T cells compared to the three other cell and tissue types (Supplementary Fig. 11). Using data from CD8+ naïve cells, we identified a “histone signature” for all 148 LTR77 copies along with a 3kb region flanking the LTR (Fig. 2b,c). We observed a strong H3K4me1 peak over the LTR element itself, suggesting that at least some LTR77 elements had this enhancer mark. The H3K4me3 peak detected 3kb downstream suggested nearby promoter activities, potentially from genes regulated by enhancers embedded in LTR77. LFSINE and UCON29 displayed H3K4me1 enrichment specifically in fetal brain (Fig. 2f,g, and Supplementary Fig. 12). Moreover, LFSINE and UCON29 both accumulate p300 binding signals in the neuroblastoma cell-line SK-N-SH, but not in any non-neural cell lines including ESC, HepG2, or GM12878 (Fig. 2h, Supplementary Fig. 12). Similarly, the T cell-specific hypomethylated TE LTR77 accumulated p300 binding signal in GM12878 (a lymphoblastoid cell-line), but not in any other cell type (Fig. 2d). These results suggested that hypomethylated DNA sequences derived from TEs might serve as tissue-specific enhancers.
Figure 2

Tissue-specific enhancer signatures of LTR77 and LFSINE

LTR77 (a-d) and LFSINE (e-h) are specifically hypomethylated in blood samples and brain samples, respectively. (a) Boxplots of MeDIP-seq and MRE-seq enrichment scores of LTR77 in multiple cell/tissue types. (b) Histone modification signatures of LTR77 in CD8+ Naïve cells. (c) Comparison of H3K4me1 signal of LTR77 between fetal brain sample and CD8+ Naïve cells. (d) p300 binding signal on LTR77 in four cell lines. (e) Boxplots of MeDIP-seq and MRE-seq enrichment scores of LFSINE in multiple cell/tissue types. (f) Histone modification signatures of LFSINE in fetal brain sample. (g) Comparison of H3K4me1 signal of LFSINE between fetal brain sample and CD8+ Naïve cells. (h) p300 binding signal on LFSINE in four cell lines. Signals of different histone modification or p300 binding for each genomic copy of the TE family including 3kb upstream and downstream flanking regions were averaged in 5bp tiling windows. Error bar represents 1 standard deviation.

We next asked if any of these hypomethylated, enhancer-like sequences within TE might contribute to tissue-specific gene expression. We selected candidate TEs that could be uniquely mapped using our data. As a proof of principle, we focused on two putative target genes: ERAP1, a gene in the generation of most HLA class I-binding peptides, and the glial cell line-derived neurotrophic factor (GDNF) family receptor alpha-1 GFRA1, a neurotrophic factor involved in the control of neuron survival and differentiation[29] (Fig. 3a,d). A LTR77 element was detected 2kb upstream of an ERAP1 alternative transcription start site. Our genome-wide data suggested that this element was hypomethylated in T-cells, a prediction confirmed by locus-specific bisulfite-sequencing (Fig. 3b). In addition to enhancer-like signature, NF-kB and Pol2 ChIP-seq peaks were observed in a lymphoblastoid cell-line (GM12878), but not in a non-lymphoblastoid cell-line (HepG2). Consistently, ERAP1 exhibited the highest expression in T-cells (Fig. 3c). This LTR77 element exhibited modest enhancer activity in 293T, SK-N-SH, and GM12878 cells based on reporter assay (Supplementary Fig. 13, LTR77-1). In the brain samples, GFRA1 appeared as a putative target of an LFSINE element (Fig. 3d). We observed tissue-specific H3K4me1 marks and a H3K4me3 mark in the promoter region in fetal brain, but not in T-cells (Fig. 3d). Transcription factor binding motifs, such as that for SOX10, a regulator of neural crest and glial cell development[30,31], were identified in the hypomethylated LFSINE element upstream of GFRA1. Consistent with the hypothesis that LFSINE is a tissue-specific enhancer, GFRA1 was highly and specifically expressed in neuronal cells (Fig. 3f). This element exhibited enhancer activity in 293T and SK-N-SH cells but not in GM12878 (Supplementary Fig. 13, LFSINE-1). Hypomethylation of these TEs did not appear to be a result of increased expression of nearby genes, since the hypomethylation was not observed for other TE families in the same genomic neighborhood (Fig 3a, d). Additional members of the LTR77, LTR12, UCON29 and LFSINE subfamilies were validated and shown to exhibit tissue-specific hypomethylation and associate with nearby tissue-specific gene expression (Supplementary Fig. 14, 15). Of the 36 TE derived candidates for which we performed reporter gene assay, 26 showed enhancer activities ranging from 5- to 1000-fold increase in at least one of the three cell-lines tested (Supplementary Fig. 13). These hypomethylated TE sequences have not been previously annotated as functional elements, but our results suggest that they may influence tissue-specific gene expression.
Figure 3

Tissue-specific hypomethylated TEs correlate with gene expression

(a) Genome Browser view of an LTR77 element upstream of the ERAP1 gene. Displayed tracks include: DNA methylation (MeDIP-seq) for human ESC H1, breast, brain and blood samples; histone modification (H3K4me1 and H3K4me3) tracks for a CD8 naïve sample and a fetal brain cell sample; transcription factor binding tracks (ENCODE) for NFkB, Pol2, and TCF12 in three cell lines; gene annotation and RepeatMasker. (b) Bisulfite sequencing validation of DNA methylation status of the LTR77 element (5 CpG sites) in human ESC H1, breast, brain and blood samples. Black circle represents methylated CpG sites and white circle represents unmethylated CpG sites. (c) Boxplots of expression levels of ERAP1 in 4 different tissues. (d) Genome Browser view of an LFSINE element upstream of the GFRA1 gene. Displayed tracks include: DNA methylation (MeDIP-seq) for human ESC H1, breast, brain, and blood samples; histone modification (H3K4me3 and H3K4me1) tracks for a fetal brain sample and a CD8+ naïve cell sample; gene annotation and RepeatMasker. (e) Bisulfite sequencing validation of DNA methylation status of the LFSINE element (4 CpG sites) in human ESC H1, breast, brain, and blood samples. (f) Boxplots of expression levels of GFRA1 in 4 different tissues.

We next examined the relationship between sequences of TEs, their epigenetic status, and transcription factor binding. We analyzed histone modification and binding data of transcription factors of two cell-lines (GM12878 and SK-N-SH) published by ENCODE[32,33]. We focused on individual copies of two TE families that exhibited tissue-specific hypomethylation in either blood (LTR77) or fetal brain (LFSINE). Consistent with our previous findings, members of these two TE families enriched for enhancer marks in a cell type-specific manner (Fig. 4) – LTR77 exhibited H3K4me1 mark and p300 binding in GM12878, but not in SK-N-SH; LFSINE exhibited p300 binding in SK-N-SH, but they did not enrich for H3K4me1 or p300 signal in GM12878. Binding sites of several transcription factors were enriched in LTR77 and LFSINE and showed cell type specificity (Fig. 4). For example, NF-kB binding overlapped specifically with LTR77 in GM12878; Rad21 bound within LFSINE more than within LTR77; and Rad21bound within LFSINE more in SK-N-SH than in GM12878 (Fig. 4). Not surprisingly, many TEs were predicted to contain a sequence motif when scanned using position specific weight matrices of transcription factors (Fig. 4). Having a motif was neither necessary nor sufficient for the actual binding, which correlated strongly with cell type-specific enhancer mark. Taken together, ENCODE data confirmed that sequences of specific TE families exhibited cell type-specific enhancer signatures and cell type-specific transcription factor binding. Whether there is a causal relationship between the TEs’ epigenetic mark and transcription factor binding awaits further investigation.
Figure 4

Correlation between cell type-specific enhancer marks, binding of transcription factors, and sequence motifs

Histone modification, transcription factor binding, and sequence motif prediction data were displayed for individual genomic copies of LTR77 and LFSINE. Each row represents one element. Data were obtained from UCSC ENCODE portal[33]. For H3K4me1 histone modification and p300 ChIP-seq data, RPKM values at 50bp resolution were plotted for a 10kb region centered on the TE copy. For transcription factor binding data, a red tick indicates that the TE copy overlaps with a peak predicted using ChIP-seq data of the given transcription factor in the given cell type. For sequence motif data, each TE copy was scored using position specific weight matrix of the given transcription factor. A blue tick indicates log-transformed e-value of observing a sequence motif by chance.

For decades, TEs have been deemed as parasitic DNA as a result of the impact of their transposition in the genome[34,35]. Transposition of TEs may be deleterious when they disrupt coding sequences or normal gene expression, resulting in human diseases[36-38]. Thus, it is believed that cells have acquired epigenetic mechanisms to cope with TEs so that transposon-derived sequences are completely methylated and transcriptionally silent in somatic tissues[14,39]. However, TE transpositions might provide diverse genetic material for natural selection, which would contribute to the evolution of species-specific traits and population biodiversity[40,41]. Many functional elements were born by “exaptation”, a process in which DNAs of a transposon are co-opted to benefit the host[42-44]. TE insertions with regulatory functions have been described in mammals[4,5,7,45]. A substantial proportion of constrained non-coding sequences arose from TEs[46,47], pointing to transposons as a driving force in the evolution of regulation network. Some hypomethylated TE subfamilies identified here were conserved based on their PhastCons and PhyloP scores, suggesting that this conservation might be a consequence of selection (Supplementary Fig. 16, 17). While we do not know how many TEs could have regulatory functions, previous reports indicate that 5% of TEs are under evolutionary constraint[46,47]. TE sequences were incorporated in gene networks under the control of transcription factors including TP53[6], OCT4[4,7], CTCF[48], and MER20 was reported to have contributed to the origin of pregnancy in placental mammals[5]. TE-derived sequences can directly regulate expression. For example, ISL1 is regulated by a SINE element[49], and so is FGF8 in the forebrain[50]. In both cases, TEs provide distal enhancers that help control expression of host genes, and their hypomethylation status in brain cells was confirmed by our genome-wide data (Supplementary Fig. 14). Our findings help to resolve the conflicting observations that TE sequences are globally suppressed by epigenetic mechanisms, including DNA methylation, but that they can mediate gene regulation in some instances. In this study, we challenge the general notion that TEs are constitutively methylated by examining the extent to which TE methylation differs between cell-types and the relationship between epigenetic silencing and TE sequences’ potential to impact gene regulation. Epigenetic control of TEs may contribute to developmental stage-specific, cell type-specific, and perhaps health condition-specific gene regulation. Distal regulatory regions are methylated at low levels, display enhancer chromatin marks, and are occupied by cell type-specific transcription factors[27]. Our results suggest that some TE sequences match this profile of distal enhancers. With a few exceptions[51,52], majority of human TEs were fixed and no longer active. Sequences within these TEs, however, could be adapted to serve as enhancers, and these sequences might be the reason for their epigenetic regulation. The mechanisms through which DNA within TEs is demethylated and obtains enhancer chromatin marks, and the relationship between TE-derived enhancers and other regulatory elements remain to be elucidated. A recent report demonstrated transposons on a human chromosome acquired activating histone modifications and changed DNA methylation status in mouse cells[53]. In rodents, some endogenous retroviruses function as species-specific enhancers in the placenta[54]. Therefore, as a source of new regulatory elements, TEs’ regulatory potential could be controlled by tissue- or cell type-specific epigenetic regulation. In our study, examination of DNA methylation in four distinct tissue types showed that while sequences of many TE families are globally hypermethylated, about 10% of TE families are hypomethylated in a tissue-specific manner and gain distal enhancer signatures. Analysis of a more extensive panel of tissues may reveal that a much larger portion of sequences derived from TEs may harbor gene regulatory function.

Online Methods

Further details for computational analyses are provided in the Supplementary Note.

1. Sample preparation

Blood

Buffy coats were obtained from the Stanford Blood Center (Palo Alto, CA). Blood was drawn and processed on the same day. Peripheral Blood Mononuclear cells (PBMC) were isolated by Histopaque 1077 (Sigma-Aldrich. Saint-Louis, MO) density gradient centrifugation according to the manufacturer’s protocol. Further purification of CD4 memory, CD4 naïve, and CD8 naïve T lymphocytes was performed using a Robosep instrument and isolation kits for each subpopulation as listed below (STEMCELL Technologies, Vancouver, BC, Canada). Total PBMC were karyotyped (Molecular Diagnostic Services Inc. San Diego, CA) and analyzed for cell cycle. PBMC and T cell subpopulations were stained with antibodies and analyzed by FACS for purity. Cells were aliquoted for DNA and RNA samples, and were washed in PBS. Cell pellets for RNA samples were resuspended in 1 ml TRIzol reagent (Invitrogen, Carlsbad, CA), and frozen at −80°C. Cell pellets for DNA samples were flash frozen in liquid nitrogen and stored at −80°C. Reagents and Antibodies: Anti-CD3 TRI-COLOR, Invitrogen Anti-CD4 PE, BD Biosciences Anti-CD8 FITC, BD Biosciences Anti-CD4 TRI-COLOR, Invitrogen Anti-CD45RO PE, Invitrogen Anti-CD45RA FITC, BD Biosciences Anti-CD8 TRI-COLOR, Invitrogen EasySep® Human Memory CD4 T Cell Enrichment Kit, EasySep® Human Naive CD4+ T Cell Enrichment Kit, Custom Human Naïve CD8 T cell Enrichment Kit, STEMCELL Technologies

Breast

Breast tissues were obtained from disease-free pre-menopausal women undergoing reduction mammoplasty in accordance with institutionally approved IRB protocol # 10-01563 (previously CHR # 8759-34462-01). All tissues were obtained as de-identified samples and linked only with minimal dataset (age, ethnicity and in some cases parity/gravidity). Tissue was dissociated mechanically and enzymatically, as previously described[56]. Briefly, tissue was minced and dissociated in RPMI 1640 with L-glutamine and 25mm HEPES (Fisher, cat # MT10041CV) supplemented with 10% fetal bovine serum (JR Scientific, Inc, cat # 43603), 100 units/ml penicillin, 100μg/ml streptomycin sulfate, 0.25μg/ml fungizone, gentamycin (Lonza, Cat # CC4081G), 200U/ml collagenase 2 (Worthington, cat # CLS-2) and 100U/ml hyaluronidase (Sigma-Aldrich, cat # H3506-SG) at 37°C for 16h. The cell suspension was centrifuged at 1,400rpm for 10min followed by a wash with RPMI 1640/10% FBS. Clusters enriched in epithelial cells (referred to as organoids) were recovered after serial filtration through a 150-μm nylon mesh (Fisher, cat # NC9445658), and a 40-μm nylon mesh (Fisher, cat # NC9860187). The final filtrate contained primarily mammary stromal cells (fibroblasts, immune cells and endothelial cells) and some single epithelial cells. Following centrifugation at 1,200rpm for 5min, the epithelial organoids and filtrate were frozen for long-term storage. The day of cell sorting, epithelial organoids were thawed out and further digested with 0.5g/L 0.05% trypsin-EDTA and dispase-DNAse I (STEMCELL Technologies, cats # 7913 and # 7900, respectively). Generation of single cell suspensions was monitored visually. Single cell suspensions were filtered through a 40-μm cell strainer (Fisher, cat # 087711), spun down and allowed to “regenerate” in MEGM medium (Lonza) supplemented with 2% fetal calf serum for 60-90min at 37°C. This “regeneration” step enables quenching of trypsin and re-expression of the cell surface markers prior to staining as their extra cellular domain had been cleaved by trypsin. The single cell suspension obtained as described above was stained for cell sorting with three human-specific primary antibodies, anti-CD10 labeled with PE-Cy7 (BD Biosciences, cat # 341092) to isolate myoepithelial cells, anti-CD227/MUC1 labeled with FITC (BD Biosciences cat # 559774) to isolate luminal epithelial cells or anti-CD73 labeled with PE (BD Biosciences, cat # 550257) to isolate a stem cell-enriched cell population, and with biotinylated antibodies for lineage markers, anti-CD2, CD3, CD16, CD64 (BD Biosciences, cat # 555325, 555338, 555405 and 555526), CD31 (Invitrogen, cat # MHCD3115), CD45, CD140b (BioLegend, cat #s 304003 and 323604) to specifically remove hematopoietic, endothelial and leukocyte lineage cells, respectively, by negative selection. Sequential incubation with primary antibodies was performed for 20min at room temperature in PBS with 1% bovine serum albumin (BSA), followed by washing in PBS with 1% BSA. Biotinylated primary antibodies were revealed with an anti-human secondary antibody labeled with streptavidin-Pacific Blue conjugate (Invitrogen, cat # S11222). After incubation, cells were washed once in PBS with 1% BSA and cell sorting was performed using a FACSAria II cell sorter (BD Biosciences).

Fetal Brain

Post-mortem human fetal neural tissues were obtained from a case of twin non-syndrome fetuses whose death was attributed to environmental/placental etiology. Tissues were obtained with appropriate patient consent according to Partner’s Healthcare/Brigham and Women’s Hospital IRB guidelines (Protocol #2010P001144). All samples and tissues were de-identified and linked only with minimal dataset (age, gender, brain location). Fetal brain tissue and fetal neural progenitor cells were derived from manually dissected regions of the brain (telencephalon), specifically the neocortex (pallium; GSM666914, GSM669615, GSM669610, GSM669612) and ganglionic eminences (subpallium; GSM669611, GSM669613). The tissues were minced and dissociated by combination of mechanical agitation (gentleMACS device) during enzymatic treatment with papain according to manufacturer’s protocol (Miltenyi Biotec, Neural tissue dissociation kit #130-092-628). Cell suspensions were then washed twice in DMEM and plated at low density in human NeuroCult NS-A media (Stem cell technology # 05751) supplemented with heparin, EGF (20ng/ml) and FGF (10ng/ml) in ultra low attachment cell culture flasks (Corning #3814).

ESC H1

Data were obtained from a previous publication[15].

2. High-throughput sequencing assays

All assays were performed as part of the NIH Roadmap Epigenomics Mapping Centers’ repository for human reference epigenome atlas[57]. Experiments were performed under the guidelines of Roadmap Epigenomics project (http://www.roadmapepigenomics.org/protocols). Specifically, MeDIP-seq and MRE-seq were performed as previously described[16]. ChIP-seq was performed as described in [58]. All data have been submitted to NCBI (Supplementary Table 3).

3. Bisulfite validation

Total genomic DNA underwent bisulfite conversion following an established protocol[59] with modification of: 95 °C for 1 min, 50 °C for 59 min for a total of 16 cycles. Regions of interest were amplified with PCR primers (see below) and were subsequently cloned using pCR2.1/TOPO (Invitrogen). Individual bacterial colonies were subjected to PCR using vector-specific primers and sequenced using an ABI 3700 automated DNA sequencer. The data were analyzed with online software BISMA[60]. Result is summarized in Supplementary Fig. 13. Genomic locations of candidates and primer information are summarized in Supplementary Table 4.

4. Reporter gene assay

TE candidates were amplified from genomic DNA using Pfu-polymerase (Agilent) and primers containing KpnI- or BglII- restriction sites. PCR products were gel-purified using Qiagen Gel purification kit, and then digested by the corresponding restriction enzymes (NEB). The digested PCR products were cloned into the pGL4.23[luc2/minP]-vector (Promega, E8411) using T4-ligase(NEB) and transformed into chemical competent DH5α-cells. The positive clones were verified by enzyme digestion and sequencing. 800 ng of reporter plasmid (or empty pGL4.23[luc2/minP]-vector control) were transfected into 3 different cell lines, 293T, GM12878, and SK-N-SH_RA which were differentiated with 6 μM of retinoic acid for 48 hours from SK-N-SH cells, using X-tremeGENE (Roche) in triplicate. In order to normalize the transfection, 200 ng of renilla luciferase plasmid driven by a TK promoter were co-transfected. The luciferase activity was measured after 48 hours, and normalized by the relative renilla control. Genomic locations of candidates and primer information are summarized in Supplementary Table 5.
ExperimentSampleGEO ID
MeDIP-seqH1Es Batch1GSM543016
H1Es Batch2GSM456941
Breast Luminal Epithelial Cells RM066GSM613856
Breast Luminal Epithelial Cells RM070GSM613843
Breast Luminal Epithelial Cells RM071GSM613852
Breast MyoEpithelial Cells RM066GSM613857
Breast MyoEpithelial Cells RM070GSM613846
Breast MyoEpithelial Cells RM071GSM613850
Breast Stem Cells RM066GSM613859
Breast Stem Cells RM070GSM613847
Breast Stem Cells RM071GSM613853
CD4 Memory Primary Cells TC003GSM613862
CD4 Memory Primary Cells TC007GSM613914
CD4 Memory Primary Cells TC009GSM669608
CD4 Naive Primary Cells TC003GSM543025
CD4 Naive Primary Cells TC007GSM613913
CD4 Naive Primary Cells TC009GSM669607
CD8 Naive Primary Cells TC003GSM543027
CD8 Naive Primary Cells TC007GSM613917
CD8 Naive Primary Cells TC009GSM669609
Fetal Brain HuFNSC01GSM669614
Fetal Brain HuFNSC02GSM669615
Neurosphere Cultured Cells, Cortex Derived HuFNSC01GSM669610
Neurosphere Cultured Cells, Cortex Derived HuFNSC02GSM669612
Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC01GSM669611
Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC02GSM669613
Peripheral Blood Mononuclear Primary Cells TC03GSM543023
Peripheral Blood Mononuclear Primary Cells TC007GSM613911
Peripheral Blood Mononuclear Primary Cells TC009GSM669606
MRE-seqH1Es Batch1GSM428286
H1Es Batch2GSM450236
Breast Luminal Epithelial Cells RM066GSM613833
Breast Luminal Epithelial Cells RM070GSM613818
Breast Luminal Epithelial Cells RM071GSM613826
Breast MyoEpithelial Cells RM066GSM613834
Breast MyoEpithelial Cells RM070GSM613821
Breast MyoEpithelial Cells RM071GSM613908
Breast Stem Cells RM066GSM613837
Breast Stem Cells RM070GSM613907
Breast Stem Cells RM071GSM613829
CD4 Memory Primary Cells TC003GSM613842
CD4 Memory Primary Cells TC007GSM613903
CD4 Memory Primary Cells TC009GSM669599
CD4 Naive Primary Cells TC003GSM543011
CD4 Naive Primary Cells TC007GSM613901
CD4 Naive Primary Cells TC009GSM613920
CD8 Naive Primary Cells TC003GSM543013
CD8 Naive Primary Cells TC007GSM613905
CD8 Naive Primary Cells TC009GSM613923
Fetal Brain HuFNSC01GSM669604
Fetal Brain HuFNSC02GSM669605
Neurosphere Cultured Cells, Cortex Derived HuFNSC01GSM669600
Neurosphere Cultured Cells, Cortex Derived HuFNSC02GSM669602
Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC01GSM669601
Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC02GSM669603
Peripheral Blood Mononuclear Primary Cells TC03GSM543009
Peripheral Blood Mononuclear Primary Cells TC007GSM613898
Peripheral Blood Mononuclear Primary Cells TC009GSM613919
HistoneChIP-seqCD8 Naive Primary Cells TC001 H3K4me1GSM613814
CD8 Naive Primary Cells TC001 H3K4me3GSM613811
CD8 Naive Primary Cells TC001 H3K36me3GSM669593
CD8 Naive Primary Cells TC001 H3K27me3GSM613815
CD8 Naive Primary Cells TC001 H3K9me3GSM613812
Fetal Brain HuFNSC01 H3K4me1GSM806942
Fetal Brain HuFNSC01 H3K4me3GSM806943
Fetal Brain HuFNSC01 H3K36me3GSM806946
Fetal Brain HuFNSC01 H3K27me3GSM806945
Fetal Brain HuFNSC01 H3K9me3GSM806944
p300(ENCODE/HAIB)GM12878 rep1GSM803387
GM12878 rep2GSM803387
H1GSM803542
HepG2GSM803499
SK-N-SH RA rep1GSM803495
SK-N-SH RA rep2GSM803495
mRNA-seqBreast Luminal Epithelial Cells RM035GSM543029
Breast Luminal Epithelial Cells RM080GSM669620
Breast MyoEpithelial Cells RM035GSM543031
Breast MyoEpithelial Cells RM080GSM669621
CD4 Memory TC014GSM669618
CD4 Naïve TC014GSM669617
CD8 Naïve TC014GSM669619
Fetal Brain HuFNSC01GSM751274
Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC01GSM751271
Neurosphere Cultured Cells, Ganglionic Eminence Derived HuFNSC02GSM751273
H1ESGSM484408
TF ChIP-seq(ENCODE)RAD21 GM12878 Rep1GSM803416
RAD21 SK-N-SH RA Rep1GSM803497
YY1 GM12878 RepGSM803406
YY1 SK-N-SH RA RepGSM803498
NFKB GM12878 Rep1GSM935478
  60 in total

1.  Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals.

Authors:  Vincent J Lynch; Robert D Leclerc; Gemma May; Günter P Wagner
Journal:  Nat Genet       Date:  2011-09-25       Impact factor: 38.330

2.  Transposable elements have rewired the core regulatory network of human embryonic stem cells.

Authors:  Galih Kunarso; Na-Yu Chia; Justin Jeyakani; Catalina Hwang; Xinyi Lu; Yun-Shen Chan; Huck-Hui Ng; Guillaume Bourque
Journal:  Nat Genet       Date:  2010-06-06       Impact factor: 38.330

3.  GREAT improves functional interpretation of cis-regulatory regions.

Authors:  Cory Y McLean; Dave Bristor; Michael Hiller; Shoa L Clarke; Bruce T Schaar; Craig B Lowe; Aaron M Wenger; Gill Bejerano
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

4.  Epigenetic inheritance at the agouti locus in the mouse.

Authors:  H D Morgan; H G Sutherland; D I Martin; E Whitelaw
Journal:  Nat Genet       Date:  1999-11       Impact factor: 38.330

5.  Rewirable gene regulatory networks in the preimplantation embryonic development of three mammalian species.

Authors:  Dan Xie; Chieh-Chun Chen; Leon M Ptaszek; Shu Xiao; Xiaoyi Cao; Fang Fang; Huck H Ng; Harris A Lewin; Chad Cowan; Sheng Zhong
Journal:  Genome Res       Date:  2010-03-10       Impact factor: 9.043

6.  Natural mutagenesis of human genomes by endogenous retrotransposons.

Authors:  Rebecca C Iskow; Michael T McCabe; Ryan E Mills; Spencer Torene; W Stephen Pittard; Andrew F Neuwald; Erwin G Van Meir; Paula M Vertino; Scott E Devine
Journal:  Cell       Date:  2010-06-25       Impact factor: 41.582

7.  BISMA--fast and accurate bisulfite sequencing data analysis of individual clones from unique and repetitive sequences.

Authors:  Christian Rohde; Yingying Zhang; Richard Reinhardt; Albert Jeltsch
Journal:  BMC Bioinformatics       Date:  2010-05-06       Impact factor: 3.169

Review 8.  Redox remodeling as an immunoregulatory strategy.

Authors:  Zhonghua Yan; Ruma Banerjee
Journal:  Biochemistry       Date:  2010-02-16       Impact factor: 3.162

9.  Conserved role of intragenic DNA methylation in regulating alternative promoters.

Authors:  Alika K Maunakea; Raman P Nagarajan; Mikhail Bilenky; Tracy J Ballinger; Cletus D'Souza; Shaun D Fouse; Brett E Johnson; Chibo Hong; Cydney Nielsen; Yongjun Zhao; Gustavo Turecki; Allen Delaney; Richard Varhol; Nina Thiessen; Ksenya Shchors; Vivi M Heine; David H Rowitch; Xiaoyun Xing; Chris Fiore; Maximiliaan Schillebeeckx; Steven J M Jones; David Haussler; Marco A Marra; Martin Hirst; Ting Wang; Joseph F Costello
Journal:  Nature       Date:  2010-07-08       Impact factor: 49.962

10.  LINE-1 retrotransposition activity in human genomes.

Authors:  Christine R Beck; Pamela Collier; Catriona Macfarlane; Maika Malig; Jeffrey M Kidd; Evan E Eichler; Richard M Badge; John V Moran
Journal:  Cell       Date:  2010-06-25       Impact factor: 41.582

View more
  106 in total

1.  Onco-exaptation of an endogenous retroviral LTR drives IRF5 expression in Hodgkin lymphoma.

Authors:  A Babaian; M T Romanish; L Gagnier; L Y Kuo; M M Karimi; C Steidl; D L Mager
Journal:  Oncogene       Date:  2015-08-17       Impact factor: 9.867

2.  Histone-lysine N-methyltransferase SETDB1 is required for development of the bovine blastocyst.

Authors:  Michael C Golding; Matthew Snyder; Gayle L Williamson; Kylee J Veazey; Michael Peoples; Jane H Pryor; Mark E Westhusin; Charles R Long
Journal:  Theriogenology       Date:  2015-07-29       Impact factor: 2.740

Review 3.  Advances in the profiling of DNA modifications: cytosine methylation and beyond.

Authors:  Nongluk Plongthongkum; Dinh H Diep; Kun Zhang
Journal:  Nat Rev Genet       Date:  2014-08-27       Impact factor: 53.242

4.  Hypomethylation marks enhancers within transposable elements.

Authors:  Zohar Mukamel; Amos Tanay
Journal:  Nat Genet       Date:  2013-07       Impact factor: 38.330

Review 5.  The role of transposable elements in health and diseases of the central nervous system.

Authors:  Matthew T Reilly; Geoffrey J Faulkner; Joshua Dubnau; Igor Ponomarev; Fred H Gage
Journal:  J Neurosci       Date:  2013-11-06       Impact factor: 6.167

6.  SQuIRE reveals locus-specific regulation of interspersed repeat expression.

Authors:  Wan R Yang; Daniel Ardeljan; Clarissa N Pacyna; Lindsay M Payer; Kathleen H Burns
Journal:  Nucleic Acids Res       Date:  2019-03-18       Impact factor: 16.971

7.  Expression of endogenous retroviruses reflects increased usage of atypical enhancers in T cells.

Authors:  Saliha Azébi; Eric Batsché; Frédérique Michel; Etienne Kornobis; Christian Muchardt
Journal:  EMBO J       Date:  2019-05-08       Impact factor: 11.598

8.  Hypermethylated LTR retrotransposon exhibits enhancer activity.

Authors:  Tianxiang Hu; Xingguo Zhu; Wenhu Pi; Miao Yu; Huidong Shi; Dorothy Tuan
Journal:  Epigenetics       Date:  2017-02-06       Impact factor: 4.528

9.  Deep transcriptome profiling of mammalian stem cells supports a regulatory role for retrotransposons in pluripotency maintenance.

Authors:  Alexandre Fort; Kosuke Hashimoto; Daisuke Yamada; Md Salimullah; Chaman A Keya; Alka Saxena; Alessandro Bonetti; Irina Voineagu; Nicolas Bertin; Anton Kratz; Yukihiko Noro; Chee-Hong Wong; Michiel de Hoon; Robin Andersson; Albin Sandelin; Harukazu Suzuki; Chia-Lin Wei; Haruhiko Koseki; Yuki Hasegawa; Alistair R R Forrest; Piero Carninci
Journal:  Nat Genet       Date:  2014-04-28       Impact factor: 38.330

10.  Pan-cancer analyses of the nuclear receptor superfamily.

Authors:  Mark D Long; Moray J Campbell
Journal:  Nucl Receptor Res       Date:  2015-12-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.