| Literature DB >> 30380113 |
Quy Xiao Xuan Lin1, Stephanie Sian1, Omer An1, Denis Thieffry2, Sudhakar Jha1,3, Touati Benoukraf1,4.
Abstract
Several recent studies have portrayed DNA methylation as a new player in the recruitment of transcription factors (TF) within chromatin, highlighting a need to connect TF binding sites (TFBS) with their respective DNA methylation profiles. However, current TFBS databases are restricted to DNA binding motif sequences. Here, we present MethMotif, a two-dimensional TFBS database that records TFBS position weight matrices along with cell type specific CpG methylation information computed from a combination of ChIP-seq and whole genome bisulfite sequencing datasets. Integrating TFBS motifs with TFBS DNA methylation better portrays the features of DNA loci recognised by TFs. In particular, we found that DNA methylation patterns within TFBS can be cell specific (e.g. MAFF). Furthermore, for a given TF, different DNA methylation profiles are associated with different DNA binding motifs (e.g. REST). To date, MethMotif database records over 500 TFBSs computed from over 2000 ChIP-seq datasets in 11 different cell types. MethMotif portal is accessible through an open source web interface (https://bioinfo-csi.nus.edu.sg/methmotif) that allows users to intuitively explore the entire dataset and perform both single, and batch queries.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30380113 PMCID: PMC6323897 DOI: 10.1093/nar/gky1005
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Construction of MethMotif database. Integrative analysis of ChIP-seq and WGBS datasets, downloaded from ENCODE and GEO databases, allows the profiling of the DNA methylation landscapes surrounding transcription factor binding regions in various cell types (A). Firstly, the methylation levels (beta scores or methylation scores) of CpGs located within all ChIP-ed protein peak regions are captured over the 200 bp surrounding peak summits. The distribution of the corresponding CpG methylation scores is then profiled in a heatmap for each cell type, to present the DNA methylation levels surrounding peak regions across all ChIP-ed proteins (methylation score less than 10% is defined as homogenously unmethylated, while methylation score more than 90% is regarded as homogenously hypermethylated). These heatmaps are accessible from the ‘Explore’ section of the MethMotif website (B). Finally, the direct binding motifs of sequence-specific TFs are identified. The DNA methylation level within TFBS is captured and shown in a MethMotif logo. MethMotif logos are collected in ‘MethMotif database’ section of the MethMotif website (C). The methylation levels within each logo can be displayed according to three states: (i) all methylation levels compiled, (ii) methylated only and, (iii) unmethylated only (D).
ChIP-seq datasets used in MethMotif database
| Cell ID | Organism | Cell type/tissue | Number of ChIP-seq experiments | Source | Download date |
|---|---|---|---|---|---|
| HeLa-S3 | Human | Cervix | 57 | ENCODE | 20 August 2017 |
| HEK293 | Human | Kidney | 199 | ENCODE | 12 March 2018 |
| IMR-90 | Human | Lung | 15 | ENCODE | 26 March 2018 |
| SK-N-SH | Human | Brain | 30 | ENCODE | 26 March 2018 |
| A549 | Human | Lung | 42 | ENCODE | 26 March 2018 |
| K562 | Human | Blood | 279 | ENCODE | 2 April 2018 |
| HepG2 | Human | Liver | 138 | 137-ENCODE, 1-GEO | 7 April 2018 |
| GM12878 | Human | Blood | 143 | ENCODE | 12 April 2018 |
| MCF-7 | Human | Breast | 99 | ENCODE | 13 April 2018 |
| H1-hESC | Human | Stem cell | 65 | ENCODE | 16 April 2018 |
| HCT116 | Human | Colorectum | 22 | ENCODE | 16 April 2018 |
WGBS datasets used in MethMotif database
| Cell ID | Organism | Cell type/tissue | Source ID | Release date |
|---|---|---|---|---|
| HeLa-S3 | Human | Cervix | GSM2175341 | 30 January 2017 |
| HEK293 | Human | Kidney | GSM1254259 | 11 December 2015 |
| IMR-90 | Human | Lung | ENCSR888FON | 31 July 2013 |
| SK-N-SH | Human | Brain | ENCSR145HNT | 13 December 2017 |
| A549 | Human | Lung | ENCSR481JIW | 4 December 2017 |
| K562 | Human | Blood | ENCSR765JPC | 22 March 2016 |
| HepG2 | Human | Liver | ENCSR881XOU | 13 October 2015 |
| GM12878 | Human | Blood | ENCSR890UQO | 23 February 2016 |
| MCF-7 | Human | Breast | GSM1328112 | 3 July 2014 |
| H1-hESC | Human | Stem cell | ENCSR617FKV | 13 October 2015 |
| HCT116 | Human | Colorectum | GSM3317488 | 10 August 2018 |
Figure 2.Workflow of MethMotif Batch Query. MethMotif Batch Query is available via the MethMotif website, which allows users to study the occurrences of TFBSs along with DNA methylation states in a given list of genomic loci. Users can upload the coordinates of the regions of their interest in BED format (A). MethMotif database will be queried (B) and the presence of TFBSs together with respective DNA methylation levels is then analysed using the MethMotif Enrichment Tool (C) in the given regions. If loci from the input regions overlap with any TFBS present in the MethMotif database, the Batch Query tool will output these overlapped loci along with their respective TFBS DNA methylation information via a beeswarm boxplot (where each dot represents the methylation level of a CpG site within the TFBS) and a MethMotif logo generated de novo(D).
Figure 3.Distinct motifs and DNA methylation levels in motifs across cell types. Examples of MethMotif logos from the MethMotif database show that TF binding propensities can be altered in terms of motifs (A and B), DNA methylation levels inside the motifs (C), or even both (D), across different cell types.