| Literature DB >> 34664666 |
Qiangwei Zhou1,2, Pengpeng Guan1,2, Zhixian Zhu1,2, Sheng Cheng1,2, Cong Zhou1,2, Huanhuan Wang1,2, Qian Xu1,2, Wing-Kin Sung2,3,4, Guoliang Li1,2.
Abstract
DNA methylation is known to be the most stable epigenetic modification and has been extensively studied in relation to cell differentiation, development, X chromosome inactivation and disease. Allele-specific DNA methylation (ASM) is a well-established mechanism for genomic imprinting and regulates imprinted gene expression. Previous studies have confirmed that certain special regions with ASM are susceptible and closely related to human carcinogenesis and plant development. In addition, recent studies have proven ASM to be an effective tumour marker. However, research on the functions of ASM in diseases and development is still extremely scarce. Here, we collected 4400 BS-Seq datasets and 1598 corresponding RNA-Seq datasets from 47 species, including human and mouse, to establish a comprehensive ASM database. We obtained the data on DNA methylation level, ASM and allele-specific expressed genes (ASEGs) and further analysed the ASM/ASEG distribution patterns of these species. In-depth ASM distribution analysis and differential methylation analysis conducted in nine cancer types showed results consistent with the reported changes in ASM in key tumour genes and revealed several potential ASM tumour-related genes. Finally, integrating these results, we constructed the first well-resourced and comprehensive ASM database for 47 species (ASMdb, www.dna-asmdb.com).Entities:
Mesh:
Year: 2022 PMID: 34664666 PMCID: PMC8728259 DOI: 10.1093/nar/gkab937
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Procedure used for ASMdb construction. The ASMdb database was constructed with MySQL and Django tools. BatMeth2 was used to map BS-Seq data, calculate the DNA methylation level and visualize the methylation patterns. MethHaplo was used to detect allele-specific DNA methylation. Hisat2 was used for RNA-Seq data mapping. ASEQ was used to detect allele-specific expressed genes. For annotation purposes, we used MethylSeekR to divide the genome into four categories of regions: unmethylated regions (UMRs), low-methylation regions (LMRs), partially methylated domains (PMDs) and highly methylated domains (HMDs) according to the methylation level.
Statistics of BS-Seq and RNA-Seq datasets in ASMdb
| Species | BS-Seq | RNA-Seq | ||||
|---|---|---|---|---|---|---|
| Projects | Samples | Categories | Projects | Samples | Categories | |
|
| 174 | 1484 | 417 | 41 | 758 | 105 |
|
| 227 | 2026 | 681 | 55 | 575 | 162 |
|
| 40 | 416 | 198 | 15 | 140 | 40 |
|
| 2 | 48 | 11 | 1 | 3 | 3 |
|
| 4 | 39 | 12 | 1 | 16 | 8 |
|
| 1 | 38 | 8 | 1 | 16 | 8 |
|
| 2 | 29 | 2 | 0 | 0 | 0 |
|
| 4 | 23 | 14 | 2 | 10 | 5 |
|
| 1 | 20 | 8 | 0 | 0 | 0 |
|
| 2 | 20 | 7 | 2 | 11 | 5 |
| Others | 67 | 257 | 142 | 22 | 69 | 43 |
| Total | 524 | 4400 | 1500 | 140 | 1598 | 379 |
Note. Categories represent different tissues, stages, or conditions.
Figure 2.Overview of ASMdb. (A) Main species included in ASMdb. (B) Proportion of BS-Seq data from various species in ASMdb. (C) Proportion of BS-Seq data from each tissue in humans. (D) Main functional modules in ASMdb. (E) An example of a genome browser screenshot around the FOXD3 gene region in human liver tissue (chr1:63321858–63325268, 3.41 kb).
Figure 3.Allele-specific analysis. (A) The distribution of ASM on chromosomes and the list of ASM obtained from human neural progenitor cells. (B) The distribution of ASEGs on chromosomes and the list of ASEGs obtained from human neural progenitor cells. (C) The overlap between ASM and ASEG. We calculated the percentage of ASEGs overlapping with ASM obtained from each methylation dataset and with ASEGs obtained from the corresponding RNA-Seq dataset. For statistical credibility, we removed the data with fewer than 500 ASM or ASEGs.
Figure 4.Screenshots of representative functional modules in ASMdb. (A) The distribution of high-frequency ASMG on chromosomes in humans. (B) The distribution of high-frequency ASEG on chromosomes in humans. (C) An example of the average DNA methylation level profile across samples from humans. (D) DNA methylation profile around the ERBB2 gene across samples from humans. The red box highlights the DNA methylation level of primordial germ cells.
Figure 5.The ERBB2 gene was used as an example to show the representative functional modules in ASMdb. (A) The location information of ERBB2 and its expression level from the GEPIA2 database. (B) The DNA methylation levels of the ERBB2 gene in normal and cancer samples. The red box indicates the differential DNA methylation level around the promoter. (C) Differential DNA methylation genes detected between lung cancer and normal lung data. The red box indicates that ERBB2 was detected as a significantly differentially methylated gene in lung cancer.
Analysis of the corresponding disease types in ASMdb tissues
| Tissues | Disease Type(s) | Tissues | Disease type(s) |
|---|---|---|---|
| Blood | • ALL | Brain | • Alzheimer |
| • AML3 | • Cancer | ||
| • CLL | • Schizophrenia | ||
| • Colon-cancer | |||
| • Lung-cancer | |||
| Breast | • Cancer | Colon | • Cancer |
| Liver | • Cancer | Lung | • Cancer |
| Prostate | • Cancer | Pancreas | • Cancer |
Figure 6.Application examples of ASMdb. (A) The distribution of high-frequency ASM-related genes in liver cancer and normal data. (B) The distribution of high-frequency ASEG in lung cancer and normal data. (C) Genome browser screenshot of the KCNQ1 gene in human liver cancer and normal data. The green box highlights the differential DNA methylation levels and ASM between cancer and normal data. (D) Genome browser screenshot of the AVPR1A gene in human liver cancer and normal data. The green box highlights the differential DNA methylation levels and ASM between cancer and normal data.
|
| Allele-specific expressed gene |
|
| Allele-specific DNA methylation |
|
| Allele-specific DNA methylation related gene |
|
| Bisulfite sequencing |
|
| Ductal carcinoma in situ |
|
| Gene Expression Omnibus |
|
| Highly methylated domain |
|
| Imprinting control region 2 |
|
| Insulin-like growth factor II |
|
| Lowly methylated region |
|
| Primordial germ cell |
|
| Partially methylated domain |
|
| Unmethylated region |