Literature DB >> 30321400

EWASdb: epigenome-wide association study database.

Di Liu1,2, Linna Zhao1,2, Zhaoyang Wang1,2, Xu Zhou1,2, Xiuzhao Fan1,2, Yong Li2, Jing Xu1,2, Simeng Hu1,2, Miaomiao Niu1,2, Xiuling Song1,2, Ying Li1,2, Lijiao Zuo1,2, Changgui Lei1,2, Meng Zhang2,3, Guoping Tang4, Min Huang2,3, Nan Zhang1,2, Lian Duan1, Hongchao Lv1, Mingming Zhang1, Jin Li1, Liangde Xu1,2, Fanwu Kong5, Rennan Feng2,3, Yongshuai Jiang1,2.   

Abstract

DNA methylation, the most intensively studied epigenetic modification, plays an important role in understanding the molecular basis of diseases. Furthermore, epigenome-wide association study (EWAS) provides a systematic approach to identify epigenetic variants underlying common diseases/phenotypes. However, there is no comprehensive database to archive the results of EWASs. To fill this gap, we developed the EWASdb, which is a part of 'The EWAS Project', to store the epigenetic association results of DNA methylation from EWASs. In its current version (v 1.0, up to July 2018), the EWASdb has curated 1319 EWASs associated with 302 diseases/phenotypes. There are three types of EWAS results curated in this database: (i) EWAS for single marker; (ii) EWAS for KEGG pathway and (iii) EWAS for GO (Gene Ontology) category. As the first comprehensive EWAS database, EWASdb has been searched or downloaded by researchers from 43 countries to date. We believe that EWASdb will become a valuable resource and significantly contribute to the epigenetic research of diseases/phenotypes and have potential clinical applications. EWASdb is freely available at http://www.ewas.org.cn/ewasdb or http://www.bioapp.org/ewasdb.

Entities:  

Mesh:

Year:  2019        PMID: 30321400      PMCID: PMC6323898          DOI: 10.1093/nar/gky942

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

DNA methylation plays a critical role in regulating chromatin structure (1) and gene expression (2,3), and takes part in a variety of key biological processes including human development (4–7), aging (8,9), genomic imprinting (10,11) and the inactivation of tumor suppressor genes (12). Recently, a systematic method, named epigenome-wide association study (EWAS), has been developed to identify epigenetic variants associated with complex diseases (13–16) or phenotypes (17). EWAS provides an effective means to understand the molecular basis for disease risk. Meanwhile, the development of high throughput technology, including the generation of Illumina HumanMethylation450 Bead Chip data, makes it possible to complete a large scale EWAS scan (13,18). To date, using EWASs, many complex diseases/phenotypes, including body mass index (BMI), adiposity (19), autism spectrum disorder (ASD) (20) and cancer (13,21,22), have been successfully analyzed. Using EWASs, achievement has been made in identifying risk epigenetic variations of diseases/phenotypes. However, compared with genome-wide association study (GWAS, identifying common genetic variants associated with diseases/phenotypes), EWAS has lagged behind. To this end, we proposed ‘The EWAS Project’ in 2015 to develop EWAS analysis tools and data resources. Previously, we designed EWAS2.0 JAVA software to identify the association between DNA methylation levels and complex diseases (23–25). However, our investigation of EWAS data resources revealed few databases that archive epigenetic association results from EWASs. This has made it difficult for researchers to view associations between diseases/phenotypes and epigenetics. Here, as an important part of ‘The EWAS Project’, we developed the EWASdb (epigenome-wide association study database, http://www.ewas.org.cn/ewasdb or http://www.bioapp.org/ewasdb), to store and utilize the significant epi-markers identified in EWASs. The EWASdb can be used to query DNA methylation markers, genes, KEGG pathways, and GO categories which have significant associations with some diseases or phenotypes. In addition, the EWASdb can help to reveal the mechanism of complex diseases at the epigenetic level.

DATABASE CONTENT

DNA methylation data

The Illumina HumanMethylation450 Bead Chip (450K) and Illumina Infinium MethylationEPIC Bead Chip (850K) methylation microarray are the most widely used platforms for epigenome-wide association study, including 482 421 CpG sites and 864 935 CpG sites, respectively. To construct a comprehensive and practical database, we downloaded 907 DNA methylation 450K datasets and 64 DNA methylation 850K datasets updated by the end of July 2018, involving 78 180 samples in matrix format, from the GPL13534 and GPL21145 platforms of the GEO (Gene Expression Omnibus) database (26).

Classification of diseases/phenotypes

We employed two groups of independent experts to classify the downloaded EWAS datasets. In brief, one group of experts extracted effective information and classified the data based on the published literature corresponding to the EWAS datasets. The other group of experts was responsible for checking the obtained information to ensure the accuracy of the classification. Finally, we curated 1319 (Table 1) EWASs and divided them into seven classifications: ‘disease’, ‘trait’, ‘drug treatment’, ‘tissue’, ‘cell line’, ‘stem cell’, and ‘other’. Among the seven classifications, there are 302 sub-classifications, of which 165 are related to ‘complex disease’, 38 are related to ‘trait’, 22 are related to ‘drug treatment’, 19 are related to ‘tissue’, 31 are related to ‘cell line’, seven are related to ‘stem cell’ and 20 ambiguous sub-classifications were placed in the class of ‘other’ (Figure 1A). The number of the EWASs relevant to each of the seven classifications is shown in Figure 1B.
Table 1.

The summary of the data curated in EWASdb

Data typeData number
EWAS studies1319
EWAS results for single epi-marker (P< 1e–7)18 538 029
EWAS results for single epi-marker (P< 1e–3)52 292 604
EWAS results for KEGG pathway (P< 1e–3)49 967
EWAS results for GO categories (P< 1e–3)930 609
EWAS results for BP GO categories (P< 1e–3)686 552
EWAS results for MF GO categories (P< 1e–3)141 979
EWAS results for CC GO categories (P< 1e–3)102 078
Gene27 918
Diseases/phenotypes302
Figure 1.

General statistical situation of the diseases/phenotypes. (A) percentage of sub-classifications relevant to each of the seven classifications; (B) number of the EWASs relevant to each of the seven classifications.

General statistical situation of the diseases/phenotypes. (A) percentage of sub-classifications relevant to each of the seven classifications; (B) number of the EWASs relevant to each of the seven classifications. The summary of the data curated in EWASdb

Identification of disease/phenotype related epi-markers

For each EWAS, we identified the association between epigenetic variations and disease/phenotype using EWAS v2.0 software (25). Using a moderate P-value (less than 1.0 × 10−3), we obtained a total of 52 292 604 disease/phenotype related markers (Table 1). Meanwhile, using a strictly significant P-value level (less than 1.0 × 10−7), 18 538 029 CpG loci were obtained (Table 1). In addition, we mapped the gene and chromosomal locations of these significant markers based on the Illumina HumanMethylation450/850 Bead Chip annotation information.

Identification of KEGG pathways

Kyoto Encyclopedia of Genes and Genomes (KEGG) consists of graphical diagrams of biochemical pathways, including metabolic pathways and some known regulatory pathways (27). Gene functions can be systematically analyzed in terms of the networks of genes and molecules. Hence, we performed KEGG pathway analysis of the significant genes to reveal the biochemical pathways they are involved in, and to understand the interactions between these genes. Susceptible biological pathways were identified by performing hypergeometric test analysis. We acquired 49 967 risk KEGG pathways at a significance level with P-values less than 1.0 × 10−3 (Table 1).

Identification of GO (Gene Ontology) Category

To further understand the functional characteristics of disease/phenotype susceptible epigenetic variations we performed Gene Ontology (GO) annotation for each risk gene (28). We used the hypergeometric test method to screen related GO terms for risk CpG loci at a P-value less than 1.0 × 10−3. Finally, we obtained 930 609 disease/phenotype associated GO terms including 686 552 biological processes (BP), 141 979 molecular functions (MF) and 102 078 cellular components (CC) (Table 1).

DATABASE ORGANIZATION AND WEB INTERFACE

Database construction

The EWASdb was constructed based on PHP language, and the web interface was built using HTML and JavaScript. All data of our database are stored in MySQL. The database has been tested using Google Chrome, Firefox, and Internet Explorer web browsers.

Search interface

To make the querying convenient and effective for users, we provide three kinds of search interfaces: (i) EWAS for single marker; (ii) EWAS for KEGG Pathway and (iii) EWAS for GO Category.

EWAS for single marker

In this search module, EWASdb provides five different ways to search for detailed information of single DNA methylation markers associated with diseases/phenotypes: (i) Search by disease/phenotype: users can enter a disease/phenotype, like ‘glioma’, to get a list of all the EWASs related to this disease or phenotype; (ii) Search by EWAS ID: users can enter any item from EWAS1 to EWAS1319 to obtain all the significant loci in this EWAS; (iii) Search by gene: users can input their gene of interest, such as ‘TTTY18’ or ‘TMSB4Y’, to acquire the loci mapped into this gene from all the EWASs; (iv) Search by cg#: users can query by CpG locus to obtain all of the information from this CpG locus of each EWAS and (v) Search by region#: users can fill in the chromosome number and the chromosomal starting and ending positions in the search box to view the significant loci in this region from all of the EWASs. For each EWAS, users can obtain detailed information, such as ‘GSE ID’, ‘EWAS Title’, ‘Disease/Phenotype’, ‘Group name’, ‘Sample size’, ‘Summary of EWAS’, ‘Contributor’, ‘Public date’ and ‘Citation (PubMed ID)’. For each epi-marker, users can obtain detailed information, such as ‘EWAS ID’, ‘cg#’, ‘chr#’, ‘Position’, ‘Gene’, ‘Disease/Phenotype’, ‘Classification’, ‘Sample size’, ‘Average beta-value in two groups’, ‘T-statistics’ and ‘P-value’.

EWAS for KEGG Pathway

The EWASdb contains interfaces for users to acquire KEGG pathways significantly associated with diseases/phenotypes. Users can input a pathway ID or pathway name, such as ‘hsa04510’ or ‘Galactose metabolism’, to obtain the details of EWASs involving this pathway. In addition, users can enter any item from EWAS1 to EWAS1319 to obtain the pathways associated with the particular EWAS. For each KEGG pathway, the following information is displayed: ‘EWAS ID’, ‘KEGG ID’, ‘Classification’, ‘Disease/Phenotype’, ‘Sample groups’, ‘KEGG Pathway Name’, ‘Pathway Gene Number’, ‘Gene Number Annotated In The KEGG Pathway’ and ‘P-value’.

EWAS for GO Category

The database allows users to get the GO terms related to the diseases/phenotypes of interest. Similarly, user can input GO term name or GO term ID, such as ‘mitochondrial genome maintenance’ or ‘GO:0000902’, to obtain the EWASs associated with this GO term. Furthermore, any item from EWAS1 to EWAS1319 can be inputted to obtain the significant GO terms associated with that particular EWAS. For each GO term, the EWASdb provides detailed information for users, such as: ‘EWAS ID’, ‘GO ID’, ‘Classification’, ‘Disease/Phenotype’, ‘Sample groups’, ‘Categories’, ‘GO Term Name’, ‘GO Term Gene Number’, ‘Gene Number Annotated In The GO Term’ and ‘P-value’.

Browse and download

The database allows users to browse data ‘by class’ and ‘by alph’. For ‘browse by class’, seven classifications are shown on the browse tree. Users can choose any one of them to view the disease/phenotype and the associated EWASs. Furthermore, by clicking the EWAS ID, users can achieve detailed information about the specific EWAS. For ‘browse by alph’, users can get an overview of the EWAS results by clicking the alphabet on the browse tree. All data can be downloaded freely from the ‘Download’ page. In addition, users can also download the EWAS v2.0 (25) to conduct their own EWAS by clicking the ‘EWAS Software’.

SUMMARY AND DISCUSSION

Over the past few years, GWAS has identified a large number of genomic variants associated with diseases/phenotypes, many of which are in meaningful biological pathways (29). Although GWAS has contributed to our understanding of the occurrence and development of diseases and prognosis, they have failed to fully explain the risk factors that contribute to diseases/phenotypes. The epigenome provides some new insight to understand risk factors affecting the diseases/phenotypes by considering both genetics and environmental perspective. With the rapid growth in DNA methylation data and the maturity of EWAS methods, increasing numbers of epigenetic variants associated with complex diseases/phenotypes have been identified (28,30,31). A useful database resource, which systemically integrates the results of EWASs, will be of great benefit to researchers. Therefore, we collected the available DNA methylation datasets focusing on different diseases/phenotypes to develop the EWASdb. The EWASdb can be queried in a wide range of ways to meet the requirements of different users. The database will provide a means for researchers to explore and understand the pathogenesis of complex disease from the epigenetic level.

FUTURE DIRECTIONS

EWAS is one of the most useful method to identify genome-wide epigenetic variations associated with diseases/phenotypes. However, it still has some limitations. First, robust findings require large-scale samples and rigorous scientific method (32). In subsequent EWASdb versions, we will collect multiple sets of DNA methylation data for a phenotype or disease and use a meta-analysis strategy to identify more stable and reliable epigenetic markers. In addition, EWAS can more powerful for identifying robust epigenetic markers in common variations rather than rare variations (32). The current EWASdb version contains only the analysis results of common variations. For rare epigenetic variations, we will develop some novel analysis methods to improve its analytical performance in the future. The EWASdb is a most important part of ‘The EWAS Project’ and has been widely used and downloaded by researchers from more than 40 countries. Our team has the ability to continuously update and maintain the database when more 450K or 850K DNA methylation data are available. We believe that the EWASdb, a comprehensive database, will become a valuable resource and a useful tool in the future.
  32 in total

Review 1.  Five years of GWAS discovery.

Authors:  Peter M Visscher; Matthew A Brown; Mark I McCarthy; Jian Yang
Journal:  Am J Hum Genet       Date:  2012-01-13       Impact factor: 11.025

2.  Epigenetic changes in the expression of the maize A1 gene in Petunia hybrida: role of numbers of integrated gene copies and state of methylation.

Authors:  F Linn; I Heidmann; H Saedler; P Meyer
Journal:  Mol Gen Genet       Date:  1990-07

3.  DNA methyltransferases Dnmt3a and Dnmt3b are essential for de novo methylation and mammalian development.

Authors:  M Okano; D W Bell; D A Haber; E Li
Journal:  Cell       Date:  1999-10-29       Impact factor: 41.582

Review 4.  DNA methylation and its basic function.

Authors:  Lisa D Moore; Thuc Le; Guoping Fan
Journal:  Neuropsychopharmacology       Date:  2012-07-11       Impact factor: 7.853

5.  Targeted mutation of the DNA methyltransferase gene results in embryonic lethality.

Authors:  E Li; T H Bestor; R Jaenisch
Journal:  Cell       Date:  1992-06-12       Impact factor: 41.582

6.  An Environment-Wide Association Study (EWAS) on type 2 diabetes mellitus.

Authors:  Chirag J Patel; Jayanta Bhattacharya; Atul J Butte
Journal:  PLoS One       Date:  2010-05-20       Impact factor: 3.240

Review 7.  Stability and flexibility of epigenetic gene regulation in mammalian development.

Authors:  Wolf Reik
Journal:  Nature       Date:  2007-05-24       Impact factor: 49.962

8.  NCBI GEO: archive for functional genomics data sets--update.

Authors:  Tanya Barrett; Stephen E Wilhite; Pierre Ledoux; Carlos Evangelista; Irene F Kim; Maxim Tomashevsky; Kimberly A Marshall; Katherine H Phillippy; Patti M Sherman; Michelle Holko; Andrey Yefanov; Hyeseung Lee; Naigong Zhang; Cynthia L Robertson; Nadezhda Serova; Sean Davis; Alexandra Soboleva
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

9.  Epigenome-Wide Association Studies (EWAS) in Cancer.

Authors:  Mukesh Verma
Journal:  Curr Genomics       Date:  2012-06       Impact factor: 2.236

10.  A novel CpG island set identifies tissue-specific methylation at developmental gene loci.

Authors:  Robert Illingworth; Alastair Kerr; Dina Desousa; Helle Jørgensen; Peter Ellis; Jim Stalker; David Jackson; Chris Clee; Robert Plumb; Jane Rogers; Sean Humphray; Tony Cox; Cordelia Langford; Adrian Bird
Journal:  PLoS Biol       Date:  2008-01       Impact factor: 8.029

View more
  15 in total

1.  Human methylome variation across Infinium 450K data on the Gene Expression Omnibus.

Authors:  Sean K Maden; Reid F Thompson; Kasper D Hansen; Abhinav Nellore
Journal:  NAR Genom Bioinform       Date:  2021-04-22

2.  Mining the Selective Remodeling of DNA Methylation in Promoter Regions to Identify Robust Gene-Level Associations With Phenotype.

Authors:  Yuan Quan; Fengji Liang; Si-Min Deng; Yuexing Zhu; Ying Chen; Jianghui Xiong
Journal:  Front Mol Biosci       Date:  2021-03-26

3.  A genomic atlas of systemic interindividual epigenetic variation in humans.

Authors:  Chathura J Gunasekara; C Anthony Scott; Eleonora Laritsky; Maria S Baker; Harry MacKay; Jack D Duryea; Noah J Kessler; Garrett Hellenthal; Alexis C Wood; Kelly R Hodges; Manisha Gandhi; Amy B Hair; Matt J Silver; Sophie E Moore; Andrew M Prentice; Yumei Li; Rui Chen; Cristian Coarfa; Robert A Waterland
Journal:  Genome Biol       Date:  2019-06-03       Impact factor: 13.583

4.  The 26th annual Nucleic Acids Research database issue and Molecular Biology Database Collection.

Authors:  Daniel J Rigden; Xosé M Fernández
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

5.  Leveraging biological and statistical covariates improves the detection power in epigenome-wide association testing.

Authors:  Jinyan Huang; Ling Bai; Bowen Cui; Liang Wu; Liwen Wang; Zhiyin An; Shulin Ruan; Yue Yu; Xianyang Zhang; Jun Chen
Journal:  Genome Biol       Date:  2020-04-06       Impact factor: 13.583

6.  EWAS Open Platform: integrated data, knowledge and toolkit for epigenome-wide association study.

Authors:  Zhuang Xiong; Fei Yang; Mengwei Li; Yingke Ma; Wei Zhao; Guoliang Wang; Zhaohua Li; Xinchang Zheng; Dong Zou; Wenting Zong; Hongen Kang; Yaokai Jia; Rujiao Li; Zhang Zhang; Yiming Bao
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

7.  EWAS Data Hub: a resource of DNA methylation array data and metadata.

Authors:  Zhuang Xiong; Mengwei Li; Fei Yang; Yingke Ma; Jian Sang; Rujiao Li; Zhaohua Li; Zhang Zhang; Yiming Bao
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

8.  Genome-wide identification of genes regulating DNA methylation using genetic anchors for causal inference.

Authors:  Paul J Hop; René Luijk; Lucia Daxinger; Maarten van Iterson; Koen F Dekkers; Rick Jansen; Joyce B J van Meurs; Peter A C 't Hoen; M Arfan Ikram; Marleen M J van Greevenbroek; Dorret I Boomsma; P Eline Slagboom; Jan H Veldink; Erik W van Zwet; Bastiaan T Heijmans
Journal:  Genome Biol       Date:  2020-08-28       Impact factor: 13.583

9.  AtMAD: Arabidopsis thaliana multi-omics association database.

Authors:  Yiheng Lan; Ruikun Sun; Jian Ouyang; Wubing Ding; Min-Jun Kim; Jun Wu; Yuhua Li; Tieliu Shi
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

Review 10.  Ten Years of EWAS.

Authors:  Siyu Wei; Junxian Tao; Jing Xu; Xingyu Chen; Zhaoyang Wang; Nan Zhang; Lijiao Zuo; Zhe Jia; Haiyan Chen; Hongmei Sun; Yubo Yan; Mingming Zhang; Hongchao Lv; Fanwu Kong; Lian Duan; Ye Ma; Mingzhi Liao; Liangde Xu; Rennan Feng; Guiyou Liu; The Ewas Project; Yongshuai Jiang
Journal:  Adv Sci (Weinh)       Date:  2021-08-11       Impact factor: 16.806

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.