Literature DB >> 30364969

EWAS Atlas: a curated knowledgebase of epigenome-wide association studies.

Mengwei Li1,2,3, Dong Zou1,2, Zhaohua Li1,2,4, Ran Gao3,5, Jian Sang1,2,3, Yuansheng Zhang1,2,3, Rujiao Li1,2, Lin Xia1,2,3, Tao Zhang1,2,3, Guangyi Niu1,2,3, Yiming Bao1,2,3,4, Zhang Zhang1,2,3,5.   

Abstract

Epigenome-Wide Association Study (EWAS) has become increasingly significant in identifying the associations between epigenetic variations and different biological traits. In this study, we develop EWAS Atlas (http://bigd.big.ac.cn/ewas), a curated knowledgebase of EWAS that provides a comprehensive collection of EWAS knowledge. Unlike extant data-oriented epigenetic resources, EWAS Atlas features manual curation of EWAS knowledge from extensive publications. In the current implementation, EWAS Atlas focuses on DNA methylation-one of the key epigenetic marks; it integrates a large number of 329 172 high-quality EWAS associations, involving 112 tissues/cell lines and covering 305 traits, 1830 cohorts and 390 ontology entities, which are completely based on manual curation from 649 studies reported in 401 publications. In addition, it is equipped with a powerful trait enrichment analysis tool, which is capable of profiling trait-trait and trait-epigenome relationships. Future developments include regular curation of recent EWAS publications, incorporation of more epigenetic marks and possible integration of EWAS with GWAS. Collectively, EWAS Atlas is dedicated to the curation, integration and standardization of EWAS knowledge and has the great potential to help researchers dissect molecular mechanisms of epigenetic modifications associated with biological traits.

Entities:  

Mesh:

Year:  2019        PMID: 30364969      PMCID: PMC6324068          DOI: 10.1093/nar/gky1027

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Epigenome-Wide Association Study (EWAS) has become a powerful approach to identify epigenetic variations associated with biological traits (1,2). With the rapid advancement of high-throughput sequencing technologies, epigenetic variations—especially variations in DNA methylation, have been extensively reported to be associated with a wide range of traits in human, including not only phenotypes and diseases, but also behaviors and environmental exposures. Specifically, phenotypic traits, such as aging and Body Mass Index (BMI), have been found to be strongly associated with DNA methylation (3,4). Environmental exposures and behaviors, such as air pollution and smoking, can affect global as well as gene-specific DNA methylation (5,6). Furthermore, some disease-related epigenetic signatures have been used as early diagnostic and/or prognostic biomarkers (7–9). As a growing body of EWAS evidence supports the use of epigenetic modifications as biomarkers for human healthcare and disease treatment, it is thus highly needed to have a comprehensive collection of EWAS knowledge by integrating associations of epigenetic variations with diverse biological traits. Toward this end, valuable efforts have been made in developing epigenetic databases that integrate different types of epigenetic data. However, none of them is specialized for integrating EWAS knowledge as well as associated information. For example, DiseaseMeth and MethHC collect annotations of aberrant DNA methylation only in human diseases, ignoring associations with other important phenotypes, behaviors and environmental exposures (10–12). MethBank and International Human Epigenome Consortium (IHEC) only incorporate epigenome data, without collecting associations of epigenetic data with traits (13–15). GWAS Catalog, a popular resource of genetic research, focuses only on published Genome-Wide Association Study (GWAS) data (16,17). Moreover, to our knowledge (as of 20 September 2018), there are two unpublished database resources, namely, EWAS Catalog (http://www.ewascatalog.org/) and EWASdb (http://www.bioapp.org/ewasdb/), which are devoted to the integration of EWAS associations. However, the former seems still under construction (as accessed on 20 September 2018) and the latter provides summary-level EWAS information for different traits or genes. Although more and more studies have shown the significant potential of epigenetic modifications in precision medicine, there still lacks a specialized resource for systematically integrating EWAS associations, especially considering the ever-increasing number of EWAS publications (Figure 1).
Figure 1.

(A) Statistics of EWAS publications. (B) Word cloud of traits.

(A) Statistics of EWAS publications. (B) Word cloud of traits. Here we present EWAS Atlas (http://bigd.big.ac.cn/ewas/), a curated knowledgebase of EWAS. As one of core resources in the BIG Data Center (18–20), EWAS Atlas is devoted to providing a comprehensive collection of high-quality EWAS associations in support of systematic investigations of complex molecular mechanisms associated with different biological traits. Unlike extant data-oriented epigenetic resources, EWAS Atlas features manual curation of EWAS knowledge from extensive publications and accordingly incorporates a large number of high-quality EWAS data and a diversity of traits and ontology entities. EWAS Atlas provides open access to all curated data and thus would serve as a valuable resource for the global research community.

IMPLEMENTATION

EWAS Atlas is built with Spring boot (http://spring.io/), a mature and convention-over-configuration Model-View-Controller (MVC) framework, deployed in a Centos Linux 6.4 environment. In the back-end part, EWAS data is stored in MySQL (https://www.mysql.com/), a free and popular relational database management system. The web pages are constructed using HTML5 and rendered using Thymeleaf (https://www.thymeleaf.org/). Front-end interfaces are built using Bootstrap (https://getbootstrap.com/) with JQuery (https://jquery.com/) to provide responsive and user-friendly web pages. Furthermore, ECharts (http://echarts.baidu.com/) is used to provide interactive charting and data visualization. The trait enrichment tool is implemented by Python 3.5 (https://www.python.org/).

DATA CURATION AND DATABASE CONTENTS

To provide high-quality information curated from EWAS publications, we set up a standardized curation process involving three major steps, viz., literature search, study curation and association curation (Figure 2). First, we perform literature search in PubMed using pre-defined keywords. Publications are eligible for inclusion in EWAS Atlas only if they contain necessary description on involved traits and significant EWAS associations. Consequently, a total of 401 publications are qualified and their other basic information (e.g. abstract, citation) are obtained automatically through Europe PMC API (https://europepmc.org/developers/) (21). The second is study curation, viz., manual curation of detailed study information from publications, including trait name(s), brief description of case and control groups, clinical and pathological characteristics of study populations. To unify the representation of biological traits, entities are mapped to the standardized terms in Experimental Factor Ontology (EFO) (22), which combines parts of several biological ontologies, such as Disease Ontology (DO) and Gene Ontology (GO). Finally, we conduct association curation to manually collect information of eligible associations (that should have P-value < 1.0E–4 or adjusted P-value < 0.05), including correlations between DNA methylation levels and experimental variables as well as their ranks in specific studies. Furthermore, considering various annotation systems adopted by different array platforms, all probes in 27K, 450K and 850K are re-annotated based on GENCODE release 28 (GRCh37) to maintain their consistency (23). The data curation in EWAS Atlas can be achieved by multiple curators through user-friendly web interfaces, enabling collaborative curation and enhancing the efficiency of the curation process.
Figure 2.

The curation model adopted by EWAS Atlas. It is noted that eligible associations should have P-value < 1.0E–4 or adjusted P-value < 0.05.

The curation model adopted by EWAS Atlas. It is noted that eligible associations should have P-value < 1.0E–4 or adjusted P-value < 0.05. Based on the standardized curation process, EWAS Atlas integrates a large collection of 329 172 high-quality associations manually curated from 649 studies reported in 401 publications, involving 112 tissues/cell lines and covering 305 traits, 1830 cohorts and 390 ontology terms. Consequently, it is interestingly found according to the current collection in EWAS Atlas that the most extensively studied trait is smoking, involving 49 studies and 26 996 associations (http://bigd.big.ac.cn/ewas/browse?traitList=smoking) and that PTPRN2, which encodes receptor-type tyrosine-protein phosphatase, is associated with >90 different traits (http://bigd.big.ac.cn/ewas/browse?gene=PTPRN2). To facilitate users in browsing these data, EWAS Atlas provides five panels, where data are organized and presented in terms of trait, probe, gene, study and publication, respectively. The trait panel provides an overview of all collected traits documented in EWAS Atlas. For each trait, both general details (name and mapped EFO terms) and summary data related to this trait are listed in a table form (Figure 3A). Specially, the percentages of biological different DNA methylation signatures (hyper/hypo) and CpG island relations (island, shore, shelf, open sea) are displayed in a visualized way. One trait may have multiple related probes that associate with different genes. To reveal biological processes for a specific trait, all relevant genes are collected for GO enrichment analysis. Thus, EWAS Atlas also provides a list of significantly enriched GO terms. The probe panel provides not only basic probe annotations (e.g. genomic coordinate, related transcripts, CpG island relation), but also summary-level descriptions of association data (e.g. occurrence frequency of a specific probe, percentages of hypo-/hyper-methylation). For each probe, details of individual associations, including study IDs, correlations, effect sizes and associated traits, are presented in a tab-separated table (Figure 3B). The gene panel contains not only general information (e.g. gene ID, genomic coordinate, tissue expression) but also summary-level EWAS association information (e.g., percentages of associations on promoter or gene body, related traits, number of associations). The study panel displays an overview of all EWAS studies, involving reported traits as well as brief descriptions of case and control groups. More importantly, an abundant collection of clinical and pathological characteristics of study populations, consisting of sample size, age, sex ratio and ancestry, are also curated from literature and integrated into EWAS Atlas (Figure 3C). Additionally, publication with its bibliographic details (title, year, journal, PubMed ID and citation) are collectively summarized in the publication panel (Figure 3D). In all four panels, hyperlinks to external databases, such as Gene Expression Omnibus (GEO) (24), European Nucleotide Archive (ENA) (25), PubMed, are provided to offer convenient access to additional information.
Figure 3.

Screenshot of web pages for (A) Trait, (B) Probe, (C) Gene, (D) Study and (E) Publication.

Screenshot of web pages for (A) Trait, (B) Probe, (C) Gene, (D) Study and (E) Publication. Powered by a large number of curated EWAS knowledge, EWAS Atlas provides an online tool for trait enrichment analysis (TEA). It allows users to submit customized probe(s)/trait as input and then explore trait-trait and trait-epigenome relationships. In the current version, the weighted Fisher's exact test is used to calculate the co-occurrence probability between input DNA methylation probes and trait-related DNA methylation probes, where the weight of each probe is the number of studies that reported this probe-trait association and equals 1 if absent. Based on this TEA tool, users can obtain TEA results, including significantly enriched traits and ontology terms. For example, when inputting reported DNA methylation sites related with cardiovascular risk (data originally from (26)), the TEA tool clearly shows that the most related trait is smoking; this result is well consistent with previous findings that smoking is one of the major cardiovascular risk factors, demonstrating its great potential utility in profiling the relationships among diverse traits (27,28). To browse and query EWAS data in an efficient and friendly manner, EWAS Atlas is equipped with multiple filters to enable users to easily find traits, probes, associations, studies and publications of interest. Specifically, these filters cover a wide range of data items, including trait, gene symbol, probe ID, ontology, genomic coordinate, study ID, PMID, etc. As a consequence, it helps users to efficiently narrow down the query results. In addition, EWAS Atlas provides auto-suggestion functionality, which is able to provide a list of candidate terms according to users’ inputs. To fully benefit the global scientific community, all relevant data in EWAS Atlas are open access and publicly available at http://bigd.big.ac.cn/ewas/downloads. To ease data downloading, all query results that are displayed in web pages can be exported as a tab-delimited file. Furthermore, in order to enable programmatic access to EWAS data, a series of RESTful APIs (http://bigd.big.ac.cn/ewas/api) are implemented for automatic data retrieval, simply by specifying different types of identifiers (e.g. study ID, gene symbol, PMID, probe ID).

DISCUSSION AND FUTURE DEVELOPMENTS

EWAS Atlas, to our knowledge, is the first knowledgebase integrating a comprehensive collection of epigenome variations associated with phenotypes, diseases, behaviors and environmental exposures. It features manual curation of EWAS knowledge from extensive publications and accordingly incorporates a large number of high-quality EWAS data and a diversity of traits and ontology entities. Considering the great potential of epigenetic modifications in precision medicine, EWAS Atlas would be of great utility in dissecting complex molecular mechanisms associated with various diseases and promoting the development of novel diagnostics and therapeutics. With the rapid increase of EWAS studies, future efforts include regular update to incorporate EWAS associations from latest published literatures. Meanwhile, different types of epigenetic marks (e.g. RNA modification (29), histone modification) will be expanded and included in EWAS Atlas. Moreover, efforts will be also devoted to linking EWAS associations with GWAS associations. We also call for worldwide collaborations to work together to build EWAS Atlas into a valuable resource covering more comprehensive associations and traits and further providing potential guidance for human healthcare and disease treatment.
  29 in total

1.  Relationship between cigarette smoking and novel risk factors for cardiovascular disease in the United States.

Authors:  Lydia A Bazzano; Jiang He; Paul Muntner; Suma Vupputuri; Paul K Whelton
Journal:  Ann Intern Med       Date:  2003-06-03       Impact factor: 25.391

2.  Modeling sample variables with an Experimental Factor Ontology.

Authors:  James Malone; Ele Holloway; Tomasz Adamusiak; Misha Kapushesky; Jie Zheng; Nikolay Kolesnikov; Anna Zhukova; Alvis Brazma; Helen Parkinson
Journal:  Bioinformatics       Date:  2010-03-03       Impact factor: 6.937

Review 3.  Epigenome-wide association studies for common human diseases.

Authors:  Vardhman K Rakyan; Thomas A Down; David J Balding; Stephan Beck
Journal:  Nat Rev Genet       Date:  2011-07-12       Impact factor: 53.242

Review 4.  The pathophysiology of cigarette smoking and cardiovascular disease: an update.

Authors:  John A Ambrose; Rajat S Barua
Journal:  J Am Coll Cardiol       Date:  2004-05-19       Impact factor: 24.094

5.  Epigenome-wide association study in the European Prospective Investigation into Cancer and Nutrition (EPIC-Turin) identifies novel genetic loci associated with smoking.

Authors:  Natalie S Shenker; Silvia Polidoro; Karin van Veldhoven; Carlotta Sacerdote; Fulvio Ricceri; Mark A Birrell; Maria G Belvisi; Robert Brown; Paolo Vineis; James M Flanagan
Journal:  Hum Mol Genet       Date:  2012-11-21       Impact factor: 6.150

6.  GENCODE: the reference human genome annotation for The ENCODE Project.

Authors:  Jennifer Harrow; Adam Frankish; Jose M Gonzalez; Electra Tapanari; Mark Diekhans; Felix Kokocinski; Bronwen L Aken; Daniel Barrell; Amonida Zadissa; Stephen Searle; If Barnes; Alexandra Bignell; Veronika Boychenko; Toby Hunt; Mike Kay; Gaurab Mukherjee; Jeena Rajan; Gloria Despacio-Reyes; Gary Saunders; Charles Steward; Rachel Harte; Michael Lin; Cédric Howald; Andrea Tanzer; Thomas Derrien; Jacqueline Chrast; Nathalie Walters; Suganthi Balasubramanian; Baikang Pei; Michael Tress; Jose Manuel Rodriguez; Iakes Ezkurdia; Jeltje van Baren; Michael Brent; David Haussler; Manolis Kellis; Alfonso Valencia; Alexandre Reymond; Mark Gerstein; Roderic Guigó; Tim J Hubbard
Journal:  Genome Res       Date:  2012-09       Impact factor: 9.043

7.  NCBI GEO: archive for functional genomics data sets--update.

Authors:  Tanya Barrett; Stephen E Wilhite; Pierre Ledoux; Carlos Evangelista; Irene F Kim; Maxim Tomashevsky; Kimberly A Marshall; Katherine H Phillippy; Patti M Sherman; Michelle Holko; Andrey Yefanov; Hyeseung Lee; Naigong Zhang; Cynthia L Robertson; Nadezhda Serova; Sean Davis; Alexandra Soboleva
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

8.  DiseaseMeth: a human disease methylation database.

Authors:  Jie Lv; Hongbo Liu; Jianzhong Su; Xueting Wu; Hui Liu; Boyan Li; Xue Xiao; Fang Wang; Qiong Wu; Yan Zhang
Journal:  Nucleic Acids Res       Date:  2011-12-01       Impact factor: 16.971

9.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

Authors:  Danielle Welter; Jacqueline MacArthur; Joannella Morales; Tony Burdett; Peggy Hall; Heather Junkins; Alan Klemm; Paul Flicek; Teri Manolio; Lucia Hindorff; Helen Parkinson
Journal:  Nucleic Acids Res       Date:  2013-12-06       Impact factor: 16.971

10.  DNA methylation age of human tissues and cell types.

Authors:  Steve Horvath
Journal:  Genome Biol       Date:  2013       Impact factor: 13.583

View more
  65 in total

Review 1.  Moving pharmacoepigenetics tools for depression toward clinical use.

Authors:  Laura M Hack; Gabriel R Fries; Harris A Eyre; Chad A Bousman; Ajeet B Singh; Joao Quevedo; Vineeth P John; Bernhard T Baune; Boadie W Dunlop
Journal:  J Affect Disord       Date:  2019-02-06       Impact factor: 4.839

2.  Database Resources of the National Genomics Data Center in 2020.

Authors: 
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

3.  Human methylome variation across Infinium 450K data on the Gene Expression Omnibus.

Authors:  Sean K Maden; Reid F Thompson; Kasper D Hansen; Abhinav Nellore
Journal:  NAR Genom Bioinform       Date:  2021-04-22

4.  Mining the Selective Remodeling of DNA Methylation in Promoter Regions to Identify Robust Gene-Level Associations With Phenotype.

Authors:  Yuan Quan; Fengji Liang; Si-Min Deng; Yuexing Zhu; Ying Chen; Jianghui Xiong
Journal:  Front Mol Biosci       Date:  2021-03-26

5.  Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2022.

Authors: 
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

6.  Maternal atopy and offspring epigenome-wide methylation signature.

Authors:  Hanna Danielewicz; Artur Gurgul; Anna Dębińska; Grzegorz Myszczyszyn; Tomasz Szmatoła; Anna Myszkal; Igor Jasielczuk; Anna Drabik-Chamerska; Lidia Hirnle; Andrzej Boznański
Journal:  Epigenetics       Date:  2020-09-09       Impact factor: 4.528

7.  Pleiotropic effects of telomere length loci with brain morphology and brain tissue expression.

Authors:  Gita A Pathak; Frank R Wendt; Daniel F Levey; Adam P Mecca; Christopher H van Dyck; Joel Gelernter; Renato Polimanti
Journal:  Hum Mol Genet       Date:  2021-06-26       Impact factor: 6.150

8.  Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning.

Authors:  Yan Liu; Hui Geng; Bide Duan; Xiuzhi Yang; Airong Ma; Xiaoyan Ding
Journal:  Biomed Res Int       Date:  2021-05-19       Impact factor: 3.411

9.  Reliability of DNA methylation measures using Illumina methylation BeadChip.

Authors:  Zongli Xu; Jack A Taylor
Journal:  Epigenetics       Date:  2020-08-15       Impact factor: 4.528

10.  Methylation and Expression of FTO and PLAG1 Genes in Childhood Obesity: Insight into Anthropometric Parameters and Glucose-Lipid Metabolism.

Authors:  Wojciech Czogała; Małgorzata Czogała; Wojciech Strojny; Gracjan Wątor; Paweł Wołkow; Małgorzata Wójcik; Mirosław Bik Multanowski; Przemysław Tomasik; Andrzej Wędrychowicz; Wojciech Kowalczyk; Karol Miklusiak; Agnieszka Łazarczyk; Przemysław Hałubiec; Szymon Skoczeń
Journal:  Nutrients       Date:  2021-05-15       Impact factor: 5.717

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.