| Literature DB >> 25278960 |
Anthony P Fejes1, Meaghan J Jones1, Michael S Kobor2.
Abstract
One of the challenges in the analysis of large data sets, particularly in a population-based setting, is the ability to perform comparisons across projects. This has to be done in such a way that the integrity of each individual project is maintained, while ensuring that the data are comparable across projects. These issues are beginning to be observed in human DNA methylation studies, as the Illumina 450k platform and next generation sequencing-based assays grow in popularity and decrease in price. This increase in productivity is enabling new insights into epigenetics, but also requires the development of pipelines and software capable of handling the large volumes of data. The specific problems inherent in creating a platform for the storage, comparison, integration, and visualization of DNA methylation data include data storage, algorithm efficiency and ability to interpret the results to derive biological meaning from them. Databases provide a ready-made solution to these issues, but as yet no tools exist that that leverage these advantages while providing an intuitive user interface for interpreting results in a genomic context. We have addressed this void by integrating a database to store DNA methylation data with a web interface to query and visualize the database and a set of libraries for more complex analysis. The resulting platform is called DaVIE: Database for the Visualization and Integration of Epigenetics data. DaVIE can use data culled from a variety of sources, and the web interface includes the ability to group samples by sub-type, compare multiple projects and visualize genomic features in relation to sites of interest. We have used DaVIE to identify patterns of DNA methylation in specific projects and across different projects, identify outlier samples, and cross-check differentially methylated CpG sites identified in specific projects across large numbers of samples. A demonstration server has been setup using GEO data at http://echelon.cmmt.ubc.ca/dbaccess/, with login "guest" and password "guest." Groups may download and install their own version of the server following the instructions on the project's wiki.Entities:
Keywords: 450k methylation array; bioinformatics; database; epigenetics; visualization
Year: 2014 PMID: 25278960 PMCID: PMC4166999 DOI: 10.3389/fgene.2014.00325
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Example of uses of visualization methods in DaVIE using LEP gene as an example. All panels show DNA methylation from 0 to 1 on the y-axis at the LEP promoter, including high-density CpG islands (dark gray) and intermediate density islands (light blue), with gene structure indicated below. Dark blue lines below indicate unreliable probes. (A) Points view shows overall level and general distribution of methylation. (B) Grouped by sex, no obvious differences are observed. (C) Grouped by tissue, one of the four different tissues in this project shows clear distinction.
Figure 2Visualization in DaVIE across projects. See Figure 1 for key. (A) In a total of over 500 samples in 12 different tissues, distinct groups are observed but separation is not achieved when the samples are colored by project. (B) When colored by tissue type, separation between tissues is improved. (C) With the trace view, subtler differences between the tissues are made visible.
Figure 3Validation of visualization methods using examples from the X chromosome. See Figure 1 for key. In all panels, female samples are in gray and male samples in green. (A) XIST promoter shows high methylation in males and approximately 50% in females. (B) MAOA, a gene subject to XCI, shows 50% methylation in females and low methylation in males. (C) RPS4X, a gene that escapes XCI, shows low island methylation regardless of sex.