| Literature DB >> 26328750 |
Florin Chelaru, Héctor Corrada Bravo.
Abstract
BACKGROUND: Computational and visual data analysis for genomics has traditionally involved a combination of tools and resources, of which the most ubiquitous consist of genome browsers, focused mainly on integrative visualization of large numbers of big datasets, and computational environments, focused on data modeling of a small number of moderately sized datasets. Workflows that involve the integration and exploration of multiple heterogeneous data sources, small and large, public and user specific have been poorly addressed by these tools. In our previous work, we introduced Epiviz, which bridges the gap between the two types of tools, simplifying these workflows.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26328750 PMCID: PMC4559604 DOI: 10.1186/1471-2105-16-S11-S4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Statistical analysis of colon cancer methylome. Top-left displays CpG-level methylation measurements for colon normal and tumor tissue for a sample of TCGA [12] data; top-right displays expression data from the Gene Expression Barcode Project across multiple cancer types as a heatmap, the hierarchical clustering is dynamically updated as the users navigates across the genome; the bottom display shows smooth function , described in the main text, corresponding to differences in methylation between normal and tumor; the "Colon Blocks" track displays statistically significant DMRs inferred from the smoothed methylation difference function. The brushing interaction (in yellow) links data from Bioconductor objects produced at different stages of a statistical analysis pipeline: from measurement preprocessing to statistically significant regions of interest allowing effective exploration of the statistical properties of these genomic findings. http://epiviz.cbcb.umd.edu/?ws=sp9ShCJdS3c
Figure 2Screenshot of Epiviz. It shows the main elements available within the tool: 1) the main toolbar, featuring all UI controls; 2) a scatter plot, showing two computed measurements: the average and difference between colon gene expression for normal and cancer tissues; the code for this plot is customized in the UI to show a line at y=0, that separates genes with positive and negative differences; 3) a heatmap, showing values from the Gene Expression Barcode [4] comparing the normal and cancer expressions for different tissues. Using its clustering feature, we notice that tumors tend to group separately from normal tissues; in addition, the clustering result seems to be determined by a small number of genes, namely MMP1, MMP3 and MMP10; 4) a stacked plot, showing two columns for normal and cancer gene expression; it uses the color by transformation, to highlight genes with various expression differences. This plot offers several insights: first, that overall expression tends to be higher for cancer than normal tissues; second, it allows us to immediately spot the differentially expressed genes, by brushing over the blocks colored in deep red, corresponding to them; 5) a custom track defined in a plugin hosted on GitHub Gist, showing blocks aligned to the genome, with height corresponding to the expression of the genes; 6) a stacked track, showing a computed measurement, corresponding to the difference between normal and cancer methylation; this track offers an insight over the hypo/hyper-methylated blocks; 7) a lines track, showing DNA methylation for normal and cancer colon tissues; the track uses the group by transformation to aggregate three normal samples and three tumor samples, and displays error bars to show the variation of methylation for each group at each data point; in addition, it uses basis interpolation to smoothly connect the available data points; 8) a genes track, showing human genome genes fetched from the UCSC database[10] using a data provider plugin stored externally on GitHub Gist; 9) a tooltip showing details on demand for the gene MMP1. The highlighted items correspond to the brushing feature, triggered while hovering over the MMP1 gene in the genes track. The feature links all visualizations together by genomic location. http://epiviz.cbcb.umd.edu/?gist[]=160e8b84795603961b9f&gist[]=5a88f39caa801e58b8ae&ws=GJU2bfURaUd
Figure 3Custom . This code computes the absolute difference between the two measurements - for example gene expression normal and cancer - in the plot, and splits it in increments of 4. The resulting plot will colour genes with different colours, each corresponding to its expression difference.
Figure 4An overview of gene expression in chromosome 11. The scatter plot shows colon normal and tumour gene expression average on the x axis, and difference on the y axis. The genes track shows genes fetched from the UCSC database, using a data provider plugin hosted on GitHub Gist (http://gist.github.com/5a88f39caa801e58b8ae). The highlighted data point in the scatter plot corresponds to a gene expression difference outlier. Using the brushing feature of Epiviz, we link this outlier to its corresponding gene in the genes track. http://epiviz.cbcb.umd.edu/?gist[]=5a88f39caa801e58b8ae&ws=gdmUH1ANl3m