Literature DB >> 25372567

HemI: a toolkit for illustrating heatmaps.

Wankun Deng1, Yongbo Wang1, Zexian Liu1, Han Cheng1, Yu Xue1.   

Abstract

Recent high-throughput techniques have generated a flood of biological data in all aspects. The transformation and visualization of multi-dimensional and numerical gene or protein expression data in a single heatmap can provide a concise but comprehensive presentation of molecular dynamics under different conditions. In this work, we developed an easy-to-use tool named HemI (Heat map Illustrator), which can visualize either gene or protein expression data in heatmaps. Additionally, the heatmaps can be recolored, rescaled or rotated in a customized manner. In addition, HemI provides multiple clustering strategies for analyzing the data. Publication-quality figures can be exported directly. We propose that HemI can be a useful toolkit for conveniently visualizing and manipulating heatmaps. The stand-alone packages of HemI were implemented in Java and can be accessed at http://hemi.biocuckoo.org/down.php.

Entities:  

Mesh:

Year:  2014        PMID: 25372567      PMCID: PMC4221433          DOI: 10.1371/journal.pone.0111988

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

A good picture is worth a thousand words. Recent progress in high-throughput techniques, such as DNA microarray, next-generation sequencing (NGS) and quantitative proteomics, has increased the demand for the visualization of multi-dimensional and numeric data [1]–[3]. As an intuitive strategy, a heatmap can graphically visualize the matrix data by representing individual values with different colors. To estimate how many papers have heatmaps, we carefully curated all original research articles published in 2012 of five leading journals, including Nature Biotechnology, Cancer Cell, Genome Research, Genome Biology, and Molecular & Cellular Proteomics, and found that ∼30.4% (202 out of 664 papers) contain at least one figure for heatmaps (Table 1). We also manually checked the 202 papers, and observed that the methods for drawing heatmaps were not mentioned in up to ∼66% (134/202) of them (Table 2). For 68 remaining papers in which the tools were clearly stated, nearly ∼46% of them visualized heatmaps with the R language package (Table 2). However, considerable programming skills, which some researchers do not possess, are needed for using R. Also, we found the Java Treeview, an illustrator of microarray data [4], was used in ∼24% of the 68 papers. Although the Java Treeview doesn't perform any clustering analyses, a clustered data file in CDT format generated by other tools must be provided an input. To overcome this limitation, Seo et al. developed an interactive tool of Hierarchical Clustering Explorer (HCE) for both visualizing heatmaps and clustering the numeric data [5]. The heatmaps in HCE can be easily manipulated, whereas the artwork quality was yet to be improved. Moreover, although heatmaps can be accomplished by a number of commercial or non-commercial softwares such as GeneSpring GX, Mayday [6], Cytobank [7] and D3 (http://d3js.org/), these tools were not designed specifically for heatmap generating. Recently, an interactive heatmap viewer called jHeatmap was developed [8]. The tool is useful for the intuitive and interactive visualization of complex data in the form of heatmaps. However, no further manipulations, such as re-coloration and re-rotation, can be performed. Also, the visualized heatmaps cannot be exported for the publication proposes. Thus, the development of an easy-to-use toolkit for conveniently illustrating heatmaps and exporting publication-quality figures will be a great help for both bioinformaticians and experimentalists.
Table 1

Using frequency of heatmap.

JournalNum. of papersNum. of heatmapsa Per.b
Nature Biotechnology 811923.46%
Cancer Cell 1064037.74%
Genome Research 1445840.28%
Genome Biology 922628.26%
Molecular & Cellular Proteomics 2415924.48%

To estimate how many papers contain heatmaps, we went through all original research papers (excluding reviews and other articles) published in 2012 of five leading journals as below.

Num. of Heatmaps, the number of papers containing with at least one heatmap figure;

Per., the percentiles.

Table 2

The summarization of the methods for illustrating heatmaps among the 202 papers published in 2012 on five leading journals.

Toolsa Num.b Web linkc
R31 http://www.r-project.org/
Java Treeview16 http://jtreeview.sourceforge.net/
MATLAB7 http://www.mathworks.cn/products/matlab/
SPSS4 http://www-01.ibm.com/software/analytics/spss/
GeneSpring2 http://genespring-support.com/
MultiExperiment Viewer2 http://www.tm4.org/mev.html
Cytobank1 https://www.cytobank.org/
Heatmap Builder1 http://ashleylab.stanford.edu/tools/tools-scripts.html
Integrative Genomics Viewer1 http://www.broadinstitute.org/igv/
Matrix2png1 http://www.chibi.ubc.ca/matrix2png/
Mayday1 http://microarray-analysis.org
Processing1 http://processing.org/
N/Ad 134N/A
Total202

Tools, the name of used tools;

Num., the number of papers that used the tool;

Web link, the website of the tool;

N/A, not mentioned in the corresponding papers.

To estimate how many papers contain heatmaps, we went through all original research papers (excluding reviews and other articles) published in 2012 of five leading journals as below. Num. of Heatmaps, the number of papers containing with at least one heatmap figure; Per., the percentiles. Tools, the name of used tools; Num., the number of papers that used the tool; Web link, the website of the tool; N/A, not mentioned in the corresponding papers.

Method

In this work, we presented a novel software package of HemI (Heatmap Illustrator, version 1.0), which used a red, green, and blue tricolor in a 256 color mode. Given a selected color scale, the total color space will be automatically processed into a numerical matrix (768 rows * 3 columns) by Java. Then the inputted gene or protein expression data can be linearly normalized as below:More frequently, researchers prefer to visualize the logarithmic relations between different conditions and molecular expression levels. Thus, the original data can also be normalized as below:While NV = normalized value OV = original value Max = the maximum of all OVs Min = the minimum of all OVs a = 2 (default), and can also be use-defined as 10 or e In both equations, the Max cannot be equal to Min, and both OV and Min values must be greater than 0 in Eq. 2. The calculated NVs were then mapped to the color matrix, while the tricolor values of the nearest number of rows were visualized. For further analysis of the data in heatmaps, several clustering approaches such as the hierarchical and k-means clustering algorithms, were also integrated. To calculate the distance, three types of linkage criteria (Table 3) and seven kinds of metrics (Table 4) were adopted for the two algorithms, respectively.
Table 3

Three mostly used linkage criteria for the hierarchical clustering.

Linkage criterionEquation
Average linkage clustering (default)
Minimum linkage clustering
Maximum linkage clustering

To calculate the pairwise distances for the hierarchical clustering, three commonly used linkage criteria were taken from the Wikipedia (http://en.wikipedia.org/wiki/Hierarchical_clustering).

Table 4

Seven distances for the clustering.

DistanceEquation
Euclidean distance
Squared Euclidean distance
Manhattan distance
Maximum distance
Pearson distance (default)
Spearman distance
Kendall's tau distance

To calculate the distances for the hierarchical and k-means clustering approaches, up to 7 mostly used distances were adopted.

To calculate the pairwise distances for the hierarchical clustering, three commonly used linkage criteria were taken from the Wikipedia (http://en.wikipedia.org/wiki/Hierarchical_clustering). To calculate the distances for the hierarchical and k-means clustering approaches, up to 7 mostly used distances were adopted. HemI 1.0 was written in Java 1.6 (J2SE 6.0) and packaged with Install4j 4.0.8. We developed six packages to support three major ×86/×64 operating systems (OSs), including Windows, Unix/Linux, and Mac. The stability and applicability of HemI was rigorously tested under Windows XP/7, Ubuntu, and Apple Mac OS X 10.5 (Leopard).

Usage

Heml was developed in an easy-to-use mode. Here, we took data from a previously published study [9] as a demo to describe the usage of HemI. Androgen receptor (AR), a hormone-activated transcription factor, regulates prostate development, function and malignant transformation as an essential transcriptional repressor [9]. To characterize potentially AR-regulated genes, LNCaP prostate cancer cells were first hormone-starved for 3 days. Then, the gene expression levels were profiled after 3, 6, 12, 24 and 48 hours of androgen treatment [9]. Totally, Zhao et al. identified 428 androgen-repressed genes [9], and the corresponding data set was used as an example for HemI. First, the numerical data in one of the three file formats, including Microsoft Excel spreadsheet (.xls), Tab Separated Value (TSV) or Comma Separated Value (CSV) can be loaded through clicking on the “LOAD” button of the main interface. Then, users can select the numerical data area for visualizing a heatmap with mouse-dragging or holding-SHIFT-then-click manipulations. The titles for X-axis and Y-axis can be specified by inputting number of row and column in the data sheet (Figure 1A). For convenience, an “Auto Fill” button was provided, while the first column and row were regarded as the titles of Y-axis and X-axis, respectively. A heatmap will be automatically generated after clicking on the finish button.
Figure 1

Usage of HemI 1.0.

(A) The numerical data in one of three file formats can be directly loaded, whereas the data area can be selected by dragging or holding-SHIFT-then-click manipulations. Titles for X-axis or Y-axis can also be specified; (B) Multiple options for manipulating the heatmp; (C) The numeric data can be clustered for either or both of X-axis and Y-axis; (D) Publication-quality figures can be exported, and two figure formats were supported.

Usage of HemI 1.0.

(A) The numerical data in one of three file formats can be directly loaded, whereas the data area can be selected by dragging or holding-SHIFT-then-click manipulations. Titles for X-axis or Y-axis can also be specified; (B) Multiple options for manipulating the heatmp; (C) The numeric data can be clustered for either or both of X-axis and Y-axis; (D) Publication-quality figures can be exported, and two figure formats were supported. The generated heatmap can be easily manipulated in a customized manner. For example, the width and height of the artwork can be adjusted, whereas the blank space out of the heatmap can also be changed (Figure 1B). The picture can be re-colored, re-rotated, and the X-axis and Y-axis can be interchanged (Figure 1B). Moreover, the corresponding data can be clustered for either or both of X-axis and Y-axis by clicking on the clustering options of main panel (Figure 1C). After all configurations are finalized, the new heatmap can be updated and displayed by clicking on the “REFRESH” button. To obtain publication-quality figures, users can export heatmaps by right-clicking on the canvas and choosing the export option. Users can select different resolutions for outputting figures, such as 72 dpi, 300 dpi and 600 dpi (Figure 1D). Two picture formats, including.png and.tiff, were also provided to satisfy the different requirements. The whole procedure was carefully implemented into a video with ∼4 minutes on our website (http://hemi.biocuckoo.org/faq.php).

Discussion

The heatmap of potentially AR-regulated genes [9] was re-illustrated by HemI (Figure 2A). Also, poly-ADP-ribose polymerase (PARP) family proteins are involved in a variety of cellular pathways such as DNA repair and cell death, and regarded as a class of important drug targets in cancer therapeutics [10]. As a sub-family of PARP, tankyrases also play an essential role in telomere length regulation [10]. Recently, a differential scanning fluorimetry (DSF) approach was adopted for rapid profiling of 185 known and potential PARP chemical compounds for their binding ability to 13 PRAP family proteins including two tankyrases, TNSK1 and TNSK2 [10]. We redrew the heatmap of thermal shifts measured by DSF for all 185 inhibitors against 13 PARP members (Figure 2B). Our results are consistent with the previous analysis, which demonstrated that most of inhibitors lack specificity and mainly target PARP1-4 as primary hits, while several compounds can efficiently inhibit both PARP1-4 and two tankyrases [10].
Figure 2

Illustrating heatmaps by HemI 1.0.

(A) Thermal shifts, which indicate binding affinities of 185 compounds to 13 PARPs, were measured by DSF. A higher value represents a stronger binding affinity. (B) Totally, 428 androgen-repressed genes were identified from LNCaP cells, after the treatment of 1 nM synthetic androgen R1881 for 3, 6, 12, 24 and 48 hours. Values shown were normalized to 0 hour and log2 transformed.

Illustrating heatmaps by HemI 1.0.

(A) Thermal shifts, which indicate binding affinities of 185 compounds to 13 PARPs, were measured by DSF. A higher value represents a stronger binding affinity. (B) Totally, 428 androgen-repressed genes were identified from LNCaP cells, after the treatment of 1 nM synthetic androgen R1881 for 3, 6, 12, 24 and 48 hours. Values shown were normalized to 0 hour and log2 transformed. Taken together, we propose that HemI 1.0 can be a useful tool for both experimentalists and bioinformaticians, and allow users to draw, manipulate and export publication-quality heatmaps in a user friendly manner. The software packages of HemI will be continuously maintained and improved upon users' comments and feedbacks.
  10 in total

1.  Java Treeview--extensible visualization of microarray data.

Authors:  Alok J Saldanha
Journal:  Bioinformatics       Date:  2004-06-04       Impact factor: 6.937

Review 2.  Data mining the human gut microbiota for therapeutic targets.

Authors:  Matthew Collison; Robert P Hirt; Anil Wipat; Sirintra Nakjang; Philippe Sanseau; James R Brown
Journal:  Brief Bioinform       Date:  2012-03-24       Impact factor: 11.622

3.  An interactive power analysis tool for microarray hypothesis testing and generation.

Authors:  Jinwook Seo; Heather Gordish-Dressman; Eric P Hoffman
Journal:  Bioinformatics       Date:  2006-01-17       Impact factor: 6.937

4.  Tools for managing and analyzing microarray data.

Authors:  André Koschmieder; Karin Zimmermann; Silke Trissl; Thomas Stoltmann; Ulf Leser
Journal:  Brief Bioinform       Date:  2011-03-21       Impact factor: 11.622

5.  jHeatmap: an interactive heatmap viewer for the web.

Authors:  Jordi Deu-Pons; Michael P Schroeder; Nuria Lopez-Bigas
Journal:  Bioinformatics       Date:  2014-02-23       Impact factor: 6.937

6.  Family-wide chemical profiling and structural analysis of PARP and tankyrase inhibitors.

Authors:  Elisabet Wahlberg; Tobias Karlberg; Ekaterina Kouznetsova; Natalia Markova; Antonio Macchiarulo; Ann-Gerd Thorsell; Ewa Pol; Åsa Frostell; Torun Ekblad; Delal Öncü; Björn Kull; Graeme Michael Robertson; Roberto Pellicciari; Herwig Schüler; Johan Weigelt
Journal:  Nat Biotechnol       Date:  2012-02-19       Impact factor: 54.908

7.  Cooperation between Polycomb and androgen receptor during oncogenic transformation.

Authors:  Jonathan C Zhao; Jianjun Yu; Christine Runkle; Longtao Wu; Ming Hu; Dayong Wu; Jun S Liu; Qianben Wang; Zhaohui S Qin; Jindan Yu
Journal:  Genome Res       Date:  2011-12-16       Impact factor: 9.043

8.  Mayday--integrative analytics for expression data.

Authors:  Florian Battke; Stephan Symons; Kay Nieselt
Journal:  BMC Bioinformatics       Date:  2010-03-09       Impact factor: 3.169

Review 9.  Cytobank: providing an analytics platform for community cytometry data analysis and collaboration.

Authors:  Tiffany J Chen; Nikesh Kotecha
Journal:  Curr Top Microbiol Immunol       Date:  2014       Impact factor: 4.291

10.  A survey of tools for variant analysis of next-generation genome sequencing data.

Authors:  Stephan Pabinger; Andreas Dander; Maria Fischer; Rene Snajder; Michael Sperk; Mirjana Efremova; Birgit Krabichler; Michael R Speicher; Johannes Zschocke; Zlatko Trajanoski
Journal:  Brief Bioinform       Date:  2013-01-21       Impact factor: 11.622

  10 in total
  340 in total

1.  Comparison of the seasonal variations of Synechococcus assemblage structures in estuarine waters and coastal waters of Hong Kong.

Authors:  Xiaomin Xia; Nayani K Vidyarathna; Brian Palenik; Puiyin Lee; Hongbin Liu
Journal:  Appl Environ Microbiol       Date:  2015-08-28       Impact factor: 4.792

2.  Exploring the Genomic Diversity and Cariogenic Differences of Streptococcus mutans Strains Through Pan-Genome and Comparative Genome Analysis.

Authors:  Peiqi Meng; Chang Lu; Qian Zhang; Jiuxiang Lin; Feng Chen
Journal:  Curr Microbiol       Date:  2017-07-17       Impact factor: 2.188

3.  Contrasting bacterial community structure in artificial pit mud-starter cultures of different qualities: a complex biological mixture for Chinese strong-flavor Baijiu production.

Authors:  Mao-Ke Liu; Yu-Ming Tang; Ke Zhao; Ying Liu; Xiao-Jiao Guo; Xin-Hui Tian; Dao-Qun Ren; Wan-Chun Yao
Journal:  3 Biotech       Date:  2019-02-18       Impact factor: 2.406

4.  Actin genes and their expression in pacific white shrimp, Litopenaeus vannamei.

Authors:  Xiaoxi Zhang; Xiaojun Zhang; Jianbo Yuan; Jiangli Du; Fuhua Li; Jianhai Xiang
Journal:  Mol Genet Genomics       Date:  2017-11-30       Impact factor: 3.291

5.  Proteomic profiling of early degenerative retina of RCS rats.

Authors:  Zhi-Hong Zhu; Yan Fu; Chuan-Huang Weng; Cong-Jian Zhao; Zheng-Qin Yin
Journal:  Int J Ophthalmol       Date:  2017-06-18       Impact factor: 1.779

6.  Characteristic and influencing factors of Taqman genotyping calling error.

Authors:  Songcheng Yu; Xing Li; Xinxin Liu; Yan Wang; Fei Yu; Yuan Xue; Zhenxing Mao; Chongjian Wang; Wenjie Li
Journal:  J Clin Lab Anal       Date:  2018-06-26       Impact factor: 2.352

7.  Fate of antimicrobial resistance genes in response to application of poultry and swine manure in simulated manure-soil microcosms and manure-pond microcosms.

Authors:  Mianzhi Wang; Yongxue Sun; Peng Liu; Jing Sun; Qin Zhou; Wenguang Xiong; Zhenling Zeng
Journal:  Environ Sci Pollut Res Int       Date:  2017-07-18       Impact factor: 4.223

8.  microRNA-204 modulates chemosensitivity and apoptosis of prostate cancer cells by targeting zinc-finger E-box-binding homeobox 1 (ZEB1).

Authors:  Guanlin Wu; Jian Wang; Guojun Chen; Xing Zhao
Journal:  Am J Transl Res       Date:  2017-08-15       Impact factor: 4.060

9.  Identification of aberrantly expressed F-box proteins in squamous-cell lung carcinoma.

Authors:  Kai Wang; Xiao Qu; Shaorui Liu; Xudong Yang; Fenglong Bie; Yu Wang; Cuicui Huang; Jiajun Du
Journal:  J Cancer Res Clin Oncol       Date:  2018-05-04       Impact factor: 4.553

10.  Effects of 5-Aminolevulinic Acid on Gene Expression, Immunity, and ATP Levels in Pacific White Shrimp, Litopenaeus vannamei.

Authors:  Ivane R Pedrosa-Gerasmio; Tohru Tanaka; Asuka Sumi; Hidehiro Kondo; Ikuo Hirono
Journal:  Mar Biotechnol (NY)       Date:  2018-08-25       Impact factor: 3.619

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.