| Literature DB >> 25143956 |
Shilin Zhao1, Yan Guo1, Quanhu Sheng1, Yu Shyr1.
Abstract
Heat maps and clustering are used frequently in expression analysis studies for data visualization and quality control. Simple clustering and heat maps can be produced from the "heatmap" function in R. However, the "heatmap" function lacks certain functionalities and customizability, preventing it from generating advanced heat maps and dendrograms. To tackle the limitations of the "heatmap" function, we have developed an R package "heatmap3" which significantly improves the original "heatmap" function by adding several more powerful and convenient features. The "heatmap3" package allows users to produce highly customizable state of the art heat maps and dendrograms. The "heatmap3" package is developed based on the "heatmap" function in R, and it is completely compatible with it. The new features of "heatmap3" include highly customizable legends and side annotation, a wider range of color selections, new labeling features which allow users to define multiple layers of phenotype variables, and automatically conducted association tests based on the phenotypes provided. Additional features such as different agglomeration methods for estimating distance between two samples are also added for clustering.Entities:
Mesh:
Year: 2014 PMID: 25143956 PMCID: PMC4124803 DOI: 10.1155/2014/986048
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1An example of “heatmap3” package. The heat map was generated based on 30 samples from TCGA BRCA dataset. The dendrogram of samples (top) was divided into two parts based on the correlation between samples' gene expression and then labeled, respectively. The categorical annotation bars (above heat map) demonstrate the annotation for age, TN, HER2, PR, and ER. The color bar on the left side demonstrates the log2 fold changes and negative log10 P values from comparison of triple negative patients versus nontriple negative patients.
The statistical test result for categorical annotation in different groups.
| Cluster1 | Cluster2 |
| |
|---|---|---|---|
| ER | |||
| Negative | 2 | 11 | 0.003 |
| Positive | 13 | 4 | |
| Positive Percent | 0.87 | 0.27 | |
| PR | |||
| Negative | 4 | 13 | 0.003 |
| Positive | 11 | 2 | |
| Positive Percent | 0.73 | 0.13 | |
| HER2 | |||
| Negative | 13 | 13 | 0.023 |
| Positive | 2 | 2 | |
| Positive Percent | 0.13 | 0.13 |
The statistical test result for age in different groups, ANNOVA P value: 0.429.
| Age | Cluster1 | Cluster2 |
|---|---|---|
| Min. | 41.00 | 46.00 |
| 1st Qu. | 51.00 | 49.00 |
| Median | 61.00 | 55.00 |
| Mean | 60.00 | 57.13 |
| 3rd Qu. | 64.25 | 62.50 |
| Max. | 89.00 | 80.00 |
Figure 2The dendrograms and clusters generated by top 500, top 3000, and all genes which were selected by standard deviation. The triple negative samples were more enriched in one group when genes with larger standard deviation were used. The results demonstrate that the “heatmap3” package can be helpful in selecting genes that best represent the phenotypes of samples.