| Literature DB >> 21624141 |
Huakun Wang1,2, Zhenzhen Wang1, Xia Li1, Binsheng Gong1, Lixin Feng2, Ying Zhou2.
Abstract
BACKGROUND: Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest.Entities:
Year: 2011 PMID: 21624141 PMCID: PMC3118357 DOI: 10.1186/1748-7188-6-14
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Figure 1Lung cancer data set clustered using the WDCM. (A) Distribution parameters scatter plot. The horizontal axis corresponds to shape parameter a, and the vertical axis corresponds to scale parameter b. The parameter pairs in different clusters were drew with different colors. (B) Cluster profile plots.
Figure 2Follicular lymphoma data set clustered using the WDCM. (A) Distribution parameters scatter plot, (B) Cluster profile plots.
Figure 3Bladder carcinoma data set clustered using the WDCM. (A) Distribution parameters scatter plot, (B) Cluster profile plots.
Figure 4biological annotation ratios of clustering results. (A) Final annotation ratios of Lung cancer clusters found by three different methods in GO biological processes (BP), cellular components (CC) and molecular functions (MF). (B) Final annotation ratios of Follicular lymphoma clusters found by three different methods in GO biological processes (BP), cellular components (CC) and molecular functions (MF). (C) Final annotation ratios of Bladder carcinoma clusters.
ARI values of WDCM, k-means and SOM algorithms for the lung cancer, B-cell follicular lymphoma and bladder carcinoma gene expression data sets
| Algorithm | Lung cancer | Follicular lymphoma | Bladder carcinoma |
|---|---|---|---|
| WDCM | |||
| k-means | 0.2478 | 0.3481 | 0.1623 |
| SOM | 0.3681 | 0.2647 | 0.0926 |
COR indices with respect to the specified percentages of missing values for the lung cancer, B-cell follicular lymphoma and bladder carcinoma data sets
| Percentage of missing | Lung cancer | Follicular lymphoma | Bladder carcinoma |
|---|---|---|---|
| 5% | 0.9140 | 0.9495 | 0.9823 |
| 10% | 0.8654 | 0.9078 | 0.9702 |
| 15% | 0.8220 | 0.8738 | 0.9565 |
| 20% | 0.7892 | 0.8418 | 0.9450 |
| 25% | 0.7649 | 0.8120 | 0.9335 |