| Literature DB >> 21989068 |
Junwan Liu1, Zhoujun Li, Xiaohua Hu, Yiming Chen, E K Park.
Abstract
BACKGROUND: Newly microarray technologies yield large-scale datasets. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. Systematic analysis of those datasets provides the increasing amount of information, which is urgently needed in the post-genomic era. Biclustering, which is a technique developed to allow simultaneous clustering of rows and columns of a dataset, might be useful to extract more accurate information from those datasets. Biclustering requires the optimization of two conflicting objectives (residue and volume), and a multi-objective artificial immune system capable of performing a multi-population search. As a heuristic search technique, artificial immune systems (AISs) can be considered a new computational paradigm inspired by the immunological system of vertebrates and designed to solve a wide range of optimization problems. During biclustering several objectives in conflict with each other have to be optimized simultaneously, so multi-objective optimization model is suitable for solving biclustering problem.Entities:
Mesh:
Year: 2011 PMID: 21989068 PMCID: PMC3194232 DOI: 10.1186/1471-2164-12-S2-S11
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Information of biclusters found on yeast dataset
| Bicluster | Genes | Conditions | Residue | Row Variance |
|---|---|---|---|---|
| 1 | 1 | 16 | 238.54 | 789.25 |
| 22 | 91 | 17 | 210.58 | 685.36 |
| 24 | 563 | 12 | 201.55 | 875.65 |
| 29 | 1233 | 9 | 275.69 | 896.35 |
| 78 | 145 | 13 | 225.11 | 745.65 |
| 98 | 874 | 11 | 207.98 | 874.01 |
Table 1 shows the number of genes and conditions, the mean squared residue and the row variance of six biclusters out of the one hundred biclusters found on the yeast dataset.
Figure 1Small biclusters of size 26×15 on the yeast dataset Figure 1 shows the expression value of 26 genes under 15 conditions from the small biclusters(bicluster 22).
Biclusters found on human dataset
| Bicluster | Genes | Conditions | Residue | Row Variance |
|---|---|---|---|---|
| 1 | 597 | 49 | 855.69 | 3584.54 |
| 3 | 611 | 45 | 911.58 | 2875.12 |
| 8 | 1024 | 31 | 887.54 | 3012.25 |
| 10 | 478 | 39 | 812.88 | 6854.54 |
| 22 | 874 | 29 | 874.96 | 8740.24 |
| 31 | 698 | 37 | 800.74 | 4870.91 |
Table 2 shows the number of genes and conditions, the mean squared residue and the row variance of six biclusters out of the one hundred biclusters found on the human dataset.
Comparative study of three algorithms
| Algorithm | Dataset | Avg. MSR | Avg. size | Avg. time |
|---|---|---|---|---|
| DMOIOB | Yeast | 201.86 | 2841.08 | 88.02 |
| Human | 832.79 | 7106.51 | 258.48 | |
| MOIB | Yeast | 202.32 | 2638.74 | 108.12 |
| Human | 839.74 | 6918.29 | 280.76 |
Table 3 compares the performance of two algorithms. It gives the average of mean squared residue and the average size of the found biclusters, and gives computation cost of two algorithms.
Significant GO terms of genes in three biclusters
| Cluster No. | No. of genes | Process | Function | Component |
|---|---|---|---|---|
| 1 | 99 | Response to DNA damage stimulus (n=21,p=0.0016) | RNA polymerase II transcription factor activity (n=11,p=0.0064) | Intracellular membrane-bound organelle (n=16,p=0.0025) |
| 22 | 91 | Physiological process (n=23,p=0.0014) | MAP kinase activity (n=6,p=0.0023) | Cytosolic ribosome (n=17,p=0.0042) |
| 78 | 145 | Protein biosynthesis (n=52,p=0.0024) | Protein transporter activity (n=9,p=0.0021) | Cytosolic ribosome (n=12,p=0.0032) |
Table 4 lists the significant shared GO terms which are used to describe genes in each bicluster for the process, function and component ontology.