| Literature DB >> 23496895 |
Abstract
BACKGROUND: In a functional analysis of gene expression data, biclustering method can give crucial information by showing correlated gene expression patterns under a subset of conditions. However, conventional biclustering algorithms still have some limitations to show comprehensive and stable outputs.Entities:
Mesh:
Year: 2013 PMID: 23496895 PMCID: PMC3618306 DOI: 10.1186/1471-2164-14-144
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Comparison of average recovery scores for simulated datasets with various correlated patterns
| BICLIC | 1 | 1 | 1 |
| BCCA | 0.141 | 0.181 | 0.168 |
| CPB | 1 | 0.996 | 0.915 |
| QUBIC | 0.431 | 0.169 | 0.466 |
The maximum and minimum numbers of the average recovery score are 1 and 0, respectively. Each average recovery score in Table 1 is the mean value of the average recovery scores from 10 independent datasets.
Comparison of average relevance scores for simulated datasets with various correlated patterns
| BICLIC | 1 | 1 | 1 |
| BCCA | 0.060 | 0.109 | 0.094 |
| CPB | 0.143 | 0.297 | 0.258 |
| QUBIC | 0.038 | 0.043 | 0.107 |
The maximum and minimum numbers of the average recovery scores are 1 and 0, respectively. Each average relevance score in Table 2 is the mean value of average relevance scores from 10 independent datasets.
Figure 1Effect of column fraction level on average recovery score in shifting, scaling, and shifting-scaling pattern. Each average recovery score is the mean value of average recovery scores from 10 independent datasets.
Summary statistics of biclustering algorithms for the yeast stress dataset
| BICLIC | 14791 | 2249.3 | 1 | 1 | 0.999 |
| (11172) | (7.2) | (0.905) | (1) | (0.109) | |
| BCCA | 8163 | 2936.8 | 0.776 | 1 | 0.317 |
| CPB | 3634 | 8413.6 | 0.512 | 1 | 0.185 |
| QUBIC | 2146 | 847.4 | 0.884 | 0.746 | 0.112 |
Values in parentheses denote the values of seed biclusters of BICLIC. The columns “Count”, “Average |I x J|”, “Gene cov.”, “Condition cov.”, and “Cell cov.” show the numbers of biclusters, average sizes of biclusters, coverage of biclusters in the gene dimension, coverage of biclusters in the condition dimension, and coverage of biclusters for all cells in the matrix.
Summary statistics of biclustering algorithms for the lung cancer dataset
| BICLIC | 6019 | 2302.8 | 1 | 1 | 0.999 |
| (3734) | (4.2) | (0.389) | (1) | (0.021) | |
| CPB | 386 | 4594.8 | 0.672 | 1 | 0.344 |
| QUBIC | 1355 | 68.2 | 0.543 | 1 | 0.048 |
Values in parentheses denote the values of seed biclusters of BICLIC. The columns “Count”, “Average |I x J|”, “Gene cov.”, “Condition cov.”, and “Cell cov.” show the numbers of biclusters, average sizes of biclusters, coverage of biclusters in the gene dimension, coverage of biclusters in the condition dimension, and coverage of biclusters for all cells in the matrix.
Figure 2Proportion of the remaining biclusters after removing overlapping biclusters in each biclustering algorithm for yeast stress dataset.
Figure 3The number of significantly enriched biological terms for four bi-clustering algorithms in four functional categories at 1% significance threshold for yeast stress data set.
Figure 4The number of significantly enriched biological terms for three bi-clustering algorithms in four functional categories at 1% significance threshold for lung cancer data set.
Figure 5Schematic diagram of determining seed biclusters.
Figure 6Schematic diagram of expanding seed biclusters.