| Literature DB >> 27051553 |
Hamid Alavi Majd1, Soodeh Shahsavari1, Ahmad Reza Baghestani1, Seyyed Mohammad Tabatabaei2, Naghme Khadem Bashi3, Mostafa Rezaei Tavirani4, Mohsen Hamidpour5.
Abstract
Background. Biclustering algorithms for the analysis of high-dimensional gene expression data were proposed. Among them, the plaid model is arguably one of the most flexible biclustering models up to now. Objective. The main goal of this study is to provide an evaluation of plaid models. To that end, we will investigate this model on both simulation data and real gene expression datasets. Methods. Two simulated matrices with different degrees of overlap and noise are generated and then the intrinsic structure of these data is compared with biclusters result. Also, we have searched biologically significant discovered biclusters by GO analysis. Results. When there is no noise the algorithm almost discovered all of the biclusters but when there is moderate noise in the dataset, this algorithm cannot perform very well in finding overlapping biclusters and if noise is big, the result of biclustering is not reliable. Conclusion. The plaid model needs to be modified because when there is a moderate or big noise in the data, it cannot find good biclusters. This is a statistical model and is a quite flexible one. In summary, in order to reduce the errors, model can be manipulated and distribution of error can be changed.Entities:
Year: 2016 PMID: 27051553 PMCID: PMC4804094 DOI: 10.1155/2016/3059767
Source DB: PubMed Journal: Scientifica (Cairo) ISSN: 2090-908X
Mean and MSR of numbers of discovered rows and columns of biclusters in matrix 50 ∗ 20.
| Scenario | Num of True Bic | Dimensions of True biclusters | Num of discovered biclusters (MSR) | Num of correct rows (MSR) | Num of correct columns (MSR) | Num of overlap discovered (MSR) | |||
|---|---|---|---|---|---|---|---|---|---|
| Noise % | Overlap | Bic1 | Bic2 | Bic1 | Bic2 | ||||
|
| |||||||||
|
|
|
| 2.31 (1.09) | 21.0 (0.00) | 12.0 (0.00) | 6 (0.00) | 5 (0.00) | 0 (0.00) | |
|
|
|
|
| 2.29 (0.84) | 20.0 (0.00) | 12.0 (0.00) | 6 (0.00) | 5 (0.00) | 6 (0.00) |
|
|
|
| 2.45 (0.68) | 20.0 (0.00) | 22.0 (0.05) | 6 (0.00) | 5 (0.00) | 10 (0.00) | |
|
| |||||||||
|
|
|
| 1.95 (2.27) | 10.6 (52.08) | 10.6 (8.09) | 4.7 (2.16) | 4.6 (1.08) | 0 (0.00) | |
|
|
|
|
| 2.00 (2.32) | 11.5 (64.47) | 10.6 (9.93) | 4.4 (3.03) | 4.6 (2.95) | 0 (0.00) |
|
|
|
| 1.80 (1.38) | 18.7 (128.19) | 10.5 (25.78) | 4.7 (2.38) | 4.8 (4.21) | 0 (0.00) | |
Mean and MSR of numbers of discovered rows and columns of biclusters in matrix 500 ∗ 50.
| Scenario | Num of bicluster | Dimensions of biclusters | Num of discovered biclusters (MSR) | Num of correct rows (MSR) | Num of correct columns (MSR) | Num of overlap discovered (MSR) | |||
|---|---|---|---|---|---|---|---|---|---|
| Noise % | Overlap | Bic1 | Bic2 | Bic1 | Bic2 | ||||
|
| |||||||||
|
|
|
| 2.82 (1.93) | 101.00 (0.00) | 121.00 (0.00) | 20.00 (0.00) | 19.44 (1.42) | 0.00 (0.00) | |
|
|
|
|
| 4.29 (9.59) | 110.21 (54.15) | 195.07 (74.43) | 19.94 (1.20) | 15.56 (8.73) | 28.81 (6.19) |
|
|
|
| 4.32 (7.05) | 185.26 (87.97) | 151.79 (82.26) | 20.00 (0.00) | 18.37 (12.65) | 37.16 (38.84) | |
|
| |||||||||
|
|
|
| 3.39 (3.05) | 99.92 (2.53) | 119.46 (0.00) | 20.00 (0.00) | 18.51 (10.63) | 0.00 (0.00) | |
|
|
|
|
| 6.03 (16.56) | 101.91 (64.86) | 179.22 (68.22) | 19.94 (1.16) | 18.29 (17.07) | 20.38 (14.62) |
|
|
|
| 4.92 (11.15) | 167.81 (98.41) | 135.50 (83.37) | 20.00 (0.00) | 17.37 (22.19) | 23.85 (52.15) | |
|
| |||||||||
|
|
|
|
| 4.73 (9.08) | 92.91 (8.55) | 112.19 (12.58) | 20.00 (0.00) | 16.24 (29.54) | 0.113 (0.113) |
|
|
|
| 6.21 (21.02) | 86.72 (140.16) | 156.87 (156.38) | 19.78 (4.14) | 17.59 (16.75) | 11.65 (23.35) | |
|
|
|
| 5.22 (13.08) | 136.57 (412.83) | 127.25 (289.13) | 19.99 (0.00) | 14.74 (32.36) | 15.64 (60.36) | |
|
| |||||||||
|
|
|
| 4.77 (9.21) | 81.66 (420.92) | 102.06 (586.21) | 19.93 (0.07) | 14.33 (85.61) | 0.87 (0.86) | |
|
|
|
|
| 6.05 (20.15) | 75.95 (490.59) | 140.39 (392.47) | 19.67 (5.19) | 15.66 (34.91) | 8.75 (26.25) |
|
|
|
| 5.16 (12.88) | 109.19 (956.23) | 123.07 (883.16) | 19.73 (0.03) | 10.35 (59.34) | 11.70 (64.29) | |
|
| |||||||||
|
|
|
| 4.16 (7.03) | 48.92 (276.52) | 94.98 (426.82) | 19.52 (1.25) | 8.96 (192.06) | 4.22 (6.26) | |
|
|
|
|
| 5.06 (12.67) | 56.53 (434.02) | 137.14 (513.55) | 19.53 (4.49) | 12.19 (151.78) | 13.93 (21.07) |
|
|
|
| 4.38 (9.73) | 73.65 (784.51) | 90.06 (963.49) | 19.52 (1.04) | 8.92 (218.00) | 3.08 (72.92) | |
Figure 1Corrected rows and columns of discovered biclusters in matrix with dim 500 × 50.
Information about bicluster result.
| Label | Genes | Conditions | MSR |
|---|---|---|---|
| A | 189 | 2 | 56.98 |
| B | 78 | 6 | 21.06 |
| C | 14 | 9 | 25.73 |
| D | 30 | 6 | 0.23 |
| E | 3 | 13 | 0.00 |
Biological significant of biclusters result.
| Bicluster | Number of GO terms | Ontology |
| |||
|---|---|---|---|---|---|---|
| <0.05 | <0.01 | <0.005 | <0.001 | |||
| A | 748 | Biological process | 87.5 | 63.1 | 57 | 42.5 |
| Molecular function | 85.5 | 67.3 | 60 | 53.6 | ||
| Cellular component | 84.2 | 58.4 | 47.5 | 28.7 | ||
|
| ||||||
| B | 94 | Biological process | 76.5 | 58.8 | 51 | 7.8 |
| Molecular function | 81 | 38.1 | 38.1 | 23.8 | ||
| Cellular component | 63.6 | 50 | 36.4 | 9.1 | ||
|
| ||||||
| C | 43 | Biological process | 57.7 | 19.2 | 15.4 | 11.5 |
| Molecular function | 40 | 20 | 20 | 10 | ||
| Cellular component | 71.4 | 14.3 | 14.3 | 14.3 | ||
|
| ||||||
| D | 171 | Biological process | 96 | 58 | 41.7 | 22.5 |
| Molecular function | 88.9 | 72.2 | 61.1 | 50 | ||
| Cellular component | 87.9 | 60.6 | 54.5 | 33.3 | ||
|
| ||||||
| E | 10 | Biological process | 80 | 0 | 0 | 0 |
| Molecular function | — | — | — | — | ||
| Cellular component | 60 | 20 | 0 | 0 | ||