| Literature DB >> 26411869 |
Işık Barış Fidaner1, Ayca Cankorur-Cetinkaya2, Duygu Dikicioglu2, Betul Kirdar3, Ali Taylan Cemgil1, Stephen G Oliver4.
Abstract
MOTIVATION: Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets.Entities:
Mesh:
Year: 2015 PMID: 26411869 PMCID: PMC4734040 DOI: 10.1093/bioinformatics/btv532
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.An example piecewise linear sequence model
Fig. 2.Parameter distribution and convergence. Histograms of K, α, d, a, b through the iterations for GLU (a), SPO (b) and MUS (c)
Summary of the clustering analysis of the datasets given at the marginal parameter settings for m and e
| Merge threshold | 0.1 | 0.1 | 0.9 | 0.9 | |
|---|---|---|---|---|---|
| Extension threshold | 0.1 | 0.9 | 0.1 | 0.9 | |
| Total number of unique clusters | 30 | 30 | 34 | 126 | |
| Size of the largest cluster | 34 | 32 | 34 | 21 | |
| Number of singletons | 2 | 2 | 2 | 70 | |
| Number of clusters enriched with 1 + GO term* | 13 | 10 | 18 | 12 | |
| % of clusters enriched with 1 + GO term** | 46 | 36 | 56 | 21 | |
| Total number of unique clusters | 39 | 39 | 52 | 694 | |
| Size of the largest cluster | 127 | 103 | 128 | 30 | |
| Number of singletons | 0 | 0 | 0 | 545 | |
| Number of clusters enriched with 1 + GO term* | 14 | 10 | 16 | 33 | |
| % of clusters enriched with 1 + GO term** | 36 | 26 | 31 | 22 | |
| Total number of unique clusters | 117 | 117 | 184 | 2249 | |
| Size of the largest cluster | 142 | 121 | 142 | 35 | |
| Number of singletons | 0 | 0 | 0 | 1906 | |
| Number of clusters enriched with 1 + GO term* | 2 | 3 | 2 | 18 | |
| % of clusters enriched with 1 + GO term** | 2 | 3 | 1 | 5 |
(*) indicates the number of clusters with two or more members, which are significantly enriched with at least one Biological Process GO Term (for GLU and SPO) and Molecular Function GO Term (for MUS) (P-value < 0.01). (**) represents the relative percentage of clusters with two or more members.
Fig. 3.Variation in the number of clusters. The number of clusters (a, e, i), the number of singleton clusters (b, f, j), the number of clusters with two or more members (c, g, k), and the percentage of clusters with two or more members among the total number of clusters (d, h, l), in GLU, SPO and MUS, respectively, as a function of m and e are displayed. Both the total number of unique clusters and singletons increases as m and e get higher, whereas the percentage of clusters with two or more members among the total number of clusters begins to drop considerably at values higher than 0.6 for both m and e
GO Term coverage performance of CnG in comparison to preceding clustering algorithms
| GLU | SPO | MUS | |
|---|---|---|---|
| No. of terms identified by CRC in total | 43 | 92 | 1 |
| No. of terms identified by CnG in total | 85 | 135 | 10 |
| % of terms CnG identifies in CRC results | 91% | 82% | 100% |
| % of terms CRC identifies in CnG results | 46% | 56% | 10% |
| No. of terms identified by CRC only | 4 | 17 | 0 |
| No. of terms identified by CnG only | 46 | 60 | 9 |
| No. of terms identified by CnG and CRC | 39 | 75 | 1 |
| No. of terms identified by GIMM in total | 79 | 64 | 7 |
| No. of terms identified by CnG in total | 85 | 135 | 10 |
| % of terms CnG identifies in GIMM results | 78% | 80% | 71% |
| % of terms GIMM identifies in CnG results | 73% | 38% | 50% |
| No. of terms identified by GIMM only | 17 | 13 | 2 |
| No. of terms identified by CnG only | 23 | 84 | 5 |
| No. of terms identified by GIMM and CnG | 62 | 51 | 5 |