| Literature DB >> 25830807 |
Ujjwal Maulik1, Saurav Mallik2, Anirban Mukhopadhyay3, Sanghamitra Bandyopadhyay2.
Abstract
Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level.Entities:
Mesh:
Year: 2015 PMID: 25830807 PMCID: PMC4382191 DOI: 10.1371/journal.pone.0119448
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Flowchart of the proposed methodology (StatBicRM) for the rule mining.
Here, the terms TOTALDESET , TOTALDESET , TOTALDESET are described in last paragraph of subsection “Identification of differentially expressed/methylated genes using Statistical tests”. For methylation dataset, the above terms are replaced by TOTALDMSET , TOTALDMSET , TOTALDMSET , respectively.
Fig 2Flowchart of the proposed methodology (StatBicRM) for the classification.
Here, the terms TOTALDESET , TOTALDESET , TOTALDESET are described in last paragraph of subsection. For methylation dataset, the above terms are replaced by TOTALDMSET , TOTALDMSET , TOTALDMSET , respectively.
Fig 3An example of generating special rules from data matrix of the differentially expressed genes.
Here, up-regulation (i.e., ‘+’) and down-regulation (‘-’) are denoted by ‘1’ and ‘0’ in (b), and red and green colors in (c), respectively. Here, s and s denote experimental/diseased/treated and control/normal samples respectively.
Fig 4An example of classification of evolved rules by the majority voting using weighted-sum.
Here, ‘r’ and ‘w’ denote rank and weight of the rule (computed by Equation 18), respectively. Tickmark/crossmark in ‘Q’ column states that test-point (ts) is satisfied/non-satisfied by the corresponding rule.
Information of used Real Datasets (DS).
|
|
|
|---|---|
| DS1 | Expression dataset (NCBI ref. id:- GSE10245) of lung cancer subtypes [ |
| DS2 | Expression dataset (NCBI ref. id:- GSE31699) of Uterine Leiomyoma [ |
| DS3 | Methylation dataset (NCBI ref. id:- GSE31699) of Uterine Leiomyoma having the 18 UL samples and 18 MM samples. |
Number of differentially expressed genes by different statistical tests for Dataset 1, where #G , #G denote up and down-regulated genes, respectively. Here, Pearson’s correlation test can not be used as the number of experimental samples is not equal to the number of control samples.
|
|
| ||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
| |
| # | 616 | 586 | 376 | 344 | 115 | 136 | 188 | 176 | 93 |
| # | 619 | 642 | 481 | 403 | 325 | 387 | 615 | 582 | 320 |
Number of differentially expressed genes by different statistical tests for Dataset 2, where #G , #G denote up and down-regulated genes, respectively.
|
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| |
| # | 391 | 391 | 329 | 62 | 54 | 86 | 86 | 86 | 86 | 86 |
| # | 576 | 576 | 491 | 97 | 82 | 70 | 70 | 70 | 70 | 70 |
Number of differentially methylated genes by different statistical tests for Dataset 3, where #G and #G refer to hyper and hypo-methylated genes, respectively.
|
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| |
| # | 652 | 652 | 507 | 87 | 74 | 185 | 185 | 181 | 186 | 181 |
| # | 676 | 676 | 600 | 129 | 118 | 165 | 165 | 163 | 166 | 163 |
Fig 5The clustergram of the common differentially expressed genes (by different statistical tests) for DS1.
Here, red colour denotes up-regulation of genes across the specific samples/conditions, and green colour denotes down-regulation of genes across the specific samples/conditions.
Fig 6Volcanoplot for identifying differential up and down-regulated genes from Dataset 1 by SAM.
Number of differentially expressed genes by different statistical tests for the artificial Dataset 4, where #G , #G denote up-regulated and down-regulated genes, respectively.
|
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| |
| # | 90 | 89 | 87 | 56 | 51 | 570 | 573 | 594 | 575 | 532 |
| # | 29 | 29 | 28 | 23 | 23 | 342 | 342 | 365 | 373 | 336 |
Fig 7A graphical representation of the gene expression of a maximal homogeneous bicluster (i.e., a MFCHOI) over different samples.
Comparative performance analysis of the rule-based classifiers on Dataset 1, respectively (at 4-fold CVs repeating for 10 times); where bold font denotes the highest value for each column.
|
|
|
|
|
|
|---|---|---|---|---|
|
|
|
|
|
|
|
| 88.25 (6.46) | 78.33 (15.81) | 85.18 (4.00) | 0.67 (0.086) |
|
| 94.75 (2.19) | 77.22 (12.13) | 89.31 (2.27) | 0.75 (0.057) |
|
| 94.25 (1.21) | 79.45 (6.95) | 89.66(1.41) | 0.75 (0.037) |
|
| 92.5 (2.04) | 78.33 (8.05) | 88.11 (1.51) | 0.72 (0.039) |
|
| 92 (2.58) | 83.89 (4.86) | 89.48 (3.19) | 0.76 (0.074) |
|
| 91.75 (5.90) | 79.44 (17.18) | 87.93 (2.82) | 0.73 (0.072) |
Comparative performance analysis of the rule-based classifiers on Dataset 2, respectively (at 4-fold CVs repeating for 10 times); where bold font denotes the highest value for each column.
|
|
|
|
|
|
|---|---|---|---|---|
|
|
| 53.75 (3.23) |
|
|
|
| 71.88 (8.46) | 63.75 (8.23) | 67.81 (3.91) | 0.36 (0.081) |
|
| 76.88 (3.02) | 62.5 (5.10) | 69.69 (1.51) | 0.40 (0.027) |
|
| 70.63 (3.02) | 62.5 (5.10) | 66.56 (3.92) | 0.33 (0.080) |
|
| 73.13 (3.02) | 58.75 (10.70) | 65.93 (4.53) | 0.33 (0.087) |
|
| 66.25 (6.04) |
| 65.31 (2.74) | 0.31 (0.057) |
|
| 76.88 (3.02) | 58.75 (3.23) | 67.81 (1.51) | 0.37 (0.031) |
Comparative performance analysis of the rule-based classifiers on Dataset 3, respectively (at 4-fold CVs repeating for 10 times); where bold font denotes the highest value for each column.
|
|
|
|
|
|
|---|---|---|---|---|
|
|
| 86.67 (2.87) |
|
|
|
| 70.56 (2.68) | 90.56 (6.95) | 80.56 (2.27) | 0.63 (0.062) |
|
| 84.44 (3.51) | 82.78 (4.86) | 83.61 (0.88) | 0.68 (0.019) |
|
| 75.56 (2.87) | 92.78 (2.68) | 84.16 (1.34) | 0.69 (0.027) |
|
| 76.67 (7.31) | 88.33 (4.86) | 82.50 (1.33) | 0.66 (0.014) |
|
| 76.11 (13.11) |
| 85.28 (6.55) | 0.72 (0.113) |
|
| 83.33 (4.54) | 80.00 (9.51) | 81.67 (2.68) | 0.64 (0.048) |
* This standard deviation of specificity is coming to be zero. On investigation, we have identified a particular datapoint belonging to normal class in Dataset 3 for which the “PART” classifier as well as the other classifiers including the proposed one are producing always false positive result.
Comparative performance analysis of the rule-based classifiers on Dataset 4, respectively (at 4-fold CVs repeating for 10 times); where bold font denotes the highest value for each column.
|
|
|
|
|
|
|---|---|---|---|---|
|
|
| 83.62 (0.37) |
|
|
|
| 83.47 (2.32) | 82.12 (0.98) | 82.56 (1.22) | 0.62 (0.04) |
|
| 84.06 (3.82) | 81.32 (0.78) | 82.78 (0.87) | 0.63 (0.02) |
|
| 79.12 (2.89) | 83.37 (0.97) | 81.55 (1.37) | 0.59 (0.04) |
|
| 79.63 (3.48) | 81.75 (1.48) | 81.10 (1.78) | 0.57 (0.06) |
|
| 80.95 (2.96) |
| 81.86 (1.27) | 0.60 (0.04) |
|
| 84.07 (2.57) | 82.56 (1.05) | 83.17 (1.42) | 0.63 (0.05) |
Fig 8Barcharts: (a) comparison of dataset-wise average accuracies, and (b) comparison of dataset-wise average MCCs, among our proposed and other existing rule-based classifiers for the four datasets.
Fig 9Boxplots of significance tests (i.e., one-way Anova) for identifying level of significances (i.e., p-values) of accuracies between the proposed and other rule-based classifiers (pairwise) for Dataset 1 [in (a).(i-vi)], Dataset 2 [in (b).(i-vi)], Dataset 3 [in (c).(i-vi)] and Dataset 4 [in (d).(i-vi)]; where (i) proposed vs ConjunctiveRule, (ii) proposed vs DecisionTable, (iii) proposed vs JRip, (iv) proposed vs OneR, (v) proposed vs PART and (vi) proposed vs Ridor; (here vertical axis denotes the accuracy of the classifier).
p-value of Anova 1 between the avg. accuracies of the proposed and other classifiers (pairwise) in DS1, DS2, DS3 and DS4 (where ‘S’ and ‘NS’ refer to significant (p-value ≤ 0.05) and non-significant (p-value > 0.05) p-values respectively).
|
|
|
|
|
|
|---|---|---|---|---|
|
| 2.41e-06 (S) | 8.68e-06 (S) | 4.53e-07 (S) | 0.0139(S) |
|
| 3.40e-06 (S) | 2.88e-08 (S) | 8.93e-06 (S) | 0.0106(S) |
|
| 1.78e-07 (S) | 1.36e-06 (S) | 8.16e-05 (S) | 0.0002(S) |
|
| 6.53e-09 (S) | 3.22e-06 (S) | 1.68e-06 (S) | 0.0003(S) |
|
| 0.0001 (S) | 2.89e-09 (S) | 0.1491 (NS) | 0.0005(S) |
|
| 1.57e-06 (S) | 4.81e-10 (S) | 9.83e-06 (S) | 0.2497(NS) |
Top 10 frequent genes in evolved rules of the two class-labels for DS1, DS2 and DS3, respectively.
Rule and Rule denote the set of the evolved rules of experimental class-label, and the set of the evolved rules of control class-label, respectively.
|
|
|
| |
|---|---|---|---|
| For |
|
|
|
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
| For |
|
|
|
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
|
Fig 10Two examples of how significant biomarkers are identified from the maximal homogeneous biclusters (i.e., MFCHOI) for each class-label for each dataset.
Here, we are shown intersection of only four maximal homogeneous biclusters for (a) the class-label AC and (b) the class-label SCC, individually (for Dataset 1). For the class AC, CENPA-, TTK-, KIF11-, KIF18B- and ZNF367- are the top frequent genes as they exist in the four biclusters (see (a)); similarly, for the class SCC, SHROOM3- is top frequent gene as it exists in the four biclusters (see (b)).
KEGG pathway, GO:BP, GO:CC and GO:MF analysis of corresponding genes of the evolved rules from the three datasets.
Here, ‘satisfiable rule’ or SRule by some KEGG-pathway(i.e., Path)/GO:BP/GO:CC/GO:MF means that all the genes (i.e., antecedent) of the rule are occurred together in the pathway/Go-term.
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|
|
|
|
| 0.0386 | 7 | MGRN1, FBXO2, KLHL13, DDB2, RHOBTB2, MID1, UBE2S | 1 | rule id 5233 |
|
|
| 8.69E-10 | 34 | PRC1, BLM, TTK, PKMYT1, CEP55, AURKB, RHOU, GTSE1, SPC24, KIF2C, CDCA8, NCAPH, NCAPG, CENPA, SKA1, ZWILCH, TXNL4B, CDK1 etc. | 21 | rule id 327, 2231, 2232, 2914, 7360 etc. | |
|
| 1.02E-11 | 30 | PRC1, BLM, TTK, PKMYT1, SPC24, KIF15, BIRC5, CENPE, NDC80, SMC2, CDK2, MAD2L1, TIMELESS, PLK1, BUB1B, SETD8 etc. | 19 | rule id 327, 2231, 2232, 2914, 7360 etc. | ||
|
| 6.04E-12 | 32 | PRC1, BLM, TTK, PKMYT1, CEP55, AURKB, RHOU, GTSE1, BIRC5, CENPE, NDC80, SMC2, CDK2, MAD2L1, TIMELESS, PLK1, BUB1B, RAD54B etc. | 18 | rule id 327, 2231, 2232, 2914, 7360 etc. | ||
|
| 4.86E-04 | 14 | KIF11, PRC1, KIF15, KIF18B, TTK, NDC80, CENPE, MID1, MARK1, GTSE1, KIF2C, CENPA, BUB1B, KIF13B | 17 | rule id 327, 2232, 7360 etc. | ||
|
|
| 5.43E-05 | 70 | MTSS1, FOSL2, PRC1, CEP78, TTK, AURKB, SENP5, RHOU, GTSE1, SLC1A4, KIF2C, CDCA8, FRMD6, PBXIP1, FANCI, SNTB1, KIF13B, CDK1, MYO6, KIF11 etc. | 85 | rule id 151, 253, 298, 327, 415, 888, 1261, 1462, 1970, 2232 etc. | |
|
| 5.43E-05 | 70 | MTSS1, FOSL2, PRC1, CEP78, TTK, AURKB, SENP5, RHOU, GTSE1, SLC1A4, KIF2C, CDCA8, FRMD6 etc. | 85 | rule id 151, 253, 298, 327, 415, 888, 1261, 1462, 1970, 2232 etc. | ||
|
| 0.0128 | 52 | DLC1, IL27RA, TSPAN4, RHOU, SLC1A4, FRMD6, CD44, LTB4R, SNTB1, CEACAM6, SLC22A3, RAB27A, ARHGEF4, ICAM1, PLD1, MYO6, LIFR etc. | 36 | rule id 212, 625, 1876, 6051 etc. | ||
|
|
| 0.0015 | 56 | ACOX2, CTPS, PKMYT1, TTK, AURKB, RHOU, KIF2C, MCM8, LTB4R, ACAD8, RAB27B, ACAD9, RAB27A, KIF13B, NMNAT3, CDK1, MYO6, KIF11, LIMK2, KIF15, MCM4, MBD1, MCM5, CDK2, etc. | 51 | rule id 254, 327, 339, 344, 494, 639, 643 etc. | |
|
| 4.95E-04 | 45 | ACOX2, FGFR2, BLM, CTPS, TTK, PKMYT1, AURKB, ADA, KIF2C, IGF1R, MCM8, STK32A, ACAD8, ACAD9, KIF13B, MYO5C, NMNAT3, CDK1, MYO6, KIF11, MKI67, LIMK2, KIF15, ATP11B etc. | 39 | rule id 254, 327, 339, 344, 494, 639, 643 etc. | ||
|
| 5.72E-04 | 45 | ACOX2, FGFR2, BLM, CTPS, TTK, PKMYT1, AURKB, ADA, KIF2C, IGF1R, MCM8, STK32A, ACAD8, ACAD9, KIF13B, MYO5C, NMNAT3, CDK1, MYO6, KIF11, MKI67, LIMK2, IPPK, UBE2S, ABCC5 etc. | 39 | rule id 254, 327, 339, 344, 494, 639, 643 etc. | ||
|
|
|
| 9.79E-04 | 5 | GSTA4, FMO2, AOX1, GSTO2, MGST1 | 1 | rule id 12 |
|
|
| 0.0341 | 9 | CEBPA, TNFSF4, BAX, SERPINE1, LIFR, IGF2, JAG1, CD24, PRL | 4 | rule id 78, 95, 145, 390 | |
|
| 0.0039 | 5 | CEBPA, TNFSF4, SMARCB1, PSRC1, IGF2 | 3 | rule id 52, 78, 225 | ||
|
|
| 0.0015 | 20 | TNFSF4, EGFL6, MMP9, APOC1, LIFR, GGH, IGF2, JAG1, MMP2, CHRDL1, PRRG1, PTGDS, C1QTNF4, SERPINE1, PECAM1, SERPINA3, C1QL1, GDF15, GFOD1, PRL | 6 | rule id 50,81,82,87,95,246 | |
|
|
|
| 1.69E-04 | 12 | EGFR, IFNA21, CCR1, TNFSF12, IFNA1, IL23A, IL20RA, CCL3L1, INHBE, TNFRSF18, TNFSF12-TNFSF13, IFNGR2, IFNA17 | 1 | rule id 177 |
|
|
| 1.88E-09 | 29 | IFNA21, S100A8, CCR1, BNIP3, HTN3, CD74, CFHR1, APOA4, REG3A, IFNA1, IL23A, SAA2, CCL3L1, SAA1, REG3G, CFHR5, IL1RL1, DEFB103A, SCUBE1, RNASE6 etc. | 12 | rule id 7, 144, 613, 617, 653, 654, 784, 822, 1067, 1182, 2293, 2342 | |
|
| 8.71E-04 | 20 | FYB, IL1RL1, SLA2, CCR1, IGJ, CD300E, BNIP3, TNFSF12, C4BPA, CD74, CLEC4M, APOA4, CFHR1, CYBA, IL23A, CCL3L1, LYST, DEFA1, TNFSF12-TNFSF13, TREM1, CFHR5 | 12 | rule id 7, 349, 350, 351, 387, 654, 784, 1182, 1674, 2293, 2342, 2361 | ||
|
| 0.0048 | 8 | CYBA, CALD1, MYH3, SLMAP, MYH4, ACTN2, SCN5A, CASQ2 | 4 | rule id 47, 138, 333, 2296 | ||
|
| 0.0118 | 7 | CALD1, MYH3, SLMAP, MYH4, ACTN2, SCN5A, CASQ2 | 4 | rule id 47, 138, 333, 2296 | ||
|
|
| 0.0090 | 67 | TEX101, STEAP4, NEURL, LHCGR, F2RL1, FCRL2, TNFSF12, KCNIP4, CALB2, FCRL3, APOB, SLMAP, ERAS, CALCRL, IFNGR2, EGFR, BSG, SLA2, SCUBE1, ACTN2, CACNG3, OR1D2, FLNA, TRPM2 etc. | 91 | rule id 126, 144, 155, 272, 321, 338, 339, 351, 385, 416 etc. | |
|
| 0.0112 | 43 | PKHD1, CCR1, LHCGR, F2RL1, TRHR, PANX3, CLDN11, TNFSF12, CD74, CALB2, SORBS3, SLMAP, TEK, ERAS, CALCRL, IFNGR2, SCN5A, EGFR, TRPM2, KCNK3, CLEC4M etc. | 43 | rule id 27, 28, 126, 144, 301, 339, 351, 385, 513, 514 etc. | ||
|
| 4.55E-09 | 59 | IFNA21, LHCGR, MMP27, TNFSF12, HTN3, APOA4, CFHR1, CFHR2, REG3A, APOB, OLFML3, SAA2, SERPINE2, SAA1, CCL3L1, CREG1, ANGPT1, REG3G, CFHR5, EGFR, NODAL, DEFA1 etc. | 40 | rule id 151, 180, 191, 346, 349, 350, 351, 486, 515, 517 etc. | ||
|
|
| 0.0029 | 16 | EGFR, S100A16, SCUBE1, TRHR, LHCGR, NFS1, BNIP3, DSCAML1, ACTN2, FLNA, APOA4, CYBA, APOB, BOK, TFAP2E, CRYBB2 | 3 | rules 1040, 1176, 2358 | |
|
| 0.0112 | 6 | IL1RL1, IL20RA, CCR1, TNFRSF18, IFNGR2, CD74 | 2 | rule id 177, 2342 |
Some top important rules w.r.t. their existing KEGG pathways/GO:BPs/GO:CCs/GO:CCs/GO:MFs in Dataset 1.
|
|
|
|
|---|---|---|
| {FBXO2+, DDB2-⇒ class = AC} | 1 | hsa04120:Ubiquitin mediated proteolysis |
|
|
|
|
| {KIF11-, BUB1B- ⇒ class = AC } | 14 | GO:0000279 M phase, GO:0000280 nuclear division, GO:0007067 mitosis, GO:0022403 cell cycle phase, GO:0000087 M phase of mitotic cell cycle, GO:0007049 cell cycle, GO:0000278 mitotic cell cycle, GO:0048285 organelle fission, GO:0051301 cell division, GO:0022402 cell cycle process, GO:0007010 cytoskeleton organization, GO:0007017 microtubule-based process, GO:0000226 microtubule cytoskeleton organization, GO:0007051 spindle organization |
| {KIF11-, TTK- ⇒ class = AC } | 10 | GO:0000279 M phase, GO:0022403 cell cycle phase, GO:0007049 cell cycle, GO:0000278 mitotic cell cycle, GO:0022402 cell cycle process, GO:0007010 cytoskeleton organization, GO:0007017 microtubule-based process, GO:0007052 mitotic spindle organization, GO:0000226 microtubule cytoskeleton organization, GO:0007051 spindle organization |
| {KIF11-, TIMELESS- ⇒ class = AC } | 10 | GO:0000279 M phase, GO:0000280 nuclear division, GO:0007067 mitosis, GO:0022403 cell cycle phase, GO:0000087 M phase of mitotic cell cycle, GO:0007049 cell cycle, GO:0000278 mitotic cell cycle, GO:0048285 organelle fission, GO:0051301 cell division, GO:0022402 cell cycle process |
| {NCAPH+, AURKB+, KIF15+ ⇒ class = SCC } | 9 | GO:0000279 M phase, GO:0000280 nuclear division, GO:0007067 mitosis, GO:0022403 cell cycle phase, GO:0000087 M phase of mitotic cell cycle, GO:0007049 cell cycle, GO:0000278 mitotic cell cycle, GO:0048285 organelle fission, GO:0022402 cell cycle process |
|
|
|
|
| {CENPN-, ZWILCH- ⇒ class = AC} | 9 | GO:0000793 condensed chromosome, GO:0000779 condensed chromosome and centromeric region, GO:0000775 chromosome and centromeric region, GO:0000777 condensed chromosome kinetochore, GO:0000776 kinetochore, GO:0044427 chromosomal part, GO:0005694 chromosome, GO:0043228 non-membrane-bounded organelle, GO:0043232 intracellular non-membrane-bounded organelle |
| {CENPN-, CENPA- ⇒ class = AC} | 9 | GO:0000793 condensed chromosome, GO:0000779 condensed chromosome and centromeric region, GO:0000775 chromosome and centromeric region, GO:0000777 condensed chromosome kinetochore, GO:0000776 kinetochore, GO:0044427 chromosomal part, GO:0005694 chromosome, GO:0043228 non-membrane-bounded organelle, GO:0043232 intracellular non-membrane-bounded organelle |
| {CENPN-, CENPM- ⇒ class = AC} | 9 | GO:0000793 condensed chromosome, GO:0000779 condensed chromosome and centromeric region, GO:0000775 chromosome and centromeric region, GO:0000777 condensed chromosome kinetochore, GO:0000776 kinetochore, GO:0044427 chromosomal part, GO:0005694 chromosome, GO:0043228 non-membrane-bounded organelle, GO:0043232 intracellular non-membrane-bounded organelle |
|
|
|
|
| {SMC2-, TTK- ⇒ class = AC} | 9 | GO:0001883 purine nucleoside binding, GO:0001882 nucleoside binding GO:0030554 adenyl nucleotide binding, GO:0000166 nucleotide binding GO:0017076 purine nucleotide binding, GO:0005524 ATP binding GO:0032559 adenyl ribonucleotide binding, GO:0032555 purine ribonucleotide binding, GO:0032553 ribonucleotide binding |
| {TTK-, KIF2C- ⇒ class = AC} | 9 | GO:0001883 purine nucleoside binding, GO:0001882 nucleoside binding GO:0030554 adenyl nucleotide binding, GO:0000166 nucleotide binding GO:0017076 purine nucleotide binding, GO:0005524 ATP binding GO:0032559 adenyl ribonucleotide binding, GO:0032555 purine ribonucleotide binding, GO:0032553 ribonucleotide binding |
| {KIF2C-, IGF1R- ⇒ class = AC} | 9 | GO:0001883 purine nucleoside binding, GO:0001882 nucleoside binding GO:0030554 adenyl nucleotide binding, GO:0000166 nucleotide binding GO:0017076 purine nucleotide binding, GO:0005524 ATP binding GO:0032559 adenyl ribonucleotide binding, GO:0032555 purine ribonucleotide binding, GO:0032553 ribonucleotide binding |
| {SMC2-, TTK-, KIF2C- ⇒ class = AC} | 9 | GO:0001883 purine nucleoside binding, GO:0001882 nucleoside binding GO:0030554 adenyl nucleotide binding, GO:0000166 nucleotide binding GO:0017076 purine nucleotide binding, GO:0005524 ATP binding GO:0032559 adenyl ribonucleotide binding, GO:0032555 purine ribonucleotide binding, GO:0032553 ribonucleotide binding |
| {TTK-, SMC2-, CTPS- ⇒ class = AC} | 9 | GO:0001883 purine nucleoside binding, GO:0001882 nucleoside binding GO:0030554 adenyl nucleotide binding, GO:0000166 nucleotide binding GO:0017076 purine nucleotide binding, GO:0005524 ATP binding GO:0032559 adenyl ribonucleotide binding, GO:0032555 purine ribonucleotide binding, GO:0032553 ribonucleotide binding |
Some top important rules w.r.t. their existing KEGG pathways/GO:BPs/GO:CCs in Dataset 2.
Here, we have got no such significant rule w.r.t. their existing GO:MFs for the dataset.
|
|
|
|
|---|---|---|
| {AOX1+, GSTA4- ⇒ class = normal} | 1 | hsa00982:Drug metabolism |
|
|
|
|
| {AOX1+, GSTA4- ⇒ class = normal} | 2 | GO:0032583 regulation of gene-specific transcription, GO:0042127 regulation of cell proliferation |
| {IGF2+, PRL+ ⇒ class = tumor} | 1 | GO:0042127 regulation of cell proliferation |
| {IGF2+, PRL+ ⇒ class = tumor} | 1 | GO:0042127 regulation of cell proliferation |
| {IGF2+, PRL+ ⇒ class = tumor} | 1 | GO:0032583 regulation of gene-specific transcription |
| {IGF2+, PRL+ ⇒ class = tumor} | 1 | GO:0042127 regulation of cell proliferation |
| {IGF2+, PRL+ ⇒ class = tumor} | 1 | GO:0032583 regulation of gene-specific transcription |
|
|
|
|
| {IGF2+, PTGDS- ⇒ class = tumor} | 3 | GO:0005576 extracellular region, GO:0031090 organelle membrane, GO:0005783 endoplasmic reticulum |
| {IGF2+, EGFL6+ ⇒ class = tumor} | 2 | GO:0005576 extracellular region, GO:0005615 extracellular space |
| {PRRG1+, SERPINE1+ ⇒ class = normal} | 1 | GO:0005576 extracellular region |
| {CHRDL1+, JAG1- ⇒ class = tumor} | 1 | GO:0005576 extracellular region |
| {IGF2+, PRL+ ⇒ class = tumor} | 1 | GO:0005576 extracellular region |
| {SERPINE1+, GFOD1+ ⇒ class = normal} | 1 | GO:0005576 extracellular region |
| {JAG1-, PECAM1- ⇒ class = tumor} | 1 | GO:0005576 extracellular region |
Top important rules w.r.t. their existing KEGG pathways/GO:BPs/GO:CCs/GO:MFs in Dataset 3.
|
|
|
|
|---|---|---|
| {IL20RA+, CCR1+ ⇒ class = tumor} | 1 | hsa04060:Cytokine-cytokine receptor interaction |
|
|
|
|
| {CYBA+, C4BPA- ⇒ class = tumor} | 5 | GO:0006952 defense response, GO:0006954 inflammatory response, GO:0009611 response to wounding, GO:0006955 immune response, GO:0045087 innate immune response |
| {LYST+, BNIP3+ ⇒ class = tumor} | 4 | GO:0006952 defense response, GO:0009615 response to virus, GO:0006955 immune response, GO:0002252 immune effector process |
| {CFHR5-, REG3A- ⇒ class = tumor} | 4 | GO:0006952 defense response, GO:0006954 inflammatory response, GO:0009611 response to wounding, GO:0002526 acute inflammatory response |
| {CCR1+, CFHR5- ⇒ class = tumor} | 4 | GO:0006952 defense response, GO:0006954 inflammatory response, GO:0009611 response to wounding, GO:0006955 immune response |
|
|
|
|
| {MST1R+, TNFSF12/TNFSF13+ ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {MST1R+, CCR1+, TNFSF12/TNFSF13+ ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {LHCGR+, SLMAP+ ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {CYBA+, MST1R+ ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {TRPM2-, SMPD2- ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {SCN4B+, TRPM2- ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {MST1R+, CALCRL+ ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {S100A16+, MTNR1A-, NODAL- ⇒ class = normal} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {TRPM2-, SMPD2-, UGT1A10-⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
| {SLMAP+, CCR1+ ⇒ class = tumor} | 4 | GO:0031226 intrinsic to plasma membrane, GO:0005886 plasma membrane, GO:0005887 integral to plasma membrane, GO:0044459 plasma membrane part |
|
|
|
|
| {BSG+, CLEC4M- ⇒ class = tumor} | 2 | GO:0005529 sugar binding, GO:0030246 carbohydrate binding |
Fig 11Comparison of number of significant itemsets between StatBicRM and other existing ARM methods at different minimum support for the two artificial datasets (viz., ArDS5 and ArDS6).
“Significant itemset” refers to MFCHOI for StatBicRM, and FI for the other methods.