| Literature DB >> 22372958 |
Donny Soh1, Difeng Dong, Yike Guo, Limsoon Wong.
Abstract
BACKGROUND: While contemporary methods of microarray analysis are excellent tools for studying individual microarray datasets, they have a tendency to produce different results from different datasets of the same disease. We aim to solve this reproducibility problem by introducing a technique (SNet). SNet provides both quantitative and descriptive analysis of microarray datasets by identifying specific connected portions of pathways that are significant. We term such portions within pathways as "subnetworks".Entities:
Mesh:
Year: 2011 PMID: 22372958 PMCID: PMC3278831 DOI: 10.1186/1471-2105-12-S13-S15
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Example of the two gene-gene relationships. Example of the two gene-gene relationships. Left: an activating relationship between ATM and CHK1. Right: an inhibiting relationship between MDM2 and p53.
Percentage overlap significant subnetworks between the datasets
| Disease | Dataset 1 | Dataset 2 | SNet | GSEA |
|---|---|---|---|---|
| Leukaemia | Golub | Armstrong | 83.33% | 0% |
| ALL Subtype | Ross | Yeoh | 47.63% | 23.1% |
| DMD | Haslett | Pescatori | 58.33% | 55.6% |
| Lung | Bhattacharjee | Garber | 90.90% | 0% |
Table showing the percentage overlap significant subnetworks between the datasets. Each row refers to a separate disease (as indicated in the first column). Each disease is tested against two datasets depicted in the second and third column. The overlap percentages refer to the pathway overlaps obtained from running SNet (column 4) and GSEA (column 5).
Number of overlap significant subnetworks between the datasets
| Disease | Dataset 1 | Dataset 2 | SNet | GSEA |
|---|---|---|---|---|
| Leukaemia | Golub | Armstrong | 20 | 0 |
| ALL subtype | Ross | Yeoh | 10 | 6 |
| DMD | Haslett | Pescatori | 7 | 10 |
| Lung | Bhattacharjee | Garber | 9 | 0 |
Table showing the number of significant overlapping subnetworks between the significant pathways. Each row refers to a separate disease (as indicated in the first column). Each disease is tested against two datasets depicted in the second and third column. The overlapping figures refer to the pathway overlaps obtained from running SNet (column 4) and GSEA (column 5).
Number and percentage of overlap genes
| SNet | GSEA | ||
|---|---|---|---|
| Leukaemia | Num Genes | 84 | |
| ALL subtype | Num Genes | 75 | |
| DMD | Num Genes | 45 | |
| Lung | Num Genes | 65 | |
Table showing the number and percentage of significant overlapping genes. γ refers to the number of genes compared against and is the number of unique genes within all the significant subnetworks of the disease datasets. The gene overlap refers to the percentage gene overlap between the two datasets of a disease for SNet (column 3) and GSEA (column 4).
Number and percentage of significant overlap genes with t-test
| SNet | t-test | t-test | ||
|---|---|---|---|---|
| Leukaemia | Num Genes | 1239 | 84 | |
| ALL subtype | Num Genes | 1072 | 75 | |
| DMD | Num Genes | 1319 | 45 | |
| Lung | Num Genes | 2091 | 65 | |
Table showing the number and percentage of significant overlapping genes. γ refers to the number of genes compared against and is the number of unique genes within all the significant subnetworks of the disease datasets. The gene overlap refers to the percentage gene overlap between the two datasets of a disease for SNet (column 3) and t-test (column 4: for genes at P leg 0.05; and column 5: for top γ significant genes).
Number and percentage of significant overlap genes with SAM
| SNet | SAM | SAM | ||
|---|---|---|---|---|
| Leukaemia | Num Genes | 1305 | 84 | |
| ALL subtype | Num Genes | 464 | 75 | |
| DMD | Num Genes | 126 | 45 | |
| Lung | Num Genes | 966 | 65 | |
Table showing the number and percentage of significant overlapping genes. γ refers to the number of genes compared against and is the number of unique genes within all the significant subnetworks of the disease datasets. The gene overlap refers to the percentage gene overlap between the two datasets of a disease for SNet (column 3) and SAM (column 4: for genes at P leq 0.05; and column 5: for top γ significant genes).
Size of largest subnetworks from t-test
| Disease | Num genes (t-test) | Num genes (SNet) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| 2 | 3 | 4 | 5 | 5 | 6 | 7 | ≥ 8 | ||
| Leukaemia | 84 | 8 | 1 | 0 | 0 | 2 | 3 | 2 | 1 |
| Subtype | 75 | 5 | 1 | 1 | 1 | 1 | 0 | 1 | 6 |
| DMD | 45 | 3 | 1 | 0 | 0 | 1 | 0 | 0 | 5 |
| Lung | 65 | 3 | 2 | 1 | 0 | 5 | 3 | 0 | 1 |
Table comparing the size of the subnetworks obtained from the t-test and from SNet. The first column shows the disease that is being considered and the second column shows the number of genes used to create the subnetworks. The third column (which comprises additionally of 4 subcolumns) depicts the number of genes present within each subnetwork for the t-test. Similarly the fourth column depicts the number of genes present within each subnetwork for SNet. So for instance in the leukaemia dataset, we have 8 subnetworks with size 2 genes, 1 subnetwork with size 3 genes for the t-test. For SNet, we have 2 subnetworks with size 5 genes, 3 subnetworks with size 6 genes, 2 subnetworks with size 7 genes and 1 subnetwork with a size of ≥ 8 genes
Percentage of genes from subnetworks for the leukaemia dataset which are also considered significant by t-test
| Subnetwork name | Percentage |
|---|---|
| leukaemia_B Cell_VAV1 | 81.82% |
| leukaemia_Purine metabolism_NP | 83.33% |
| leukaemia_Phosphatidylinositol signaling_PLCG2 | 100.00% |
| leukaemia_Regulation of actin cytoskeleton_RAC1 | 57.14% |
| leukaemia_Proteasome Degradation_UBC | 100.00% |
| leukaemia_Regulation of Actin Cytoskeleton_RAC1 | 57.14% |
| leukaemia_B Cell_NFKB1 | 80.00% |
| leukaemia_Regulation of actin cytoskeleton_CSK | 75.00% |
| leukaemia_B Cell Receptor Signaling_POU2F2 | 75.00% |
| leukaemia_IL6 Signaling_IL8 | 75.00% |
| leukaemia_Focal Adhesion_ACTB | 100.00% |
Table depicting the percentage of genes from subnetworks which are also significant for the t-test. The first column depicts the name of the subnetwork considered. The second column depicts the percentage of genes from that subnetwork which are also deemed significant for the t-test. (leukaemia datasets [26,27])
Percentage of genes from subnetworks for the ALL subtype which are also considered significant by t-test
| Subnetwork name | Percentage |
|---|---|
| MLLBCR_Fatty acid metabolism_ACAA1 | 28.57% |
| MLLBCR_Valine, leucine and isoleucine degradation_HSD17B10 | 40.00% |
| MLLBCR_B Cell_BLNK | 72.73% |
| MLLBCR_Valine, leucine and isoleucine degradation_HSD17B10 | 33.33% |
| MLLBCR_B cell receptor signaling pathway_BLNK | 72.73% |
| MLLBCR_Acute myeloid leukaemia_FLT3 | 44.44% |
| BCR_Chronic myeloid leukaemia_ABL1 | 75.00% |
| BCR_Fc Epsilon RI Signaling_PIK3C2B | 70.00% |
| BCR_T Cell Receptor Signaling Pathway_RASA1 | 44.44% |
Table depicting the percentage of genes from subnetworks which are also significant for the t-test. The first column depicts the name of the subnetwork considered. The second column depicts the percentage of genes from that subnetwork which are also deemed significant for the t-test. (ALL Subtype datasets [28,29])
Percentage of genes from subnetworks for the DMD dataset which are also considered significant by t-test
| Subnetwork name | Percentage |
|---|---|
| DMD_Tight junction_RHOA | 87.50% |
| DMD_Integrin Signaling_TTN | 75.00% |
| DMD_ECM-receptor interaction_SDC3 | 88.89% |
| DMD_Tight junction_RHOA | 85.71% |
| DMD_Leukocyte transendothelial migration_ACTB | 83.33% |
| DMD_Actin Cytoskeleton Signaling_MYL9 | 78.57% |
| DMD_Calcium signaling pathway_CALM1 | 80.00% |
Table depicting the percentage of genes from subnetworks which are also significant for the t-test. The first column depicts the name of the subnetwork considered. The second column depicts the percentage of genes from that subnetwork which are also deemed significant for the t-test. (DMD datasets [16,17])
Percentage of genes from subnetworks for the lung dataset which are also considered significant for the t-test
| Subnetwork name | Percentage |
|---|---|
| SNet_Notch signaling pathway_NOTCH3 | 100.00% |
| SNet_ECM-receptor interaction_SDC1 | 69.23% |
| SNet_Adherens junction_CTNNB1 | 100.00% |
| SNet_Tyrosine metabolism_ADH1B | 100.00% |
| SNet_Phenylalanine metabolism_ALDH3B1 | 100.00% |
| SNet_Tryptophan metabolism_WBSCR22 | 80.00% |
| SNet_Natural killer cell mediated cytotoxicity_TNFSF10 | 60.00% |
| SNet_Insulin Recpetor Signaling_AKT3 | 100.00% |
| SNet_Glycogen Metabolism_PYGM | 60.00% |
Table depicting the percentage of genes from subnetworks which are also significant for the t-test. The first column depicts the name of the subnetwork considered. The second column depicts the percentage of genes from that subnetwork which are also deemed significant for the t-test. (Lung datasets [14,15])
Figure 2Sample subnetwork from leukaemia dataset. A sample subnetwork from leukaemia dataset [26,27].
Figure 3Sample subnetwork from leukaemia database. A sample subnetwork from DMD dataset [16,17].
Figure 4Sample subnetwork formation. An example of how we form subnetworks from a sample pathway with its genes.
Figure 5Sample null distribution of subnetworks according to the size and score.