| Literature DB >> 19025591 |
Kristian Ovaska1, Marko Laakso, Sampsa Hautaniemi.
Abstract
BACKGROUND: Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses.Entities:
Year: 2008 PMID: 19025591 PMCID: PMC2613876 DOI: 10.1186/1756-0381-1-11
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Benchmark results.
| Measure | Number of GO terms | csbl.go | GOSim (vs. csbl.go) | SemSim (vs. csbl.go) |
| Resnik | 50 | 0.002 s | 4.9 s (2399 ×) | 4.5 s (2158 ×) |
| Resnik | 100 | 0.006 s | 19.5 s (3219 ×) | 17.4 s (2880 ×) |
| Resnik | 200 | 0.024 s | 77.8 s (3245 ×) | 71.0 s (2963 ×) |
| Lin | 50 | 0.002 s | 7.4 s (3612 ×) | 4.5 s (2167 ×) |
| Lin | 100 | 0.007 s | 29.5 s (4284 ×) | 17.4 s (2528 ×) |
| Lin | 200 | 0.024 s | 117.8 s (4894 ×) | 71.0 s (2950 ×) |
| Jiang-Conrath | 50 | 0.002 s | 7.4 s (3590 ×) | 4.5 s (2157 ×) |
| Jiang-Conrath | 100 | 0.007 s | 29.5 s (4274 ×) | 17.4 s (2525 ×) |
| Jiang-Conrath | 200 | 0.023 s | 117.5 s (5043 ×) | 70.9 s (3043 ×) |
| Relevance | 50 | 0.002 s | - | 4.5 s (2062 ×) |
| Relevance | 100 | 0.007 s | - | 17.4 s (2400 ×) |
| Relevance | 200 | 0.025 s | - | 71.1 s (2866 ×) |
Speed difference between csbl.go and other package is shown in parenthesis. Timings of csbl.go are shown with an accuracy of 1/1000 s.
Figure 1GO heat map and clustering. GO based clustering dendrogram of the selected genes (vertical axis) is visualised along with the expression patterns that are used to cluster the samples (horizontal axis). There are nine GO-based clusters named G1,...,G9 that contain more than one gene. The GO clusters are separated by a horizontal bar in the heat map. Genes without annotations are omitted from the heat map. Overexpressed genes are shown with white or yellow color and underexpressed genes with red color.
Genes corresponding to the most statistically significant clusters found in the case study.
| Cluster | Genes |
| G1 | PDCL3 MAGED1 (two probe sets) PRKCE PRDX1 CLIC4 MRPS23 |
| GABARAPL3 | |
| G2 | CBR3 RANBP17 NBEA FVT1 |
| G3 | LRRC47 WARS |
| G4 | NANOGP8 ZNF215 POU5F1 MYBL2 L1TD1 CITED2 TCEA2 SMARCAD1 |
| MKI67IP CPSF4 PPFIBP2 WDSUB1 PPP3CA ISG20L1 TIPARP CEP290 | |
| DPPA4 TJP2 NLRP7 | |
| G5 | CTH GLO1 |
| G6 | PLAU PPAP2A |
| G7 | TLR5 OR5R1 TMEM106C IFITM1 PCDHB5 PCDHB11 AC069513.28 PLK3 |
| G8 | SLC22A17 FLVCR1 |
| G9 | PDGFA IGSF21 GDF3 CCDC80 GAL TF |
MAGED1 has two distinct differentially expressed probe sets. Genes are ordered from bottom to top in Figure 1. For example, the genes in cluster G1 in Figure 1 are, from bottom to top, PDCL3, two probe sets for MAGED1, PRKCE, etc.
Most informative GO terms for the clusters obtained from microarray data.
| Cluster | Size | p-value | GO term | |
| G1 | 8 | 0.0063 | 1.835 | cytoplasm |
| G2 | 4 | 0.20 | 1.835 | cytoplasm |
| 0.888 | binding | |||
| G3 | 2 | 0.009 | 8.751 | aminoacyl-tRNA ligase activity |
| 5.841 | translation | |||
| 0.888 | binding | |||
| G4 | 19 | 0.020 | 0.888 | binding |
| G5 | 2 | 0.0007 | 12.012 | carbon-sulfur lyase activity |
| 1.835 | cytoplasm | |||
| 1.753 | primary metabolic process | |||
| G6 | 2 | 0.21 | 4.705 | negative regulation of biological process |
| 3.420 | hydrolase activity | |||
| 3.411 | cellular protein metabolic process | |||
| G7 | 8 | 0.020 | 1.366 | membrane |
| G8 | 2 | 0.25 | 4.238 | transporter activity |
| 3.181 | transport | |||
| 2.866 | integral to membrane | |||
| G9 | 6 | < 0.0001 | 3.967 | extracellular region |
IC is the information content of given term. P-values are derived using a permutation test with 10000 repetitions. At most three GO terms are shown for each cluster. Only common GO terms with IC > 0 are shown.
KEGG pathways for differentially expressed genes.
| Cluster | Gene | KEGG pathways |
| G1 | GABARAPL3 | Regulation of autophagy |
| G1 | PRKCE | Tight junction, Fc epsilon RI signaling pathway, Type II diabetes mellitus |
| G2 | FVT1 | Sphingolipid metabolism |
| G2 | CBR3 | Arachidonic acid metabolism |
| G3 | WARS | Tryptophan metabolism, Aminoacyl-tRNA biosynthesis |
| G4 | PPP3CA | MAPK signaling pathway, Calcium signaling pathway, Apoptosis, Wnt signaling pathway, Axon guidance, VEGF signaling pathway, Natural killer cell mediated cytotoxicity, T cell receptor signaling pathway, B cell receptor signaling pathway, Long-term potentiation, Amyotrophic lateral sclerosis (ALS) |
| G4 | TJP2 | Tight junction, Vibrio cholerae infection |
| G5 | CTH | Glycine, serine and threonine metabolism, Methionine metabolism, Cysteine metabolism, Selenoamino acid metabolism, Nitrogen metabolism |
| G5 | GLO1 | Pyruvate metabolism |
| G6 | PLAU | Complement and coagulation cascades |
| G6 | PPAP2A | Glycerolipid metabolism, Glycerophospholipid metabolism, Ether lipid metabolism, Sphingolipid metabolism |
| G7 | TLR5 | Toll-like receptor signaling pathway, Pathogenic Escherichia coli infection -EHEC and EPEC |
| G7 | OR5R1 | Olfactory transduction |
| G7 | IFITM1 | B cell receptor signaling pathway |
| G9 | PDGFA | MAPK signaling pathway, Focal adhesion, Gap junction, Regulation of actin cytoskeleton, Glioma, Prostate cancer, Melanoma |
Genes not shown in the table did not have any KEGG pathway annotation.