| Literature DB >> 19477980 |
José Caldas1, Nils Gehlenborg, Ali Faisal, Alvis Brazma, Samuel Kaski.
Abstract
MOTIVATION: As ArrayExpress and other repositories of genome-wide experiments are reaching a mature size, it is becoming more meaningful to search for related experiments, given a particular study. We introduce methods that allow for the search to be based upon measurement data, instead of the more customary annotation data. The goal is to retrieve experiments in which the same biological processes are activated. This can be due either to experiments targeting the same biological question, or to as yet unknown relationships.Entities:
Mesh:
Year: 2009 PMID: 19477980 PMCID: PMC2687969 DOI: 10.1093/bioinformatics/btp215
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Visualization of the topic model. A subset of 13 topics, 211 gene sets and 105 experiments is shown. For details and a discussion see the text.
Fig. 2.The experiment collection visualized as glyphs on a plane. Topic colors in all glyphs match topic colors in Figure 1. (A) NeRV projection of the 105 experiments, each shown as a glyph. (B) The slices of each glyph show the distribution of topics in the experiment. The experiment labels are from left to right: asthma, Barrett's esophagus and high-stage neuroblastoma. (C) Enlarged region from (A) where glyphs have additionally been scaled according to their relevance to the query with the ‘malignant melanoma’ experiment shown in the center. A detailed description of this experiment is included in Section 3.
Top five gene sets for the 13 most probable topics
| 2 | 5 | 11 |
| Cell cycle (BIOCARTA) | Purine metabolism (KEGG) | G protein signaling |
| Cell cycle (KEGG) | Pyrimidine metabolism (KEGG) | Biopeptides pathway |
| G1 to S cell cycle (REACTOME) | Purine metabolism (GENMAPP) | NFAT pathway |
| DNA replication (REACTOME) | Pyrimidine metabolism (GENMAPP) | CREB pathway |
| G2 pathway | DNA replication (REACTOME) | GPCR pathway |
| 15 | 18 | 19 |
| Gluconeogenesis | Apoptosis (GENMAPP 1) | Valine leucine and isoleucine degradation |
| Glycolysis | Apoptosis (KEGG) | Propanoate metabolism (KEGG) |
| Glycolysis and gluconeogenesis (KEGG) | Apoptosis (GENMAPP 2) | Fatty acid metabolism |
| Glycolysis and gluconeogenesis (GENMAPP) | Apoptosis (GENMAPP 3) | Propanoate metabolism (GENMAPP) |
| Fructose and mannose metabolism | Death pathway | Valine leucine and isoleucine degradation |
| 24 | 26 | 27 |
| IL2RB pathway | mTOR pathway | Hematopoietic cell lineage |
| PDGF pathway | Sphingolipid metabolism | Complement and coagulation cascades |
| EGF pathway | eIF4 pathway | Inflammation pathway |
| Gleevec pathway | RAS pathway | NKT pathway |
| IGF-1 pathway | IGF-1 mTOR pathway | Dendritic cell pathway |
| 32 | 35 | 44 |
| Epithelial cell signaling in | Integrin pathway | mRNA processing (REACTOME) |
| Cholera infection (KEGG) | Met pathway | RNA transcription (REACTOME) |
| Photosynthesis | ERK pathway | Translation factors |
| ATP synthesis | AT1R pathway | Folate biosynthesis |
| Flagellar assembly | ECM pathway | Basal transcription factors |
| 50 | ||
| Oxidative phosphorylation (KEGG) | ||
| Oxidative phosphorylation (GENMAPP) | ||
| Glycolysis and gluconeogenesis | ||
| IL-7 pathway | ||
| Gamma hexachlorocyclohexane degradation |
An acronym for the source of the gene set was included either to distinguish between gene sets with similar names, or when the gene set's name already includes a mention of that source [KEGG (Kanehisa and Goto, 2000), GENMAPP (Salomonis et al., 2007), BIOCARTA (http://www.biocarta.com) or REACTOME (Vastrik et al., 2007)].
Fig. 3.(A) Average Precision for cancer queries for the top 10 results. Queries are sorted by the average precision given by the topic model. Error bars represent the 99% confidence interval of the random permutation results. (B) Interpolated average precision at 11 standard recall levels (given as percentages). The solid line corresponds to our method; the dashed line corresponds to the baseline.
Fig. 4.NeRV projection of the 105 experiments, portraying the outcome of querying the model with a melanoma experiment. Both glyph size and color saturation encode the relevance of each experiment to the query. The bigger the glyph and the more saturated the red the higher the relevance of the experiment to the query. The query itself is represented by the biggest glyph.