| Literature DB >> 22689770 |
Aaron Wise1, Zoltán N Oltvai, Ziv Bar-Joseph.
Abstract
MOTIVATION: With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression comparison will allow us to better utilize these datasets: cross-species expression comparison enables us to address questions in evolution and development, and further allows the identification of disease-related genes and pathways that play similar roles in humans and model organisms. Unlike sequence, which is static, expression data changes over time and under different conditions. Thus, a prerequisite for performing cross-species analysis is the ability to match experiments across species.Entities:
Mesh:
Year: 2012 PMID: 22689770 PMCID: PMC3371837 DOI: 10.1093/bioinformatics/bts205
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Cross validation accuracy comparing expression analysis alone to co-training. The co-training method greatly improves between iterations indicating that the new labeled examples contribute to the performance of the combined classifier. Error bars represent standard deviation.
Fig. 2.Overlap between textual matches and expression matches by iteration. Overlap matches (blue) were defined as microarray pairs that were in the top 5% of both most textually similar and most expression similar pairs. Random (green) pairs are pairs chosen from a random 5% of all expression pairs that were also in a random 5% sample of all textual pairs.
Fig. 3.Overlap between textual matches and expression matches by size of overlap set in iteration 5. Overlap matches (blue) were defined as microarray pairs that were in the top x% of both most textually similar and most expression similar pairs. Random (green) pairs are pairs chosen from a random x% of all expression pairs that were also in a random x% sample of all textual pairs.
Fig. 4.GO term enrichment of mouse genes with high-expression correlation to cycling genes in putative microarray similar pairs. We show a comparison between the co-training approach and expression similarity alone. All cell cycle related GO terms that are enriched in either of the two sets of genes are included in the figure.
Top biological process GO terms by P-value for mouse genes correlated with human genes that are known to have type 2 diabetes-related mutations
| Rank | Category name | Assigned | ||
|---|---|---|---|---|
| 1 | Developmental process | 55 | 2.19E-17 | <0.001 |
| 2 | Positive regulation of biological process | 49 | 2.4E-14 | <0.001 |
| 3 | Positive regulation of cellular process | 46 | 5.63E-14 | <0.001 |
| 4 | Anatomical structure development | 37 | 1.0E-13 | <0.001 |
| 5 | Anatomical structure morphogenesis | 28 | 6.6E-13 | <0.001 |
| 12 | Positive regulation of metabolic process | 29 | 1.1E-9 | <0.001 |
| 13 | Regulation of metabolic process | 51 | 1.6E-9 | <0.001 |
| 14 | Regulation of primary metabolic process | 46 | 2.3E-9 | <0.001 |
| 33 | Regulation of biosynthetic process | 36 | 3.3E-7 | <0.001 |
| 96 | Positive regulation of immune system process | 11 | 4.0E-6 | 0.003 |
| 96 | Regulation of immune system process | 13 | 1.7E-5 | 0.012 |
| 101 | Immune system process | 14 | 2.2E-5 | 0.02 |