| Literature DB >> 16464256 |
Pedro Carmona-Saez1, Monica Chagoyen, Andres Rodriguez, Oswaldo Trelles, Jose M Carazo, Alberto Pascual-Montano.
Abstract
BACKGROUND: Microarray technology is generating huge amounts of data about the expression level of thousands of genes, or even whole genomes, across different experimental conditions. To extract biological knowledge, and to fully understand such datasets, it is essential to include external biological information about genes and gene products to the analysis of expression data. However, most of the current approaches to analyze microarray datasets are mainly focused on the analysis of experimental data, and external biological information is incorporated as a posterior process.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16464256 PMCID: PMC1386712 DOI: 10.1186/1471-2105-7-54
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Rules related to metabolic pathways. Rules extracted from the diauxic shift dataset using KEGG pathways. To facilitate the visualization, the consequent elements are graphically represented by colored squares. Red color represents over-expression, green color represents under-expression and empty squares represent neither over-expression nor under-expression. For example, the first rule should be {Ribosome →[-]T 6, [-]T 7} in the classical representation. Only values for support (supp.), confidence (conf.) and permutation corrected p-values are shown for each rule, the rest of measures are reported in the additional files. The last column contains the number of genes covered by each association.
Figure 2Rules related to transcriptional regulators. Rules extracted from the diauxic shift dataset using transcriptional regulators that bind to promoter regions.
Figure 3Rules related to transcriptional regulators and metabolic pathways. Rules extracted from the diauxic shift dataset using transcriptional regulators and KEGG pathways simultaneously.
Figure 4Rules related to GO Biological Process category. Rules extracted from the serum stimulation dataset using the Biological Process category of Gene Ontology. Bp in brackets denotes biological process categories.
Figure 5Rules related to annotations from the three categories of GO. Rules extracted from the serum stimulation dataset using terms from the three categories of GO. Bp: biological process, cc: cellular component and mf: molecular function. Times corresponding to 15 min., 30 min., 1 hr. and 2 hr. are omitted because there were not significant associations at these time points for the thresholds used to extract association rules.
Transaction databases from gene expression data(a) Transaction database used to extract association rules among gene attributes and expression patterns. (b) Transaction database used to extract association rules among genes.
| a | |
| gene A | [+]Exp 1, [+]Exp 2, [-]Exp 3, [+]Exp 4, [+]Exp 5 annotation |
| gene B | [+]Exp 1, [+]Exp 2, [+]Exp 4, [+]Exp 5, annotation |
| gene C | [+]Exp 1, [+]Exp 2, [-]Exp 3, [+]Exp 4, [+]Exp 5, [+]Exp 6, annotation |
| gene D | [+]Exp 4, [-]Exp 6, annotation |
| gene E | [+]Exp 1, [+]Exp 2, [-]Exp 3, annotation |
| gene F | [+]Exp 1, [+]Exp 2, [-]Exp 3, [+]Exp 6, annotation |
| ... | ... |
| b | |
| Experiment 1 | [+]gene A, [+]gene B, [+]gene C, [+]gene E, [+]gene F |
| Experiment 2 | [+]gene A, [+]gene B, [+]gene C, [+]gene E, [+]gene F |
| Experiment 3 | [-]gene A, [-]gene C, [-]gene E, [-]gene F |
| Experiment 4 | [+]gene A, [+]gene B, [+]gene C, [+]gene D |
| Experiment 5 | [+]gene A, [+]gene B, [+]gene C |
| Experiment 6 | [+]gene C, [-]gene D, [+]gene F |
| ... | ... |
Figure 6Example of dataset containing heterogeneous information and the two obtained rules. (a) Example of dataset in which genes (named as g1, g2...) are annotated with different characteristics (second column). Third and the rest of columns represent experimental conditions and values of 1 represent over-expression, -1 under-expression and 0 neither expression nor inhibition. (b) Two rules that were selected after applying the filter.