| Literature DB >> 19073704 |
Viviane Praz1, Philipp Bucher.
Abstract
The CleanEx expression database (http://www.cleanex.isb-sib.ch) provides access to public gene expression data via unique gene names as well as via experiments biomedical characteristics. To reach this, a dual annotation of both sequences and experiments has been generated. First, the system links official gene symbols to any kind of sequences used for gene expression measurements (cDNA, Affymetrix, oligonucleotide arrays, SAGE or MPSS tags, Expressed Sequence Tags or other mRNA sequences, etc.). For the biomedical annotation, we re-annotate each experiment from the CleanEx database with the MeSH (Medical Subject Headings) terms, primarily used by NLM (National Library of Medicine) for indexing articles for the MEDLINE/PubMED database. This annotation allows a fast and easy retrieval of expression data with common biological or medical features. The numerical data can then be exported as matrix-like tab-delimited text files. Data can be extracted from either one dataset or from heterogeneous datasets.Entities:
Mesh:
Year: 2009 PMID: 19073704 PMCID: PMC2686468 DOI: 10.1093/nar/gkn878
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Mapping process for SAGE and MPSS tags. Sequences belonging to Trome clusters are stored in a tempory database The SAGE or MPSS tags are then mapped onto these sequences via the “tager” program. The corresponding gene name for the stored sequence identifires is extracted from Unigene. The quality is given according to the following criteria, for each SAGE or MPSS tag : if all the targeted sequences belong to one or two Unigene clusters, the quality is set as “High”, If all the targeted sequences belong to more than two but less than five clusters, the quality is set as “Medium”. For more than clusters, is “low”. Otherwise, the quality is considered as “Unknown”. The process is done for human and mouse.
Figure 2.Over-representation of TATA-box-occurrences in human genes which show up-regulation in cancer tissues. Promoter sequences for up- and down-regulated sets of genes havae been extracted via the cleanEX two pools comparision tool and-box freaquency has been analysed via the SSA server with default options.