| Literature DB >> 21930501 |
François Moreews1, Gaelle Rauffet, Patrice Dehais, Christophe Klopp.
Abstract
Expression microarrays are commonly used to study transcriptomes. Most of the arrays are now based on oligo-nucleotide probes. Probe design being a tedious task, it often takes place once at the beginning of the project. The oligo set is then used for several years. During this time period, the knowledge gathered by the community on the genome and the transcriptome increases and gets more precise. Therefore re-annotating the set is essential to supply the biologists with up-to-date annotations. SigReannot-mart is a query environment populated with regularly updated annotations for different oligo sets. It stores the results of the SigReannot pipeline that has mainly been used on farm and aquaculture species. It permits easy extraction in different formats using filters. It is used to compare probe sets on different criteria, to choose the set for a given experiment to mix probe sets in order to create a new one.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21930501 PMCID: PMC3263592 DOI: 10.1093/database/bar025
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Probe target specificity classes (TSCs) and subclasses
| TSCs | Description |
|---|---|
| 1 | One good hit and no noise |
| 2 | One good hit with noise |
| 3 | No hit, one noise ≥30 bp |
| 4 | No hit, one noise ≥20 and <30 bp |
| 5 | No hit many noises |
| 6 | No good hit no noises |
| 7 | Many good hits |
| 7.1 (subclasses) | Many good hits but one entity |
| MH (subclasses) | Multiple hits on one chromosome |
| MC (subclasses) | Hits on multiple chromosomes |
These quality indicators use a Blastn similarity search on the transcriptome of the studied species. As an illustration, the users can decide to reject the probes with many hits (Category 7) or not, using a complementary fine grain subcategory indicator (7.1, MH, MC). Ensembl provides probe set mapping but do not provides TSCs and stores only probes with one genomic hit and no more than one mismatch.
Summary of data currently available in the SigReannot-mart database
| Microarray | Species | Manufacturer | Data set | ||
|---|---|---|---|---|---|
| Ensembl 56 | Ensembl 59 + RefSeq RNA | Ensembl 61+ RefSeq RNA | |||
| 44 K | Bovine | Agilent | * | * | * |
| 24 K | EADGENE | * | |||
| 22 K | INRA | * | |||
| 44 k | Chicken | Agilent | * | * | * |
| 20 K | EADGENE | * | * | ||
| 44 K | Horse | Agilent | * | * | * |
| GPL2881 | Mouse | Agilent | * | ||
| GPL2877 | Rat | Agilent | * | ||
| 44 K | Pig | Agilent | * | * | * |
| 25 K | EADGENE | * | |||
| 17 K | INRA | * | |||
| 44 K | Rabbit | Agilent | * | * | * |
| 44 K | Salmon | Agilent | * | ||
| 15 K | Sheep | Agilent | * | * | * |
| 37 K | Trout | Agilent | * | * | |
| GPL884 | Human | Agilent | * | ||
The frequency of update of the probe set annotation follows the Ensembl update, at least two times a year. The current probe sets are not available in Ensembl.
Asterisks correspond to the annotation version of the probe sets.
External databases referenced from SigReannot-mart
| Data source | Genes | Transcripts | Pathways | GO terms | Gene symbols | Orthologs | URL | Entities description |
|---|---|---|---|---|---|---|---|---|
| Ensembl | * | * | * | Gene, ncRNA, mRNA, putative RNA and orthologuous genes | ||||
| RefSeq | * | Transcript | ||||||
| Gene Ontology | * | GO term | ||||||
| HGNC | * | Gene symbol | ||||||
| KEGG | * | * | Enzyme, pathways and ortholog groups |
Asterisks represent the data sources corresponding to each biological entities imported in SigReannot-mart to perform the annotation process.
Figure 1.Annotation pipeline, BioMart integration and SigReannot-mart query interface. The management of the probe annotation processing pipeline and the biomart environment are centralized and automatized to allows efficient biomart configuration for multiple data sets with limited human intervention. The BioMart database is directly created and populated at the end of the annotation pipeline (A), then the BioMart configuration is automatically generated (B) using an XML file created from a generic template (C) and probe set properties (D). The SigReannot-mart data set can be filtered by user queries from a web page (E). Many attributes can be used as filters like probe specificity, Gene hits, chromosome hit location or orthologs.
Species related to SigReannot-mart present data sets
| Species | Category |
|---|---|
| Cow | Farm |
| Chicken | |
| Horse | |
| Pig | |
| Rabbit | |
| Sheep | |
| Salmon | Fishery |
| Trout | |
| Mouse | Model |
| Rat | |
| Human | |
| Database | Sigenae oligo annotation |
| Dataset | btaurus_agilent_44k (bos_taurus) |
| Filters | Probe : [ID-list specified], Category: 1, 2 |
| Attributes | Probe name, gene name, specificity category |
| Database | Ensembl Gene at ensembl.org |
| Data set | Gallus gallus genes (WASHUC2) |
| Filters | |
| Attributes | Ensembl Transcript ID |
| Database | Sigenae oligo annotation at SigReannot-mart.toulouse.inra.fr |
| Datasets | ggallus_agilent_44k(sus_scrofa) and ggallus_ eadgene_20k (sus_scrofa) |
| Filters | Ensembl transcript ID ([ID-list specified) |
| Attributes | Ensembl Transcript ID |