| Literature DB >> 22434830 |
Rama Balakrishnan1, Julie Park, Kalpana Karra, Benjamin C Hitz, Gail Binkley, Eurie L Hong, Julie Sullivan, Gos Micklem, J Michael Cherry.
Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) provides high-quality curated genomic, genetic, and molecular information on the genes and their products of the budding yeast Saccharomyces cerevisiae. To accommodate the increasingly complex, diverse needs of researchers for searching and comparing data, SGD has implemented InterMine (http://www.InterMine.org), an open source data warehouse system with a sophisticated querying interface, to create YeastMine (http://yeastmine.yeastgenome.org). YeastMine is a multifaceted search and retrieval environment that provides access to diverse data types. Searches can be initiated with a list of genes, a list of Gene Ontology terms, or lists of many other data types. The results from queries can be combined for further analysis and saved or downloaded in customizable file formats. Queries themselves can be customized by modifying predefined templates or by creating a new template to access a combination of specific data types. YeastMine offers multiple scenarios in which it can be used such as a powerful search interface, a discovery tool, a curation aid and also a complex database presentation format. DATABASE URL: http://yeastmine.yeastgenome.org.Entities:
Mesh:
Year: 2012 PMID: 22434830 PMCID: PMC3308152 DOI: 10.1093/database/bar062
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Data types integrated into YeastMine, the source project of the annotations and means by which these data can be accessed from detailed web pages or downloadable files from http://downloads.yeastgenome.org
| Data type | Source | Web page at SGD | Downloadable files ( |
|---|---|---|---|
| Basic gene information (description of gene function, gene names) | SGD | Locus Summary | curation/SGD_features.tab, curation/saccharomyces_cerevisiae.gff |
| Chromosomal coordinates, sequence for chromosomal features | SGD | Locus Summary, GBrowse, Gene/Seq resources, PatMatch, BLAST | curation/SGD_features.tab, curation/saccharomyces_cerevisiae.gff |
| Gene ontology annotations | SGD | Locus Summary, GO term | curation/gene_association.sgd |
| Mutant phenotype | SGD | Locus Summary | curation/phenotype_data.tab |
| Interactions | BioGRID | Locus Summary | curation/interaction_data.tab |
| Protein properties | SGD | Locus Protein | curation/protein_properties.tab |
| Biochemical pathways | SGD | Locus Summary, YeastCyc | curation/biochemical_pathways.tab |
| Literature | SGD | Locus Literature Guide, Curated Paper, Textpresso full-text search | curation/gene_literature.tab |
| Gene expression | SPELL | Locus Expression, SPELL | published_datasets/ |
| Homologs | TreeFam | Not currently available in SGD | genomics/homology/ |
Figure 1.Example of a template search of expression data: screenshot of the ‘SpellDataSet→SpellScore→Genes’ template showing the SpellExpression Score constrained to be between ≥3 and ≤−3, and the SpellDatasetCondition name constrained to be ‘=*sorbitol*’. Switching ON the other parameters such as ‘SpellDataSet author’ or ‘SpellDataset pubmedID’ will allow constraint of those values. The ‘Show Results’ button runs the query. This template is prepopulated with certain constraints, but clicking on the ‘Edit Query’ button will bring up the Model browser, which offers more options for query constraints and output formats.
Figure 2.An example of editing a template using the Query Builder. The Model Browser (on the left) displays the attributes for the GOAnnotation object in the Gene→GO Terms template. Clicking on the ‘CONSTRAIN→’ button next to the namespace box allows one to constrain on the ontology namespace. The Query Overview (on the right) shows the ontology namespace being restricted to the value ‘Biological Process’.
Step-by-step description of an intricate query using YeastMine to retrieve a list of protein complexes where one or more of the constituent members shows a response to osmotic stress
| Aim | Template search | Query builder edits | List operations |
|---|---|---|---|
| Step 1: retrieve genes differentially expressed in response to sorbitol (also shown in | SpellDataSet → SpellScore → Genes
-‘SpellDataSetCondition conditionname’ = *sorbitol* -‘SpellExpressionScore score’ = [select threshold] | None | Save genes from results report as ‘List 1’ |
| Step 2: retrieve genes sensitive to osmotic stress when mutated | Phenotype → Genes
-‘Observable’ LIKE *osmotic stress | None | Save genes from results report as ‘List 2’ |
| Step 3: retrieve genes sensitive to sorbitol when mutated | Phenotype → Genes | Query Overview
-Remove ‘Observable’ constraint | Add genes from results report to ‘List 2’ |
| Model Browser
-Constrain qualifier! = normal -Constrain chemical = *sorbitol* | |||
| Step 4: make a list of all genes with a response to osmotic stress | None | None | Union ‘List 1’ and ‘List 2’ and Save as ‘List 3’ |
| Step 5: retrieve genes annotated with GO to a complex | GO Term name [and children of this term] → All genes in organism
-GO Term name = macromolceular complex | None | Save genes from results report as ‘List 4’ |
| Step 6: make a list of genes that respond to osmotic stress that are also in a complex | None | None | Intersect ‘List 3’ and ‘List 4’ and Save as ‘List 5’ |
| Step 7: retrieve complexes where at least one member protein responds to osmotic stress | Gene → GO term
-constrain to ‘osmotic genes in a complex’ list | Model Browser
-constrain goAnnotations → ontologyTerm → parents- → name = macromolecular complex | Save GO child terms from results report as ‘End List’ |
Retrieval of genes that have altered expression under osmotic stress, (Step 1, List 1); have a mutant phenotype under osmotic stress conditions (Steps 2–3, List 2). Lists created by Steps 1–3 are unified in Step 4 to obtain List 3. A list of genes (List 4) mapping up to the cellular component GO term ‘macromolecular complex’ is retrieved in Step 5. Intersecting List 3 and List 4 in Step 6 results in List 5, genes that both have a response to osmotic stress and are members of a complex. Finally, in Step 7, limiting our search to genes within List 5, we retrieve a list of GO complex terms that have at least one member of the complex experimentally shown to be involved in osmotic stress, the End List. The results of the End List using YeastMine version 2011-10-09 and using ‘3’ as an expression score cut-off in Step 1 can be found in Supplementary Table S1.