| Literature DB >> 30888410 |
Jason Bubier1, David Hill1, Gaurab Mukherjee1, Timothy Reynolds2, Erich J Baker2, Alexander Berger1, Jake Emerson1, Judith A Blake1, Elissa J Chesler1.
Abstract
Genomic data interpretation often requires analyses that move from a gene-by-gene focus to a focus on sets of genes that are associated with biological phenomena such as molecular processes, phenotypes, diseases, drug interactions or environmental conditions. Unique challenges exist in the curation of gene sets beyond the challenges in curation of individual genes. Here we highlight a literature curation workflow whereby gene sets are curated from peer-reviewed published data into GeneWeaver (GW), a data repository and analysis platform. We describe the system features that allow for a flexible yet precise curation procedure. We illustrate the value of curation by gene sets through analysis of independently curated sets that relate to the integrated stress response, showing that sets curated from independent sources all share significant Jaccard similarity. A suite of reproducible analysis tools is provided in GW as services to carry out interactive functional investigation of user-submitted gene sets within the context of over 150 000 gene sets constructed from publicly available resources and published gene lists. A curation interface supports the ability of users to design and maintain curation workflows of gene sets, including assigning, reviewing and releasing gene sets within a curation project context.Entities:
Mesh:
Year: 2019 PMID: 30888410 PMCID: PMC6424415 DOI: 10.1093/database/baz036
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Public data resources incorporated into GW data store
|
|
|
|
|
|---|---|---|---|
| Allen Brain Atlas | Differential expression in adult mouse brain structures | 802 | ( |
| Comparative Toxicogenomics Database | Curated chemical–gene interactions | 21 630 | ( |
| Drug Related Gene Database | Addiction-related experiment data | 250 | ( |
| GO | Gene annotations | 85 573 | ( |
| GWAS Catalog | GWAS summary results | 3389 | ( |
| Human Phenotype Ontology | Gene annotations | 6276 | ( |
| KEGG | Gene network and pathway members | 1339 | ( |
| MeSH | Gene annotations | 12 069 | ( |
| MP | Gene annotations | 7931 | ( |
| Molecular Signatures Database | Curated gene sets | 3738 | ( |
| Online Mendelian Inheritance in Man | Curated disease–gene associations | 738 | ( |
| PC | Gene networks and pathways | 1149 | ( |
Gene identifier types and expression microarray platforms supported in GW
|
|
| |
|---|---|---|
| CGNC |
| |
| Ensembl gene | All | |
| Ensembl protein | All | |
| Ensembl transcript | All | |
| FlyBase |
| |
| Gene symbol | All | |
| HGNC |
| |
| MGI |
| |
| miRBase | All | |
| NCBI gene | All | |
| RGD |
| |
| SGD |
| |
| Unigene | All | |
| Wormbase |
| |
| ZFIN |
| |
|
| ||
|
|
|
|
| Affymetrix |
| 42 |
| Agilent |
| 4 |
| Illumina |
| 4 |
Figure 1Sample GeneSet from GW showing the curator-populated fields of Gene Set Name, Gene Set Figure Label. Gene Set Description and Ontology Annotations. Publication data are populated by providing the PMID during the set upload.
Figure 2(A) A workflow diagram representing the creation of a new curation group. (B) A diagram showing the management of curation tasks and their assignment to group members. (C) The creation of a new project and the subsequent assignment of gene sets to it, then allows the sharing of that gene set among users.
Figure 3(A) A schematic representation of two branches of the ISR. Unfolded protein ER stress or amino acid starvation activates two separate kinases, EIF2AK3 or EIF2AK4, which phosphorylates EIF2A. Phosphorylated EIF2A then represses general translation in the cell, but stimulates translation of a subset of response genes including the transcription factor ATF4. A consequence of ATF4 activation is the downstream activation of autophagy. (B) The results of Jaccard similarity analyses using independently curated gene sets that represent different aspects of the ISR. These results show significant similarity between genes up-regulated after amino acid starvation and genes down-regulated in EIF2A kinase mutant cells, ATF4 target genes and genes annotated with the GO term ‘positive regulation of autophagy’ (GO:0010508, downloaded from a Mouse Genome Informatics Query performed on 17 August 2018).
Gene sets used for analysis of the ISR
|
|
|
|
|
|---|---|---|---|
| GS355023 | 102 | Positive regulation of autophagy | ( |
| GS354317 | 528 | Genes down-regulated in livers of Eif2ak4-mutant mice perfused with all amino acids minus methionine | ( |
| GS354521 | 647 | Genes down-regulated in livers of Eif2ak3-mutant mice treated with tBuHQ | ( |
| GS354534 | 83 | Genes up-regulated in amino acid-starved N25/2 cells | ( |
| GS355584 | 472 | Atf4 target genes | ( |
| GS354663 | 269 | Genes up-regulated in amino acid-starved fibroblasts | ( |