| Literature DB >> 27789706 |
Shujing Li1,2, Ke Shui1, Ying Zhang1, Yongqiang Lv1, Wankun Deng1, Shahid Ullah1, Luoying Zhang3, Yu Xue4.
Abstract
We report a database of circadian genes in eukaryotes (CGDB, http://cgdb.biocuckoo.org), containing ∼73 000 circadian-related genes in 68 animals, 39 plants and 41 fungi. Circadian rhythm is ∼24 h rhythm in behavioral and physiological processes that exists in almost all organisms on the earth. Defects in the circadian system are highly associated with a number of diseases such as cancers. Although several databases have been established for rhythmically expressed genes, a comprehensive database of cycling genes across phyla is still lacking. From the literature, we collected 1382 genes of which transcript level oscillations were validated using methods such as RT-PCR, northern blot and in situ hybridization. Given that many genes exhibit different oscillatory patterns in different tissues/cells within an organism, we have included information regarding the phase and amplitude of the oscillation, as well as the tissue/cells in which the oscillation was identified. Using these well characterized cycling genes, we have then conducted an orthologous search and identified ∼45 000 potential cycling genes from 148 eukaryotes. Given that significant effort has been devoted to identifying cycling genes by transcriptome profiling, we have also incorporated these results, a total of over 26 000 genes, into our database.Entities:
Mesh:
Year: 2016 PMID: 27789706 PMCID: PMC5210527 DOI: 10.1093/nar/gkw1028
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The workflow in constructing the CGDB database. Cycling genes that were identified via non-high throughput methods were collected from published literature and ortholog search was conducted subsequently. This data was then combined with cycling genes identified by high throughput methods. Finally, all cycling genes were annotated by established databases including Uniprot, NCBI and Ensembl.
Figure 2.The search options of CGDB. (A) ‘Simple search’. The database can be queried with one or multiple keywords. (B) ‘Advanced search’. Users can input up to two terms to perform a precise search. (D) ‘Multiple search’. A list of keywords in a line-by-line format can be input to search multiple genes at once. (C) ‘Blast search’. The option was designed for searching the database with one protein sequence in FASTA format.
Figure 3.The browse options of CGDB. We provide two methods to browse the database: (A) by species and (B) by external condition. (C) The browse results can be displayed in a tabular format with CGDB ID, UniProt/Ensembl accession, species, and protein name. (D) The detailed information of human HTR1B.
Figure 4.The insulin signaling pathway is significantly enriched in human cycling genes (the hypergeometric test, p-value = 9.70E-7, E-ratio = 2.26). Part of the insulin signaling pathway adapted from KEGG. Cycling genes are filled gray ovals, whereas compounds are open pink ovals and vesicles are open gray ovals. Peak time points for human and mouse cycling genes are indicated in brown and green, respectively. Solid lines indicate direct protein interactions while dashed lines indicate indirect effects (i.e. other factors are involved which are not shown in the figure). PP1 [regulatory subunit 3B (ZT22) /regulatory subunit 3C (ZT0)].