| Literature DB >> 23894186 |
Qingyan Li1, Shuabin Lian, Zhiming Dai, Qian Xiang, Xianhua Dai.
Abstract
Bivalent gene is a gene marked with both H3K4me3 and H3K27me3 epigenetic modification in the same area, and is proposed to play a pivotal role related to pluripotency in embryonic stem (ES) cells. Identification of these bivalent genes and understanding their functions are important for further research of lineage specification and embryo development. So far, lots of genome-wide histone modification data were generated in mouse and human ES cells. These valuable data make it possible to identify bivalent genes, but no comprehensive data repositories or analysis tools are available for bivalent genes currently. In this work, we develop BGDB, the database of bivalent genes. The database contains 6897 bivalent genes in human and mouse ES cells, which are manually collected from scientific literature. Each entry contains curated information, including genomic context, sequences, gene ontology and other relevant information. The web services of BGDB database were implemented with PHP + MySQL + JavaScript, and provide diverse query functions. Database URL: http://dailab.sysu.edu.cn/bgdb/Entities:
Mesh:
Substances:
Year: 2013 PMID: 23894186 PMCID: PMC3724367 DOI: 10.1093/database/bat057
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Data statistics of the BGDB
| Organism | Gene number | Percentiles (%) |
|---|---|---|
| 3913 | 56.7 | |
| 2984 | 43.3 | |
| Total | 6897 | 100 |
| Both | 1604 | 23.3 |
aGenes with the same name in both Homo sapiens and Mus musculus ES cells.
Figure 1.The data generation flow of the BGDB database.
Search results in PubMed
| Key words | Article number |
|---|---|
| Bivalent gene | 820 |
| Bivalent domain | 405 |
| H3K4 H3K27 | 142 |
| H3K4me3 H3K27me3 | 204 |
The top 10 articles that contain most bivalent genes are shown in Supplementary Table S1
Figure 2.Number of bivalent genes found in 1, 2 and >2 references.
Description of fields used to annotate bivalent gene
| Field name | Description |
|---|---|
| ID | Unique database identifier for the bivalent gene |
| Gene symbol | Approved symbol for the bivalent gene |
| Gene full name | Approved full gene symbol |
| Gene type | Biotype of the bivalent gene |
| Organism | Organism of the bivalent gene |
| Gene synonym | Other gene names used for the bivalent gene |
| Summary | Descriptive text about the gene |
| Reference | Articles that reported the bivalent gene |
| HGNC/MGI ID | HGNC ID for human bivalent gene, and MGI ID for mouse |
| Entrez ID | External link to Entrez gene |
| Ensembl ID | External link to Ensembl |
| UniProtKB ID | External link to UniProtKB |
| UCSC link | External link to UCSC |
| Gene ontology | The specific GO terms are listed by source of the information, category and term. Each GO term supports a link to the AmiGO browser |
| Genomic location | Genomic location of the bivalent gene |
| RefSeq ID | Reference sequence ID |
| Nucleotide sequence | Nucleotide sequence of the bivalent gene |
| Protein sequence | Protein sequence of the bivalent gene |
Figure 3.Representative screenshots of BGDB. (A) Users could input ‘GRK4’ for querying. (B) The results will be shown in a tabular format. Users could click on the BGDB ID (BGDB-002517) to view the detailed information. (C) The detailed information of bivalent gene GRK4. The nucleotide and protein sequence are also presented.
The top five most enriched GO terms of biological processes, molecular functions and cellular components in human bivalent genes
| Description of GO term | Bivalent gene | Genome | ||
|---|---|---|---|---|
| The top five most enriched biological processes | ||||
| Anterior/posterior pattern specification (GO:0009952) | 70 (1.79) | 102 (0.54) | 3.31 | 2.72E-13 |
| Neuron differentiation (GO:0030182) | 46 (1.18) | 81 (0.43) | 2.74 | 3.30E-07 |
| Negative regulation of canonical Wnt receptor signaling pathway (GO:0090090) | 46 (1.18) | 79 (0.42) | 2.81 | 1.39E-07 |
| Neuron migration (GO:0001764) | 56 (1.43) | 100 (0.53) | 2.70 | 2.25E-08 |
| Central nervous system development (GO:0007417) | 58 (1.48) | 108 (0.57) | 2.59 | 3.57E-08 |
| The top five most enriched molecular functions | ||||
| RNA polymerase II distal enhancer sequence-specific DNA binding transcription factor activity (GO:0003705) | 62 (1.58) | 108 (0.57) | 2.77 | 2.00E-09 |
| Metal ion binding (GO:0046872) | 552 (14.11) | 1050 (5.57) | 2.53 | 8.35E-68 |
| Sequence-specific DNA binding (GO:0043565) | 260 (6.64) | 536 (2.84) | 2.34 | 2.70E-27 |
| Transcription factor binding (GO:0008134) | 116 (2.96) | 280 (1.49) | 2.00 | 2.18E-09 |
| Protein dimerization activity (GO:0046983) | 64 (1.64) | 159 (0.84) | 1.94 | 2.25E-05 |
| The top five most enriched cellular components | ||||
| Voltage-gated potassium channel complex (GO:0008076) | 48 (1.23) | 69 (0.37) | 3.35 | 1.26E-09 |
| Axon (GO:0030424) | 77 (1.97) | 149 (0.79) | 2.49 | 7.13E-10 |
| Dendrite (GO:0030425) | 82 (2.10) | 178 (0.94) | 2.22 | 1.18E-08 |
| Neuronal cell body (GO:0043025) | 104 (2.66) | 223 (1.18) | 2.25 | 1.05E-10 |
| Postsynaptic membrane (GO:0045211) | 85 (2.17) | 187 (0.99) | 2.19 | 1.43E-08 |
aNum., number of proteins annotated.
bPer. percentiles of proteins annotated.
cE-ratio, enrichment ratio of bivalent genes.
Distribution for bivalent genes in human ESC chromosomes
| Chromosome | Bivalent gene number | Protein-coding gene number | Percentiles (%) |
|---|---|---|---|
| 1 | 366 | 2080 | 17.60 |
| 2 | 303 | 1333 | 22.73 |
| 3 | 221 | 1079 | 20.48 |
| 4 | 195 | 769 | 25.36 |
| 5 | 200 | 898 | 22.27 |
| 6 | 200 | 1054 | 18.98 |
| 7 | 182 | 983 | 18.51 |
| 8 | 165 | 702 | 23.50 |
| 9 | 193 | 829 | 23.28 |
| 10 | 185 | 774 | 23.90 |
| 11 | 228 | 1317 | 17.31 |
| 12 | 191 | 1070 | 17.85 |
| 13 | 83 | 332 | 25.00 |
| 14 | 123 | 866 | 14.20 |
| 15 | 121 | 619 | 19.55 |
| 16 | 142 | 886 | 16.03 |
| 17 | 223 | 1217 | 18.32 |
| 18 | 58 | 290 | 20.00 |
| 19 | 169 | 1496 | 11.30 |
| 20 | 128 | 562 | 22.78 |
| 21 | 47 | 247 | 19.03 |
| 22 | 103 | 511 | 20.16 |
| X | 86 | 836 | 10.29 |
| Y | 1 | 56 | 1.79 |