| Literature DB >> 27635398 |
Yang Hu1, Wenyang Zhou1, Jun Ren1, Lixiang Dong2, Yadong Wang3, Shuilin Jin4, Liang Cheng5.
Abstract
Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27635398 PMCID: PMC5011202 DOI: 10.1155/2016/4130861
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
The statistical information of associations between genes and terms.
| The number of genes | The number of terms | The number of associations between genes and terms |
|---|---|---|
| mRNA | ||
| 13,148 | 7,182 | 288,869 |
| MicroRNA | ||
| 948 | 533 | 9,496 |
| lncRNA | ||
| 139 | 297 | 901 |
Figure 1Distribution of functional terms and genes in the annotation results. (a) Histogram of the number of genes associated with individual functional term. (b) Histogram of the number of functional terms associated with individual gene.
The top ten terms ordered by the number of gene annotations.
| Term ID | Term name | Number of genes |
|---|---|---|
| GO:0005623 | Cell | 7,524 |
| GO:0005488 | Binding | 5,011 |
| GO:0065007 | Biological regulation | 4,846 |
| GO:0023052 | Signaling | 4,466 |
| GO:0032502 | Developmental process | 3,521 |
| GO:0009058 | Biosynthetic process | 3,346 |
| DOID:162 | Cancer | 3,139 |
| GO:0006351 | Transcription, DNA-templated | 3,121 |
| DOID:305 | Carcinoma | 3,069 |
| GO:0040007 | Growth | 3,011 |
The top ten genes ordered by the number of term annotations.
| HGNC gene ID | Gene symbol | Number of functional terms |
|---|---|---|
| HGNC:11998 | TP53 | 828 |
| HGNC:11892 | TNF | 792 |
| HGNC:6018 | IL6 | 683 |
| HGNC:12680 | VEGFA | 669 |
| HGNC:11766 | TGFB1 | 664 |
| HGNC:3236 | EGFR | 560 |
| HGNC:7176 | MMP9 | 521 |
| HGNC:391 | AKT1 | 517 |
| HGNC:7794 | NFKB1 | 494 |
| HGNC:6025 | CXCL8 | 473 |
Figure 2The comparison of annotations in GeneRIFs and with annotations in GOA. (a) Histogram of the number of genes associated with individual GO term. (b) Histogram of the number of DO terms associated with individual gene. (c) The correlation between term frequency of gene by GeneRIFs and GOA. (d) The correlation between gene frequency of term by GeneRIFs and GOA.
Figure 3A bipartite network demonstrating the relationship between genes and terms. Rectangles with yellow represent DO terms, three rectangles with blue in the center of the figure indicate DO terms, and other rectangles are GO terms. An edge is placed between a gene and a term of GO and DO if the gene relates with the term.
Data sources.
| Data source | Web site (date of download) |
|---|---|
| GeneRIF |
|
| HGNC |
|
| GO & GOA |
|
| DO |
|
Figure 4A subgraph of the DAG for BP term “Mitochondrial genome maintenance (GO:0000002).” The arrow symbol represents an “IS_A” link of GO. For example, “Mitochondrial genome maintenance (GO:0000002)” is linked to “Mitochondrion organization (GO:0007005)” by an “IS_A” relationship.
Figure 5Diagram of functional annotation of human genome. (a) A framework to annotate functional description of human genome to ontologies. (b) An example of annotating a GeneRIF.