| Literature DB >> 23368677 |
Jiajie Peng1, Jin Chen, Yadong Wang.
Abstract
BACKGROUND: Gene Ontology (GO) has been widely used in biological databases, annotation projects, and computational analyses. Although the three GO categories are structured as independent ontologies, the biological relationships across the categories are not negligible for biological reasoning and knowledge integration. However, the existing cross-category ontology term similarity measures are either developed by utilizing the GO data only or based on manually curated term name similarities, ignoring the fact that GO is evolving quickly and the gene annotations are far from complete.Entities:
Mesh:
Year: 2013 PMID: 23368677 PMCID: PMC3549802 DOI: 10.1186/1471-2105-14-S2-S15
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1An example of two GO categories, gene co-function network and gene set association. (a and b) An example of two GO categories, in which each node is a GO term, each edge represents a conceptual relation between two terms, and {g1...g13} is the set of genes annotated to corresponding terms. (c) An example of gene co-function network, in which each node is a gene, each edge represents the functional associations between the genes, and the confidence score at each edge measures the probability of an interaction to represent a true functional linkage between two genes. (d) The gene set association between gene set Gand G.
Figure 2ROC curves for the experimental results on the gold-standard sets. ROC curves for the experimental results on the gold-standard sets of yeast (a) and human (b) calculated with CroGO (red), VSM (blue) and ASR (green). ASR and VSM based measures have very similar trends, and are overlapping for most of the visible portion of the ROC curves.
The performance study on yeast and human gold-standard sets.
| Organism | Measure | No. of term pairs (when FP = TP) | TP rate | TP rate | TP rate |
|---|---|---|---|---|---|
| Yeast | ASR | 50 | 81% | 82% | 83% |
| VSM | 50 | 81% | 82% | 83% | |
| CroGO | |||||
| Human | ASR | 21 | 79% | 80% | 81% |
| VSM | 21 | 79% | 80% | 81% | |
| CroGO | |||||
The performance study of CroGO, VSM and ASR based measures on yeast and human gold-standard sets.
Figure 3The edges distribution in the genome-specific term association network of yeast (a) or human (b). The edges distribution in the genome-specific term association network of yeast (a) or human (b). The three categories are "identical", "non-overlap" and "overlap but not identical". It indicates a significant part ("non-overlap") of the networks can only be identified by CroGO because of the incorporation of extra biological information from the co-function networks.
Top 20 term associations in category "overlap but not identical" that were identified by CroGO.
| polynucleotide adenylyltransferase activity | ncRNA polyadenylation | NEW |
| TFIIF-class binding TF activity | regulation of transcription-coupled nucleotide-excision repair | REF [ |
| TFIIF-class binding TF activity | positive regulation of transcription elongation from Pol I promoter | REF [ |
| TFIIF-class binding TF activity | regulation of transcription elongation from Pol I promoter | REF [ |
| TFIIF-class binding TF activity | positive regulation of histone H3-K36 trimethylation | NEW |
| TFIIF-class binding TF activity | regulation of histone H3-K36 trimethylation | NEW |
| TFIIF-class binding TF activity | positive regulation of histone H3-K36 methylation | NEW |
| TFIIF-class binding TF activity | regulation of nucleotide-excision repair | REF [ |
| TFIIF-class binding TF activity | regulation of histone H2B ubiquitination | REF [ |
| TFIIF-class binding TF activity | positive regulation of phosphorylation of Pol II C-terminal domain serine 2 residues | NEW |
| TFIIF-class binding TF activity | regulation of phosphorylation of Pol II C-terminal domain serine 2 residues | NEW |
| TFIIF-class binding TF activity | regulation of histone H2B conserved C-terminal lysine ubiquitination | NEW |
| IMP dehydrogenase activity | GTP biosynthetic process | REF [ |
| hydrogen ion transporting ATP synthase activity, rotational mechanism | ATP biosynthetic process | LEXICAL |
| RNA-directed RNA polymerase activity | tRNA transcription from Pol III promoter | LEXICAL |
| RNA-directed RNA polymerase activity | tRNA transcription | NEW |
| protein prenyltransferase activity | protein geranylgeranylation | REF [ |
| second spliceosomal transesterification activity | generation of catalytic spliceosome for second transesterification step | NEW |
| oxoglutarate dehydrogenase activity | 2-oxoglutarate metabolic process | LEXICAL |
| peptide alpha-N-acetyltransferase activity | N-terminal protein amino acid acetylation | REF [ |
Top 20 term associations in category "overlap but not identical" that were identified by CroGO. In the list, 8 term associations are supported by the existing biological studies, 3 are supported by the lexical matching on term definition, and the rest 7 are new conceptual connections that cannot be found in any literature. Only 3 of the term associations can be identified by the VSM or ASR based measures.
Top 20 term associations in category "non-overlap" that were identified by CroGO.
| MF Name | BP Name | Evidence |
|---|---|---|
| endopeptidase activator activity | proteasome core complex assembly | NEW |
| TFIIF-class binding TF activity | regulation of histone H3 K79 methylation | NEW |
| RNA-directed RNA polymerase activity | DNA-dependent transcriptional start site selection | LEXICAL |
| RNA-directed RNA polymerase activity | transcriptional start site selection at Pol II promoter | LEXICAL |
| single base insertion or deletion binding | chiasma assembly | REF [ |
| double-strand/single-strand DNA junction binding | chiasma assembly | REF [ |
| double-stranded telomeric DNA binding | gene conversion at mating-type locus, DNA double-strand break processing | NEW |
| G-quadruplex DNA binding | gene conversion at mating-type locus, DNA double-strand break processing | NEW |
| very long-chain fatty acid-CoA ligase activity | long-chain fatty-acyl-CoA metabolic process | REF [ |
| very long-chain fatty acid-CoA ligase activity | fatty-acyl-CoA metabolic process | REF [ |
| very long-chain fatty acid-CoA ligase activity | acyl-CoA metabolic process | REF [ |
| single base insertion or deletion binding | meiotic heteroduplex formation | NEW |
| guanine/thymine mispair binding | chiasma assembly | NEW |
| TFIIE-class TF binding | negative regulation of ribosomal protein gene transcription from Pol II promoter in response to chemical stimulus | REF [ |
| TFIIE-class binding TF activity | negative regulation of ribosomal protein gene transcription from Pol II promoter in response to chemical stimulus | REF [ |
| Hsp90 protein binding | positive regulation of telomere maintenance via telomerase | NEW |
| Hsp90 protein binding | positive regulation of telomere maintenance | REF [ |
| Hsp90 protein binding | positive regulation of homeostatic process | REF [ |
| aldehyde dehydrogenase activity | beta-alanine metabolic process | REF [ |
| aldehyde dehydrogenase activity | beta-alanine biosynthetic process | REF [ |
Top 20 term associations in category "non-overlap" that were identified by CroGO. In the list, 11 term associations are supported by the existing biological studies, 2 are supported by the lexical matching on term definition, and the rest 7 are new conceptual connections that cannot be found in any literature. None of the term associations can be identified by the VSM or ASR based measures.
Figure 4A case study of a MF-centered topological pattern found in the yeast MF-BP association network. A case study of a MF-centered topological pattern found in the yeast MF-BP association network. The yellow and white nodes are MF and BP term respectively.
Figure 5An example that five different types of DNA binding proteins are involved in biological processes meiotic mismatch repair and chiasma assembly in both human and yeast. An example that five different types of DNA binding proteins are involved in biological processes meiotic mismatch repair and chiasma assembly in both human and yeast. The yellow and white nodes are MF and BP term respectively.