| Literature DB >> 22359495 |
Paul D Thomas1, Valerie Wood, Christopher J Mungall, Suzanna E Lewis, Judith A Blake.
Abstract
A recent paper (Nehrt et al., PLoS Comput. Biol. 7:e1002073, 2011) has proposed a metric for the "functional similarity" between two genes that uses only the Gene Ontology (GO) annotations directly derived from published experimental results. Applying this metric, the authors concluded that paralogous genes within the mouse genome or the human genome are more functionally similar on average than orthologous genes between these genomes, an unexpected result with broad implications if true. We suggest, based on both theoretical and empirical considerations, that this proposed metric should not be interpreted as a functional similarity, and therefore cannot be used to support any conclusions about the "ortholog conjecture" (or, more properly, the "ortholog functional conservation hypothesis"). First, we reexamine the case studies presented by Nehrt et al. as examples of orthologs with divergent functions, and come to a very different conclusion: they actually exemplify how GO annotations for orthologous genes provide complementary information about conserved biological functions. We then show that there is a global ascertainment bias in the experiment-based GO annotations for human and mouse genes: particular types of experiments tend to be performed in different model organisms. We conclude that the reported statistical differences in annotations between pairs of orthologous genes do not reflect differences in biological function, but rather complementarity in experimental approaches. Our results underscore two general considerations for researchers proposing novel types of analysis based on the GO: 1) that GO annotations are often incomplete, potentially in a biased manner, and subject to an "open world assumption" (absence of an annotation does not imply absence of a function), and 2) that conclusions drawn from a novel, large-scale GO analysis should whenever possible be supported by careful, in-depth examination of examples, to help ensure the conclusions have a justifiable biological basis.Entities:
Mesh:
Year: 2012 PMID: 22359495 PMCID: PMC3280971 DOI: 10.1371/journal.pcbi.1002386
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Experimentally-supported GO annotations for MAP4K2 and MAP4K3 genes in human, and Map4k2 gene in mouse.
| GO biological process |
|
|
|
| GO:0065007 biological regulation | x | x | |
| GO:0050789 regulation of biological process | x | x | |
| GO:0060255 regulation of macromolecule metabolic process | x | ||
| GO:0080090 regulation of primary metabolic process | x | ||
| GO:0051716 cellular response to stimulus | x | x | |
| GO:0031323 regulation of cellular metabolic process | x | ||
| GO:0050794 regulation of cellular process | x | ||
| GO:0019222 regulation of metabolic process | x | ||
| GO:0065009 regulation of molecular function | x | ||
| GO:0051174 regulation of phosphorus metabolic process | x | ||
| GO:0051246 regulation of protein metabolic process | x | ||
| GO:0044093 positive regulation of molecular function | x | ||
| GO:0050790 regulation of catalytic activity | x | ||
| GO:0032268 regulation of cellular protein metabolic process | x | ||
| GO:0019220 regulation of phosphate metabolic process | x | ||
| GO:0009987 cellular process | x | x | x |
| GO:0035556 intracellular signal transduction | x | x | |
| GO:0008152 metabolic process | x | x | |
| GO:0043085 positive regulation of catalytic activity | x | ||
| GO:0042325 regulation of phosphorylation | x | ||
| GO:0031399 regulation of protein modification process | x | ||
| GO:0051338 regulation of transferase activity | x | ||
| GO:0006950 response to stress | x | xx | |
| GO:0007154 cell communication | x | x | |
| GO:0044237 cellular metabolic process | x | x | |
| GO:0033554 cellular response to stress | x | ||
| GO:0007243 intracellular protein kinase cascade | x | xx | |
| GO:0051347 positive regulation of transferase activity | x | ||
| GO:0044238 primary metabolic process | x | x | |
| GO:0043549 regulation of kinase activity | x | ||
| GO:0001932 regulation of protein phosphorylation | x | ||
| GO:0080134 regulation of response to stress | x | ||
| GO:0050896 response to stimulus | x | ||
| GO:0023052 signaling | x | x | |
| GO:0044260 cellular macromolecule metabolic process | x | ||
| GO:0043170 macromolecule metabolic process | x | ||
| GO:0000165 MAPKKK cascade | x | ||
| GO:0006793 phosphorus metabolic process | x | x | |
| GO:0033674 positive regulation of kinase activity | x | ||
| GO:0019538 protein metabolic process | x | x | |
| GO:0080135 regulation of cellular response to stress | x | ||
| GO:0010627 regulation of intracellular protein kinase cascade | x | ||
| GO:0045859 regulation of protein kinase activity | x | ||
| GO:0048583 regulation of response to stimulus | x | ||
| GO:0023051 regulation of signaling | x | ||
| GO:0007165 signal transduction | x | x | |
| GO:0031098 stress-activated protein kinase signaling cascade | x | ||
| GO:0044267 cellular protein metabolic process | x | x | |
| GO:0007254 JNK cascade | x | ||
| GO:0043412 macromolecule modification | x | x | |
| GO:0006796 phosphate-containing compound metabolic process | x | x | |
| GO:0045860 positive regulation of protein kinase activity | x | ||
| GO:0043408 regulation of MAPKKK cascade | x | ||
| GO:0071900 regulation of protein serine/threonine kinase activity | x | ||
| GO:0009966 regulation of signal transduction | x | ||
| GO:0070302 regulation of stress-activated protein kinase signaling cascade | x | ||
| GO:0016310 phosphorylation | x | x | |
| GO:0071902 positive regulation of protein serine/threonine kinase activity | x | ||
| GO:0006464 protein modification process | x | x | |
| GO:0046328 regulation of JNK cascade | x | ||
| GO:0043405 regulation of MAP kinase activity | x | ||
| GO:0043406 positive regulation of MAP kinase activity | x | ||
| GO:0006468 protein phosphorylation | x | xx | |
| GO:0043506 regulation of JUN kinase activity | x | ||
| GO:0000187 activation of MAPK activity | x | ||
| GO:0043507 positive regulation of JUN kinase activity | x | ||
| GO:0007257 activation of JUN kinase activity | xx | ||
| GO:0051641 cellular localization | x | ||
| GO:0051179 localization | x | ||
| GO:0051234 establishment of localization | x | ||
| GO:0051640 organelle localization | x | ||
| GO:0051649 establishment of localization in cell | x | ||
| GO:0051656 establishment of organelle localization | x | ||
| GO:0006810 transport | x | ||
| GO:0051648 vesicle localization | x | ||
| GO:0051650 establishment of vesicle localization | x | ||
| GO:0016192 vesicle-mediated transport | xx |
‘x’ means inferred annotation (direct annotation by curator was to a child term); ‘xx’ means direct annotation. The “functional similarity” (actually an annotation congruence score) as defined by Nehrt et al. includes all terms, both inferred and direct.
Experimentally-supported GO annotations for Thra and Esr1 genes in mouse, and THRA gene in human.
| GO molecular function |
|
|
|
| GO:0060089 molecular transducer activity | x | x | x |
| GO:0001071 nucleic acid binding transcription factor activity | x | x | x |
| GO:0004872 receptor activity | x | x | x |
| GO:0003700 sequence-specific DNA binding transcription factor activity | x | x | x |
| GO:0004871 signal transducer activity | x | x | x |
| GO:0000981 sequence-specific DNA binding RNA polymerase II transcription factor activity | x | x | x |
| GO:0038023 signaling receptor activity | x | x | x |
| GO:0004879 ligand-activated sequence-specific DNA binding RNA polymerase II transcription factor activity | xx | x | xx |
| GO:0004887 thyroid hormone receptor activity | xx | ||
| GO:0005488 binding | x | x | x |
| GO:0005515 protein binding | xx | x | xx |
| GO:0032403 protein complex binding | xx | ||
| GO:0008134 transcription factor binding | x | xx | |
| GO:0017025 TBP-class protein binding | xx | ||
| GO:0019904 protein domain specific binding | xx | ||
| GO:0003676 nucleic acid binding | x | x | |
| GO:0003723 RNA binding | x | ||
| GO:0003727 single-stranded RNA binding | x | ||
| GO:0002153 steroid receptor RNA activator RNA binding | xx | ||
| GO:0042562 hormone binding | x | ||
| GO:0070324 thyroid hormone binding | xx | ||
| GO:0003677 DNA binding | x | ||
| GO:0001067 regulatory region nucleic acid binding | x | ||
| GO:0000975 regulatory region DNA binding | xx | ||
| GO:0003682 chromatin binding | xx |
‘x’ means inferred annotation (direct annotation by curator was to a child term); ‘xx’ means direct annotation. The “functional similarity” (actually an annotation congruence score) as defined by Nehrt et al. includes all terms, both inferred and direct.
GO annotation classes overrepresented in mouse compared to human, or vice versa.
| Aspect | GO ID | GO term | # mouse annotations | # human annotations | P-value |
| molecular function | GO:0005515 | protein binding | 6151 | 12318 | <10−100 |
| molecular function | GO:0016462 | pyrophosphatase activity | 109 | 240 | <10−50 |
| molecular function | GO:0003682 | chromatin binding | 204 | 68 | <10−30 |
| molecular function | GO:0005261 | cation channel activity | 187 | 75 | <10−20 |
| molecular function | GO:0003700 | sequence-specific DNA binding transcription factor activity | 427 | 252 | <10−10 |
| biological process | GO:0032502 | developmental process | 22114 | 3197 | <10−100 |
| biological process | GO:0032501 | multicellular organismal process | 15070 | 2987 | <10−100 |
| biological process | GO:0030154 | cell differentiation | 5390 | 1035 | <10−100 |
| biological process | GO:0043412 | macromolecule modification | 1438 | 2277 | <10−100 |
| biological process | GO:0044248 | cellular catabolic process | 523 | 904 | <10−100 |
| biological process | GO:0051276 | chromosome organization | 338 | 634 | <10−100 |
P-value is calculated using hypergeometric distribution without Bonferroni correction.