| Literature DB >> 28193260 |
Maxat Kulmanov1,2, Robert Hoehndorf3,4.
Abstract
BACKGROUND: Ontologies are widely used as metadata in biological and biomedical datasets. Measures of semantic similarity utilize ontologies to determine how similar two entities annotated with classes from ontologies are, and semantic similarity is increasingly applied in applications ranging from diagnosis of disease to investigation in gene networks and functions of gene products.Entities:
Keywords: Gene ontology; Ontology; Semantic similarity
Mesh:
Year: 2017 PMID: 28193260 PMCID: PMC5307803 DOI: 10.1186/s13326-017-0119-z
Source DB: PubMed Journal: J Biomed Semantics
Spearman and Pearson correlation coefficients between similarity value and absolute annotation size as well as between variance in similarity value and annotation size
| Similarity measure | Spearman | Pearson | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Yeast | Synthetic GO | Synthetic HPO | Yeast | Synthetic GO | Synthetic GO | |||||||
| Average | Variance | Average | Variance | Average | Variance | Average | Variance | Average | Variance | Average | Variance | |
| GIC (Graph Information Content) | 0.929780 | 0.251586 | 0.970924 | –0.773449 | 0.953247 | –0.980159 | 0.861348 | 0.117734 | 0.831167 | –0.744321 | 0.802873 | –0.958817 |
| NTO (Normalized Term Overlap) | 0.178345 | –0.860012 | 0.248990 | –0.976335 | 0.123304 | –0.988240 | –0.014072 | –0.682683 | –0.009088 | –0.574883 | –0.158914 | –0.593458 |
| UI (Union Intersection) | 0.892631 | 0.298097 | 0.879582 | –0.934921 | 0.729942 | –0.995599 | 0.788675 | 0.030649 | 0.777515 | –0.914405 | 0.736711 | –0.935415 |
| BMA with Jiang, Conrath 1997 | 0.960133 | –0.892027 | 0.998773 | –0.993506 | 0.999351 | –0.996609 | 0.892576 | –0.812184 | 0.895020 | –0.629497 | 0.907974 | –0.692269 |
| BMA with Lin 1998 | 0.980519 | –0.800362 | 0.998918 | –0.994733 | 0.999134 | –0.998052 | 0.925181 | –0.772250 | 0.896497 | –0.638574 | 0.917599 | –0.677309 |
| BMA with Resnik 1995 | 0.980519 | –0.717457 | 0.998773 | –0.994228 | 0.998918 | –0.998124 | 0.939044 | –0.703981 | 0.895107 | –0.642652 | 0.917738 | –0.675426 |
| BMA with Schlicker 2006 | 0.980519 | –0.800362 | 0.998918 | –0.994733 | 0.999134 | –0.998052 | 0.925181 | –0.772250 | 0.896497 | –0.638574 | 0.917599 | –0.677309 |
Fig. 1The distribution of similarity values as a function of the annotation size (top), annotation size difference (middle) and annotation class depth (bottom) for Resnik’s measure (using the Best Match Average strategy) and the simGIC measure
Spearman and Pearson correlation coefficients between similarity value and difference in annotation size as well as between variance in similarity value and difference in annotation size
| Similarity measure | Spearma | Pearson | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Yeast | Synthetic GO | Synthetic HPO | Yeast | Synthetic GO | Synthetic GO | |||||||
| Average | Variance | Average | Variance | Average | Variance | Average | Variance | Average | Variance | Average | Variance | |
| GIC (Graph Information Content) | –0.895310 | –0.931818 | –0.999928 | –0.999784 | –0.999784 | –0.997835 | –0.875583 | –0.503795 | –0.964250 | –0.484246 | –0.963553 | –0.496135 |
| NTO (Normalized Term Overlap) | 0.901443 | –0.233045 | 0.999784 | 0.961833 | 0.999784 | 0.959524 | 0.882986 | –0.192168 | 0.990210 | 0.848649 | 0.993038 | 0.849263 |
| UI (Union Intersection) | –0.909524 | –0.924459 | –1.000000 | –0.658658 | –1.000000 | –0.518687 | –0.906605 | –0.596963 | –0.963476 | –0.547645 | –0.963569 | –0.508495 |
| BMA with Jiang, Conrath 1997 | 0.283838 | –0.925830 | –0.902597 | –0.521861 | –0.891486 | –0.770130 | 0.074788 | –0.850654 | –0.834208 | –0.495874 | –0.848264 | –0.735985 |
| BMA with Lin 1998 | 0.462843 | –0.674892 | –0.901587 | –0.552237 | –0.891126 | –0.731530 | 0.303157 | –0.707318 | –0.836486 | –0.517670 | –0.852998 | –0.693744 |
| BMA with Resnik 1995 | 0.578211 | –0.579149 | –0.901587 | –0.537807 | –0.891126 | –0.699856 | 0.442458 | –0.487544 | –0.835991 | –0.507179 | –0.854007 | –0.670199 |
| BMA with Schlicker 2006 | 0.462843 | –0.674892 | –0.901587 | –0.552237 | –0.891126 | –0.731530 | 0.303157 | –0.707318 | -0.836486 | –0.517670 | –0.852998 | –0.693744 |
Fig. 2ROC Curves for protein-protein interaction prediction using random annotations and interaction data from BioGRID for yeast
Fig. 3ROC Curves for protein-protein interaction prediction using random annotations and interaction data from BioGRID for mouse and human
Fig. 4ROC Curves for gene-disease association prediction using PhenomeNet Ontology with mouse phenotype from MGI and OMIM disease phenotype annotations compared with random annotations