| Literature DB >> 29197409 |
Ian Harrow1, Ernesto Jiménez-Ruiz2, Andrea Splendiani3, Martin Romacker4, Peter Woollard5, Scott Markel6, Yasmin Alam-Faruque7, Martin Koch8, James Malone9, Arild Waaler2.
Abstract
BACKGROUND: The disease and phenotype track was designed to evaluate the relative performance of ontology matching systems that generate mappings between source ontologies. Disease and phenotype ontologies are important for applications such as data mining, data integration and knowledge management to support translational science in drug discovery and understanding the genetics of disease.Entities:
Keywords: Biomedical ontology; Disease; Evaluation; OAE; Ontology alignment; Phenotype
Mesh:
Year: 2017 PMID: 29197409 PMCID: PMC5712086 DOI: 10.1186/s13326-017-0162-9
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1Phases of the OAEI 2016 Disease and Phenotype track. Important Dates: D1 (publication of preliminary datasets), D2 (final datasets released), D3 (system registration), D4 (system submission), D5 (publication of evaluation results and presentation in the Ontology Matching workshop [1])
Metrics of the track ontologies. Source: NCBI BioPortal on 2nd June 2016
| Ontology | Number of axioms | Number of classes | Maximum depth | Avg. number of children |
|---|---|---|---|---|
| HP | 137,289 | 11,786 | 15 | 3 |
| MP | 129,036 | 11,721 | 15 | 3 |
| DOID | 124,362 | 9248 | 12 | 3 |
| ORDO | 188,991 | 12,936 | 11 | 16 |
Note that the metric “average number of children” excludes the leaf nodes
Recall against BioPortal (baseline) mappings
| System | AML | DiSMatch | FCA-Map | LYAM++ | LogMap | LogMapBio | LogMapLt | PhenoMF | PhenoMM | PhenoMP | XMap |
|---|---|---|---|---|---|---|---|---|---|---|---|
| HP-MP | 1.0 | 0.25 | 0.998 | 0.014 | 0.997 | 1.0 | 0.994 | 1.0 | 1.0 | 0.412 | 0.995 |
| DOID-ORDO | 0.993 | 0.048 | 0.984 | - | 0.942 | 0.950 | 0.943 | 0.994 | 0.0 | 0.0 | 0.967 |
Consensus alignments for the HP-MP matching task
| Min. Votes | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|
| Mappings | 217039 | 2308 | 1588 | 1287 | 677 | 152 | 0 |
Seven (family) system groups contributing
Consensus alignments for the DOID-ORDO matching task
| Min. Votes | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| Mappings | 50,998 | 1883 | 1617 | 1447 | 991 | 36 |
Six (family) system groups contributing
Example mappings in the Disease and Phenotype track
| Entity 1 | Entity 2 | Rel. | Source |
|---|---|---|---|
| x-linked chondrodysplasia punctata (DOID_0060292) | Chondrodysplasia punctata (Orphanet_93442) | ≡ | (only) consensus alignment vote=2 |
| Meningeal melanomatosis (DOID_8243) | Diffuse leptomeningeal melanocytosis | ≡ | Consensus alignment vote=3 |
| (Orphanet_252031) | |||
| Reactive arthritis (DOID_6196) | Reactive arthritis (Orphanet_29207) | ≡ | Consensus alignment vote=3 |
| Hypoplastic scapulae (HP_0000882) | Short scapula (MP_0004340) | ≡ | (only) consensus alignment vote=2 |
| Macrocytic anemia (HP_0001972) | Macrocytic anemia (MP_0002811) | ≡ | Consensus alignment vote=3 |
| Unerupted tooth (HP_0000706) | Failure of tooth eruption (MP_0000121) | ≡ | Consensus alignment vote=3 |
| Breast leiomyosarcoma (DOID_5285) | Rare malignant breast tumor |
| Manually created |
| (Orphanet_180257) | |||
| Abnormality of body weight (HP_0004323) | Abnormal body weight (MP_0001259) | ≡ | Manually created |
| Microcephaly (HP_0000252) | Decreased brain size (MP_0000774) | ≡ | AML unique mapping (correct) |
| Skeletal dysplasia (HP_0002652) | Abnormal skeletal muscle morphology | ≡ | AML unique mapping (incorrect) |
| (MP_0000759) | |||
| Carbohydrate metabolism disease (DOID_0050013) | Disorder of carbohydrate metabolism | ≡ | LogMapBio unique mapping (correct) |
| (Orphanet_79161) | |||
| Spinocerebellar ataxia type 35 (DOID_0050982) | Transglutaminase 6 (Orphanet_279644) | ≡ | LogMapBio unique mapping (incorrect) |
| Female hypogonadism (HP_0000134) | Small ovary (MP_0001127) | ≡ | PhenoMF unique mapping (correct) |
Results against consensus alignments with vote =2 and vote =3 in the HP-MP task
| System-mappings | Mappings | Precision-2 | F-Measure-2 | Recall-2 | Precision-3 | F-Measure-3 | Recall-3 |
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 1755 | 0.93 | 0.86 | 0.80 | 0.85 | 0.90 | 0.94 |
|
| 644 | 0.55 | 0.30 | 0.21 | 0.45 | 0.28 | 0.20 |
|
| 1590 | 0.98 | 0.85 | 0.75 | 0.94 | 0.93 | 0.92 |
|
| 381 | 0.41 | 0.12 | 0.07 | 0.17 | 0.06 | 0.04 |
|
| 2011 | 0.94 | 0.92 | 0.91 | 0.77 | 0.86 | 0.97 |
|
| 2151 | 0.92 | 0.92 | 0.93 | 0.75 | 0.85 | 0.98 |
|
| 667 | 1.00 | 0.51 | 0.34 | 1.00 | 0.62 | 0.45 |
|
| 204,089 | 0.76 | 0.83 | 0.92 | 0.63 | 0.76 | 0.95 |
|
| 198,149 | 0.77 | 0.83 | 0.91 | 0.64 | 0.76 | 0.94 |
|
| 169,660 | 0.78 | 0.67 | 0.58 | 0.64 | 0.57 | 0.51 |
|
| 650 | 1.00 | 0.50 | 0.33 | 1.00 | 0.61 | 0.44 |
Precision and Recall represent their semantic variants
Results against consensus alignments with vote =2 and vote =3 in the DOID-ORDO task
| System-mappings | Mappings | Precision-2 | F-Measure-2 | Recall-2 | Precision-3 | F-Measure-3 | Recall-3 |
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 2098 | 0.85 | 0.91 | 0.97 | 0.78 | 0.87 | 1.00 |
|
| 335 | 0.23 | 0.08 | 0.05 | 0.19 | 0.07 | 0.04 |
|
| 1803 | 0.97 | 0.96 | 0.96 | 0.89 | 0.94 | 0.99 |
|
| 1667 | 0.95 | 0.91 | 0.88 | 0.91 | 0.92 | 0.94 |
|
| 1804 | 0.92 | 0.91 | 0.90 | 0.86 | 0.90 | 0.95 |
|
| 1000 | 0.99 | 0.72 | 0.56 | 0.99 | 0.76 | 0.62 |
|
| 40,612 | 0.95 | 0.89 | 0.83 | 0.95 | 0.94 | 0.92 |
|
| 1030 | 0.98 | 0.72 | 0.57 | 0.98 | 0.77 | 0.63 |
Precision and Recall represent their semantic variants
Results against curated alignments
| System-mappings | HP-MP task | DOID-ORDO task | ||
|---|---|---|---|---|
| Standard recall | Semantic recall | Standard recall | Semantic recall | |
|
|
|
|
|
|
|
| 0.28 | 0.76 | 0.00 | 0.00 |
|
| 0.07 | 0.14 | 0.02 | 0.03 |
|
| 0.21 | 0.62 | 0.00 | 0.00 |
|
| 0.00 | 0.00 | - | - |
|
| 0.24 | 0.66 | 0.02 | 0.12 |
|
| 0.28 | 0.69 | 0.03 | 0.17 |
|
| 0.17 | 0.52 | 0.00 | 0.00 |
|
| 0.90 | 0.90 | 0.00 | 0.00 |
|
| 0.90 | 0.90 | - | - |
|
| 0.83 | 0.83 | - | - |
|
| 0.17 | 0.52 | 0.00 | 0.00 |
|
| 0.90 | 0.90 | 0.05 | 0.20 |
|
| 0.31 | 0.79 | 0.00 | 0.00 |
|
| 0.24 | 0.66 | 0.00 | 0.00 |
Manual assessment of unique mappings and estimated positive and negative contribution in the HP-MP task
| System-mappings | Unique mappings | Precision | Positive contrib. | Negative contrib. |
|---|---|---|---|---|
|
|
|
| ||
|
| 122 | 0.87 | 8.63% | 1.33% |
|
| 291 | 0.83 | 19.80% | 3.96% |
|
| 26 | 0.96 | 2.04% | 0.08% |
|
| 226 | 0.70 | 12.91% | 5.53% |
|
| 130 | 0.93 | 9.90% | 0.71% |
|
| 176 | 0.93 | 13.40% | 0.96% |
|
| 0 | - | - | - |
|
| 89 | 1.00 | 7.27% | 0.00% |
|
| 85 | 1.00 | 6.94% | 0.00% |
|
| 80 | 1.00 | 6.53% | 0.00% |
|
| 0 | - | - | - |
|
|
|
|
|
|
Manual assessment of unique mappings and estimated positive and negative contribution in the DOID-ORDO task
| System-mappings | Unique mappings | Precision | Positive contrib. | Negative contrib. |
|---|---|---|---|---|
|
|
|
| ||
|
| 308 | 0.87 | 30.40% | 4.68% |
|
| 259 | 0.40 | 11.80% | 17.70% |
|
| 61 | 0.83 | 5.79% | 1.16% |
|
| 80 | 0.90 | 8.20% | 0.91% |
|
| 144 | 0.97 | 15.85% | 0.55% |
|
| 7 | 0.50 | 0.40% | 0.40% |
|
| 3 | 1.00 | 0.34% | 0.00% |
|
| 16 | 0.56 | 1.03% | 0.80% |
|
|
|
|
|
|
Results in the OAEI interactive track
| Task | System | F-measure | Gain | Requests |
|---|---|---|---|---|
| HP-MP |
| 0.93 | 0.03 | 388 |
|
| 0.97 | 0.11 | 1928 | |
| DOID-ORDO |
| 0.96 | 0.09 | 413 |
|
| 0.99 | 0.07 | 1602 |