| Literature DB >> 35123470 |
Luke T Slater1,2,3,4, Sophie Russell5,6, Silver Makepeace5,6, Alexander Carberry5,6, Andreas Karwath5,6,7,8, John A Williams5,6,8, Hilary Fanning6,8, Simon Ball6,7,8, Robert Hoehndorf9, Georgios V Gkoutos5,6,10,11,12,7,8.
Abstract
BACKGROUND: Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance 'patient-like me' analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area.Entities:
Keywords: Differential diagnosis; MIMIC-III; Ontology; Semantic similarity; Semantic web
Mesh:
Year: 2022 PMID: 35123470 PMCID: PMC8818208 DOI: 10.1186/s12911-022-01770-4
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Overall description of the experimental methodology. These processes are split into separate notebooks, forming modules that can be modified and replaced, to extend the framework to explore other methods, outcomes, or settings
Fig. 2A sample of an example output of the semantic similarity process. On the left is a semantic similarity matrix, in which every phenotype profile associated with a patient visit has been compared with every other one. The result is a matrix of similarity values. To evaluate the matrices, we then convert these into a ranked list of similarity values for each patient visit, which also includes whether or not the two patient visits being compared share a primary diagnosis. The latter structure is used to create our evaluation scores (e.g. AUC). In the experiment described in this article, this process is repeated once for every combination of semantic similarity measure being explored, since each will produce a separate similarity matrix
Breakdown of the different categories of semantic similarity measures available in the SML Toolkit
| Category | Type of method | Count | Used |
|---|---|---|---|
| Pairwise similarity | Structural | 7 | 7 |
| Information content | 10 | 8 | |
| Groupwise similarity | Direct | 19 | 19 |
| Indirect (Pairwise required) | 5 | 5 | |
| Indirect (IC only required) | 1 | 1 | |
| Information content | Structural | 5 | 5 |
| Corpus | 1 | 1 | |
| Total | 48 | 46 |
Fig. 3Distribution of scores for all measure combinations evaluated, using different performance measures. The distribution of MRR-0 and MRR-NA are the same, since these scores have a static relationship
Top algorithms by AUC, MRR, and A@10
| By | Method | AUC | MRR-0 | A@10 |
|---|---|---|---|---|
| AUC | AVG (GW) + Resnik (PW) + Resnik (IC) | 0.81 (0.79–0.83) | 0.19 | 0.34 (0.31–0.37) |
| AVG (GW) + Jaccard (PW) + Zhou (IC) | 0.8 (0.78–0.82) | 0.19 | 0.32 (0.29–0.35) | |
| AVG (GW) + NODE_SIM (PW) + Zhou (IC) | 0.8 (0.78–0.82) | 0.18 | 0.31 (0.28–0.34) | |
| MRR | GIC (DGW) + Zhou (IC) | 0.68 (0.65–0.71) | 0.24 | 0.41 (0.38–0.44) |
| GIC (DGW) + Seco (IC) | 0.68 (0.65–0.71) | 0.24 | 0.41 (0.38–0.44) | |
| Bader (DGW) | 0.77 (0.74–0.8) | 0.24 | 0.4 (0.37–0.43) | |
| A@10 | GIC (DGW) + Resnik (IC) | 0.68 (0.65–0.71) | 0.24 | 0.42 (0.39–0.45) |
| GIC (DGW) + Sanchez (IC) | 0.67 (0.64–0.7) | 0.24 | 0.42 (0.39–0.45) | |
| GIC (DGW) + Min (IC) | 0.67 (0.64–0.7) | 0.23 | 0.42 (0.39–0.45) |
Since MRR-NA and MRR-0 are statically dependent, the top algorithms for both are equivalent, and so only ‘MRR-0’ is listed here
Average performance per groupwise measure
| Groupwise method | AUC | MRR-NA | MRR-0 | A@10 |
|---|---|---|---|---|
| MAX | 0.52 | 0.19 | 0.13 | 0.22 |
| MIN | 0.51 | 0.18 | 0.12 | 0.19 |
| BMA | 0.68 | 0.29 | 0.19 | 0.33 |
| BMM | 0.68 | 0.22 | 0.14 | 0.28 |
| AVG | 0.24 | 0.16 | 0.28 | |
| Direct | 0.71 |
Bold indicates the emboldened figure in each column describes the greatest score for the relevant evaluation metric
Average performance of information content measures versus those that did not use information content
| IC method | AUC | MRR-NA | MRR-0 | A@10 |
|---|---|---|---|---|
| Resnik | 0.62 | 0.23 | 0.15 | 0.26 |
| Zhou | 0.62 | 0.22 | 0.14 | 0.25 |
| Seco | 0.62 | 0.22 | 0.14 | 0.25 |
| Sanchez | 0.63 | 0.22 | 0.14 | 0.25 |
| Max | 0.63 | 0.22 | 0.14 | 0.25 |
| Min | 0.63 | 0.22 | 0.15 | 0.26 |
| Non-IC |
Bolded entries in each result column indicate the best-performing method by that measure