| Literature DB >> 32574362 |
Ying Wang, Enrico Coiera, Farah Magrabi.
Abstract
OBJECTIVE: The study sought to evaluate the feasibility of using Unified Medical Language System (UMLS) semantic features for automated identification of reports about patient safety incidents by type and severity.Entities:
Keywords: UMLS; incident reporting; natural language processing; patient safety; semantics; supervised machine learning; text classification
Mesh:
Year: 2020 PMID: 32574362 PMCID: PMC7566533 DOI: 10.1093/jamia/ocaa082
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.Flowchart to train, validate and test semantic classifiers using 3 testing datasets: benchmark, original, and independent.
Composition of balanced and stratified datasets for training and testing semantic classifiers
| Benchmark (balanced AIMS) | Original (stratified AIMS) | Independent (stratified Riskman) | ||||||
|---|---|---|---|---|---|---|---|---|
| N1 | N2 | N1 | N2 | % | N1 | N2 | % | |
| Incident type | ||||||||
| Falls | 260 | 261 | 90 | 91 | 20 | 872 | 939 | 15 |
| Medications | 260 | 304 | 68 | 74 | 15 | 1053 | 1217 | 18 |
| Pressure injury | 260 | 264 | 37 | 38 | 8 | 190 | 197 | 3 |
| Aggression | 260 | 271 | 49 | 57 | 11 | 487 | 541 | 8 |
| Documentation | 260 | 589 | 26 | 67 | 6 | 252 | 809 | 4 |
| Blood product | 260 | 273 | 5 | 6 | 1 | 59 | 70 | 1 |
| Patient identification | 260 | 337 | 7 | 8 | 2 | 86 | 117 | 1 |
| Infection | 260 | 274 | 6 | 6 | 1 | 22 | 35 | <1 |
| Clinical handover | 260 | 301 | 7 | 8 | 2 | 87 | 101 | 1 |
| Deteriorating patient | 260 | 264 | 1 | 2 | <1 | 14 | 21 | <1 |
| Others | 260 | 689 | 148 | 173 | 33 | 2,878 | 4039 | 48 |
| Total | 2860 | 3827 | 444 | 530 | 6000 | 8086 | ||
| Severity level | ||||||||
| SAC1 |
290 290 290 290 1160 |
25 95 2198 2519 4837 | <1 |
23 105 2609 3213 5950 | <1 | |||
| SAC2 | 2 | 2 | ||||||
| SAC3 | 45 | 44 | ||||||
| SAC4 | 52 | 54 | ||||||
|
| ||||||||
The same data was used to train bag-of-words classifiers.
N1 is the number of reports based on primary labels. N2 is the number of reports considering 2 labels, and % is based on primary label alone.
AIMS: Advanced Incident Management System; SAC: severity assessment code.
Rare incident type (ie, <2%).
The most effective classifiers for incident type using semantic features compared with BOW
| Classification studies | Semantics | BOW |
|---|---|---|
| Ensemble strategy | OvsA | OvsA |
| Ensemble size | 12 ECCs | 6 ECCs |
| Feature extraction | Bag of concepts | BOW |
| Feature space representation | TF-IDF + label associations | Binary count + label associations |
| Base classifier | SVM RBF kernel | SVM RBF kernel |
| Group decision making | Voting | Voting |
| Average F-score | ||
| Benchmark dataset, % | 82.6 | 69.4 |
| Original dataset, % | 77.9 | 68.8 |
| Independent dataset, % | 78.0 | 67.4 |
BOW: bag-of-words; ECCs: ensemble of binary classifier chains; OvsA: one vs all; RBF: radial basis function; SVM: support vector machine; TF-IDF: term frequency–inverse document frequency.
Overall classification performance of semantic features in identifying incident type compared with BOW
| Benchmark | Original | Independent | ||||
|---|---|---|---|---|---|---|
| Feature representation | Semantics | BOW | Semantics | BOW | Semantics | BOW |
| Example-based measures | ||||||
| Hamming loss | 3.9 | 7.8 | 3.7 | 7.2 | 3.9 | 8.1 |
| Accuracy | 82.2 | 64.4 | 75.6 | 68.0 | 75.4 | 61.7 |
| Precision | 84.7 | 70.6 | 77.3 | 72.9 | 78.1 | 66.4 |
| Recall | 84.4 | 77.1 | 76.7 | 76.6 | 76.8 | 67.4 |
| F-score | 84.6 | 73.7 | 77.0 | 74.7 | 77.4 | 66.9 |
| Exact match | 48.9 | 39.9 | 57.9 | 44.4 | 59.5 | 34.9 |
| Label-based measures | ||||||
| Macro-precision | 86.7 | 69.7 | 77.7 | 52.4 | 70.2 | 54.2 |
| Macro-recall | 91.1 | 79.0 | 79.9 | 77.4 | 78.2 | 70.3 |
| Macro–F-score | 87.9 | 73.7 | 77.1 | 59.2 | 73.6 | 58.8 |
| Micro-precision | 82.8 | 67.1 | 78.0 | 67.1 | 78.9 | 68.7 |
| Micro-recall | 82.4 | 71.9 | 77.9 | 70.7 | 77.1 | 66.1 |
| Micro–F-score | 82.6 | 69.4 | 77.9 | 68.8 | 78.0 | 67.4 |
Values are %.
BOW: bag-of-words.
F-score for identifying incident types and severity level using semantic features compared with BOW
| Benchmark | Original | Independent | ||||
|---|---|---|---|---|---|---|
| Feature representation | Semantics | BOW | Semantics | BOW | Semantics | BOW |
| Incident type | ||||||
| Falls | 88.1 | 88.1 | 92.4 | 89.7 | 90.3 | 81.0 |
| Medications | 78.5 | 67.4 | 87.1 | 76.3 | 85.1 | 75.2 |
| Pressure injury | 96.4 | 91.5 | 91.6 | 85.4 | 85.7 | 81.4 |
| Aggression | 93.1 | 74.0 | 82.6 | 69.1 | 79.3 | 63.8 |
| Documentation | 71.0 | 62.2 | 78.2 | 61.2 | 75.2 | 75.0 |
| Blood products | 96.6 | 71.1 | 75.0 | 44.4 | 76.9 | 54.9 |
| Patient identification | 77.2 | 66.0 | 64.0 | 34.0 | 70.5 | 56.4 |
| Infection | 93.8 | 81.1 | 80.0 | 46.2 | 63.1 | 30.8 |
| Clinical handover | 77.0 | 62.5 | 51.6 | 32.0 | 50.4 | 34.5 |
| Deteriorating patient | 87.5 | 89.7 | 66.7 | 44.4 | 48.8 | 54.6 |
| Others | 75.8 | 60.0 | 78.6 | 67.9 | 83.4 | 69.4 |
| Severity level | ||||||
| Average F-score | 71.6 | 62.9 | 42.2 | 50.1 | 49.6 | 52.7 |
| SAC1 | 87.3 | 87.3 | 25.5 | 19.8 | 50.7 | 12.5 |
| SAC2 | 69.8 | 49.0 | 8.4 | 12.3 | 11.9 | 12.0 |
| SAC3 | 75.3 | 49.1 | 42.1 | 42.6 | 58.4 | 48.3 |
| SAC4 | 60.0 | 64.0 | 52.2 | 61.8 | 39.3 | 60.0 |
Values are %.
BOW: bag-of-words; SAC: severity assessment code.
Rare incident type (ie, <2%).
| Report format | Structured | Free text |
|---|---|---|
| Basic element | incident ID | description of incident |
| date and time | actions taken | |
|
incident type(s) severity access code |
preventative steps patient outcome | |
| investigation findings and results |