| Literature DB >> 30840055 |
Piotr Przybyła1, Austin J Brockmeier1, Sophia Ananiadou1.
Abstract
OBJECTIVE: We seek to quantify the mortality risk associated with mentions of medical concepts in textual electronic health records (EHRs). Recognizing mentions of named entities of relevant types (eg, conditions, symptoms, laboratory tests or behaviors) in text is a well-researched task. However, determining the level of risk associated with them is partly dependent on the textual context in which they appear, which may describe severity, temporal aspects, quantity, etc.Entities:
Keywords: electronic health records; machine learning; multitask learning; natural language processing; risk assessment
Mesh:
Year: 2019 PMID: 30840055 PMCID: PMC6515525 DOI: 10.1093/jamia/ocz004
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.Outline of the risk assessment workflow. The focus of this article is on the last stage (ie, risk quantification). CALM: context-aware linear modeling; CRF: Conditional Random Field; EHR: electronic health record.
Figure 2.Risk factor mention count distribution plotted on log-log scale.
Error values for baseline approaches and different CALM methods (with their regularization schemes) calculated on all mentions in the test set (all) and the subsets of mentions with risk factors present or absent in the training data (known and unknown, respectively)
| MSE | ||||
|---|---|---|---|---|
| Method | Regularization Scheme | All | Known | Unknown |
| Human baseline (interannotator agreement) | – | 0.2465 | — | — |
| Naive baseline (mean risk in training data) | – | 0.5362 | 0.5335 | 0.5631 |
| Single task without embeddings |
| 0.3295 | 0.3228 | 0.3978 |
| Single task with embeddings |
| 0.2977 | 0.2908 | 0.3680 |
| L1 + L1 without embeddings |
| 0.2533 | 0.2345 | 0.4454 |
| L1 + L1 with embeddings |
| 0.2425 | 0.2289 | 0.3817 |
| Multitask feature selection |
| 0.2461 | 0.2306 | 0.4052 |
CALM: context-aware linear modeling; MSE: mean squared error.
alowest error values in each set.
Number of available coefficients in each of the elements of CALM models and number of nonzero values in 3 of the methods employed
|
| |||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
| Available | 19 835 | 9215 | 500 | 182 779 525 | 182 809 075 |
| Single task with embeddings | 2118 | — | 94 | — | 2212 |
| L1 + L1 with embeddings | 1928 | 2000 | 77 | 7735 | 11 740 |
| Multitask feature selection | 2656 | 9215 | 37 | 63 319 | 75 227 |
CALM: context-aware linear modeling.
Top 11 features with the highest importance for the model (features with at least 5 occurrences selected according to absolute coefficient value) of 2 types: common and specific to a risk factor (quoted as appearing in text)
| Risk Factor | Feature | Coefficient | Explanation |
|---|---|---|---|
| (common) | <lumbosacral | –0.7471 | Abnormalities in the lower back have a low mortality risk |
| (common) | >her2 | +0.6864 | Overexpression of HER2 associated with breast cancer |
| (common) | <fh | –0.6606 | A risk factor listed as family history (fh) carries lower risk |
| (common) | <pint | +0.6443 | Alcohol quantity influencing the risk |
| (common) | <lid | –0.6131 | Treatments or symptoms near an eyelid have a lower mortality risk |
| (common) | <critical | +0.6077 | Critical state of a condition |
| (common) | <vodka | +0.6047 | Indicating higher alcohol consumption |
| (common) | <subdiaphragmatic | –0.5761 | Increased risk for factors regarding the body cavity below the diaphragm |
| (common) | >detox | +0.5620 | Mentioned in context of alcohol or drug problems |
| (common) | <methadone | +0.5549 | An analgesic that can be administered to treat withdrawal from illicit drugs |
| (common) | <father | –0.5368 | Conditions experienced by the patient’s father |
| BUN | [40–55] | +0.9039 | A high level of blood urea nitrogen (mg/dL) |
| CAD | ConText.OTHER | –0.7943 | Risk of coronary artery disease lessened when experienced by someone other than the patient, such as family member |
| Alcohol | <due | +0.7823 | Alcohol abuse causing other condition, as in |
| BUN | [55–74] | +0.7657 | A high level of blood urea nitrogen (mg/dL) |
| EtOH | [445–601] | +0.7510 | Alcohol level measured on admission (mg/dL) |
| Smoking | >day | +0.7380 | Smoking daily (eg, |
| BP | [181–245] | +0.7375 | Increased risk from high systolic blood pressure (mm Hg) |
| Varices | >leg | –0.7134 | Varices, when located in legs, carry less risk |
| SBP | [181–245] | +0.7120 | Increased risk from high systolic blood pressure (mm Hg) |
| Triglycerides | [2.72–3.67] | +0.7119 | Increased triglyceride level (mmol/L) |
| Bnzodzp | >pos | +0.7043 | Positive results of benzodiazepines measured on admission, possible drug abuse |
Features include several types: numbers falling in specific ranges (square brackets), words in left or right neighborhood (<, >) and labels assigned by the ConText tool.
BP: blood pressure; BUN: blood urea nitrogen; CAD: coronary artery disease; EtOH: ethanol; SBP: systolic blood pressure.
Figure 3.Empirical probability density function of the risk values predicted by the best-performing model for the risk factors belonging to the low, medium, and high category according to manual annotation.
Confusion matrices showing the disagreements in the test set (between the best CALM model’s predictions and manual annotation) and in the double-annotated part of the corpus (aggregated across all pairs of 3 annotators)
| Actual (%) | Annotator B (%) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||||
| Predicted |
| 70.29 |
|
| Annotator A |
|
| 11.88 | 1.33 |
|
| 28.92 |
| 39.74 |
|
| 75.02 |
| ||
|
|
|
| 59.03 |
| 0.88 | 13.09 |
| ||
CALM: context-aware linear modeling.
abetter value (higher agreement or lower disagreement).
Figure 4.Proportion of low-, medium-, and high-risk factors in electronic health record documents (the corners correspond to documents with risk factors all marked at the same risk level.) For the test set, the actual values (black circles) are compared with the context-aware linear modeling predictions (red crosses). Pairs of human annotators are compared on a subset of the training set.