| Literature DB >> 31426779 |
Jihad S Obeid1,2, Erin R Weeda3, Andrew J Matuskowitz4, Kevin Gagnon5, Tami Crawford6, Christine M Carr6,4, Lewis J Frey6,7.
Abstract
BACKGROUND: Machine learning has been used extensively in clinical text classification tasks. Deep learning approaches using word embeddings have been recently gaining momentum in biomedical applications. In an effort to automate the identification of altered mental status (AMS) in emergency department provider notes for the purpose of decision support, we compare the performance of classic bag-of-words-based machine learning classifiers and novel deep learning approaches.Entities:
Keywords: Altered mental status; Decision support; Deep learning; Machine learning; Pulmonary embolism; Word embedding
Mesh:
Year: 2019 PMID: 31426779 PMCID: PMC6701023 DOI: 10.1186/s12911-019-0894-9
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Breakdown of the labeled notes
| HPI label | Notes | Patients |
|---|---|---|
| AMS | 493 | 459 |
| Not AMS | 637 | 422 |
| Total | 1130 | 858a |
a23 patients had records in both categories from different ED visits
List of ICD-9 and ICD-10 codes considered to be evidence of AMS in the context of pulmonary embolism
| Code Set | ICD Code | Diagnosis Name |
|---|---|---|
| ICD9 | 780.0x | Alteration of consciousness |
| ICD9 | 780.2 | Syncope and collapse |
| ICD9 | 780.97 | Altered mental status |
| ICD9 | 799.5x | Signs and symptoms involving cognition |
| ICD10 | R40.x | Somnolence, stupor and coma |
| ICD10 | R41.0 | Disorientation, unspecified |
| ICD10 | R41.8x | Other symptoms and signs involving cognitive functions and awareness |
| ICD10 | R41.9 | Unspecified symptoms and signs involving cognitive functions and awareness |
| ICD10 | R55 | Syncope and collapse |
Fig. 1The deep neural network architecture consists of a word embedding layer, followed by a convolutional layer with multiple filters, followed by a merge tensor, a fully connected dense layer and a single sigmoid output node
The confusion matrix for AMS ICD codes attributed to visits concurrent with the HPI notes vs. labels by clinicians (accuracy = 81.3%)
| Label by clinician | AMS ICD’s | No AMS ICD’s |
|---|---|---|
| AMS | 456 | 37 |
| Not AMS | 174 | 463 |
Accuracy and area under the ROC curve (AUC) results for bag of words (BOW)-based models and the word embedding-based deep learning models along with 95% confidence intervals (CI)
| Category | Modela | AUC (95% CI) | Accuracy | Epochs |
|---|---|---|---|---|
| BOW models | RF | 0.975 (0.967–0.983) | 0.921 | N/A |
| LASS | 0.973 (0.964–0.982) | 0.912 | N/A | |
| SVM | 0.967 (0.957–0.976) | 0.912 | N/A | |
| MLP | 0.947 (0.934–0.960) | 0.883 | N/A | |
| SDT | 0.934 (0.918–0.950) | 0.911 | N/A | |
| NBC | 0.924 (0.908–0.940) | 0.838 | N/A | |
| Deep learning models | CNN_D200 |
| 30.8 | |
| CNN_W2V | 0.942 |
| ||
| CNN_D50 | 0.984 (0.978–0.991) | 0.944 | 36.6 |
aModel abbreviations are described in the text
The number of epochs for training the deep learning is based on the early stopping condition as described in the methods. The entries are sorted in descending order of AUC within each category. Bolding indicates results for the best performing models
Fig. 2Area under the ROC curve (AUC) plots. a) AUC plots for the BOW-based models; b) AUC plots for the word embedding-based deep learning models. (Model abbreviations are described in the text)
Fig. 3Variable importance plot based on the RF classifier