| Literature DB >> 29447188 |
Sebastian Gehrmann1,2, Franck Dernoncourt1,3,4, Yeran Li1,5, Eric T Carlson1,6, Joy T Wu1,5, Jonathan Welt1,7, John Foote1,8, Edward T Moseley1,9, David W Grant1,10, Patrick D Tyler1,11, Leo A Celi1,3.
Abstract
In secondary analysis of electronic health records, a crucial task consists in correctly identifying the patient cohort under investigation. In many cases, the most valuable and relevant information for an accurate classification of medical conditions exist only in clinical narratives. Therefore, it is necessary to use natural language processing (NLP) techniques to extract and evaluate these narratives. The most commonly used approach to this problem relies on extracting a number of clinician-defined medical concepts from text and using machine learning techniques to identify whether a particular patient has a certain condition. However, recent advances in deep learning and NLP enable models to learn a rich representation of (medical) language. Convolutional neural networks (CNN) for text classification can augment the existing techniques by leveraging the representation of language to learn which phrases in a text are relevant for a given medical condition. In this work, we compare concept extraction based methods with CNNs and other commonly used models in NLP in ten phenotyping tasks using 1,610 discharge summaries from the MIMIC-III database. We show that CNNs outperform concept extraction based methods in almost all of the tasks, with an improvement in F1-score of up to 26 and up to 7 percentage points in area under the ROC curve (AUC). We additionally assess the interpretability of both approaches by presenting and evaluating methods that calculate and extract the most salient phrases for a prediction. The results indicate that CNNs are a valid alternative to existing approaches in patient phenotyping and cohort identification, and should be further investigated. Moreover, the deep learning approach presented in this paper can be used to assist clinicians during chart review or support the extraction of billing codes from text by identifying and highlighting relevant phrases for various medical conditions.Entities:
Mesh:
Year: 2018 PMID: 29447188 PMCID: PMC5813927 DOI: 10.1371/journal.pone.0192360
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The ten different phenotypes used for this study.
The first column shows the name of the phenotype, the second column shows the number of positive examples our of the total 1,610 notes, and the third shows the κ coefficient as inter-rater agreement measure. The last column lists the definition for each phenotype that was used to identify and annotate the phenotype.
| Phenotype] | #pos. | Definition | |
|---|---|---|---|
| Adv. / Metastatic Cancer | 161 | 0.83 | Cancers with very high or imminent mortality (pancreas, esophagus, stomach, cholangiocarcinoma, brain); mention of distant or multi-organ metastasis, where palliative care would be considered (prognosis < 6 months). |
| Adv. Heart Disease | 275 | 0.82 | Any consideration for needing a heart transplant; description of severe aortic stenosis (aortic valve area < 1.0cm2), severe cardiomyopathy, Left Ventricular Ejection Fraction (LVEF) <= 30%. Not sufficient to have a medical history of congestive heart failure (CHF) or myocardial infarction (MI) with stent or coronary artery bypass graft (CABG) as these are too common. |
| Adv. Lung Disease | 167 | 0.81 | Severe chronic obstructive pulmonary disease (COPD) defined as Gold Stage III-IV, or with a forced expiratory volume during first breath (FEV1) < 50% of normal, or forced vital capacity (FVC) < 70%, or severe interstitial lung disease (ILD), or Idiopathic pulmonary fibrosis (IPF). |
| Chronic Neurologic Dystrophies | 368 | 0.71 | Any chronic central nervous system (CNS) or spinal cord diseases, included/not limited to: Multiple sclerosis (MS), amyotrophic lateral sclerosis (ALS), myasthenia gravis, Parkinson’s Disease, epilepsy, history of stroke/cerebrovascular accident (CVA) with residual deficits, and various neuromuscular diseases/dystrophies. |
| Chronic Pain | 321 | 0.83 | Any etiology of chronic pain, including fibromyalgia, requiring long-term opioid/narcotic analgesic medication to control. |
| Alcohol Abuse | 196 | 0.86 | Current/recent alcohol abuse history; still an active problem at time of admission (may or may not be the cause of it). |
| Substance Abuse | 155 | 0.86 | Include any intravenous drug abuse (IVDU), accidental overdose of psychoactive or narcotic medications,(prescribed or not). Admitting to marijuana use in history is not sufficient. |
| Obesity | 126 | 0.94 | Clinical obesity. BMI > 30. Previous history of or being considered for gastric bypass. Insufficient to have abdominal obesity mentioned in physical exam. |
| Psychiatric disorders | 295 | 0.91 | All psychiatric disorders in DSM-5 classification, including schizophrenia, bipolar and anxiety disorders, other than depression. |
| Depression | 460 | 0.95 | Diagnosis of depression; prescription of anti-depressant medication; or any description of intentional drug overdose, suicide or self-harm attempts. |
Fig 1Overview of the basic CNN architecture.
(A) Each word within a discharge note is represented as its word embedding. In this example, both instances of the word “and” will have the same embedding. (B) Convolutions of different widths are used to learn filters that are applied to word sequences of the corresponding length. The convolution K2 with width 2 in the example looks at all 10 combinations of neighboring two words and output one value each. There can be multiple feature maps for each convolution width. (C) The multiple resulting vectors are reduced to only the highest value (the one with the most signaling power) for each of the different convolutions. (D) The final prediction (“Does the phenotype apply to the patient?”) is made by computing a weighted combination of the pooled values and applying a sigmoid function, similar to a logistic regression. This figure is adapted with permission from Kim [33].
Fig 2Comparison of achieved F1-scores across all tested phenotypes.
The left three models directly classify from text, the right two models are concept-extraction based. The CNN outperforms the other models on most tasks.
This table shows the best performing model for each approach and phenotype.
We show precision, recall, F1-Score, and AUC.
| CNN | BoW | n-gram | cTAKES full | cTAKES filter | ||
|---|---|---|---|---|---|---|
| Adv. Cancer | 44 | 41 | 80 | 85 | ||
| 65 | 55 | 65 | 55 | |||
| 56 | 47 | 71 | 67 | |||
| 90 | 88 | 94 | 92 | |||
| Adv. Heart Disease | 74 | 70 | 71 | 73 | ||
| 32 | 42 | 49 | 59 | |||
| 44 | 55 | 58 | 65 | |||
| 85 | 85 | 88 | 89 | |||
| Adv. Lung Disease | 21 | 27 | 43 | |||
| 29 | 39 | 29 | 36 | |||
| 24 | 32 | 40 | 39 | |||
| 76 | 79 | 81 | 87 | |||
| Chronic Neuro | 69 | 47 | 49 | 75 | ||
| 46 | 54 | 55 | 62 | |||
| 69 | 46 | 51 | 64 | |||
| 84 | 72 | 71 | 87 | |||
| Chronic Pain | 33 | 42 | 66 | 66 | ||
| 45 | 54 | 46 | 41 | |||
| 41 | 44 | 51 | 56 | |||
| 73 | 68 | 67 | 78 | |||
| Alcohol Abuse | 85 | 55 | 88 | 91 | ||
| 50 | 64 | 75 | ||||
| 81 | 67 | 59 | 82 | |||
| 89 | 88 | 95 | ||||
| Substance Abuse | 83 | 62 | 83 | 87 | ||
| 50 | 33 | 47 | 67 | |||
| 56 | 48 | 62 | 75 | |||
| 90 | 86 | |||||
| Obesity | 27 | 44 | 64 | 62 | ||
| 35 | 20 | 80 | 75 | |||
| 30 | 28 | 71 | 68 | |||
| 72 | 71 | 99 | 98 | |||
| Psychiatric Disorders | 47 | 53 | 74 | 81 | ||
| 53 | 39 | 63 | 64 | |||
| 50 | 45 | 68 | 72 | |||
| 77 | 76 | 88 | 93 | |||
| Depression | 51 | 51 | 81 | 79 | ||
| 67 | 73 | 72 | 77 | |||
| 58 | 60 | 76 | 78 | |||
| 93 | 77 | 78 | 91 | |||
Fig 3Impact of phrase length on model performance.
The figure shows the change in F1-score between a model that considers only single words and a model that phrases up to a length of 5.
The most salient phrases for advanced heart failure and alcohol abuse.
The salient cTAKES CUIs are extracted from the filtered RF model.
| cTAKES | CNN |
|---|---|
| Magnesium | Wall Hypokinesis |
| Cardiomyopathy | Port pacer |
| Hypokinesia | Ventricular hypokinesis |
| Heart Failure | p AVR |
| Acetylsalicylic Acid | post ICD |
| Atrium, Heart | status post ICD |
| Coronary Disease | EF 20 30 |
| Atrial Fibrillation | bifurcation aneurysm clipping |
| Coronary Artery | CHF with EF |
| Disease | cardiomyopathy, EF 15 |
| Aortocoronary Bypasses | (EF 20 30 |
| Fibrillation | coronary artery bypass graft |
| Heart | respiratory viral infection by DFA |
| Catheterization | severe global free wall hypokinesis |
| Chest | Class II, EF 20 |
| Artery | lateral CHF with EF 30 |
| CAT Scans, X-Ray | anterior and atypical hypokinesis akinesis |
| Hypertension | severe global left ventricular hypokinesis |
| Creatinine Measurement | ’s cardiomyopathy, EF 15 |
| Victim of abuse | Consciousness Alert |
| Ethanol Measurement | Alcohol Abuse |
| Alcohol Abuse | EtOH abuse |
| Thiamine | Alcoholic Dilated |
| Social and personal history | ETOH cirrhosis |
| Family history | heavy alcohol abuse |
| Hypertension | evening Alcohol abuse |
| Injuries risk | Drug Reactions Attending |
| Pain | alcohol withdrawal compartment syndrome |
| Sodium | EtOH abuse with multiple |
| Potassium Measurement | liver secondary to alcohol abuse |
| Plasma Glucose Measurement | abuse crack cocaine, EtOH |