| Literature DB >> 27655861 |
Jonathan H Chen1, Mary K Goldstein2,3, Steven M Asch1,4, Lester Mackey5, Russ B Altman1,6,7.
Abstract
OBJECTIVE: Build probabilistic topic model representations of hospital admissions processes and compare the ability of such models to predict clinical order patterns as compared to preconstructed order sets.Entities:
Keywords: clinical decision support systems; clinical summarization; data mining; electronic health records; order sets; probabilistic topic modeling
Mesh:
Year: 2017 PMID: 27655861 PMCID: PMC5391730 DOI: 10.1093/jamia/ocw136
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.Topic modeling as factorization of a word-document matrix. Simulated data in the top-left reflects that the word “Heart” appears 12 times in the article “Evidence Underlying AHA.” Factoring this full matrix into simpler matrices can discover a smaller number of latent dimensions that summarize the content. Topic modeling represents these latent dimensions as topics defining a categorical probability distribution of word occurrences in the topic-word matrix. This reveals the underlying statistical structure of the data, but an algorithmic process cannot itself provide meaning. By observing the most prevalent words in each topic axis, however, an underlying meaning is often interpretable (e.g., prevalence of the words “heart” and “aspirin” in the first topic axis implies a general topic of “Cardiology”).
Most commonly used human-authored inpatient order sets
| Use rate (%) | Size | Description |
|---|---|---|
| 35.1 | 51 | Anesthesia—Post-Anesthesia (Inpatient) |
| 27.2 | 161 | Medicine—General Admit |
| 23.5 | 51 | General—Pre-Admission/Pre-Operative |
| 18.4 | 17 | Insulin–Subcutaneous |
| 15.7 | 28 | General—Transfusion |
| 9.3 | 13 | General—Discharge |
| 7.6 | 150 | Surgery—General Admit |
| 6.9 | 9 | Emergency—Admit |
| 6.1 | 224 | Intensive Care—General Admit |
| 5.9 | 147 | Orthopedics—Total Joint Replacement |
| 4.9 | 46 | Pain—Regional Anesthesia Admit |
| 4.4 | 80 | Emergency—General Complaint |
| 4.1 | 40 | Anesthesia—Post-Anesthesia (Outpatient) |
| 3.9 | 135 | Orthopedics Trauma |
| 3.9 | 9 | Pain—Patient Controlled Analgesia |
| 3.4 | 168 | Psychiatry—Admit |
| 3.3 | 132 | Neurosurgery–Intensive Care |
| 3.3 | 16 | General—Heparin Protocols |
| 3.0 | 39 | Pain—Epidural Analgesia Post-Op |
| 2.9 | 11 | Insulin—Subcutaneous Adjustment |
| 2.7 | 9 | Lab—Blood Culture and Infection |
| 2.6 | 155 | Neurology—General Admit |
| 2.5 | 169 | Intensive Care—Surgery/Trauma Admit |
| 2.4 | 9 | Pharmacy—Warfarin Protocol |
| 2.4 | 14 | Insulin—Intravenous Infusion |
| … | … | … |
Use rate reflects the percentage of validation patients for whom the order set was used within the first 24 hours of hospitalization. Size reflects the number of order suggestions available in each order set. Notably, these essentially all reflect nonspecific care processes, while scenario specific order sets (e.g., management of asthma, heart attacks, pneumonia, sepsis, or gastrointestinal bleeds) are rarely used.
Summary statistics for human-authored order set use within the first 24 hours of hospitalization for 4820 validation patients
| Metric (per first 24 hours of each hospitalization) | Mean (std dev) | Median interquartile range |
|---|---|---|
| A: Order sets used | 3.0 | 3 |
| (1.4) | (2, 4) | |
| B: Orders entered (including non-order set) | 32.7 | 30 |
| (15.7) | (22, 41) | |
| C: Orders entered from order sets | 13.3 | 12 |
| (7.9) | (8, 18) | |
| D: Orders available from used order sets | 129.0 | 130 |
| (47.5) | (102, 153) | |
| E: Order set precision = (C/D) | 11% | 9.5% |
| (7.2%) | (6.3%, 13.8%) | |
| F: Order set recall = (C/B) | 43% | 42% |
| (20%) | (28%, 58%) |
Metrics count only orders used in the final set of 1512 preprocessed clinical orders after normalization of medication orders and exclusion of rare orders and common process orders.
Example clinical topics generated when modeling 32 topics from training patient data
| Weight (%) | Clinical item | Weight | Clinical item | Weight | Clinical item |
|---|---|---|---|---|---|
| 2.37 | PoC Arterial Blood Gas | 1.56 | Insulin Lispro (Subcutaneous) | 2.98 | Culture + Gram Stain, Fluid |
| 1.39 | Team—Respiratory Tech | 1.26 | Metabolic Panel, Basic | 2.42 | Cell Count and Diff, Fluid |
| 1.38 | Lactate, Whole Blood | 1.09 | Dx—Diabetes mellitus (250) | 1.59 | Protein Total, Fluid |
| 1.32 | XRay Chest 1 View | 1.08 | Dx—DM w/o complication (250.0) | 1.51 | Albumin, Fluid |
| 1.14 | Blood Gases, Venous | 1.03 | Dx—DM not uncontrolled (250.00) | 1.24 | LDH Total, Fluid |
| 1.00 | Ventilator Settings Change | 1.01 | Hemoglobin A1c | 1.10 | Albumin (IV) |
| 0.97 | Blood Gases, Arterial | 0.99 | Diet—Low Carbohydrate | 1.09 | Glucose, Fluid |
| 0.96 | Vancomycin (IV) | 0.91 | CBC w/ Diff | 0.84 | Pathology Review |
| 0.92 | PoC Arterial Blood Gas B | 0.91 | Sodium Chloride (IV) | 0.68 | Team—Registered Nurse |
| 0.88 | Epinephrine (IV) | 0.89 | Diagnosis—Essential hypertension | 0.67% | CBC w/Diff |
| 0.88 | Norepinephrine (IV) | 0.82% | MRSA Screen | 0.61 | MRSA Screen |
| 0.87 | Central Line | 0.75 | Team—Registered Nurse | 0.60 | XRay Chest 1 View |
| 0.86 | Team—Medical ICU | 0.73 | Regular Insulin (Subcutaneous) | 0.53 | Albumin, Serum |
| 0.84 | Sodium Bicarbonate (IV) | 0.73 | XRay Chest 1 View | 0.52 | Metabolic Panel, Basic |
| 0.81 | MRSA Screen | 0.67 | Fungal Culture | 0.51 | Cytology |
| 0.79 | Hepatic Function Panel | 0.66 | Anaerobic Culture | 0.50 | Amylase, Fluid |
| 0.76 | Midazolam (IV) | 0.66 | Consult—Diabetes Team | 0.47 | Prothrombin Time (PT/INR) |
| 0.73 | Result—Lactate (High) | 0.65 | Team—Respiratory Tech | 0.47 | Midodrine (Oral) |
| 0.73 | Result—TCO2 (Low) | 0.63 | Admit—Thoracolumbar… (722.1) | 0.47 | Male Gender |
| 0.72 | NIPPVentilation | 0.63 | Admit—Lumbar Disp. (722.10) | 0.42 | Result—RBC (Low) |
| 0.69 | Result—pH (Low) | 0.62 | EKG 12-Lead | 0.41 | Sodium Chloride (IV) |
| … | … | … | |||
The most prominent clinical items (e.g., medications, imaging, laboratory orders, and results) are listed for each example topic, with corresponding P(Itemi|Topicj) weights. The bottom rows reflect the percentage of validation patients with estimated P(Topicj|Patientk) > 1% along with our manually ascribed labels that summarize the largely interpretable topic contents.
Abbreviations: AB: Antibody, AG: Antigen, CBC: Complete blood count, CSF: Cerebrospinal fluid, Diff: Differential, Disp: Displacement, DFA: Direct fluorescent antibody, DM: Diabetes mellitus, Dx: Diagnosis, Dz: Disease, HSV: Herpes simplex virus, ICU: Intensive care unit, Inh: Inhaled, INR: International normalized ratio, IV: Intravenous, LDH: Lactate dehydrogenase, MRSA: Methicillin resistant Staphylococcus aureus, NIPPV: Noninvasive positive pressure ventilation, PE: Pulmonary embolism, PoC: Point-of-care, PCR: Polymerase chain reaction, RBC: Red blood cells, TCO2: Total carbon dioxide, WBC: White blood cells.
Figure 2.Example of 2 generated clinical topics plotted in a 2-dimensional space. Only clinical orders are plotted, based on their prominence in each of the topics. The top left reflects clinical orders most associated with TopicY, with little association with TopicX, suggestive of a workup for diarrhea and abdominal pain. The bottom right reflects clinical orders associated with TopicX, suggestive of a workup for an intentional (medication) overdose and involuntary psychiatric hospitalization. The top right reflects common clinical orders that are associated with both topics. For legibility, items whose score is < 0.2% for both topics are omitted and only a subsample of the bottom-left items are labeled. The diagonal arrow represents a hypothetical patient inferred to have P(TopicX|Patientk) = 80% and P(TopicY|Patientk) = 20%. The dashed lines reflect orthogonal P(Itemi|Patientk) isolines to visually illustrate how clinical order suggestions can be made from such a topic inference. In this case, orders farthest along the projected patient vector (e.g., serum acetaminophen) are predicted to be most relevant for the patient.
Figure 3.Topic count selection. Average discrimination accuracy (ROC AUC) when predicting additional clinical orders occurring within t followup verification time of the invocation of a pre-authored order set during the first 24 hours of hospitalization for 4820 validation patients. Predictions based on Latent Dirichlet allocation (LDA) topic models trained on 10 655 separate training patients. The standard LDA algorithm requires external specification of a topic count parameter to indicate the number of latent dimensions by which to organize the source data, which varies along this X-axis from 2 to 2048. Peak performance occurs around a choice of 32 topics, degrading once attempting to model > 64 topics.
Figure 4.(A) Topic models vs order sets for different followup verification times. For each real use of a preauthored order set, either that order set or a topic model (with 32 trained topics) was used to suggest clinical orders. For longer followup times, the number of subsequent possible items considered correct increases from an average of 5.4–20.6. The average correct predictions in the immediate timeframe is similar for topic models (3.2) and order sets (3.8), but increases more for topic models (9.3) vs order sets (6.7) when forecasting up to 24 hours. At the time of order set usage, physicians choose an average of 3.8 orders out of 54.8 order set suggestions, as well as 1.6 = (5.4 – 3.8) a la carte orders. (B) Topic models vs order sets by recall at N. For longer followup verification times, more possible subsequent items are considered correct (see 4A). This results in an expected decline in recall (sensitivity). Order sets, of course, predict their own immediate use better, but lag behind topic model-based approaches when anticipating orders beyond 2 hours (P < 10−20 for all times). (C) Topic models vs order sets by precision at N. For longer followup verification times, more subsequent items are considered correct, resulting in an expected increase in precision (positive predictive value). Again, topic model-based approaches are better at anticipating clinical orders beyond the initial 2 hours after order set usage (P < 10−6 for all times). (D) Topic models vs order sets by ROC AUC (c-statistic), evaluating the full ranking of possible orders scored by topic models or included/excluded by order sets (P < 10−100 for all times).
Summary of relative tradeoffs between manually authored order sets vs algorithmically generated order suggestions
| Aspect | Order sets | Topic models |
|---|---|---|
| Production | Manual development | Automated generation |
| Construction | Preconceived concepts | Underlying data structure |
| Usability | Interruptive workflow | Passive dynamic adaption |
| Applicability | Isolated scenarios | Composite patient context |
| Interpretability | Annotated rationale | Numerical associations |
| Reliability | Clinical judgment | Statistical significance |